app package
Subpackages
- app.components package
- Subpackages
- app.components.api_tools package
- app.components.enrichment package
- app.components.figures package
- Submodules
- app.components.figures.bar_graph module
- app.components.figures.before_after_plot module
- app.components.figures.color_tools module
- app.components.figures.commonality_graph module
- app.components.figures.comparative_plot module
- app.components.figures.cvplot module
- app.components.figures.figure_legends module
- app.components.figures.heatmaps module
- app.components.figures.histogram module
- app.components.figures.imputation_histogram module
- app.components.figures.network_plot module
- app.components.figures.reproducibility_graph module
- app.components.figures.scatter module
- app.components.figures.tic_graph module
- app.components.figures.volcano_plot module
- Module contents
- app.components.tools package
- Submodules
- app.components.EnrichmentAdmin module
- app.components.MS_run_json_parser module
- app.components.cleanup_tasks module
- app.components.db_functions module
add_column()add_multiple_records()add_record()create_connection()delete_multiple_records()delete_record()drop_table()dump_full_database_to_csv()export_snapshot()generate_database_table_templates_as_tsvs()get_contaminants()get_database_versions()get_from_table()get_from_table_by_list_criteria()get_from_table_match_with_priority()get_full_table_as_pd()get_last_update()get_table_column_names()is_test_db()list_tables()map_protein_info()modify_multiple_records()modify_record()remove_column()rename_column()
- app.components.figure_functions module
- app.components.file_upload_api module
- app.components.infra module
- app.components.interactomics module
loggeradd_bait_column()add_crapome()count_knowns()create_dummy_list_txt()do_ms_microscopy()do_network()enrich()filter_controls_by_similarity()generate_saint_container()get_saint_matrix()known_plot()make_saint_dict()map_intensity()network_display_data()pca()prepare_controls()prepare_crapome()run_saint()saint_cmd()saint_counts()saint_filtering()saint_histogram()
- app.components.mathparser module
- app.components.matrix_functions module
compute_zscore()compute_zscore_based_deviation_from_control()count_per_sample()do_pca()filter_missing()hierarchical_clustering()impute()impute_gaussian()impute_minprob()impute_minprob_df()impute_minval()median_normalize()normalize()quantile_normalize()ranked_dist()ranked_dist_n_per_run()reverse_log2()
- app.components.ms_microscopy module
- app.components.parsing module
check_bait()check_comparison_file()check_numeric()check_required_columns()check_sample_table_column()clean_column_name()clean_sample_names()delete_samples()format_data()format_sample_group_name()generate_replicate_name()get_distribution_title()guess_controls()handle_mztab()identify_columns()parse_comparisons()parse_data_file()parse_parameters()parse_sample_table()read_data_from_content()read_df_from_content()read_dia_nn()read_fragpipe()read_matrix()remove_all_na()remove_duplicate_protein_groups()remove_file_path()remove_filepath_from_columns()remove_from_table()remove_rawfile_ending()rename_columns_and_update_expdesign()sdrf_to_table()unmix_dtypes()update_nested_dict()validate_basic_inputs()
- app.components.proteomics module
- app.components.qc_analysis module
- app.components.quick_stats module
- app.components.text_handling module
- app.components.tooltips module
- app.components.ui_components module
HEADER_DICTchecklist()discard_samples_checklist()interactomics_area()interactomics_control_col()interactomics_crapome_col()interactomics_enrichment_col()interactomics_inbuilt_control_col()interactomics_input_card()main_content_div()main_sidebar()make_du_uploader()modals()navbar()post_saint_container()proteomics_area()proteomics_input_card()qc_area()range_input()saint_filtering_container()table_of_contents()upload_area()workflow_area()
- Module contents
- Subpackages
- app.pages package
- app.pipeline_module package
- Submodules
- app.pipeline_module.batch_data_store_builder module
- app.pipeline_module.batch_figure_builder_from_divs module
- app.pipeline_module.pipeline_batch module
BatchConfigBatchConfig.additional_controlsBatchConfig.chosen_enrichmentsBatchConfig.comparison_fileBatchConfig.control_groupBatchConfig.crapome_fc_thresholdBatchConfig.crapome_percentage_thresholdBatchConfig.crapome_setsBatchConfig.data_table_pathBatchConfig.fc_thresholdBatchConfig.figure_templateBatchConfig.force_supervennBatchConfig.imputationBatchConfig.keep_batch_outputBatchConfig.n_controlsBatchConfig.na_filter_percentBatchConfig.na_filter_typeBatchConfig.normalizationBatchConfig.outdirBatchConfig.p_thresholdBatchConfig.plot_formatsBatchConfig.proximity_filteringBatchConfig.remove_common_contaminantsBatchConfig.rename_replicatesBatchConfig.rescue_enabledBatchConfig.saint_bfdr_thresholdBatchConfig.sample_table_pathBatchConfig.test_typeBatchConfig.unique_onlyBatchConfig.uploaded_controlsBatchConfig.workflow
dash_to_wire()run_pipeline()
- app.pipeline_module.pipeline_from_toml module
- app.pipeline_module.pipeline_input_watcher module
- Module contents
- app.resources package
Submodules
app.app module
Main application module for ProteoGyver.
This module initializes and configures the Dash application with Celery for long callbacks, sets up logging, creates the navigation bar, and defines the main layout structure.
- app.app.celery_app
Celery application instance for handling long callbacks
- Type:
Celery
- app.app.app
Main Dash application instance
- Type:
Dash
- app.app.server
Flask server instance from Dash app
- Type:
Flask
- app.app.logger
Application logger instance
- Type:
Logger
Create the application navigation bar.
- Parameters:
parameters (
dict) – App parameters containing navbar configuration.- Return type:
Navbar- Returns:
Bootstrap Navbar with pages and branding.
Toggle the navbar collapse state.
- Parameters:
n (
int) – Number of clicks on the toggle button.is_open (
bool) – Current collapse state.
- Return type:
bool- Returns:
New collapse state.
app.database_admin module
Administrative entrypoints and helpers for database lifecycle operations.
Tasks include schema creation, snapshotting, external data updates, and periodic cleanup of old versions.
- app.database_admin.clean_database(versions_to_keep_dict)[source]
Remove old database directories, keeping a configured number of versions.
- Parameters:
versions_to_keep_dict – Mapping with keys ‘<name>’ (keep count), ‘<name>_path’ (path list), and ‘<name>_regex’ (folder regex with group 1 sortable for recency).
- Return type:
None- Returns:
None.
- app.database_admin.create_sqlite_from_schema(schema_file, db_file, overwrite=False, pragmas=('foreign_keys=ON', 'journal_mode=WAL'))[source]
Create a SQLite database from a .sql schema file.
- Parameters:
schema_file (
str|Path) – Path to the schema file.db_file (
str|Path) – Path of the database to create.overwrite (
bool) – Whether to overwrite an existing DB file.pragmas (
Optional[Iterable[str]]) – PRAGMAs to apply after connecting (e.g., (“foreign_keys=ON”,)).
- Return type:
Path- Returns:
Absolute path to the created database.
- Raises:
FileNotFoundError – If
schema_filedoes not exist.FileExistsError – If
db_fileexists andoverwriteis False.sqlite3.Error – If executing the schema fails.
- app.database_admin.get_external_versions(conn, externals)[source]
Get the versions of the external databases.
- Return type:
dict
- app.database_admin.last_update(conn, uptype, interval, time_format)[source]
Return the last update time for a given update type or a default.
If the log lookup fails, defaults to now minus
intervalseconds.- Parameters:
conn (
Connection) – SQLite database connection.uptype (
str) – Update type label to query (e.g., ‘external’).interval (
int) – Interval in seconds to compute a safe default.time_format (
str) – Timestamp format string used in the log table.
- Return type:
datetime- Returns:
Datetime of the last update or a computed default.
app.database_updater module
Utilities to update and synchronize the SQLite database from TSV inputs and external APIs (UniProt, IntAct, BioGRID).
This module provides helpers for: - Creating TSV-based inserts/updates with schema reconciliation - Merging interaction datasets and exporting incremental updates - Recording update logs and packaging outputs
- app.database_updater.get_dataframe_differences(df1, df2, ignore_columns=None)[source]
Compare two DataFrames and return modified/new indices and missing indices.
Columns listed in
ignore_columnsare dropped before comparison. The two DataFrames must have identical columns after dropping.- Parameters:
df1 (
DataFrame) – Baseline DataFrame.df2 (
DataFrame) – New DataFrame to compare against baseline.ignore_columns (
list[str] |None) – Columns to ignore during comparison.
- Return type:
tuple[list[str],list[str]]- Returns:
Tuple of (new_or_modified_indices, missing_indices).
- app.database_updater.handle_merg_chunk(existing, organisms, timestamp, L, last_update_date, odir, parameters)[source]
Merge IntAct and BioGRID chunks, writing new and modified interactions.
- Parameters:
existing (
DataFrame) – Existing interactions DataFrame (index ‘interaction’).organisms (
set|None) – Optional set of organism IDs to include.timestamp (
str) – Current update timestamp string.L (
str) – Chunk prefix letter.last_update_date (
datetime|None) – Optional cutoff date for remote queries.odir (
str) – Output directory for TSVs.parameters (
dict) – Updater parameters including ‘Ignore diffs’ and paths.
- Return type:
None- Returns:
None.
- app.database_updater.handle_mods(check_for_mods, existing, timestamp, L, parameters, odir)[source]
Write modified interactions to TSV and optionally queue deletions.
- Parameters:
check_for_mods – List of candidate modified interaction dicts.
existing – Existing interactions DataFrame (indexed by ‘interaction’).
timestamp – Current update timestamp string.
L – Chunk prefix/letter for file naming.
parameters – Updater parameters including deletion settings.
odir – Output directory.
- Return type:
None- Returns:
None.
- app.database_updater.handle_new(new_interactions, odir, timestamp, L)[source]
Write new interactions to a timestamped TSV file in the output directory.
- Parameters:
new_interactions – List of interaction dicts keyed by ‘interaction’.
odir – Output directory.
timestamp – Current update timestamp string.
L – Chunk prefix/letter for file naming.
- Return type:
None- Returns:
None.
- app.database_updater.merge_multiple_string_dataframes(dfs)[source]
Merge DataFrames with semicolon-separated string fields by union of values.
Each cell is split on ‘;’ and de-duplicated across dataframes; merged rows are indexed by ‘interaction’.
- Parameters:
dfs (
list[DataFrame]) – List of input DataFrames.- Return type:
DataFrame- Returns:
Merged DataFrame with unioned semicolon-joined values.
- app.database_updater.stream_flattened_rows(df)[source]
Yield rows as dictionaries with set-unioned values split on ‘;’.
- Parameters:
df (
DataFrame) – Input DataFrame; uses ‘interaction’ as index when present.- Return type:
Iterator[dict]- Returns:
Iterator of dict rows with ‘interaction’ key and set values per column.
- app.database_updater.update_database(conn, parameters, cc_cols, cc_types, timestamp)[source]
Update multiple database tables using TSV files from configured directories.
- Parameters:
conn – SQLite database connection.
parameters – Parameters with ‘Update files’ table→directory mappings and limits.
cc_cols – Expected column names for creating fresh tables.
cc_types – SQL column types aligned with
cc_cols.timestamp – Current update timestamp string.
- Returns:
Tuple of (inmod_names, inmod_vals) listing counts per table and action.
- app.database_updater.update_external_data(conn, parameters, timestamp, organisms=None, last_update_date=None, versions=None, ncpu=1)[source]
Update external data tables (UniProt and known interactions).
- Parameters:
conn – SQLite database connection.
parameters – Updater parameters; includes external update intervals.
timestamp – Current update timestamp string.
organisms (
set|None) – Optional set of organism IDs to update.last_update_date (
datetime|None) – Cutoff date; ignore data older than this.ncpu (
int) – Number of CPUs to use for merging chunks.
- Returns:
None.
- app.database_updater.update_knowns(conn, parameters, timestamp, uniprots, organisms, versions, last_update_date=None, ncpu=1)[source]
Update known interaction TSVs in parallel by merging external sources.
- Parameters:
conn – SQLite connection for reading existing interactions.
parameters – Updater parameters with file paths.
timestamp – Current update timestamp string.
uniprots – Set of UniProt IDs to filter by.
organisms – Optional set of organism IDs to include.
versions (
dict[str,list[str] |str]) – Dictionary of versions for each external source.last_update_date (
datetime|None) – Cutoff datetime; ignore older remote entries.ncpu (
int) – Number of worker processes.
- Return type:
list[tuple[str,str]]- Returns:
List of new versions for each external source.
- app.database_updater.update_log_table(conn, inmod_names, inmod_vals, timestamp, uptype)[source]
Record database update info in a log table.
- Parameters:
conn – SQLite database connection.
inmod_names – Names like ‘table action’ (e.g., ‘proteins insertions’).
inmod_vals – Counts aligned with
inmod_names.timestamp – Update timestamp string.
uptype (
str) – Update category label (e.g., ‘external’, ‘snapshot’).
- Return type:
None- Returns:
None.
- app.database_updater.update_table_with_file(cursor, table_name, file_path, parameters, timestamp, add_info='')[source]
Update a table with data from a TSV file, adding columns if needed.
- Parameters:
cursor – SQLite database cursor.
table_name – Target table name.
file_path – Path to the TSV file with new data.
parameters – Configuration parameters; expects keys like ‘Allowed new columns’, ‘Allowed missing columns’, ‘Ignore diffs’.
timestamp – Current update timestamp string.
add_info (
str) – Optional progress info for logging.
- Returns:
Tuple of (insertions, modifications).
- Raises:
ValueError – If too many new or missing columns are detected.
- app.database_updater.update_uniprot(conn, parameters, timestamp, versions, organisms=None)[source]
Download and stage UniProt data, writing TSV updates if differences found.
- Parameters:
conn – SQLite database connection.
parameters – Updater parameters (paths, ignore diffs, deletion policy).
timestamp – Current update timestamp string.
versions (
list) – List of current versions of the UniProt database.organisms (
set|None) – Optional set of organism IDs to include.
- Returns:
Set of UniProt IDs present in the fetched dataset.
app.element_styles module
Styles for Dash interface elements.
Defines style dictionaries used throughout the UI, including sidebar, content area, upload components, and status indicators.
app.embedded_page_updater module
Module for creating embedded page files from a list of websites.
This module reads a text file containing website names and URLs, then generates Dash pages that embed these websites using html.Embed.
Limitations:
Not all sites can be embedded due to content security policies (CSP)
HTTPS support is untested but may help with embedding restrictions
Successfully tested only with:
Sites served from the same server
www.proteomics.fi
All testing has been done without HTTPS
- app.embedded_page_updater.create_page_file(output_dir, site_name, url)[source]
Create a new Dash page file for embedding a website.
- Parameters:
output_dir (
str) – Directory where the page file should be created.site_name (
str) – Name of the website (used for the page title and route).url (
str) – URL of the website to embed.
- Return type:
None- Returns:
None.
- app.embedded_page_updater.parse_embed_file(filename)[source]
Parse the embed configuration file containing website names and URLs.
- Parameters:
filename (
str) – Path to the text file containing site information.- Return type:
List[Tuple[str,str]]- Returns:
List of (site_name, url) tuples.
- app.embedded_page_updater.update_pages(output_dir, embed_file)[source]
Update embedded pages based on the configuration file.
- Parameters:
output_dir (
str) – Directory where page files should be created.embed_file (
str) – Path to the text file containing site information.
- Return type:
None- Returns:
None.
app.run_as_pipeline module
ProteoGyver Batch Pipeline
This script runs the complete batch pipeline using the same infrastructure as the GUI, ensuring identical behavior and maintainability.
Module contents
ProteoGyver - A web-based platform for proteomics and interactomics data analysis.