app.components package

Subpackages

Submodules

app.components.EnrichmentAdmin module

class app.components.EnrichmentAdmin.EnrichmentAdmin(parameters_file)[source]

Bases: object

Manage enrichment handlers and orchestrate enrichment runs.

Parameters:

parameters_file (str) – Path to parameters TOML used to locate handler modules.

enrich_all(data_table, enrichment_strings, id_column=None, id_list=None, split_by_column=None, split_name=None)[source]

Run all requested enrichments via their handlers.

Parameters:
  • data_table (DataFrame) – Input table with identifiers and optional split column.

  • enrichment_strings (list) – List of enrichment names to run.

  • id_column (str) – Column containing identifiers to enrich.

  • id_list (list) – Explicit list of identifiers if not using id_column.

  • split_by_column (str) – Optional column to split input by groups/baits.

  • split_name (str) – Label for the split dimension (defaults to ‘Sample group’).

Return type:

list

Returns:

Tuple of (result_names, return_dataframes, information).

Raises:

AssertionError – If neither id_column nor id_list is provided.

get_available()[source]

List all available enrichment names across handlers.

Return type:

list

Returns:

Sorted list of enrichment names.

get_default()[source]

List default enrichment names suggested by handlers.

Return type:

list

Returns:

Sorted list of default enrichment names.

get_disabled()[source]

List enrichments disabled by configuration.

Return type:

list

Returns:

Sorted list of disabled enrichment names.

import_handlers()[source]

Import all enrichment handler modules from configured directory.

Return type:

dict

Returns:

Dict mapping module name -> handler instance.

app.components.MS_run_json_parser module

app.components.cleanup_tasks module

app.components.db_functions module

app.components.db_functions.add_column(db_conn, tablename, colname, coltype)[source]

Add a column to a table.

Parameters:
  • db_conn – SQLite connection.

  • tablename – Target table name.

  • colname – New column name.

  • coltype – Column type string.

Returns:

None

Raises:

sqlite3.Error – On SQL failure.

app.components.db_functions.add_multiple_records(db_conn, tablename, column_names, list_of_values)[source]

Insert multiple records into a table.

Parameters:
  • db_conn – SQLite connection.

  • tablename – Table name.

  • column_names – List of column names.

  • list_of_values – Iterable of row value sequences.

Return type:

None

Returns:

None

Raises:

sqlite3.Error – On SQL failure.

app.components.db_functions.add_record(db_conn, tablename, column_names, values)[source]

Insert a single record into a table.

Parameters:
  • db_conn – SQLite connection.

  • tablename – Table name.

  • column_names – List of column names.

  • values – List of values.

Returns:

None

Raises:

sqlite3.Error – On SQL failure.

app.components.db_functions.create_connection(db_file, error_file=None, mode='ro')[source]

Create a database connection to an SQLite file.

Parameters:
  • db_file – Database file path; returns None if file doesn’t exist.

  • error_file (str | None) – Optional path to append exception messages.

  • mode (str) – 'ro' for read-only (default) or any other for read-write.

Returns:

Connection object or None.

app.components.db_functions.delete_multiple_records(db_conn, table, deletes)[source]

Delete multiple records per delete spec.

Parameters:
  • db_conn – SQLite connection.

  • table – Table name.

  • deletes – List of dicts with keys criteria_col and criteria.

Returns:

None

Raises:

sqlite3.Error – On SQL failure.

app.components.db_functions.delete_record(db_conn, tablename, criteria_col, criteria)[source]

Delete a single record convenience wrapper.

Parameters:
  • db_conn – SQLite connection.

  • tablename – Table name.

  • criteria_col – WHERE column.

  • criteria – WHERE value.

Returns:

None

app.components.db_functions.drop_table(conn, table_name)[source]

Drop a table from the database if it exists.

Parameters:
  • conn (Connection) – SQLite connection.

  • table_name (str) – Table name to drop.

Return type:

None

Returns:

None

app.components.db_functions.dump_full_database_to_csv(database_file, output_directory)[source]

Dump all tables to TSV files in an output directory.

Parameters:
  • database_file – Path to database file.

  • output_directory – Destination directory.

Return type:

None

Returns:

None

app.components.db_functions.export_snapshot(source_path, snapshot_dir, snapshots_to_keep)[source]

Create a timestamped SQLite snapshot and prune old backups.

Parameters:
  • source_path (str) – Source DB path.

  • snapshot_dir (str) – Directory to store backups.

  • snapshots_to_keep (int) – Keep at most this many snapshots (None to skip pruning).

Return type:

None

Returns:

None

Raises:

FileNotFoundError – If source DB does not exist.

app.components.db_functions.generate_database_table_templates_as_tsvs(db_conn, output_dir, primary_keys)[source]

Generate TSV templates (headers only) for selected tables.

Parameters:
  • db_conn – SQLite connection (not closed by this function).

  • output_dir – Directory to write TSV files.

  • primary_keys – Mapping table -> primary key column name to place first.

Returns:

None

app.components.db_functions.get_contaminants(db_file, protein_list=None, error_file=None)[source]

Retrieve contaminant UniProt IDs from the contaminants table.

Parameters:
  • db_file (str) – Database file path.

  • protein_list (list) – If provided, intersect results with this list.

  • error_file (str) – Optional error log path for connection errors.

Return type:

list

Returns:

List of contaminant UniProt IDs.

app.components.db_functions.get_database_versions(db_file_path, subset=None)[source]

Get the version of the database.

Parameters:
  • db_file_path (str) – Path to the database file.

  • subset (list[str] | None) – Subset of update types to get versions for.

Return type:

dict

Returns:

Version of the database.

app.components.db_functions.get_from_table(conn, table_name, criteria_col=None, criteria=None, select_col=None, as_pandas=False, pandas_index_col=None, operator='=')[source]

Query a table with optional WHERE and return list or DataFrame.

Parameters:
  • conn (Connection) – SQLite connection.

  • table_name (str) – Table name.

  • criteria_col (str | None) – Optional WHERE column.

  • criteria (str | tuple | None) – WHERE value or tuple for two-parameter condition.

  • select_col (str | list | None) – Column(s) to select (default all).

  • as_pandas (bool) – If True, return DataFrame; else list.

  • pandas_index_col (str | None) – Index column for DataFrame.

  • operator (str) – SQL operator to use (default =).

Return type:

list[tuple] | DataFrame

Returns:

DataFrame or list of values (first column) depending on as_pandas.

app.components.db_functions.get_from_table_by_list_criteria(conn, table_name, criteria_col, criteria, as_pandas=True, select_col=None, pandas_index_col=None)[source]

Query rows where a column matches any of the given values.

Parameters:
  • conn (Connection) – SQLite connection.

  • table_name (str) – Table name.

  • criteria_col (str) – Column name for the IN clause.

  • criteria (list) – List of values for the IN clause.

  • as_pandas (bool) – If True, return DataFrame; else list of tuples.

  • select_col (str) – Column(s) to select (default all).

  • pandas_index_col (str | None) – Optional index column for DataFrame.

Returns:

DataFrame or list depending on as_pandas.

app.components.db_functions.get_from_table_match_with_priority(conn, criteria_list, table, criteria_cols, *, case_insensitive=False, key_col=None, extra_tiebreak=None, return_cols=None)[source]

Find the best-matching row per value using priority columns.

Parameters:
  • conn (Connection) – SQLite connection.

  • criteria_list (Iterable[str]) – Values to match.

  • table (str) – Table name.

  • criteria_cols (List[str]) – Columns to try in priority order.

  • case_insensitive (bool) – If True, match case-insensitively.

  • key_col (Optional[str]) – Column ensuring deterministic ordering; defaults to rowid.

  • extra_tiebreak (Optional[List[Tuple[str, str]]]) – Extra (column, direction) order terms.

  • return_cols (Optional[List[str]]) – Columns to return (default all).

Return type:

Dict[str, Optional[Dict[str, Any]]]

Returns:

Mapping value -> row dict (or None if no match).

Raises:

ValueError – For invalid table/columns.

app.components.db_functions.get_full_table_as_pd(db_conn, table_name, index_col=None, filter_col=None, startswith=None)[source]

Read an entire table into a pandas DataFrame with optional prefix filter.

Parameters:
  • db_conn – SQLite connection.

  • table_name – Table name to read.

  • index_col (str | None) – Column to set as index.

  • filter_col (str | None) – Column to apply LIKE 'startswith%' on.

  • startswith (str | None) – Prefix for the filter.

Return type:

DataFrame

Returns:

DataFrame converted to pandas nullable dtypes.

app.components.db_functions.get_last_update(conn, uptype)[source]

Get the last update timestamp of a given type from update_log.

Parameters:
  • conn – SQLite connection.

  • uptype (str) – Update type value to filter on.

Return type:

str

Returns:

Latest timestamp string.

app.components.db_functions.get_table_column_names(db_conn, table_name)[source]

Get column names for a table.

Parameters:
  • db_conn – SQLite connection.

  • table_name (str) – Table name.

Return type:

list[str]

Returns:

List of column names.

app.components.db_functions.is_test_db(db_path)[source]

Check if an SQLite DB has metadata key is_test set to true.

Parameters:

db_path (str) – Path to database file.

Return type:

bool

Returns:

True if DB indicates test, else False.

app.components.db_functions.list_tables(database_file)[source]

List table names in an SQLite database.

Parameters:

database_file – Path to database file.

Return type:

list[str]

Returns:

List of table names.

app.components.db_functions.map_protein_info(uniprot_ids, info=None, placeholder=None, db_file_path=None)[source]

Map requested columns from the proteins table for UniProt IDs.

Parameters:
  • uniprot_ids (list) – UniProt IDs to map; order is preserved in result.

  • info (list | str) – Column name or list of column names to return (default 'gene_name').

  • placeholder (list | str) – Placeholder(s) for missing IDs; str or list aligned to info.

  • db_file_path (str | None) – SQLite DB path; if None, all IDs are treated as missing.

Returns:

List (or list of lists) with mapped values per input ID.

app.components.db_functions.modify_multiple_records(db_conn, table, updates)[source]

Modify multiple records according to update specs.

Parameters:
  • db_conn – SQLite connection.

  • table – Table name.

  • updates – List of dicts with keys criteria_col, criteria, columns, values.

Returns:

None

Raises:

sqlite3.Error – On SQL failure.

app.components.db_functions.modify_record(db_conn, table, criteria_col, criteria, columns, values)[source]

Modify a single record convenience wrapper.

Parameters:
  • db_conn – SQLite connection.

  • table – Table name.

  • criteria_col – WHERE column.

  • criteria – WHERE value.

  • columns – Columns to update.

  • values – Values to set.

Returns:

Executed SQL template string.

app.components.db_functions.remove_column(db_conn, tablename, colname)[source]

Remove a column from a table.

Parameters:
  • db_conn – SQLite connection.

  • tablename – Table name.

  • colname – Column name to drop.

Returns:

None

Raises:

sqlite3.Error – On SQL failure.

app.components.db_functions.rename_column(db_conn, tablename, old_col, new_col)[source]

Rename a column.

Parameters:
  • db_conn – SQLite connection.

  • tablename – Table name.

  • old_col – Existing column name.

  • new_col – New column name.

Returns:

None

Raises:

sqlite3.Error – On SQL failure.

app.components.figure_functions module

app.components.figure_functions.improve_text_position(data_frame)[source]

Generate alternating text positions for annotations.

Parameters:

data_frame (DataFrame) – DataFrame whose number of rows determines list length.

Return type:

list

Returns:

List of Plotly-compatible text positions cycling through corners/center.

app.components.file_upload_api module

API endpoint and Celery task for accepting and saving pipeline input files.

This module provides a Flask API endpoint that accepts file uploads (data_table, sample_table, pipeline_toml, and optionally proteomics_comparisons) and a Celery background task that saves these files to a specified directory.

The API runs continuously as part of the Flask server, and file saving is handled asynchronously by Celery workers.

app.components.file_upload_api.register_file_upload_api(server, celery_app_instance=None)[source]

Register the file upload API endpoint with the Flask server.

This function sets up a POST endpoint at ‘/api/upload-pipeline-files’ that accepts multipart/form-data file uploads. The endpoint accepts: - data_table (required): Data table file - sample_table (required): Sample table file - pipeline_toml (required): Pipeline configuration TOML file - proteomics_comparisons (optional): Proteomics comparisons file

Files are processed asynchronously via a Celery task.

Parameters:
  • server – Flask server instance (typically app.server from Dash).

  • celery_app_instance – Optional Celery app instance to use for task execution.

Returns:

None

app.components.infra module

Infrastructure components for the Proteogyver web application.

This module provides core infrastructure components and utilities for the Proteogyver web app, including data storage, figure export, and utility functions.

Key functionality includes:

  • Data store configuration and management

  • Figure export in multiple formats (HTML, PDF, PNG)

  • Input parameter tracking and export

  • Utility functions for component traversal and data formatting

  • Hidden utility components for app functionality

The module defines configurations for data store exports and figure directories, and provides functions for saving data, figures, and input parameters to files. It also creates various Dash components used throughout the application.

Functions

app.components.infra.save_data_stores()[source]
Saves data from data stores to files.
app.components.infra.save_figures()[source]
Exports figures in various formats.
app.components.infra.save_input_information()[source]
Saves input parameters to TSV.
app.components.infra.get_all_props()[source]
Utility function for traversing Dash components.
app.components.infra.get_all_types()[source]
Utility function for traversing Dash components.
app.components.infra.upload_data_stores()[source]
Creates Dash components for uploaded data stores.
app.components.infra.working_data_stores()[source]
Creates Dash components for processed/working data stores.

Constants

app.components.infra.DATA_STORE_IDS
List of all data store IDs.
app.components.infra.data_store_export_configuration
Export settings for each data store.
app.components.infra.figure_export_directories
Output directory mapping for figures.
app.components.infra.format_nested_list(input_list)[source]

Format a nested list into a comma-separated string.

Parameters:

input_list (list) – Nested list to format.

Returns:

Comma-separated string representation.

app.components.infra.get_all_props(elements, marker_key, match_partial=True)[source]

Find all elements whose props contain a marker key.

Parameters:
  • elements – Nested dash component-like dict/list structure.

  • marker_key – Key to search for within props.

  • match_partial – Whether to allow partial matches (unused).

Return type:

list

Returns:

List of tuples (marker_key, element).

app.components.infra.get_all_types(elements, get_types)[source]

Find all elements of specified types in a nested structure.

Parameters:
  • elements – Nested dash component-like dict/list structure.

  • get_types – List of element type strings to collect (e.g., ['h4','graph']).

Return type:

list

Returns:

List of matching elements.

app.components.infra.invisible_utilities()[source]

Create hidden utility components container.

Return type:

Div

Returns:

Hidden Div with utility components.

app.components.infra.notifiers()[source]

Create notification components used by callbacks.

Return type:

Div

Returns:

Hidden Div with notifier children.

app.components.infra.save_data_stores(data_stores, export_dir)[source]

Save data from data stores into files under an export directory.

Parameters:
  • data_stores – List of Store-like elements with props.data.

  • export_dir – Destination directory path.

Return type:

dict

Returns:

Dict of failures keyed by store id (empty if none).

app.components.infra.save_figures(analysis_divs, export_dir, output_formats, commonality_pdf_data, workflow)[source]

Save figures found in analysis divs in requested formats.

Parameters:
  • analysis_divs – Analysis container elements to scan.

  • export_dir – Base directory to save figures to.

  • output_formats – Formats to export (e.g., ['html','pdf','png']).

  • commonality_pdf_data – Optional PDF bytes (base64) for commonality figure.

  • workflow – Workflow name used in paths.

Return type:

None

Returns:

None

app.components.infra.save_input_information(input_divs, export_dir)[source]

Save user input parameters/settings to a TSV file.

Parameters:
  • input_divs – Input container elements to scan.

  • export_dir – Destination directory.

Return type:

None

Returns:

None

app.components.infra.temporary_download_button_loading_divs()[source]

Create hidden loading indicators for download buttons.

Returns:

Div with Loading components used during downloads.

app.components.infra.temporary_download_divs()[source]

Create temporary divs used for downloads.

Returns:

Div containing temporary download children.

app.components.infra.upload_data_stores()[source]

Create Store components for uploaded data.

Return type:

Div

Returns:

Div containing uploaded data Stores.

app.components.infra.working_data_stores()[source]

Create Store components for processed/working data.

Return type:

Div

Returns:

Div containing working data Stores.

app.components.infra.write_README(save_dir, guide_file)[source]

Write a README.html file rendered from markdown.

Parameters:
  • save_dir – Target directory for README.

  • guide_file – Path to markdown file to render.

Return type:

None

Returns:

None

app.components.interactomics module

Functions for processing and visualizing protein-protein interaction data.

This module provides functionality for analyzing mass spectrometry-based interactomics data, including: - Running and processing SAINT analysis for scoring protein interactions - Filtering results based on BFDR and CRAPome metrics - Creating visualizations (networks, heatmaps, PCA plots) - Performing enrichment analysis - MS-microscopy analysis for protein localization - Processing known interaction data

The module integrates with a SQLite database for retrieving reference data and uses Dash components for creating interactive visualizations.

Typical usage example:
>>> saint_dict = make_saint_dict(spc_table, sample_groups, controls, proteins)
>>> saint_output = run_saint(saint_dict, temp_dir, session_id, bait_ids)
>>> filtered_output = saint_filtering(saint_output, bfdr=0.01, crapome_pct=0.1)
>>> network_plot = do_network(filtered_output, plot_height=600)
app.components.interactomics.logger

Logger instance for module-level logging

app.components.interactomics.add_bait_column(saint_output, bait_uniprot_dict)[source]

Add bait UniProt and bait-self flags to SAINT output.

Parameters:
  • saint_output (DataFrame) – SAINT output DataFrame with Bait and Prey.

  • bait_uniprot_dict (Dict[str, str]) – Mapping bait name -> UniProt IDs (; separated allowed).

Return type:

DataFrame

Returns:

DataFrame with Bait uniprot and Prey is bait added.

app.components.interactomics.add_crapome(saint_output_json, crapome_json)[source]

Merge CRAPome annotations into SAINT output JSON.

Parameters:
  • saint_output_json (str) – SAINT output in pandas split-JSON format.

  • crapome_json (str) – CRAPome table in pandas split-JSON format.

Return type:

str

Returns:

Merged SAINT output JSON.

app.components.interactomics.count_knowns(saint_output, replicate_colors)[source]

Count known interactions per bait protein.

Parameters:
  • saint_output (DataFrame) – SAINT output with columns including Bait and Known interaction.

  • replicate_colors (Dict[str, Dict[str, Dict[str, str]]]) – Mapping with structure {'contaminant': {'sample groups': {bait: color}}, 'non-contaminant': {...}}.

Return type:

DataFrame

Returns:

DataFrame with columns Bait, Known interaction, Prey count, and Color.

app.components.interactomics.create_dummy_list_txt(temp_dir, saint_input)[source]

Create a dummy SAINT list.txt when SAINTexpressSpc is unavailable.

Generates a plausible-looking SAINT output file using random values so that downstream steps can proceed in demo or fallback mode.

Parameters:
  • temp_dir (str) – Target directory to write list.txt and marker file.

  • saint_input (Dict[str, List[List[str]]]) – SAINT input dict with keys bait, prey, int.

Return type:

None

Returns:

None

app.components.interactomics.do_ms_microscopy(saint_output_json, db_file, figure_defaults, version='v1.0')[source]

Perform MS-microscopy localization analysis and visualize.

Parameters:
  • saint_output_json (str) – SAINT output in pandas split-JSON format.

  • db_file (str) – SQLite DB path for MS-microscopy reference.

  • figure_defaults (Dict[str, Any]) – Figure defaults.

  • version (str) – Analysis version tag.

Return type:

Tuple[Div, str]

Returns:

Tuple of (plots Div, results JSON).

app.components.interactomics.do_network(saint_output_json, plot_height)[source]

Create a Cytoscape network from filtered SAINT output.

Parameters:
  • saint_output_json (str) – SAINT output in pandas split-JSON format.

  • plot_height (int) – Height of the network plot in pixels.

Return type:

Tuple[Div, List[Dict[str, Any]], Dict[str, Any]]

Returns:

Tuple of (plot container Div, cytoscape elements, interactions dict).

app.components.interactomics.enrich(saint_output_json, chosen_enrichments, figure_defaults, keep_all=False, sig_threshold=0.01, parameters_file='config/parameters.toml')[source]

Run selected enrichment methods and visualize results.

Parameters:
  • saint_output_json (str) – SAINT output in pandas split-JSON format.

  • chosen_enrichments (List[str]) – List of enrichment method names.

  • figure_defaults (Dict[str, Any]) – Figure defaults for plotting.

  • keep_all (bool) – If True, include non-significant rows meeting fold criteria.

  • sig_threshold (float) – Significance cutoff.

  • parameters_file (str) – Path to parameters TOML used by enrichment admin.

Return type:

Tuple[List[Div], Dict[str, Any], List[Any]]

Returns:

Tuple of (list of result Divs, dict of enrichment data, list of info).

app.components.interactomics.filter_controls_by_similarity(spc_table, controls, top_n)[source]

Filter control runs by similarity to experiment runs.

Parameters:
  • spc_table (DataFrame) – Spectral count table for experiment samples.

  • controls (List[DataFrame]) – List of candidate control tables.

  • top_n (int) – Number of top similar controls to keep per sample.

Return type:

DataFrame

Returns:

Filtered control table with selected columns.

app.components.interactomics.generate_saint_container(input_data_dict, uploaded_controls, additional_controls, crapomes, db_file, select_most_similar_only=False, n_controls=30)[source]

Build SAINT UI container and prepare inputs.

Parameters:
  • input_data_dict (Dict[str, Any]) – Input data and metadata including sample groups.

  • uploaded_controls (List[str]) – Uploaded control group names.

  • additional_controls (List[str]) – Additional DB control names.

  • crapomes (List[str]) – CRAPome dataset names.

  • db_file (str) – SQLite database path.

  • select_most_similar_only (bool) – If True, filter controls by similarity.

  • n_controls (int) – Number of controls to keep when filtering.

Return type:

Tuple[Div, Dict[str, List[List[str]]], str]

Returns:

Tuple of (container Div, SAINT input dict, CRAPome JSON).

app.components.interactomics.get_saint_matrix(saint_data_json)[source]

Convert SAINT output JSON to prey x bait matrix of AvgSpec.

Parameters:

saint_data_json (str) – SAINT output in pandas split-JSON format.

Return type:

DataFrame

Returns:

Pivot table DataFrame (rows=Prey, cols=Bait, values=AvgSpec).

app.components.interactomics.known_plot(filtered_saint_input_json, db_file, rep_colors_with_cont, figure_defaults, isoform_agnostic=False)[source]

Plot known interactions per bait.

Parameters:
  • filtered_saint_input_json (str) – Filtered SAINT output in pandas split-JSON format.

  • db_file (str) – Path to SQLite database file.

  • rep_colors_with_cont (Dict[str, Dict[str, str]]) – Mapping for contaminant and non-contaminant colors by bait.

  • figure_defaults (Dict[str, Any]) – Figure defaults for plotting.

  • isoform_agnostic (bool) – If True, match using base UniProt IDs (no isoforms).

Return type:

Tuple[Div, str]

Returns:

Tuple of (plot Div, processed SAINT output JSON).

app.components.interactomics.make_saint_dict(spc_table, rev_sample_groups, control_table, protein_table)[source]

Create SAINT input dict from SPC and metadata tables.

Parameters:
  • spc_table (DataFrame) – Spectral count data.

  • rev_sample_groups (Dict[str, str]) – Mapping sample -> group.

  • control_table (DataFrame) – Control spectral count table.

  • protein_table (DataFrame) – Protein info with columns uniprot_id, length, gene_name.

Return type:

Dict[str, List[List[str]]]

Returns:

Dict with keys bait, prey, int as lists of rows.

app.components.interactomics.map_intensity(saint_output_json, intensity_table_json, sample_groups)[source]

Map averaged intensity per group onto SAINT output rows.

Parameters:
  • saint_output_json (str) – SAINT output in pandas split-JSON format.

  • intensity_table_json (str) – Intensity table in pandas split-JSON format.

  • sample_groups (Dict[str, str]) – Mapping bait -> group name.

Return type:

str

Returns:

SAINT output JSON with optional Averaged intensity column.

app.components.interactomics.network_display_data(node_data, int_data, table_height, datatype='Cytoscape')[source]

Create a table for network connections.

Parameters:
  • node_data (dict[str, list[dict]]) – Node data; for Cytoscape use {'edgesData': [{'source','target'},...]}; for visdcc use {'edges': ['source_-_target', ...]}.

  • int_data (dict[str, dict[str, list[str | float]]]) – Mapping source -> target -> [gene_name, avg_spec].

  • table_height (int) – Table height in pixels.

  • datatype (str) – 'Cytoscape' or 'visdcc'.

Return type:

list[Label | DataTable]

Returns:

List containing a label and a DataTable with Bait, Prey, PreyGene, AvgSpec.

app.components.interactomics.pca(saint_output_data, defaults, replicate_colors)[source]

Perform PCA on SAINT output and plot bait relationships.

Parameters:
  • saint_output_data (str) – SAINT output in pandas split-JSON format.

  • defaults (Dict[str, Any]) – Figure defaults.

  • replicate_colors (Dict[str, str]) – Mapping 'sample groups' -> color.

Return type:

Tuple[Div, str]

Returns:

Tuple of (plot Div, PCA data JSON). Returns empty plot if <2 baits.

app.components.interactomics.prepare_controls(input_data_dict, uploaded_controls, additional_controls, db_conn, select_most_similar_only=False, top_n=30)[source]

Assemble uploaded and DB controls for SAINT.

Parameters:
  • input_data_dict (Dict[str, Any]) – Inputs including sample groups and SPC data tables.

  • uploaded_controls (List[str]) – Names of uploaded control groups.

  • additional_controls (List[str]) – Additional DB control table names.

  • db_conn (Connection) – SQLite connection.

  • select_most_similar_only (bool) – If True, keep only most similar controls.

  • top_n (int) – Number of controls to keep per-sample when filtering.

Return type:

Tuple[DataFrame, DataFrame]

Returns:

Tuple of (SPC table without control columns, combined control table).

app.components.interactomics.prepare_crapome(db_conn, crapomes)[source]

Prepare CRAPome tables for downstream filtering.

Parameters:
  • db_conn (Connection) – SQLite connection.

  • crapomes (List[str]) – List of CRAPome table names (possibly with suffixes).

Return type:

DataFrame

Returns:

DataFrame with per-CRAPome frequency and spc averages plus max frequency.

app.components.interactomics.run_saint(saint_input, saint_tempdir, session_uid, bait_uniprots, cleanup=True)[source]

Execute SAINT pipeline and return processed output.

Parameters:
  • saint_input (Dict[str, List[List[str]]]) – SAINT input dict.

  • saint_tempdir (List[str]) – Temp directory base as path segments.

  • session_uid (str) – Unique run identifier.

  • bait_uniprots (Dict[str, str]) – Mapping bait -> UniProt IDs.

  • cleanup (bool) – If True, remove temp files after success.

Return type:

Tuple[str, bool]

Returns:

Tuple of (output JSON or error string, saint_missing_flag).

app.components.interactomics.saint_cmd(saint_input, saint_tempdir, session_uid)[source]

Run SAINTexpressSpc on prepared input files.

Parameters:
  • saint_input (Dict[str, List[List[str]]]) – Dict with keys bait, prey, int containing row lists.

  • saint_tempdir (List[str]) – List of path segments for temp dir base.

  • session_uid (str) – Unique identifier to isolate run directory.

Return type:

str

Returns:

Path to directory containing list.txt (or dummy if SAINT missing).

Raises:
  • OSError – On temp dir creation failure.

  • sh.CommandNotFound – If SAINTexpressSpc is not available.

app.components.interactomics.saint_counts(filtered_output_json, figure_defaults, replicate_colors)[source]

Count prey per bait and plot as a bar chart.

Parameters:
  • filtered_output_json (str) – Filtered SAINT output in pandas split-JSON format.

  • figure_defaults (Dict[str, Any]) – Figure defaults for plotting.

  • replicate_colors (Dict[str, str]) – Mapping 'sample groups' -> color.

Return type:

Tuple[Div, str]

Returns:

Tuple of (bar plot Div, count data JSON).

app.components.interactomics.saint_filtering(saint_output_json, bfdr_threshold, crapome_percentage, crapome_fc, do_rescue=False)[source]

Filter SAINT output by BFDR and CRAPome thresholds.

Parameters:
  • saint_output_json (str) – SAINT output in pandas split-JSON format.

  • bfdr_threshold (float) – BFDR threshold for filtering.

  • crapome_percentage (float) – CRAPome frequency threshold.

  • crapome_fc (float) – CRAPome fold-change threshold for rescue.

  • do_rescue (bool) – If True, keep preys that pass in any bait.

Return type:

str

Returns:

Filtered SAINT output JSON.

app.components.interactomics.saint_histogram(saint_output_json, figure_defaults)[source]

Create a histogram of BFDR scores from SAINT output.

Parameters:
  • saint_output_json (str) – SAINT output in pandas split-JSON format.

  • figure_defaults (Dict[str, Any]) – Figure defaults for plotting.

Return type:

Tuple[Div, str]

Returns:

Tuple of (histogram Div, histogram data JSON).

app.components.mathparser module

class app.components.mathparser.MathParser(vars, math=True)[source]

Bases: object

Basic math expression parser with variable support.

Courtesy of user3240484.

Parameters:
  • vars – Mapping where vars[name] -> numeric value used for evaluation.

  • math – If True (default), expose functions/constants from math module.

Example

>>> data = {'r': 3.4, 'theta': 3.141592653589793}
>>> parser = MathParser(data)
>>> round(parser.parse('r*cos(theta)'), 1)
-3.4
>>> data['theta'] = 0.0
>>> parser.parse('r*cos(theta)')
3.4
eval_(node)[source]

Evaluate an AST node recursively.

Parameters:

node – AST node to evaluate.

Returns:

Result of evaluating the expression.

Raises:

TypeError – If node type is not supported.

parse(expr)[source]

Parse and evaluate a mathematical expression string.

Parameters:

expr – Expression string to parse.

Returns:

Numerical result of evaluating the expression.

app.components.matrix_functions module

app.components.matrix_functions.compute_zscore(data, test_samples, control_samples, measure='median', std=2)[source]

Compute Z-scores of test samples relative to control samples.

Parameters:
  • data (DataFrame) – DataFrame with proteins in index and samples in columns (log2).

  • test_samples (list) – List of test sample column names.

  • control_samples (list) – List of control sample column names.

  • measure (str) – 'mean' or 'median' center for controls.

  • std (int) – Threshold; values below are set to 0 in the result.

Returns:

DataFrame of Z-scores for test samples.

app.components.matrix_functions.compute_zscore_based_deviation_from_control(df, sample_groups, control_group, top_n=50)[source]

Compute group-wise Z-score deviations relative to a control group.

Parameters:
  • df (DataFrame) – DataFrame with proteins in rows and samples in columns (log2).

  • sample_groups (dict) – Mapping group -> list of sample names.

  • control_group (str) – Name of the control group.

  • top_n (int) – Number of top proteins to aggregate per group.

Return type:

tuple

Returns:

Tuple of (dict of Z-score summaries, per-protein summary DataFrame, top-N proteins DataFrame).

app.components.matrix_functions.count_per_sample(data_table, rev_sample_groups)[source]

Count non-NA values per sample for given sample list.

Parameters:
  • data_table (DataFrame) – Input DataFrame.

  • rev_sample_groups (dict) – Mapping sample -> group; keys define sample order.

Return type:

Series

Returns:

Series indexed by sample name with non-NA counts.

Raises:

ValueError – If inputs are empty.

app.components.matrix_functions.do_pca(data_df, rev_sample_groups, n_components)[source]

Compute PCA of samples and return labeled components and DataFrame.

Parameters:
  • data_df (DataFrame) – DataFrame with features in rows and samples in columns.

  • rev_sample_groups (dict) – Mapping sample -> group for labeling.

  • n_components – Number of components to compute (>=2).

Return type:

tuple

Returns:

Tuple (pc1_label, pc2_label, pca_result_df).

app.components.matrix_functions.filter_missing(data_table, sample_groups, filter_type, threshold_percentage=60)[source]

Filter rows with excessive missing values.

Keeps a row if it meets the threshold either per group (sample-group) or across the whole table (sample-set).

Parameters:
  • data_table (DataFrame) – Input DataFrame.

  • sample_groups (dict) – Mapping group -> list of column names.

  • filter_type (str) – 'sample-group' or 'sample-set'.

  • threshold_percentage (int) – Minimum non-NA percentage required.

Return type:

DataFrame

Returns:

Filtered copy of data_table.

app.components.matrix_functions.hierarchical_clustering(df, cluster='both', method='ward', fillval=0.0)[source]

Perform hierarchical clustering on a DataFrame.

Parameters:
  • df – DataFrame with numerical values.

  • cluster – One of 'rows', 'columns', or 'both'.

  • method – Linkage method for clustering (e.g., 'ward').

  • fillval (float) – Value used to fill NaNs before distance computation.

Returns:

Reordered DataFrame according to hierarchical clustering.

Raises:

ValueError – If cluster is not one of the allowed values.

app.components.matrix_functions.impute(data_table, errorfile, method, random_seed, rev_sample_groups)[source]

Impute missing values using the specified method.

Parameters:
  • data_table (DataFrame) – Input DataFrame.

  • errorfile (str) – Path used by external methods for diagnostics.

  • method (str) – One of 'minprob', 'minvalue', 'gaussian', 'qrilc', 'random_forest'.

  • random_seed (int) – Random seed for reproducibility.

  • rev_sample_groups (dict) – Mapping sample -> group (used by some methods).

Return type:

DataFrame

Returns:

Imputed DataFrame.

Raises:

ValueError – For invalid imputation method.

app.components.matrix_functions.impute_gaussian(data_table, random_seed, dist_width=0.15, dist_down_shift=2)[source]

Impute values by sampling from a shifted/scaled Gaussian.

Based on Perseus’ method.

Parameters:
  • data_table (DataFrame) – Numeric DataFrame with missing values.

  • random_seed (int) – Random seed for reproducibility.

  • dist_width (float) – Width as a fraction of column standard deviation.

  • dist_down_shift (float) – Downward shift in standard deviations.

Return type:

DataFrame

Returns:

DataFrame with imputed values.

app.components.matrix_functions.impute_minprob(series_to_impute, random_seed, scale=1.0, tune_sigma=0.01, impute_zero=True)[source]

Impute values from a distribution near the lowest non-NA values.

Parameters:
  • series_to_impute (Series) – Series with possible missing values.

  • random_seed (int) – Random seed for reproducibility.

  • scale (float) – Scale parameter for numpy.random.normal.

  • tune_sigma (float) – Fraction of the lowest values to define the distribution.

  • impute_zero – If True, treat zeros as missing.

Return type:

Series

Returns:

Series with imputed values.

app.components.matrix_functions.impute_minprob_df(dataframe, *args, **kwargs)[source]

Impute an entire DataFrame using the minprob method.

Parameters:
  • dataframe (DataFrame) – Numeric DataFrame to impute.

  • args – Positional args forwarded to impute_minprob.

  • kwargs – Keyword args forwarded to impute_minprob.

Return type:

DataFrame

Returns:

Imputed DataFrame.

app.components.matrix_functions.impute_minval(dataframe, impute_zero=False)[source]

Impute missing values with the per-column minimum.

Parameters:
  • dataframe (DataFrame) – Numeric DataFrame with missing values.

  • impute_zero (bool) – If True, treat zeros as missing.

Return type:

DataFrame

Returns:

DataFrame with imputed values.

app.components.matrix_functions.median_normalize(data_frame)[source]

Median-normalize a log2-transformed DataFrame.

Parameters:

data_frame (DataFrame) – DataFrame with samples as columns (log2-transformed).

Return type:

DataFrame

Returns:

Median-normalized DataFrame.

app.components.matrix_functions.normalize(data_table, normalization_method, errorfile, random_seed=13)[source]

Normalize a DataFrame using a specified method.

Parameters:
  • data_table – Input DataFrame (log2 for median/quantile; raw for VSN).

  • normalization_method – One of 'no_normalization', 'median', 'quantile', 'vsn'.

  • errorfile (str) – Path used by VSN routine for diagnostics.

  • random_seed (int) – Random seed for VSN reproducibility.

Return type:

DataFrame

Returns:

Normalized DataFrame.

Raises:

ValueError – For invalid normalization method.

app.components.matrix_functions.quantile_normalize(dataframe)[source]

Quantile-normalize a DataFrame.

Parameters:

dataframe (DataFrame) – DataFrame to normalize.

Return type:

DataFrame

Returns:

Quantile-normalized DataFrame.

app.components.matrix_functions.ranked_dist(main_df, supplemental_df)[source]

Rank supplemental columns by summed distance to main columns.

Parameters:
  • main_df – DataFrame providing reference columns.

  • supplemental_df – DataFrame to compare against.

Returns:

List of [column_name, distance_sum] sorted ascending by distance.

app.components.matrix_functions.ranked_dist_n_per_run(main_df, supplemental_df, per_run)[source]

Select top-N closest supplemental columns per main column.

Parameters:
  • main_df – DataFrame providing reference columns.

  • supplemental_df – DataFrame to compare against.

  • per_run – Number of closest supplemental columns to take per main column.

Returns:

Sorted unique list of chosen supplemental column names.

app.components.matrix_functions.reverse_log2(value)[source]

Reverse a log2 transformation.

Parameters:

value – Log2-transformed numeric value.

Returns:

Original (base-2) value.

app.components.ms_microscopy module

app.components.ms_microscopy.draw_localization_heatmap(defaults, localization_results)[source]

Draw a heatmap of localization scores (bait x localization).

Parameters:
  • defaults (dict) – Dict containing height and width.

  • localization_results (DataFrame) – DataFrame of scores (index=baits, columns=locs).

Return type:

Figure

Returns:

Plotly Figure heatmap.

app.components.ms_microscopy.draw_localization_plot(defaults, datarow, cmap=[[255, 255, 255], [0, 0, 255]], nsteps=10, plot_min=0, plot_max=100)[source]

Draw a polar localization plot for a single bait row.

Parameters:
  • defaults (dict) – Dict containing height and width.

  • datarow (Series) – Series of localization scores (columns=locations).

  • cmap (list) – Two RGB triplets defining start and end colors.

  • nsteps (int) – Number of discrete color steps between min and max.

  • plot_min (int) – Minimum plotting value for radial axis.

  • plot_max (int) – Maximum plotting value for radial axis.

Returns:

Plotly Figure.

app.components.ms_microscopy.generate_msmic_dataframes(saint_data, reference_data, plot_min=0, plot_max=100)[source]

Generate MS microscopy localization score DataFrame.

Parameters:
  • saint_data (DataFrame) – SAINT results with columns Bait, Prey, AvgSpec.

  • reference_data (DataFrame) – Reference localization data with columns Loc, Prey and precomputed scores.

  • plot_min (int) – Minimum plotting value (for normalization).

  • plot_max (int) – Maximum plotting value (for normalization).

Return type:

tuple

Returns:

DataFrame of bait x localization integer scores (0..plot_max).

app.components.ms_microscopy.localization_graph(graph_id, defaults, plot_type, baitname, *args, **kwargs)[source]

Create a Dash Graph for localization visualization.

Parameters:
  • graph_id (str) – Component ID for the graph.

  • defaults (dict) – Dict with config, height, width.

  • plot_type (str) – 'polar' or 'heatmap'.

  • baitname (str) – Bait name for the naming of the downloadable figure file.

  • args – Positional args forwarded to the specific drawing function.

  • kwargs – Keyword args forwarded to the specific drawing function.

Return type:

Graph

Returns:

Dash Graph.

app.components.ms_microscopy.tweak_fig_size_hw(height, width, desired_ratio, method='reduce')[source]

Adjust figure height/width to target aspect ratio.

Parameters:
  • height (int) – Current height in pixels.

  • width (int) – Current width in pixels.

  • desired_ratio (float) – Target height/width ratio.

  • method'reduce' to reduce the larger dimension, 'inflate' to increase the smaller.

Return type:

tuple

Returns:

Tuple of (new_height, new_width) as ints.

app.components.parsing module

File parsing functions for ProteoGyver.

Functions for parsing and processing various data formats, handling data type conversions, and managing parameter configurations used throughout the application.

app.components.parsing.check_bait(bait_entry)[source]

Checks if a string contains a valid bait name.

Parameters:

bait_entry (str) – The bait entry to validate

Returns:

A string representation of the bait. Returns ‘No bait uniprot’ if the entry is

empty, None, or ‘nan’

Return type:

str

Examples

>>> check_bait('P12345')
'P12345'
>>> check_bait(None)
'No bait uniprot'
>>> check_bait('nan')
'No bait uniprot'
app.components.parsing.check_comparison_file(file_contents, file_name, sgroups, new_upload_style)[source]

Validate and parse a comparison file with sample-control pairs.

Parameters:
  • file_contents (str) – Base64 encoded contents of the uploaded comparison file.

  • file_name (str) – Name of the uploaded file.

  • sgroups (Dict[str, List[str]]) – Dictionary of valid sample groups.

  • new_upload_style (Dict[str, str]) – Style dict updated with status color.

Return type:

Tuple[Dict[str, str], List[List[str]]]

Returns:

Tuple of (updated style dict, list of valid [sample, control] pairs).

app.components.parsing.check_numeric(st)[source]

Check if a string can be converted to a numeric value.

Parameters:

st (Union[str, number]) – String or numpy number to check for numeric conversion.

Return type:

Dict[str, Union[bool, int, float, str]]

Returns:

Dict with keys success and value (converted value or original string).

app.components.parsing.check_required_columns(columns)[source]

Validates presence of required columns in sample table.

Parameters:

columns (list) – List of column names to check

Returns:

Contains:
  • dict: Mapping of standardized names to actual column names

  • set: Set of required column types that were found

Return type:

tuple

Notes

  • Required columns: sample name, sample group

  • Optional columns: bait uniprot/id

  • Case-insensitive matching of column names

app.components.parsing.check_sample_table_column(column, accepted_values)[source]

Checks if a column name matches any accepted values.

Parameters:
  • column (str) – Column name to check

  • accepted_values (list) – List of valid column name variations

Returns:

Original column name if match found, None otherwise

Return type:

str

Notes

  • Case-insensitive matching

  • Returns exact original column name if match found

app.components.parsing.clean_column_name(col_name)[source]

Removes file paths and extensions from column names.

Parameters:

col_name (str) – Original column name potentially containing path and extensions

Returns:

Cleaned column name with paths and extensions removed

Return type:

str

Notes

  • Handles both Windows and Unix style paths

  • Removes _SPC suffix

  • Removes .d extension

  • Processes path components from right to left

app.components.parsing.clean_sample_names(expdesign, bait_id_column_names)[source]

Clean and validate the experimental design dataframe.

Parameters:
  • expdesign (pd.DataFrame) – Input experimental design dataframe containing at minimum ‘Sample group’ and ‘Sample name’ columns

  • bait_id_column_names (list) – List of possible column names that could contain bait identifiers (e.g., [‘bait id’, ‘bait uniprot’])

Returns:

Cleaned experimental design dataframe with:
  • Rows containing missing required values removed

  • All values converted to strings

  • Sample names cleaned of file paths and special characters

  • Standardized bait column name if present

Return type:

pd.DataFrame

Notes

  • Required columns are ‘Sample group’ and ‘Sample name’

  • Rows with NA values in required columns are dropped

  • File paths in sample names are removed (handles both Windows and Unix paths)

  • Special characters in sample names are replaced with underscores

  • If a bait identifier column exists, it is renamed to ‘Bait uniprot’

  • All modifications are done on a copy of the input dataframe

Example

>>> expd = pd.DataFrame({
...     'Sample name': ['path/to/sample1.raw', 'sample2'],
...     'Sample group': ['group1', 'group2'],
...     'bait id': ['P12345', 'P67890']
... })
>>> cleaned = clean_sample_names(expd, ['bait id', 'bait uniprot'])
>>> cleaned['Sample name']
0    sample1
1    sample2
Name: Sample name, dtype: object
app.components.parsing.delete_samples(discard_samples, data_dictionary)[source]

Removes specified samples from all tables in the data dictionary.

Parameters:
  • discard_samples (list) – List of sample names to remove

  • data_dictionary (dict) – Dictionary containing all experimental data tables and metadata

Returns:

Updated data dictionary with samples removed and sample groups adjusted

Return type:

dict

Notes

  • Processes all tables including intensity, spectral counts, and experimental design

  • Updates sample group mappings to reflect removed samples

  • Adds list of discarded samples to dictionary

  • Handles both regular and contaminant-containing tables

  • Removes empty sample groups after sample deletion

app.components.parsing.format_data(session_uid, data_tables, data_info, expdes_table, expdes_info, contaminants_to_remove, replace_replicate_names, use_unique_only, control_indicators, bait_id_column_names)[source]

Formats experimental data into a standardized dictionary structure for analysis.

Parameters:
  • session_uid (str) – Unique identifier for the analysis session

  • data_tables (dict) – Dictionary containing intensity and spectral count tables in JSON format

  • data_info (dict) – Metadata about the data tables including file info and data type

  • expdes_table (dict) – Experimental design table in JSON format

  • expdes_info (dict) – Metadata about the experimental design table

  • contaminants_to_remove (list) – List of contaminant proteins to filter out

  • replace_replicate_names (bool) – Whether to replace sample names with standardized replicate names

  • use_unique_only (bool) – Whether to use only unique peptides/proteins

  • control_indicators (list) – List of terms that indicate control samples

  • bait_id_column_names (list) – List of possible column names for bait identifiers

Returns:

A structured dictionary containing:
  • sample_groups: Sample grouping information

  • data_tables: Processed data tables (intensity, spectral counts, etc.)

  • info: Processing metadata and experiment type

  • file_info: Source file information

  • other: Additional data including protein lengths and bait information

Return type:

dict

Notes

  • Intensity values are log2 transformed if present

  • Zero values are replaced with NaN

  • Tables are stored in JSON split format

  • Experiment type is determined based on presence of bait information

  • Control samples are guessed based on provided indicators

app.components.parsing.format_sample_group_name(sample_group)[source]

Format sample group names, handling numeric cases.

Parameters:

sample_group (Union[str, int, float]) – The sample group identifier to format. Can be numeric or string.

Returns:

Formatted sample group name. Returns None if input is NaN.

For numeric inputs, returns “SampleGroup_<number>”. For string inputs, returns the string value.

Return type:

str

Examples

>>> format_sample_group_name(1)
'SampleGroup_1'
>>> format_sample_group_name("Control")
'Control'
>>> format_sample_group_name(np.nan)
None
app.components.parsing.generate_replicate_name(group_name, sample_name, existing_names, replace_names)[source]

Generate unique replicate names for samples within groups.

Parameters:
  • group_name (str) – Name of the sample group.

  • sample_name (str) – Original name of the sample.

  • existing_names (Set[str]) – Set of already assigned replicate names.

  • replace_names (bool) – If True, generates names like “Group_Rep_1”. If False, preserves original sample names with numeric suffixes if needed.

Returns:

A unique replicate name that doesn’t exist in existing_names.

Return type:

str

Note

When replace_names is True:
  • Names follow pattern: “{group_name}_Rep_{i}”

  • i increments until a unique name is found

When replace_names is False:
  • Uses cleaned original sample name as base

  • Adds “_i” suffix only if needed for uniqueness

  • i starts at 0 and increments until unique

Examples

>>> generate_replicate_name("Control", "sample1", {"Control_Rep_1"}, True)
'Control_Rep_2'
>>> generate_replicate_name("Control", "sample1", {"sample1"}, False)
'sample1_0'
app.components.parsing.get_distribution_title(used_table_type)[source]

Gets appropriate title for value distribution plots.

Parameters:

used_table_type (str) – Type of table being plotted

Returns:

Plot title indicating value type and transformation

Return type:

str

app.components.parsing.guess_controls(sample_groups, ctrl_indicators)[source]

Guesses control samples from sample group names based on indicator terms.

Parameters:
  • sample_groups (dict) – Dictionary mapping group names to sample lists

  • ctrl_indicators (list) – List of strings that indicate control samples

Returns:

Contains:
  • list: Control group names

  • list: Lists of samples in each control group

Return type:

tuple

Notes

  • Case-insensitive matching of control indicators

  • Returns empty lists if no controls are found

  • Each control group’s samples are kept together

app.components.parsing.handle_mztab(mz_filecontents)[source]
app.components.parsing.identify_columns(df, column_criteria_list, keep_logic)[source]
Return type:

tuple[str, bool]

app.components.parsing.parse_comparisons(control_group, comparison_data, sgroups)[source]

Parses control group and comparison data into pairwise comparisons.

Parameters:
  • control_group (str) – Name of the main control group

  • comparison_data (list) – List of explicit [sample, control] comparisons

  • sgroups (dict) – Dictionary of all sample groups

Returns:

List of [sample, control] pairs representing comparisons

Return type:

list

Notes

  • If control_group is specified, creates comparisons against all other groups

  • Appends any explicit comparisons from comparison_data

  • Skips invalid group names

  • Returns empty list if no valid comparisons found

app.components.parsing.parse_data_file(data_file_contents, data_file_name, data_file_modified_data, new_upload_style, parameters)[source]

Parses a data file and validates its contents.

Parameters:
  • data_file_contents (str) – The contents of the uploaded file

  • data_file_name (str) – Name of the uploaded file

  • data_file_modified_data (int) – Last modified timestamp of the file

  • new_upload_style (dict) – Style dictionary for UI feedback

  • parameters (dict) – Processing parameters including max PSM threshold

Returns:

Contains:
  • dict: Updated upload style with background color indicating status

  • dict: File info including metadata and data type

  • dict: Tables dictionary with intensity and spectral count data in split JSON format

  • list: List of warnings

  • str: sample table in split json format, if uploaded file was mztab, and a sample table was able to be generated from it.

Return type:

tuple

Notes

  • Validates file has sufficient numeric columns (>=3)

  • Sets background-color to ‘green’ if valid, ‘red’ if invalid

  • Tables are stored in split JSON format for serialization

app.components.parsing.parse_parameters(parameters_file)[source]

Parse and enrich parameters from a TOML configuration file.

Parameters:

parameters_file (Union[str, Path]) – Path to parameters TOML file.

Return type:

Dict[str, Any]

Returns:

Enriched parameters dictionary (controls, CRAPome, enrichment).

app.components.parsing.parse_sample_table(data_file_contents, data_file_name, data_file_modified_data, new_upload_style, sdrf_parameters)[source]

Parse and validate a sample metadata table.

Parameters:
  • data_file_contents (str) – Contents of the uploaded sample table file.

  • data_file_name (str) – Name of the uploaded file.

  • data_file_modified_data (int) – Last modified timestamp of the file.

  • new_upload_style (Dict[str, str]) – Style dictionary for UI feedback.

  • sdrf_parameters (dict) – Parameters for identifying sample name and group columns from SDRF files.

Return type:

Tuple[Dict[str, str], Dict[str, Any], str | None]

Returns:

Tuple of (new style, info dict, table JSON split).

app.components.parsing.read_data_from_content(file_contents, filename, maxpsm)[source]

Determine and apply the appropriate read function for a data file.

Parameters:
  • file_contents (str) – Contents of the uploaded file.

  • filename (str) – Name of the uploaded file.

  • maxpsm (int) – Maximum theoretical PSM value for spectral counting.

Return type:

Tuple[Dict[str, str], Dict[str, Any], str | None]

Returns:

Tuple of (tables dict in JSON split, info dict, json split str of sample table, if one could be generated from mztab input).

app.components.parsing.read_df_from_content(content, filename, lowercase_columns=False)[source]

Read a dataframe from uploaded file content.

Parameters:
  • content (str) – Base64 encoded file content.

  • filename (str) – Original filename with extension.

  • lowercase_columns (bool) – Whether to convert column names to lowercase.

Return type:

DataFrame

Returns:

Parsed DataFrame.

app.components.parsing.read_dia_nn(data_table)[source]

Reads DIA-NN report file into an intensity matrix.

Parameters:

data_table (pd.DataFrame) – Raw DIA-NN data table

Returns:

Contains:
  • pd.DataFrame: Processed intensity matrix

  • pd.DataFrame: Empty placeholder table

  • dict: Protein length information if available

Return type:

list

Notes

  • Handles both report and matrix formats

  • Extracts protein length information

  • Replaces zeros with NaN values

  • Pivots data if in report format

app.components.parsing.read_fragpipe(data_table)[source]

Reads FragPipe report into spectral count and intensity tables.

Parameters:

data_table (pd.DataFrame) – Raw FragPipe data table

Returns:

Contains:
  • pd.DataFrame: Intensity table

  • pd.DataFrame: Spectral count table

  • dict: Protein length information if available

Return type:

tuple

Notes

  • Identifies intensity and spectral count columns

  • Handles unique peptide counts

  • Supports MaxLFQ intensity values

  • Replaces zeros with NaN values

app.components.parsing.read_matrix(data_table, is_spc_table=False, max_spc_ever=0)[source]

Reads a generic matrix into a data table.

Parameters:
  • data_table (pd.DataFrame) – Input data matrix

  • is_spc_table (bool, optional) – Whether matrix contains spectral counts. Defaults to False

  • max_spc_ever (int, optional) – Maximum expected spectral count value. Defaults to 0

Returns:

Contains:
  • pd.DataFrame: Intensity table

  • pd.DataFrame: Spectral count table

  • dict: Protein length information if available

Return type:

tuple

Notes

  • Automatically detects spectral count tables

  • Handles protein length information

  • Removes non-numeric columns

  • Replaces zeros with NaN values

app.components.parsing.remove_all_na(data_table, subset=None, inplace=False)[source]

Removes rows with all missing values from a data table .

Return type:

DataFrame

app.components.parsing.remove_duplicate_protein_groups(data_table)[source]

Remove duplicate protein groups by aggregating their values.

Parameters:

data_table (DataFrame) – Input data table with protein groups as index.

Return type:

DataFrame

Returns:

Table with unique protein groups and aggregated values.

app.components.parsing.remove_file_path(column_name)[source]

Removes the file path from a column name. For example, if the column name is ‘data/run1.raw’, it will be changed to ‘run1’.

Return type:

str

app.components.parsing.remove_filepath_from_columns(data_table)[source]

Removes filepath from column names. For example, if the column name is ‘data/run1.raw’, it will be changed to ‘run1’. Column renaming will be done in place.

Return type:

None

app.components.parsing.remove_from_table(table_name, table, discard_samples)[source]

Removes specified samples from a data table based on table type.

Parameters:
  • table_name (str) – Name of the table being processed

  • table (pd.DataFrame) – Data table to remove samples from

  • discard_samples (list) – List of sample names to remove

Returns:

Table with specified samples removed

Return type:

pd.DataFrame

Notes

  • For experimental design tables, removes rows where Sample name matches discard list

  • For other tables, removes columns matching discard list

app.components.parsing.remove_rawfile_ending(column_name)[source]

Removes the raw file ending from a column name. For example, if the column name is ‘run1.raw’, it will be changed to ‘run1’.

Return type:

str

app.components.parsing.rename_columns_and_update_expdesign(expdesign, tables, bait_id_column_names, replace_names=True)[source]

Standardize sample names and update experimental design.

Parameters:
  • expdesign (DataFrame) – Experimental design DataFrame with ‘Sample group’ and ‘Sample name’.

  • tables (List[DataFrame]) – DataFrames to rename columns in.

  • bait_id_column_names (List[str]) – Possible column names containing bait identifiers.

  • replace_names (bool) – Whether to generate standardized replicate names.

Return type:

Tuple[Dict[str, Dict[str, List[str]]], List[str], List[Dict[str, str]], DataFrame]

Returns:

Tuple of (sample groups mapping, discarded columns, used columns, updated expdesign).

app.components.parsing.sdrf_to_table(sdrf_df, parameters)[source]

Convert SDRF file to sample table.

Parameters:
  • sdrf_df – SDRF file as pandas DataFrame

  • parameters – Parameters dictionary

Returns:

Contains:
  • pd.DataFrame: Sample table

  • list: List of problems

Return type:

tuple

app.components.parsing.unmix_dtypes(df)[source]

Convert mixed dtype columns in a dataframe to strings in place.

Parameters:

df (DataFrame) – DataFrame to process.

Raises:

TypeError – If conversion still results in mixed dtype.

Return type:

None

app.components.parsing.update_nested_dict(base_dict, update_dict)[source]

Update a nested dictionary with values from another.

Parameters:
  • base_dict (Dict[str, Any]) – Base dictionary to update.

  • update_dict (Dict[str, Any]) – Dictionary containing update values.

Return type:

Dict[str, Any]

Returns:

Updated base dictionary.

app.components.parsing.validate_basic_inputs(*args, fail_on_None=True)[source]

Validate basic inputs for ProteoGyver analysis.

Parameters:
  • args (Any) – Arbitrary inputs; last two are style dicts with ‘background-color’.

  • fail_on_None (bool) – If True, treat any None as invalid.

Return type:

bool

Returns:

True if validation fails, False otherwise.

app.components.proteomics module

Proteomics analysis figure builders.

Implements plots and computations for missing value filtering, normalization, distributions, imputation, PCA, clustermap, CV plots, and volcano analyses used by the proteomics workflow.

app.components.proteomics.clustermap(imputed_data_json, defaults)[source]

Draws a correltion clustergram figure from the given data_table.

Parameters:
  • imputed_data_json (str) – Imputed data in JSON split format.

  • defaults (dict) – Figure defaults and component config.

Return type:

tuple

Returns:

Tuple of (graph div, correlation matrix JSON).

app.components.proteomics.differential_abundance(imputed_data_json, sample_groups, comparisons, fc_thr, p_thr, defaults, test_type='independent', db_file_path=None)[source]

Run differential analysis and generate volcano plots.

Parameters:
  • imputed_data_json (str) – Imputed data in JSON split format.

  • sample_groups (dict) – Mapping group -> list of columns.

  • comparisons (list) – List of (sample, control) group pairs.

  • fc_thr (float) – Absolute log2 fold change threshold.

  • p_thr (float) – Adjusted p-value threshold.

  • defaults (dict) – Figure defaults and component config.

  • test_type (str) – 'independent' or 'paired'.

  • db_file_path (str) – Optional DB path for gene mapping.

Return type:

tuple

Returns:

Tuple of (components, significant data JSON).

app.components.proteomics.imputation(filtered_data_json, imputation_option, defaults, errorfile, sample_groups_rev, title=None)[source]

Impute missing values and render distribution comparison.

Parameters:
  • filtered_data_json – Filtered data in JSON split format.

  • imputation_option – Imputation method name.

  • defaults – Figure defaults and component config.

  • errorfile (str) – Path for logging/diagnostics from imputation.

  • sample_groups_rev (dict) – Mapping sample -> group (used by imputation).

  • title (str) – Optional plot title.

Return type:

tuple

Returns:

Tuple of (graph div, imputed table JSON).

app.components.proteomics.missing_values_in_other_samples(filtered_data_json, defaults)[source]

Histogram comparing intensities of proteins with/without missing values.

Parameters:
  • filtered_data_json – Filtered data in JSON split format.

  • defaults – Figure defaults and component config.

Return type:

Div

Returns:

Div containing the histogram and legend.

app.components.proteomics.na_filter(input_data_dict, filtering_percentage, figure_defaults, title=None, filter_type='sample-group')[source]

Apply NA filtering and visualize before/after counts.

Parameters:
  • input_data_dict – Data dictionary containing intensity tables and sample groups.

  • filtering_percentage – Threshold percentage for presence filtering.

  • figure_defaults – Figure defaults and component config.

  • title (str) – Optional plot title.

  • filter_type (str) – 'sample-group' or 'sample-set'.

Return type:

tuple

Returns:

Tuple of (graph div, filtered data JSON).

app.components.proteomics.normalization(filtered_data_json, normalization_option, defaults, errorfile, title=None)[source]

Normalize filtered data and show distributions before/after.

Parameters:
  • filtered_data_json (str) – Filtered data in JSON split format.

  • normalization_option (str) – Normalization method name.

  • defaults (dict) – Figure defaults and component config.

  • errorfile (str) – Path for logging/diagnostics from normalization.

  • title (str) – Optional plot title.

Return type:

tuple

Returns:

Tuple of (graph div, normalized table JSON).

app.components.proteomics.pca(imputed_data_json, sample_groups_rev, defaults, replicate_colors)[source]

Compute PCA and plot the first two components.

Parameters:
  • imputed_data_json (str) – Imputed data in JSON split format.

  • sample_groups_rev (dict) – Mapping sample -> group.

  • defaults (dict) – Figure defaults and component config.

  • replicate_colors (dict) – Mapping for group colors.

Return type:

tuple

Returns:

Tuple of (graph div, PCA result JSON).

app.components.proteomics.perc_cvplot(raw_int_data, na_filtered_data, sample_groups, replicate_colors, defaults)[source]

Compute and plot coefficient of variation per group.

Parameters:
  • raw_int_data (str) – Raw intensity data in JSON split format.

  • na_filtered_data (str) – NA-filtered intensity data in JSON split format.

  • sample_groups (dict) – Mapping group -> list of columns.

  • replicate_colors (dict) – Mapping for group colors.

  • defaults (dict) – Figure defaults and component config.

Return type:

tuple

Returns:

Tuple of (graph div, stats JSON).

app.components.qc_analysis module

QC analysis figure builders and containers.

Generates standard QC visualizations (counts, coverage, missing values, distribution, reproducibility, commonality, TIC) and their wrappers for use in the UI.

app.components.qc_analysis.common_proteins(data_table, db_file, figure_defaults, additional_groups=None, id_str='qc')[source]

Summarize common proteins by class and sample.

Parameters:
  • data_table (str) – Input matrix in JSON split format.

  • db_file (str) – Path to SQLite database file.

  • figure_defaults (dict) – Figure defaults and component config.

  • additional_groups (dict) – Optional mapping of group name to protein list to include.

  • id_str (str) – ID prefix for generated components.

Return type:

tuple

Returns:

Tuple of (graph div, plot data as JSON split).

app.components.qc_analysis.commonality_plot(pandas_json, rev_sample_groups, defaults, only_groups=None)[source]

Plot commonality across groups using heatmap or Supervenn.

Parameters:
  • pandas_json (str) – Data table in JSON split format.

  • rev_sample_groups (dict) – Mapping sample -> group.

  • defaults (dict) – Figure defaults and component config.

  • only_groups (list) – Optional subset of group names to include.

Return type:

tuple

Returns:

Tuple of (graph div, common proteins string, optional base64 PDF string).

app.components.qc_analysis.count_plot(pandas_json, replicate_colors, contaminant_list, defaults, title=None)[source]

Generate protein count bar plot per sample.

Parameters:
  • pandas_json (str) – Data table in JSON split format.

  • replicate_colors (dict) – Color mappings for samples and contaminants.

  • contaminant_list (list) – Optional list of contaminants to exclude.

  • defaults (dict) – Figure defaults and component config.

  • title (str) – Optional plot title.

Return type:

tuple

Returns:

Tuple of (graph div, count data as JSON split).

app.components.qc_analysis.coverage_plot(pandas_json, defaults, title=None)[source]

Create coverage bar plot (proteins identified in N samples).

Parameters:
  • pandas_json (str) – Data table in JSON split format.

  • defaults (dict) – Figure defaults and component config.

  • title (str) – Optional plot title.

Return type:

tuple

Returns:

Tuple of (graph div, coverage data as JSON split).

app.components.qc_analysis.distribution_plot(pandas_json, replicate_colors, sample_groups, defaults, title=None)[source]

Plot value distributions per group as box plots.

Parameters:
  • pandas_json (str) – Data table in JSON split format.

  • replicate_colors (dict) – Mapping for group colors.

  • sample_groups (dict) – Mapping sample -> group.

  • defaults (dict) – Figure defaults and component config.

  • title (str) – Optional title.

Return type:

tuple

Returns:

Tuple of (graph div, original JSON string).

app.components.qc_analysis.generate_commonality_container(sample_groups)[source]

Build the selection controls and display container for commonality.

Parameters:

sample_groups – List of group names.

Returns:

Bootstrap Row with controls and graph area.

app.components.qc_analysis.mean_plot(pandas_json, replicate_colors, defaults, title=None)[source]

Plot mean of values per sample.

Parameters:
  • pandas_json (str) – Data table in JSON split format.

  • replicate_colors (dict) – Mapping for sample colors.

  • defaults (dict) – Figure defaults and component config.

  • title (str) – Optional title.

Return type:

tuple

Returns:

Tuple of (graph div, mean data JSON).

app.components.qc_analysis.missing_plot(pandas_json, replicate_colors, defaults, title=None)[source]

Plot missing value percentage per sample.

Parameters:
  • pandas_json (str) – Data table in JSON split format.

  • replicate_colors (dict) – Mapping for sample colors.

  • defaults (dict) – Figure defaults and component config.

  • title (str) – Optional title.

Return type:

tuple

Returns:

Tuple of (graph div, NA data JSON).

app.components.qc_analysis.parse_tic_data(expdesign_json, replicate_colors, db_file, defaults)[source]

Prepare TIC/BPC/MSn trace bundles for plotting.

Parameters:
  • expdesign_json (str) – Experimental design table in JSON split format.

  • replicate_colors (dict) – Mapping of sample names to colors.

  • db_file (str) – Path to SQLite database file.

  • defaults (dict) – Figure defaults and component config.

Return type:

tuple

Returns:

Tuple of (graph div scaffold, trace dictionary).

app.components.qc_analysis.reproducibility_plot(pandas_json, sample_groups, table_type, defaults, title=None)[source]

Plot per-replicate deviations from group mean.

Parameters:
  • pandas_json (str) – Data table in JSON split format.

  • sample_groups (dict) – Mapping group -> list of columns.

  • table_type (str) – Label for axis title.

  • defaults (dict) – Figure defaults and component config.

  • title (str) – Optional title.

Return type:

tuple

Returns:

Tuple of (graph div, reproducibility data JSON).

app.components.qc_analysis.sum_plot(pandas_json, replicate_colors, defaults, title=None)[source]

Plot sum of values per sample.

Parameters:
  • pandas_json (str) – Data table in JSON split format.

  • replicate_colors (dict) – Mapping for sample colors.

  • defaults (dict) – Figure defaults and component config.

  • title (str) – Optional title.

Return type:

tuple

Returns:

Tuple of (graph div, sum data JSON).

app.components.quick_stats module

Quick statistical utilities for QC and differential analysis.

Includes ANOVA, pairwise differential testing, and helpers to compute basic per-sample metrics used across figures.

app.components.quick_stats.anova(dataframe, sample_groups)[source]

Run one-way ANOVA across sample groups for each row.

Parameters:
  • dataframe (DataFrame) – Wide matrix with samples as columns and proteins as rows.

  • sample_groups (dict) – Mapping group -> list of column names.

Return type:

DataFrame

Returns:

DataFrame with F-statistic, p-value, and Benjamini-Hochberg q-value.

app.components.quick_stats.differential(data_table, sample_groups, comparisons, data_is_log2_transformed=True, namemap=None, adj_p_thr=0.01, fc_thr=1.0, test_type='independent', db_file_path=None)[source]

Compute pairwise differential statistics for specified comparisons.

Parameters:
  • data_table (DataFrame) – Wide matrix with samples as columns and proteins as rows.

  • sample_groups (dict) – Mapping from group name to list of column names.

  • comparisons (list) – List of (sample_group, control_group) pairs.

  • data_is_log2_transformed (bool) – If True, data is already log2; otherwise log2 means are computed.

  • namemap (dict) – Optional mapping from index to display name.

  • adj_p_thr (float) – FDR q-value threshold used for the Significant flag.

  • fc_thr (float) – Absolute log2 fold change threshold for Significant.

  • test_type (str) – 'independent' or 'paired' t-test.

  • db_file_path (str) – Optional database path for gene mapping.

Return type:

DataFrame

Returns:

Long-form DataFrame with FC, p-values, q-values and metadata per comparison.

app.components.quick_stats.get_common_data(data_table, rev_sample_groups, only_groups=None)[source]

Collect sets of identified proteins per group.

Parameters:
  • data_table (DataFrame) – Input DataFrame with proteins as index and samples as columns.

  • rev_sample_groups (dict) – Mapping from sample name to group name.

  • only_groups (list | None) – Optional subset of group names to include.

Return type:

dict

Returns:

Dict mapping group name to set of identified proteins.

app.components.quick_stats.get_comparative_data(data_table, sample_groups)[source]

Split data into a list of group-specific DataFrames.

Parameters:
  • data_table – Wide matrix with samples as columns.

  • sample_groups – Mapping from sample to group name.

Return type:

tuple

Returns:

Tuple of (list of group names, list of DataFrames).

app.components.quick_stats.get_count_data(data_table, contaminant_list=None)[source]

Count non-NA entries per column, optionally excluding contaminants.

Parameters:
  • data_table (DataFrame) – Input DataFrame.

  • contaminant_list (list) – Optional list of identifiers to treat as contaminants.

Return type:

DataFrame

Returns:

DataFrame with counts and an Is contaminant flag when applicable.

app.components.quick_stats.get_coverage_data(data_table)[source]

Compute identification coverage counts.

Parameters:

data_table (DataFrame) – Input DataFrame.

Return type:

DataFrame

Returns:

DataFrame of value counts of the number of samples each protein is identified in.

app.components.quick_stats.get_mean_data(data_table)[source]

Compute column means.

Parameters:

data_table – Input DataFrame.

Return type:

DataFrame

Returns:

DataFrame with Value mean per sample.

app.components.quick_stats.get_na_data(data_table)[source]

Compute NA percentage per column.

Parameters:

data_table (DataFrame) – Input DataFrame.

Return type:

DataFrame

Returns:

DataFrame with Missing value % per sample.

app.components.quick_stats.get_sum_data(data_table)[source]

Compute column sums.

Parameters:

data_table – Input DataFrame.

Return type:

DataFrame

Returns:

DataFrame with Value sum per sample.

app.components.text_handling module

Text handling utilities for cleaning and normalizing strings.

Utilities for: - Removing accent marks from characters - Replacing special characters with specified replacements - Combined accent and special character handling - Simplified text cleaning interface

app.components.text_handling.clean_text(text)[source]

Simplified alias for replace_accent_and_special_characters.

Parameters:

text (str) – Input string to clean.

Return type:

str

Returns:

Cleaned string with default handling.

app.components.text_handling.remove_accent_characters(text)[source]

Replace accented characters with their unaccented equivalents.

Parameters:

text (str) – Input string containing accented characters.

Return type:

str

Returns:

String with accented characters replaced by unaccented equivalents.

app.components.text_handling.replace_accent_and_special_characters(text, **kwargs)[source]

Replace both accented and special characters in a string.

Parameters:
  • text (str) – Input string containing accented and special characters.

  • kwargs – Passed through to replace_special_characters.

Return type:

str

Returns:

Cleaned string.

app.components.text_handling.replace_special_characters(text, replacewith='.', dict_and_re=False, replacement_dict=None, stripresult=True, remove_duplicates=False, make_lowercase=True, allow_numbers=True, allow_space=False, mask_first_digit=None)[source]

Replace special characters in a string with specified replacements.

Parameters:
  • text (str) – Input string containing special characters.

  • replacewith (str) – Character to use for replacement.

  • dict_and_re (bool) – Whether to apply both dictionary replacements and regex.

  • replacement_dict (Optional[Dict[str, str]]) – Mapping of specific substrings to replacements.

  • stripresult (bool) – Strip whitespace and replacement characters from result.

  • remove_duplicates (bool) – Collapse consecutive replacement characters.

  • make_lowercase (bool) – Convert result to lowercase.

  • allow_numbers (bool) – Allow numbers in the result.

  • allow_space (bool) – Allow spaces in the result.

  • mask_first_digit (str | None) – Character to prefix when first char is a digit.

Return type:

str

Returns:

String with special characters replaced.

app.components.text_handling.sanitize_for_database_use(text)[source]

Sanitize a string for use in a database column name.

Parameters:

text (str) – Input string to sanitize.

Return type:

str

Returns:

Sanitized string (alnum/underscore, prefixed if starting with digit).

app.components.tooltips module

Tooltip helpers for the UI.

Provides convenience wrappers for Bootstrap tooltips used throughout the application to document inputs and options.

app.components.tooltips.generic_tooltip(target, text)[source]

Create a generic tooltip.

Parameters:
  • target (str) – ID of the target element.

  • text (str) – Tooltip text content.

Return type:

Tooltip

Returns:

Bootstrap Tooltip component.

app.components.tooltips.interactomics_select_top_controls_tooltip(target='interactomics-num-controls')[source]

Tooltip for selecting most similar inbuilt controls.

Parameters:

target – ID of the target element.

Return type:

Tooltip

Returns:

Bootstrap Tooltip component.

app.components.tooltips.na_tooltip(target='filtering-label')[source]

Tooltip for NA filtering control.

Parameters:

target – ID of the target element.

Return type:

Tooltip

Returns:

Bootstrap Tooltip component.

app.components.tooltips.nearest_tooltip(target='interactomics-nearest-control-filtering')[source]

Tooltip for nearest-control selection in interactomics.

Parameters:

target – ID of the target element.

Return type:

Tooltip

Returns:

Bootstrap Tooltip component.

app.components.tooltips.rescue_tooltip(target='interactomics-rescue-filtered-out')[source]

Tooltip explaining the rescue option for SAINT filtering.

Parameters:

target – ID of the target element.

Return type:

Tooltip

Returns:

Bootstrap Tooltip component.

app.components.tooltips.test_type_tooltip(target='proteomics-test-type')[source]

Tooltip describing test type selection (paired vs independent).

Parameters:

target – ID of the target element.

Return type:

Tooltip

Returns:

Bootstrap Tooltip component.

app.components.ui_components module

Components for the user interface.

Reusable Dash/Bootstrap UI components for building the application interface, including checklists, range inputs, uploaders, sidebars, workflow containers, and navigation.

app.components.ui_components.HEADER_DICT

Mapping of header levels to HTML header components.

Type:

dict

app.components.ui_components.checklist(label, options, default_choice, disabled_options=None, id_prefix=None, id_only=False, prefix_list=None, postfix_list=None, clean_id=True, style_override=None)[source]

Create a Bootstrap checklist with customizable options.

Parameters:
  • label (str) – Label text for the checklist.

  • options (List[str]) – Options to display in the checklist.

  • default_choice (List[str]) – Pre-selected options.

  • disabled_options (Optional[List[str]]) – Options to disable.

  • id_prefix (Optional[str]) – Prefix for the component ID.

  • id_only (bool) – If True, removes label from display.

  • prefix_list (Optional[List[Any]]) – Elements to prepend to the checklist.

  • postfix_list (Optional[List[Any]]) – Elements to append to the checklist.

  • clean_id (bool) – If True, sanitize the ID string.

  • style_override (Optional[Dict[str, Any]]) – Custom CSS styles for the component.

Return type:

List[Any]

Returns:

List of components constituting the labeled checklist.

app.components.ui_components.discard_samples_checklist(count_plot, list_of_samples)[source]

Create a checklist UI for selecting samples to discard.

Parameters:
  • count_plot (Div) – Plot component showing sample counts.

  • list_of_samples (List[str]) – List of sample names that can be discarded.

Return type:

List[Any]

Returns:

List of components containing the count plot and checklist.

app.components.ui_components.interactomics_area(parameters, data_dictionary)[source]

Create the main interactomics analysis area and results container.

Parameters:
  • parameters (Dict[str, Any]) – Interactomics configuration parameters.

  • data_dictionary (Dict[str, Any]) – Data required for interactomics analysis.

Return type:

List[Div]

Returns:

List with input and results sections for interactomics.

app.components.ui_components.interactomics_control_col(all_sample_groups, chosen)[source]

Create a column with controls for selecting uploaded control samples.

Parameters:
  • all_sample_groups (List[str]) – All available sample groups.

  • chosen (List[str]) – Pre-selected sample groups.

Return type:

Col

Returns:

Column with a “Select all” checkbox and a checklist.

app.components.ui_components.interactomics_crapome_col(crapome_dict)[source]

Create a column for selecting CRAPome control sets.

Parameters:

crapome_dict (Dict[str, List[str]]) – Dict with available, default, and disabled lists.

Return type:

Col

Returns:

Column with select-all and checklist for CRAPome sets.

app.components.ui_components.interactomics_enrichment_col(enrichment_dict)[source]

Create a column for selecting enrichment analysis options.

Parameters:

enrichment_dict (Dict[str, List[str]]) – Dict with available, default, and disabled lists.

Return type:

Col

Returns:

Column with a deselect-all button and checklist.

app.components.ui_components.interactomics_inbuilt_control_col(controls_dict)[source]

Create a column for selecting built-in control sets.

Parameters:

controls_dict (Dict[str, List[str]]) – Dict with available, default, and disabled lists.

Return type:

Col

Returns:

Column containing a select-all control and checklist.

app.components.ui_components.interactomics_input_card(parameters, data_dictionary)[source]

Create the main input card for interactomics configuration.

Parameters:
  • parameters (Dict[str, Any]) – Dict with controls, crapome, and enrichment options.

  • data_dictionary (Dict[str, Any]) – Dict with normalized sample groups and guessed controls.

Return type:

Div

Returns:

Div containing control selection columns and filtering options.

app.components.ui_components.main_content_div()[source]

Create the main content area for displaying analysis results.

Return type:

Div

Returns:

Div with workflow-specific inputs and result areas.

app.components.ui_components.main_sidebar(figure_templates, implemented_workflows)[source]

Create the main sidebar component with input controls.

Parameters:
  • figure_templates (List[str]) – Available figure style templates.

  • implemented_workflows (List[str]) – Available workflow types.

Return type:

Div

Returns:

Sidebar Div containing inputs, options, and downloads.

app.components.ui_components.make_du_uploader(id_str, message)[source]

Create a dash-uploader component with a success indicator.

Parameters:
  • id_str (str) – ID for the upload component.

  • message (str) – Display message for the upload area.

Return type:

Tuple[Div, str]

Returns:

Tuple of (upload component container, unique session ID).

app.components.ui_components.modals()[source]

Create modal dialogs for the application.

Return type:

Div

Returns:

Div containing modal components (discard samples modal).

app.components.ui_components.navbar(navbar_pages)[source]

Create the main navigation bar for the application.

Parameters:

navbar_pages (List[Tuple[str, str]]) – List of tuples of (name, link) for navigation items.

Return type:

NavbarSimple

Returns:

Bootstrap NavbarSimple with navigation items and branding.

app.components.ui_components.post_saint_container()[source]

Create a container for post-SAINT analysis visualizations.

Return type:

List[Div]

Returns:

List containing a Div with loading indicators for post-SAINT plots.

app.components.ui_components.proteomics_area(parameters, data_dictionary)[source]

Create the main proteomics analysis area and results container.

Parameters:
  • parameters (Dict[str, Any]) – Proteomics-specific configuration parameters.

  • data_dictionary (Dict[str, Any]) – Data required for proteomics analysis.

Return type:

List[Div]

Returns:

List containing input and results sections with loading indicators for NA filtering, normalization, missing values, imputation, CV, PCA, clustermap, and volcano plots.

app.components.ui_components.proteomics_input_card(parameters, data_dictionary)[source]

Create a card containing proteomics analysis input controls.

Parameters:
  • parameters (Dict[str, Any]) – Configuration parameters (NA filter default, imputation/normalization options and defaults).

  • data_dictionary (Dict[str, Any]) – Data containing sample groups and normalization info.

Return type:

Card

Returns:

Bootstrap Card with controls for filtering, imputation, normalization, and thresholds.

app.components.ui_components.qc_area()[source]

Create the quality control analysis area with multiple plots.

Return type:

Div

Returns:

Div containing loading indicators and containers for QC plots.

app.components.ui_components.range_input(label, min_val, max_val, id_str, typestr='number', style_float='center', stepsize=1)[source]

Create a range input component with min and max fields.

Parameters:
  • label (str) – Label text for the range input.

  • min_val (float) – Initial minimum value.

  • max_val (float) – Initial maximum value.

  • id_str (str) – Base ID for the component.

  • typestr (str) – Input type ('number', 'text', etc.).

  • style_float (str) – CSS float for positioning.

  • stepsize (float) – Step size for number inputs.

Return type:

Div

Returns:

Div containing the range input.

app.components.ui_components.saint_filtering_container(defaults, rescue, saint_found)[source]

Create the SAINT filtering controls and visualization container.

Parameters:
  • defaults (Dict[str, Any]) – Default configuration (expects config).

  • rescue (bool) – Whether rescue filtering is enabled.

  • saint_found (bool) – Whether SAINT executable was found (controls warning visibility).

Return type:

Div

Returns:

Div with SAINT histogram, thresholds, and controls.

app.components.ui_components.table_of_contents(main_div_children, itern=0)[source]

Recursively generate a table of contents from header elements.

Parameters:
  • main_div_children (List[Dict[str, Any]]) – List of HTML component-like dicts to process.

  • itern (int) – Current recursion depth.

Return type:

List[Any]

Returns:

List of HTML components representing the table of contents.

app.components.ui_components.upload_area(id_text, upload_id, indicator=True)[source]

Create a drag-and-drop upload area with optional success indicator.

Parameters:
  • id_text (str) – ID for the upload component.

  • upload_id (str) – Display text for the upload area.

  • indicator (bool) – Whether to show upload success indicator.

Return type:

Div

Returns:

Div containing the upload area and optional success indicator.

app.components.ui_components.workflow_area(workflow, workflow_specific_parameters, data_dictionary)[source]

Create the appropriate workflow area based on workflow type.

Parameters:
  • workflow (str) – Workflow type ('Proteomics', 'Interactomics').

  • workflow_specific_parameters (Dict[str, Any]) – Parameters for each workflow type.

  • data_dictionary (Dict[str, Any]) – Data required for the workflow analysis.

Return type:

Div

Returns:

Workflow-specific component tree.

Module contents

Components package for ProteoGyver.