app.pipeline_module package

Submodules

app.pipeline_module.batch_data_store_builder module

Batch Data Store Builder for ProteoGyver

This module constructs data stores in the exact format expected by the GUI, allowing the batch pipeline to use the same infra.save_data_stores function as the interactive GUI.

The module converts batch pipeline results into Dash Store components that match the structure and content expected by the GUI export system.

app.pipeline_module.batch_data_store_builder.build_data_stores_from_batch_output(batch_output_dir, workflow)[source]

Build the complete list of data stores from batch output directory.

Parameters:

batch_output_dir (str) – Directory containing batch output JSON files.
workflow (str) – Workflow name (‘proteomics’ or ‘interactomics’).

Return type:

List[Dict]

Returns:

Complete list of data store components ready for infra.save_data_stores.

app.pipeline_module.batch_data_store_builder.build_interactomics_data_stores(batch_output_dir)[source]

Build interactomics-specific data stores from batch output.

Parameters:: batch_output_dir (str) – Directory containing batch output JSON files.
Return type:: List[Dict]
Returns:: List of interactomics data store components.

app.pipeline_module.batch_data_store_builder.build_proteomics_data_stores(batch_output_dir)[source]

Build proteomics-specific data stores from batch output.

Parameters:: batch_output_dir (str) – Directory containing batch output JSON files.
Return type:: List[Dict]
Returns:: List of proteomics data store components.

app.pipeline_module.batch_data_store_builder.build_qc_data_stores(qc_data)[source]

Build QC-related data stores.

Parameters:: qc_data (Dict) – QC artifacts from batch output.
Return type:: List[Dict]
Returns:: List of QC data store components.

app.pipeline_module.batch_data_store_builder.build_replicate_colors_stores(data_dict)[source]

Build replicate color data stores.

Parameters:: data_dict (Dict) – Data dictionary from batch output.
Return type:: List[Dict]
Returns:: List containing replicate color data store components.

app.pipeline_module.batch_data_store_builder.build_upload_data_store(data_dict)[source]

Build the main upload data store from batch data dictionary.

Parameters:: data_dict (Dict) – Data dictionary from batch output.
Return type:: Dict
Returns:: Data store component for ‘upload-data-store’.

app.pipeline_module.batch_data_store_builder.create_data_store_component(store_id, data, timestamp=None)[source]

Create a Dash data store component.

Parameters:

store_id (str) – Data store ID (e.g., ‘proteomics-volcano-data-store’).
data (Any) – Data to store (dict/JSON string/etc.).
timestamp (Optional[float]) – Optional milliseconds since epoch; uses current time if None.

Return type:

Dict

Returns:

Dash Store component dict structure.

app.pipeline_module.batch_data_store_builder.save_batch_data_using_infra(batch_output_dir, export_dir, workflow)[source]

Save batch data using the GUI’s infra.save_data_stores function.

Parameters:

batch_output_dir (str) – Directory containing batch output JSON files.
export_dir (str) – Directory to save exported data.
workflow (str) – Workflow name (‘proteomics’ or ‘interactomics’).

Return type:

Dict[str, Any]

Returns:

Summary dict of the export operation.

app.pipeline_module.batch_figure_builder_from_divs module

Batch Figure Builder using saved Dash divs

This module loads the saved div pickle files generated by the batch pipeline and uses them directly with infra.save_figures, just like the GUI does.

This approach is much more reliable than trying to reconstruct figures from JSON.

app.pipeline_module.batch_figure_builder_from_divs.build_analysis_divs_from_saved_divs(batch_output_dir, workflow, params)[source]

Build analysis_divs list from saved div pickle files.

The GUI export expects a flat list of individual div components.

Parameters:

batch_output_dir (str) – Directory containing the batch output files.
workflow (str) – Workflow type (‘proteomics’ or ‘interactomics’).
params (dict) – Parsed parameters dict (for TIC rendering defaults).

Return type:

List[Any]

Returns:

List of analysis div components ready for infra.save_figures.

app.pipeline_module.batch_figure_builder_from_divs.get_commonality_pdf_data(batch_output_dir)[source]

Get commonality PDF data if available.

Parameters:: batch_output_dir (str) – Directory containing batch output.
Return type:: Optional[str]
Returns:: PDF data string, or None if not available.

app.pipeline_module.batch_figure_builder_from_divs.load_div_pickle(pickle_path)[source]

Load a div pickle file.

Parameters:: pickle_path (str) – Path to the pickle file.
Return type:: Dict[str, Any]
Returns:: Dict of div components or empty dict if not found or on error.

app.pipeline_module.batch_figure_builder_from_divs.main()[source]

Command-line entry point for figure generation using saved divs.

Returns:: None.

app.pipeline_module.batch_figure_builder_from_divs.save_batch_figures_using_saved_divs(batch_output_dir, export_dir, workflow, parameters, output_formats=None)[source]

Save batch figures using saved div pickle files and GUI infrastructure.

Parameters:

batch_output_dir (str) – Directory containing batch output and div pickle files.
export_dir (str) – Directory for figure export.
workflow (str) – Workflow type (‘proteomics’ or ‘interactomics’).
parameters (dict) – Parsed parameters dict for figure defaults.
output_formats (Optional[List[str]]) – Output format list, default [‘html’, ‘pdf’, ‘png’].

Return type:

Dict[str, Any]

Returns:

Summary dict with export details and counts.

app.pipeline_module.pipeline_batch module

class app.pipeline_module.pipeline_batch.BatchConfig(data_table_path, sample_table_path, outdir='batch_out', figure_template='plotly_white', remove_common_contaminants=True, rename_replicates=False, unique_only=False, workflow='proteomics', plot_formats=<factory>, keep_batch_output=False, na_filter_percent=70, na_filter_type='sample-group', normalization='no_normalization', imputation='QRILC', control_group=None, comparison_file=None, fc_threshold=2, p_threshold=0.05, test_type='independent', uploaded_controls=<factory>, additional_controls=<factory>, crapome_sets=<factory>, proximity_filtering=False, n_controls=3, saint_bfdr_threshold=0.05, crapome_percentage_threshold=20, crapome_fc_threshold=2, rescue_enabled=False, chosen_enrichments=<factory>, force_supervenn=False)[source]

Bases: object

additional_controls: List[str]

chosen_enrichments: List[str]

comparison_file: Optional[str] = None

control_group: Optional[str] = None

crapome_fc_threshold: int = 2

crapome_percentage_threshold: int = 20

crapome_sets: List[str]

data_table_path: str

fc_threshold: float = 2

figure_template: str = 'plotly_white'

force_supervenn: bool = False

imputation: str = 'QRILC'

keep_batch_output: bool = False

n_controls: int = 3

na_filter_percent: int = 70

na_filter_type: str = 'sample-group'

normalization: str = 'no_normalization'

outdir: str = 'batch_out'

p_threshold: float = 0.05

plot_formats: List[str]

proximity_filtering: bool = False

remove_common_contaminants: bool = True

rename_replicates: bool = False

rescue_enabled: bool = False

saint_bfdr_threshold: float = 0.05

sample_table_path: str

test_type: str = 'independent'

unique_only: bool = False

uploaded_controls: List[str]

workflow: str = 'proteomics'

app.pipeline_module.pipeline_batch.dash_to_wire(obj)[source]

Recursively convert Dash/Plotly components to JSON-serializable structures.

Leaves primitives (str, int, float, bool, None) untouched.
Converts any object exposing to_plotly_json() (Dash components, go.Figure).
Recurses through dicts and lists/tuples.
Dataclasses are converted via asdict() then recursed.

Parameters:: obj – Any Python object (Dash component, go.Figure, dict/list, primitives).
Returns:: JSON-serializable structure with components replaced by dicts/lists.

app.pipeline_module.pipeline_batch.run_pipeline(cfg, params)[source]

Execute the batch pipeline mirroring the app’s QC and analysis steps.

Parameters:

cfg (BatchConfig) – Batch configuration object.
params (dict) – Parsed application parameters.

Return type:

Dict[str, Any]

Returns:

Summary dict and JSON artifacts written to cfg.outdir.

app.pipeline_module.pipeline_from_toml module

app.pipeline_module.pipeline_from_toml.load_config(toml_path, default_toml_dir=None)[source]

Load a complete BatchConfig from a user TOML and defaults.

Parameters:

toml_path (str) – Path to user TOML file.
default_toml_dir (Path | None) – Optional directory of default TOMLs; when provided, a fully expanded TOML is emitted next to the user TOML for transparency.

Return type:

BatchConfig

Returns:

Populated BatchConfig instance.

app.pipeline_module.pipeline_from_toml.load_pipeline_parameters(user_toml, defaults_dir)[source]

Build final parameters with precedence.

Precedence: common defaults <- workflow defaults <- user TOML.

Parameters:

user_toml (Path) – User-provided TOML path.
defaults_dir (Path) – Directory containing default TOMLs.

Return type:

dict[str, Any]

Returns:

Merged parameters dictionary.

Raises:

KeyError – If workflow is not defined in the user TOML.
ValueError – If workflow is unsupported.

app.pipeline_module.pipeline_input_watcher module

Directory watcher that runs the batch pipeline on stable input trees.

It selects a pipeline TOML under each subdirectory, validates it, ensures the directory is quiescent, and invokes the pipeline runner. Results and errors are written next to the inputs.

Module contents

Pipeline module for background analyses and watchers.