app.pipeline_module package

Submodules

app.pipeline_module.batch_data_store_builder module

Batch Data Store Builder for ProteoGyver

This module constructs data stores in the exact format expected by the GUI, allowing the batch pipeline to use the same infra.save_data_stores function as the interactive GUI.

The module converts batch pipeline results into Dash Store components that match the structure and content expected by the GUI export system.

app.pipeline_module.batch_data_store_builder.build_data_stores_from_batch_output(batch_output_dir, workflow)[source]

Build the complete list of data stores from batch output directory.

Parameters:
  • batch_output_dir (str) – Directory containing batch output JSON files.

  • workflow (str) – Workflow name (‘proteomics’ or ‘interactomics’).

Return type:

List[Dict]

Returns:

Complete list of data store components ready for infra.save_data_stores.

app.pipeline_module.batch_data_store_builder.build_interactomics_data_stores(batch_output_dir)[source]

Build interactomics-specific data stores from batch output.

Parameters:

batch_output_dir (str) – Directory containing batch output JSON files.

Return type:

List[Dict]

Returns:

List of interactomics data store components.

app.pipeline_module.batch_data_store_builder.build_proteomics_data_stores(batch_output_dir)[source]

Build proteomics-specific data stores from batch output.

Parameters:

batch_output_dir (str) – Directory containing batch output JSON files.

Return type:

List[Dict]

Returns:

List of proteomics data store components.

app.pipeline_module.batch_data_store_builder.build_qc_data_stores(qc_data)[source]

Build QC-related data stores.

Parameters:

qc_data (Dict) – QC artifacts from batch output.

Return type:

List[Dict]

Returns:

List of QC data store components.

app.pipeline_module.batch_data_store_builder.build_replicate_colors_stores(data_dict)[source]

Build replicate color data stores.

Parameters:

data_dict (Dict) – Data dictionary from batch output.

Return type:

List[Dict]

Returns:

List containing replicate color data store components.

app.pipeline_module.batch_data_store_builder.build_upload_data_store(data_dict)[source]

Build the main upload data store from batch data dictionary.

Parameters:

data_dict (Dict) – Data dictionary from batch output.

Return type:

Dict

Returns:

Data store component for ‘upload-data-store’.

app.pipeline_module.batch_data_store_builder.create_data_store_component(store_id, data, timestamp=None)[source]

Create a Dash data store component.

Parameters:
  • store_id (str) – Data store ID (e.g., ‘proteomics-volcano-data-store’).

  • data (Any) – Data to store (dict/JSON string/etc.).

  • timestamp (Optional[float]) – Optional milliseconds since epoch; uses current time if None.

Return type:

Dict

Returns:

Dash Store component dict structure.

app.pipeline_module.batch_data_store_builder.save_batch_data_using_infra(batch_output_dir, export_dir, workflow)[source]

Save batch data using the GUI’s infra.save_data_stores function.

Parameters:
  • batch_output_dir (str) – Directory containing batch output JSON files.

  • export_dir (str) – Directory to save exported data.

  • workflow (str) – Workflow name (‘proteomics’ or ‘interactomics’).

Return type:

Dict[str, Any]

Returns:

Summary dict of the export operation.

app.pipeline_module.batch_figure_builder_from_divs module

Batch Figure Builder using saved Dash divs

This module loads the saved div pickle files generated by the batch pipeline and uses them directly with infra.save_figures, just like the GUI does.

This approach is much more reliable than trying to reconstruct figures from JSON.

app.pipeline_module.batch_figure_builder_from_divs.build_analysis_divs_from_saved_divs(batch_output_dir, workflow, params)[source]

Build analysis_divs list from saved div pickle files.

The GUI export expects a flat list of individual div components.

Parameters:
  • batch_output_dir (str) – Directory containing the batch output files.

  • workflow (str) – Workflow type (‘proteomics’ or ‘interactomics’).

  • params (dict) – Parsed parameters dict (for TIC rendering defaults).

Return type:

List[Any]

Returns:

List of analysis div components ready for infra.save_figures.

app.pipeline_module.batch_figure_builder_from_divs.get_commonality_pdf_data(batch_output_dir)[source]

Get commonality PDF data if available.

Parameters:

batch_output_dir (str) – Directory containing batch output.

Return type:

Optional[str]

Returns:

PDF data string, or None if not available.

app.pipeline_module.batch_figure_builder_from_divs.load_div_pickle(pickle_path)[source]

Load a div pickle file.

Parameters:

pickle_path (str) – Path to the pickle file.

Return type:

Dict[str, Any]

Returns:

Dict of div components or empty dict if not found or on error.

app.pipeline_module.batch_figure_builder_from_divs.main()[source]

Command-line entry point for figure generation using saved divs.

Returns:

None.

app.pipeline_module.batch_figure_builder_from_divs.save_batch_figures_using_saved_divs(batch_output_dir, export_dir, workflow, parameters, output_formats=None)[source]

Save batch figures using saved div pickle files and GUI infrastructure.

Parameters:
  • batch_output_dir (str) – Directory containing batch output and div pickle files.

  • export_dir (str) – Directory for figure export.

  • workflow (str) – Workflow type (‘proteomics’ or ‘interactomics’).

  • parameters (dict) – Parsed parameters dict for figure defaults.

  • output_formats (Optional[List[str]]) – Output format list, default [‘html’, ‘pdf’, ‘png’].

Return type:

Dict[str, Any]

Returns:

Summary dict with export details and counts.

app.pipeline_module.pipeline_batch module

class app.pipeline_module.pipeline_batch.BatchConfig(data_table_path, sample_table_path, outdir='batch_out', figure_template='plotly_white', remove_common_contaminants=True, rename_replicates=False, unique_only=False, workflow='proteomics', plot_formats=<factory>, keep_batch_output=False, na_filter_percent=70, na_filter_type='sample-group', normalization='no_normalization', imputation='QRILC', control_group=None, comparison_file=None, fc_threshold=2, p_threshold=0.05, test_type='independent', uploaded_controls=<factory>, additional_controls=<factory>, crapome_sets=<factory>, proximity_filtering=False, n_controls=3, saint_bfdr_threshold=0.05, crapome_percentage_threshold=20, crapome_fc_threshold=2, rescue_enabled=False, chosen_enrichments=<factory>, force_supervenn=False)[source]

Bases: object

additional_controls: List[str]
chosen_enrichments: List[str]
comparison_file: Optional[str] = None
control_group: Optional[str] = None
crapome_fc_threshold: int = 2
crapome_percentage_threshold: int = 20
crapome_sets: List[str]
data_table_path: str
fc_threshold: float = 2
figure_template: str = 'plotly_white'
force_supervenn: bool = False
imputation: str = 'QRILC'
keep_batch_output: bool = False
n_controls: int = 3
na_filter_percent: int = 70
na_filter_type: str = 'sample-group'
normalization: str = 'no_normalization'
outdir: str = 'batch_out'
p_threshold: float = 0.05
plot_formats: List[str]
proximity_filtering: bool = False
remove_common_contaminants: bool = True
rename_replicates: bool = False
rescue_enabled: bool = False
saint_bfdr_threshold: float = 0.05
sample_table_path: str
test_type: str = 'independent'
unique_only: bool = False
uploaded_controls: List[str]
workflow: str = 'proteomics'
app.pipeline_module.pipeline_batch.dash_to_wire(obj)[source]

Recursively convert Dash/Plotly components to JSON-serializable structures.

  • Leaves primitives (str, int, float, bool, None) untouched.

  • Converts any object exposing to_plotly_json() (Dash components, go.Figure).

  • Recurses through dicts and lists/tuples.

  • Dataclasses are converted via asdict() then recursed.

Parameters:

obj – Any Python object (Dash component, go.Figure, dict/list, primitives).

Returns:

JSON-serializable structure with components replaced by dicts/lists.

app.pipeline_module.pipeline_batch.run_pipeline(cfg, params)[source]

Execute the batch pipeline mirroring the app’s QC and analysis steps.

Parameters:
  • cfg (BatchConfig) – Batch configuration object.

  • params (dict) – Parsed application parameters.

Return type:

Dict[str, Any]

Returns:

Summary dict and JSON artifacts written to cfg.outdir.

app.pipeline_module.pipeline_from_toml module

app.pipeline_module.pipeline_from_toml.load_config(toml_path, default_toml_dir=None)[source]

Load a complete BatchConfig from a user TOML and defaults.

Parameters:
  • toml_path (str) – Path to user TOML file.

  • default_toml_dir (Path | None) – Optional directory of default TOMLs; when provided, a fully expanded TOML is emitted next to the user TOML for transparency.

Return type:

BatchConfig

Returns:

Populated BatchConfig instance.

app.pipeline_module.pipeline_from_toml.load_pipeline_parameters(user_toml, defaults_dir)[source]

Build final parameters with precedence.

Precedence: common defaults <- workflow defaults <- user TOML.

Parameters:
  • user_toml (Path) – User-provided TOML path.

  • defaults_dir (Path) – Directory containing default TOMLs.

Return type:

dict[str, Any]

Returns:

Merged parameters dictionary.

Raises:
  • KeyError – If workflow is not defined in the user TOML.

  • ValueError – If workflow is unsupported.

app.pipeline_module.pipeline_input_watcher module

Directory watcher that runs the batch pipeline on stable input trees.

It selects a pipeline TOML under each subdirectory, validates it, ensures the directory is quiescent, and invokes the pipeline runner. Results and errors are written next to the inputs.

Module contents

Pipeline module for background analyses and watchers.