app.pipeline_module package
Submodules
app.pipeline_module.batch_data_store_builder module
Batch Data Store Builder for ProteoGyver
This module constructs data stores in the exact format expected by the GUI, allowing the batch pipeline to use the same infra.save_data_stores function as the interactive GUI.
The module converts batch pipeline results into Dash Store components that match the structure and content expected by the GUI export system.
- app.pipeline_module.batch_data_store_builder.build_data_stores_from_batch_output(batch_output_dir, workflow)[source]
Build the complete list of data stores from batch output directory.
- Parameters:
batch_output_dir (
str) – Directory containing batch output JSON files.workflow (
str) – Workflow name (‘proteomics’ or ‘interactomics’).
- Return type:
List[Dict]- Returns:
Complete list of data store components ready for infra.save_data_stores.
- app.pipeline_module.batch_data_store_builder.build_interactomics_data_stores(batch_output_dir)[source]
Build interactomics-specific data stores from batch output.
- Parameters:
batch_output_dir (
str) – Directory containing batch output JSON files.- Return type:
List[Dict]- Returns:
List of interactomics data store components.
- app.pipeline_module.batch_data_store_builder.build_proteomics_data_stores(batch_output_dir)[source]
Build proteomics-specific data stores from batch output.
- Parameters:
batch_output_dir (
str) – Directory containing batch output JSON files.- Return type:
List[Dict]- Returns:
List of proteomics data store components.
- app.pipeline_module.batch_data_store_builder.build_qc_data_stores(qc_data)[source]
Build QC-related data stores.
- Parameters:
qc_data (
Dict) – QC artifacts from batch output.- Return type:
List[Dict]- Returns:
List of QC data store components.
- app.pipeline_module.batch_data_store_builder.build_replicate_colors_stores(data_dict)[source]
Build replicate color data stores.
- Parameters:
data_dict (
Dict) – Data dictionary from batch output.- Return type:
List[Dict]- Returns:
List containing replicate color data store components.
- app.pipeline_module.batch_data_store_builder.build_upload_data_store(data_dict)[source]
Build the main upload data store from batch data dictionary.
- Parameters:
data_dict (
Dict) – Data dictionary from batch output.- Return type:
Dict- Returns:
Data store component for ‘upload-data-store’.
- app.pipeline_module.batch_data_store_builder.create_data_store_component(store_id, data, timestamp=None)[source]
Create a Dash data store component.
- Parameters:
store_id (
str) – Data store ID (e.g., ‘proteomics-volcano-data-store’).data (
Any) – Data to store (dict/JSON string/etc.).timestamp (
Optional[float]) – Optional milliseconds since epoch; uses current time if None.
- Return type:
Dict- Returns:
Dash Store component dict structure.
- app.pipeline_module.batch_data_store_builder.save_batch_data_using_infra(batch_output_dir, export_dir, workflow)[source]
Save batch data using the GUI’s infra.save_data_stores function.
- Parameters:
batch_output_dir (
str) – Directory containing batch output JSON files.export_dir (
str) – Directory to save exported data.workflow (
str) – Workflow name (‘proteomics’ or ‘interactomics’).
- Return type:
Dict[str,Any]- Returns:
Summary dict of the export operation.
app.pipeline_module.batch_figure_builder_from_divs module
Batch Figure Builder using saved Dash divs
This module loads the saved div pickle files generated by the batch pipeline and uses them directly with infra.save_figures, just like the GUI does.
This approach is much more reliable than trying to reconstruct figures from JSON.
- app.pipeline_module.batch_figure_builder_from_divs.build_analysis_divs_from_saved_divs(batch_output_dir, workflow, params)[source]
Build
analysis_divslist from saved div pickle files.The GUI export expects a flat list of individual div components.
- Parameters:
batch_output_dir (
str) – Directory containing the batch output files.workflow (
str) – Workflow type (‘proteomics’ or ‘interactomics’).params (
dict) – Parsed parameters dict (for TIC rendering defaults).
- Return type:
List[Any]- Returns:
List of analysis div components ready for infra.save_figures.
- app.pipeline_module.batch_figure_builder_from_divs.get_commonality_pdf_data(batch_output_dir)[source]
Get commonality PDF data if available.
- Parameters:
batch_output_dir (
str) – Directory containing batch output.- Return type:
Optional[str]- Returns:
PDF data string, or None if not available.
- app.pipeline_module.batch_figure_builder_from_divs.load_div_pickle(pickle_path)[source]
Load a div pickle file.
- Parameters:
pickle_path (
str) – Path to the pickle file.- Return type:
Dict[str,Any]- Returns:
Dict of div components or empty dict if not found or on error.
- app.pipeline_module.batch_figure_builder_from_divs.main()[source]
Command-line entry point for figure generation using saved divs.
- Returns:
None.
- app.pipeline_module.batch_figure_builder_from_divs.save_batch_figures_using_saved_divs(batch_output_dir, export_dir, workflow, parameters, output_formats=None)[source]
Save batch figures using saved div pickle files and GUI infrastructure.
- Parameters:
batch_output_dir (
str) – Directory containing batch output and div pickle files.export_dir (
str) – Directory for figure export.workflow (
str) – Workflow type (‘proteomics’ or ‘interactomics’).parameters (
dict) – Parsed parameters dict for figure defaults.output_formats (
Optional[List[str]]) – Output format list, default [‘html’, ‘pdf’, ‘png’].
- Return type:
Dict[str,Any]- Returns:
Summary dict with export details and counts.
app.pipeline_module.pipeline_batch module
- class app.pipeline_module.pipeline_batch.BatchConfig(data_table_path, sample_table_path, outdir='batch_out', figure_template='plotly_white', remove_common_contaminants=True, rename_replicates=False, unique_only=False, workflow='proteomics', plot_formats=<factory>, keep_batch_output=False, na_filter_percent=70, na_filter_type='sample-group', normalization='no_normalization', imputation='QRILC', control_group=None, comparison_file=None, fc_threshold=2, p_threshold=0.05, test_type='independent', uploaded_controls=<factory>, additional_controls=<factory>, crapome_sets=<factory>, proximity_filtering=False, n_controls=3, saint_bfdr_threshold=0.05, crapome_percentage_threshold=20, crapome_fc_threshold=2, rescue_enabled=False, chosen_enrichments=<factory>, force_supervenn=False)[source]
Bases:
object-
additional_controls:
List[str]
-
chosen_enrichments:
List[str]
-
comparison_file:
Optional[str] = None
-
control_group:
Optional[str] = None
-
crapome_fc_threshold:
int= 2
-
crapome_percentage_threshold:
int= 20
-
crapome_sets:
List[str]
-
data_table_path:
str
-
fc_threshold:
float= 2
-
figure_template:
str= 'plotly_white'
-
force_supervenn:
bool= False
-
imputation:
str= 'QRILC'
-
keep_batch_output:
bool= False
-
n_controls:
int= 3
-
na_filter_percent:
int= 70
-
na_filter_type:
str= 'sample-group'
-
normalization:
str= 'no_normalization'
-
outdir:
str= 'batch_out'
-
p_threshold:
float= 0.05
-
plot_formats:
List[str]
-
proximity_filtering:
bool= False
-
remove_common_contaminants:
bool= True
-
rename_replicates:
bool= False
-
rescue_enabled:
bool= False
-
saint_bfdr_threshold:
float= 0.05
-
sample_table_path:
str
-
test_type:
str= 'independent'
-
unique_only:
bool= False
-
uploaded_controls:
List[str]
-
workflow:
str= 'proteomics'
-
additional_controls:
- app.pipeline_module.pipeline_batch.dash_to_wire(obj)[source]
Recursively convert Dash/Plotly components to JSON-serializable structures.
Leaves primitives (str, int, float, bool, None) untouched.
Converts any object exposing
to_plotly_json()(Dash components, go.Figure).Recurses through dicts and lists/tuples.
Dataclasses are converted via
asdict()then recursed.
- Parameters:
obj – Any Python object (Dash component, go.Figure, dict/list, primitives).
- Returns:
JSON-serializable structure with components replaced by dicts/lists.
- app.pipeline_module.pipeline_batch.run_pipeline(cfg, params)[source]
Execute the batch pipeline mirroring the app’s QC and analysis steps.
- Parameters:
cfg (
BatchConfig) – Batch configuration object.params (
dict) – Parsed application parameters.
- Return type:
Dict[str,Any]- Returns:
Summary dict and JSON artifacts written to
cfg.outdir.
app.pipeline_module.pipeline_from_toml module
- app.pipeline_module.pipeline_from_toml.load_config(toml_path, default_toml_dir=None)[source]
Load a complete BatchConfig from a user TOML and defaults.
- Parameters:
toml_path (
str) – Path to user TOML file.default_toml_dir (
Path|None) – Optional directory of default TOMLs; when provided, a fully expanded TOML is emitted next to the user TOML for transparency.
- Return type:
- Returns:
Populated BatchConfig instance.
- app.pipeline_module.pipeline_from_toml.load_pipeline_parameters(user_toml, defaults_dir)[source]
Build final parameters with precedence.
Precedence: common defaults <- workflow defaults <- user TOML.
- Parameters:
user_toml (
Path) – User-provided TOML path.defaults_dir (
Path) – Directory containing default TOMLs.
- Return type:
dict[str,Any]- Returns:
Merged parameters dictionary.
- Raises:
KeyError – If workflow is not defined in the user TOML.
ValueError – If workflow is unsupported.
app.pipeline_module.pipeline_input_watcher module
Directory watcher that runs the batch pipeline on stable input trees.
It selects a pipeline TOML under each subdirectory, validates it, ensures the directory is quiescent, and invokes the pipeline runner. Results and errors are written next to the inputs.
Module contents
Pipeline module for background analyses and watchers.