app.components.enrichment package

Submodules

app.components.enrichment.pantherdb module

PANTHER DB enrichment utilities.

This module integrates with the PANTHER overrepresentation API and dataset listings to support enrichment analysis workflows in ProteoGyver.

Main entry points

handler: stateful helper exposing dataset discovery and enrichment
handler.enrich: runs overrepresentation across configured datasets

class app.components.enrichment.pantherdb.handler(get_datasets=False)[source]

Bases: object

Stateful PANTHER enrichment helper.

Provides dataset discovery, file retrieval utilities, and convenience wrappers to run PANTHER overrepresentation analysis and assemble results for downstream visualization.

enrich(data_lists, options, filter_out_negative=True)[source]

Run PANTHER overrepresentation for multiple bait lists.

Parameters:

data_lists (list) – List of pairs (bait_name, prey_list).
options (str) – 'defaults' to use default panel, or semicolon-delimited dataset names.
filter_out_negative (bool) – If True, filter rows with non-positive fold enrichment.

Return type:

list

Returns:

Tuple of (result_names, result_dataframes, result_legends) suitable for plotting.

get_available()[source]

Return display names of available datasets.

Return type:: dict
Returns:: Sorted list of dataset names.

get_default_panel()[source]

Return the default set of dataset display names.

Return type:: list
Returns:: List of default dataset names.

get_pantherdb_datasets()[source]

Retrieve available PANTHER datasets from the service.

Return type:: list
Returns:: Mapping from annotation ID to tuple of (name, description).

property nice_name: str

panel_to_usable(entries)[source]

Convert display names/IDs to internal dataset metadata.

Parameters:: entries (list) – List of dataset display names or IDs.
Return type:: list
Returns:: List of triples [annotation_id, display_name, (annotation_id, name, description)].

retrieve_pantherdb_gene_classification(species=None, savepath='PANTHER datafiles', progress=False)[source]

Download PANTHER gene classification files for desired organisms.

Will not download when files with the same name already exist in the save directory.

Parameters:

species (list) – List of species to download. If None, downloads human only. If 'all', downloads all species files.
savepath (str) – Directory in which to save the files.
progress (bool) – If True, print progress information.

Return type:

None

Returns:

None

run_panther_overrepresentation_analysis(datasets, protein_list, data_set_name=None, background_list=None, organism=9606, test_type='FISHER', correction_type='FDR')[source]

Run PANTHER overrepresentation analysis for a protein list.

The returned dictionary contains:

Name: name of the enrichment database
Description: description of the database
Reference information: information about tool, database, and analysis
Results: pandas DataFrame with the full enrichment results

Parameters:

datasets (list) – Datasets to run against; see get_pantherdb_datasets.
protein_list (list) – List of identified proteins (UniProt accessions).
data_set_name (str | None) – Label for the incoming dataset; if None, a date-stamped name is used.
background_list (list) – Optional background proteins; if None, entire annotation DB is used.
organism (int) – NCBI TaxID of the organism (e.g., human is 9606).
test_type (str) – Statistical test type, see PANTHER docs.
correction_type (str) – Multiple testing correction.

Return type:

dict

Returns:

Mapping from dataset key to result bundle.

Module contents

Enrichment tools subpackage.