app.components.enrichment package
Submodules
app.components.enrichment.pantherdb module
PANTHER DB enrichment utilities.
This module integrates with the PANTHER overrepresentation API and dataset listings to support enrichment analysis workflows in ProteoGyver.
Main entry points
handler: stateful helper exposing dataset discovery and enrichmenthandler.enrich: runs overrepresentation across configured datasets
- class app.components.enrichment.pantherdb.handler(get_datasets=False)[source]
Bases:
objectStateful PANTHER enrichment helper.
Provides dataset discovery, file retrieval utilities, and convenience wrappers to run PANTHER overrepresentation analysis and assemble results for downstream visualization.
- enrich(data_lists, options, filter_out_negative=True)[source]
Run PANTHER overrepresentation for multiple bait lists.
- Parameters:
data_lists (
list) – List of pairs(bait_name, prey_list).options (
str) –'defaults'to use default panel, or semicolon-delimited dataset names.filter_out_negative (
bool) – IfTrue, filter rows with non-positive fold enrichment.
- Return type:
list- Returns:
Tuple of (result_names, result_dataframes, result_legends) suitable for plotting.
- get_available()[source]
Return display names of available datasets.
- Return type:
dict- Returns:
Sorted list of dataset names.
- get_default_panel()[source]
Return the default set of dataset display names.
- Return type:
list- Returns:
List of default dataset names.
- get_pantherdb_datasets()[source]
Retrieve available PANTHER datasets from the service.
- Return type:
list- Returns:
Mapping from annotation ID to tuple of (name, description).
- property nice_name: str
- panel_to_usable(entries)[source]
Convert display names/IDs to internal dataset metadata.
- Parameters:
entries (
list) – List of dataset display names or IDs.- Return type:
list- Returns:
List of triples [annotation_id, display_name, (annotation_id, name, description)].
- retrieve_pantherdb_gene_classification(species=None, savepath='PANTHER datafiles', progress=False)[source]
Download PANTHER gene classification files for desired organisms.
Will not download when files with the same name already exist in the save directory.
- Parameters:
species (
list) – List of species to download. IfNone, downloads human only. If'all', downloads all species files.savepath (
str) – Directory in which to save the files.progress (
bool) – IfTrue, print progress information.
- Return type:
None- Returns:
None
- run_panther_overrepresentation_analysis(datasets, protein_list, data_set_name=None, background_list=None, organism=9606, test_type='FISHER', correction_type='FDR')[source]
Run PANTHER overrepresentation analysis for a protein list.
The returned dictionary contains:
Name: name of the enrichment databaseDescription: description of the databaseReference information: information about tool, database, and analysisResults: pandas DataFrame with the full enrichment results
- Parameters:
datasets (
list) – Datasets to run against; seeget_pantherdb_datasets.protein_list (
list) – List of identified proteins (UniProt accessions).data_set_name (
str|None) – Label for the incoming dataset; ifNone, a date-stamped name is used.background_list (
list) – Optional background proteins; ifNone, entire annotation DB is used.organism (
int) – NCBI TaxID of the organism (e.g., human is 9606).test_type (
str) – Statistical test type, see PANTHER docs.correction_type (
str) – Multiple testing correction.
- Return type:
dict- Returns:
Mapping from dataset key to result bundle.
Module contents
Enrichment tools subpackage.