app.components.enrichment package

Submodules

app.components.enrichment.pantherdb module

PANTHER DB enrichment utilities.

This module integrates with the PANTHER overrepresentation API and dataset listings to support enrichment analysis workflows in ProteoGyver.

Main entry points

  • handler: stateful helper exposing dataset discovery and enrichment

  • handler.enrich: runs overrepresentation across configured datasets

class app.components.enrichment.pantherdb.handler(get_datasets=False)[source]

Bases: object

Stateful PANTHER enrichment helper.

Provides dataset discovery, file retrieval utilities, and convenience wrappers to run PANTHER overrepresentation analysis and assemble results for downstream visualization.

enrich(data_lists, options, filter_out_negative=True)[source]

Run PANTHER overrepresentation for multiple bait lists.

Parameters:
  • data_lists (list) – List of pairs (bait_name, prey_list).

  • options (str) – 'defaults' to use default panel, or semicolon-delimited dataset names.

  • filter_out_negative (bool) – If True, filter rows with non-positive fold enrichment.

Return type:

list

Returns:

Tuple of (result_names, result_dataframes, result_legends) suitable for plotting.

get_available()[source]

Return display names of available datasets.

Return type:

dict

Returns:

Sorted list of dataset names.

get_default_panel()[source]

Return the default set of dataset display names.

Return type:

list

Returns:

List of default dataset names.

get_pantherdb_datasets()[source]

Retrieve available PANTHER datasets from the service.

Return type:

list

Returns:

Mapping from annotation ID to tuple of (name, description).

property nice_name: str
panel_to_usable(entries)[source]

Convert display names/IDs to internal dataset metadata.

Parameters:

entries (list) – List of dataset display names or IDs.

Return type:

list

Returns:

List of triples [annotation_id, display_name, (annotation_id, name, description)].

retrieve_pantherdb_gene_classification(species=None, savepath='PANTHER datafiles', progress=False)[source]

Download PANTHER gene classification files for desired organisms.

Will not download when files with the same name already exist in the save directory.

Parameters:
  • species (list) – List of species to download. If None, downloads human only. If 'all', downloads all species files.

  • savepath (str) – Directory in which to save the files.

  • progress (bool) – If True, print progress information.

Return type:

None

Returns:

None

run_panther_overrepresentation_analysis(datasets, protein_list, data_set_name=None, background_list=None, organism=9606, test_type='FISHER', correction_type='FDR')[source]

Run PANTHER overrepresentation analysis for a protein list.

The returned dictionary contains:

  • Name: name of the enrichment database

  • Description: description of the database

  • Reference information: information about tool, database, and analysis

  • Results: pandas DataFrame with the full enrichment results

Parameters:
  • datasets (list) – Datasets to run against; see get_pantherdb_datasets.

  • protein_list (list) – List of identified proteins (UniProt accessions).

  • data_set_name (str | None) – Label for the incoming dataset; if None, a date-stamped name is used.

  • background_list (list) – Optional background proteins; if None, entire annotation DB is used.

  • organism (int) – NCBI TaxID of the organism (e.g., human is 9606).

  • test_type (str) – Statistical test type, see PANTHER docs.

  • correction_type (str) – Multiple testing correction.

Return type:

dict

Returns:

Mapping from dataset key to result bundle.

Module contents

Enrichment tools subpackage.