alphabase.pg_reader.alphapept_pg_reader¶
AlphaPept protein group reader.
Classes:
|
Reader for protein group matrices from the alphapept search engine. |
- class alphabase.pg_reader.alphapept_pg_reader.AlphaPeptPGReader(*, column_mapping: dict[str, Any] | None = None, measurement_regex: str | Literal['raw', 'lfq'] | None = 'raw')[source][source]¶
Bases:
PGReaderBaseReader for protein group matrices from the alphapept search engine.
Per default, the reader will read raw intensities from the protein group matrix. By passing a suitable regular expression, it is also possible to extract LFQ corrected intensities from the reader.
Notes:¶
AlphaPept protein group matrices contain both raw intensities and LFQ-corrected intensities. The LFQ-corrected intensities are marked by an _LFQ suffix.
In order to read alphapept .hdf output, please install the package with extra optional dependencies pip install “alphabase[hdf]”.
Example:¶
Get example data
import os import tempfile from alphabase.tools.data_downloader import DataShareDownloader from alphabase.pg_reader import AlphaPeptPGReader # Download to temporary directory URL = "https://datashare.biochem.mpg.de/s/6G6KHJqwcRPQiOO" download_dir = tempfile.mkdtemp() download_path = DataShareDownloader(url=URL, output_dir=download_dir).download()
Per default, the reader will return the raw intensities. Additional protein features are stored in the dataframe index, samples are stored as columns.
# Get raw intensities reader = AlphaPeptPGReader() results = reader.import_file(download_path) results.index.names > FrozenList(['proteins', 'uniprot_ids', 'ensembl_ids', 'source_db', 'is_decoy']) results.columns > Index(['A', 'B'], dtype='object')
To read the LFQ values, pass the pre-configured key lfq to the reader, which represents a regular expression that automatically extracts the LFQ columns from the protein group table.
# Get raw intensities reader = AlphaPeptPGReader(measurement_regex="lfq") results = reader.import_file(download_path) results.index.names > FrozenList(['proteins', 'uniprot_ids', 'ensembl_ids', 'source_db', 'is_decoy']) results.columns > Index(['A_LFQ', 'B_LFQ'], dtype='object')
To check out all preconfigured regular expressions, use the get_preconfigured_regex method:
AlphaPeptPGReader.get_preconfigured_regex() > {'raw': '^.*(?<!_LFQ)$', 'lfq': '_LFQ$'}
Methods:
__init__(*[, column_mapping, measurement_regex])Initialize AlphaPept protein group matrix reader.
- __init__(*, column_mapping: dict[str, Any] | None = None, measurement_regex: str | Literal['raw', 'lfq'] | None = 'raw')[source][source]¶
Initialize AlphaPept protein group matrix reader.
- Parameters:
column_mapping – Dictionary mapping alphabase column names (keys) to AlphaPept column names (values). If None, uses default mapping from configuration file.
measurement_regex –
Pattern to select quantity columns
”raw” (default): Raw intensities (excludes _LFQ columns)
”lfq”: LFQ-corrected intensities (_LFQ suffix)
str: Custom regular expression pattern
None: All quantity columns
See class documentation for usage examples and get_preconfigured_regex() for available patterns.