alphabase.pg_reader.mztab_pg_reader

FragPipe protein group reader.

Classes:

MZTabPGReader(*[, column_mapping, ...])

Reader for MZTab search engine output.

class alphabase.pg_reader.mztab_pg_reader.MZTabPGReader(*, column_mapping: dict[str, str] | None = None, measurement_regex: str | Literal['assay', 'study_variable'] | None = 'assay')[source][source]

Bases: PGReaderBase

Reader for MZTab search engine output.

MZTab is a standardized tab-delimited format for reporting proteomics and metabolomics results. The format organizes data into distinct sections: metadata (MTD), protein groups (PRH/PRT), peptides (PEH/PEP), PSMs (PSH/PSM), and small molecules (SMH/SML), with each section identified by specific three-letter prefixes. This reader extracts protein-level quantification data from the PRT lines, which contain protein abundances across samples or study variables.

Example:

Per default, the reader will return the raw intensities from the razor method. Additional protein features are stored in the dataframe index, samples are stored as columns.

from alphabase.pg_reader import MZTabPGReader

# Get raw intensities
reader = MZTabPGReader()
results = reader.import_file(path)

References:

Methods:

__init__(*[, column_mapping, measurement_regex])

Read protein group (PG) matrices into the standardized alphabase format.

__init__(*, column_mapping: dict[str, str] | None = None, measurement_regex: str | Literal['assay', 'study_variable'] | None = 'assay')[source][source]

Read protein group (PG) matrices into the standardized alphabase format.

Parameters:
  • column_mapping – A dictionary of mapping alphabase columns (keys) to the corresponding columns in the other search engine (values). If None will be loaded from the column_mapping key of the respective search engine in pg_reader.yaml

  • measurement_regex – Regular expression that identifies correct measurement type. Only relevant if PG matrix contains multiple measurement types. For example, alphapept returns the raw protein intensity per sample in column A and the LFQ corrected value in A_LFQ. If None uses all columns.

column_mapping

Dictionary structure mapping alphabase columns (keys) to the corresponding columns in the other search engine (values), see parameters.

measurement_regex

Regular expression that matches quantity of interest for all samples

Notes

Standardizes protein group reports to a protein group dataframe (features x samples) in wide format. Contains at least

Additional feature-level metadata might be available in the index.