alphabase.scoring.ml_scoring#

Classes:

`Percolator`()
`SupervisedPercolator`()	DIA-NN like scoring.

class alphabase.scoring.ml_scoring.Percolator[source][source]#

Bases: object

Methods:

`__init__`()
`extract_features`(psm_df, args, *kwargs)	Extract features for rescoring.
`rescore`(df)	Estimate ML scores and then FDRs (q-values)
`run_rerank_workflow`(top_k_psm_df[, ...])	Run percolator workflow with reranking the peptides for each spectrum.
`run_rescore_workflow`(psm_df, args, *kwargs)	Run percolator workflow:

Attributes:

`feature_extractor`	The feature extractor inherited from BaseFeatureExtractor
`feature_list`	Get extracted feature_list.
`ml_model`	ML model in Percolator.

extract_features(psm_df: DataFrame, *args, **kwargs) → DataFrame[source][source]#

Extract features for rescoring.

*args and **kwargs are used for self.feature_extractor.extract_features.

property feature_extractor: BaseFeatureExtractor#: The feature extractor inherited from BaseFeatureExtractor

property feature_list: list#: Get extracted feature_list. Property, read-only

property ml_model#: ML model in Percolator. It can be sklearn models or other models but implement the methods fit() and decision_function() (or predict_proba()) which are the same as sklearn models.

rescore(df: DataFrame) → DataFrame[source][source]#

Estimate ML scores and then FDRs (q-values)

run_rerank_workflow(top_k_psm_df: DataFrame, rerank_column: str = 'spec_idx', *args, **kwargs) → DataFrame[source][source]#

Run percolator workflow with reranking the peptides for each spectrum.

*args and **kwargs are used for self.feature_extractor.extract_features.

Parameters:

top_k_psm_df (pd.DataFrame) – PSM DataFrame
rerank_column (str) –
The column use to rerank PSMs.

For example, use the following code to select the top-ranked peptide for each spectrum. ` rerank_column = 'spec_idx' # scan_num idx = top_k_psm_df.groupby(['raw_name',rerank_column])['ml_score'].idxmax() psm_df = top_k_psm_df.loc[idx].copy() `

Returns:

Only top-scored PSM is returned for each group of the rerank_column.

Return type:

pd.DataFrame

run_rescore_workflow(psm_df: DataFrame, *args, **kwargs) → DataFrame[source][source]#

Run percolator workflow:

*args and **kwargs are used for self.feature_extractor.extract_features.

class alphabase.scoring.ml_scoring.SupervisedPercolator[source][source]#

DIA-NN like scoring.

Methods:

`__init__`()
`rescore`(psm_df)	Estimate ML scores and then FDRs (q-values)

rescore(psm_df: DataFrame) → DataFrame[source][source]#

Estimate ML scores and then FDRs (q-values)