alphabase.scoring.ml_scoring#
Classes:
DIA-NN like scoring. |
- class alphabase.scoring.ml_scoring.Percolator[source][source]#
Bases:
object
Methods:
__init__
()extract_features
(psm_df, *args, **kwargs)Extract features for rescoring.
rescore
(df)Estimate ML scores and then FDRs (q-values)
run_rerank_workflow
(top_k_psm_df[, ...])Run percolator workflow with reranking the peptides for each spectrum.
run_rescore_workflow
(psm_df, *args, **kwargs)Run percolator workflow:
Attributes:
The feature extractor inherited from BaseFeatureExtractor
Get extracted feature_list.
ML model in Percolator.
- extract_features(psm_df: DataFrame, *args, **kwargs) DataFrame [source][source]#
Extract features for rescoring.
*args and **kwargs are used for self.feature_extractor.extract_features.
- Parameters:
psm_df (pd.DataFrame) – PSM DataFrame
- Returns:
psm_df with feature columns appended inplace.
- Return type:
pd.DataFrame
- property feature_extractor: BaseFeatureExtractor#
The feature extractor inherited from BaseFeatureExtractor
- property feature_list: list#
Get extracted feature_list. Property, read-only
- property ml_model#
ML model in Percolator. It can be sklearn models or other models but implement the methods fit() and decision_function() (or predict_proba()) which are the same as sklearn models.
- rescore(df: DataFrame) DataFrame [source][source]#
Estimate ML scores and then FDRs (q-values)
- Parameters:
df (pd.DataFrame) – psm_df
- Returns:
psm_df with ml_score and fdr columns updated inplace
- Return type:
pd.DataFrame
- run_rerank_workflow(top_k_psm_df: DataFrame, rerank_column: str = 'spec_idx', *args, **kwargs) DataFrame [source][source]#
Run percolator workflow with reranking the peptides for each spectrum.
self.extract_features()
self.rescore()
*args and **kwargs are used for self.feature_extractor.extract_features.
- Parameters:
top_k_psm_df (pd.DataFrame) – PSM DataFrame
rerank_column (str) –
The column use to rerank PSMs.
For example, use the following code to select the top-ranked peptide for each spectrum.
` rerank_column = 'spec_idx' # scan_num idx = top_k_psm_df.groupby(['raw_name',rerank_column])['ml_score'].idxmax() psm_df = top_k_psm_df.loc[idx].copy() `
- Returns:
Only top-scored PSM is returned for each group of the rerank_column.
- Return type:
pd.DataFrame
- run_rescore_workflow(psm_df: DataFrame, *args, **kwargs) DataFrame [source][source]#
Run percolator workflow:
self.extract_features()
self.rescore()
*args and **kwargs are used for self.feature_extractor.extract_features.
- Parameters:
psm_df (pd.DataFrame) – PSM DataFrame
- Returns:
psm_df with feature columns appended inplace.
- Return type:
pd.DataFrame
- class alphabase.scoring.ml_scoring.SupervisedPercolator[source][source]#
Bases:
Percolator
DIA-NN like scoring.
Methods: