alphabase.constants.isotope#
Classes:
|
Functions:
|
If we have two isotope distributions, we can convolute them into one distribution. |
|
Generate the isotope distribution and the mono index for a given formula (as a list, e.g. [('H', 2), ('C', 2), ('O', 1)]). |
|
Calculate the isotope distribution for an element and its numbers. |
- class alphabase.constants.isotope.IsotopeDistribution(max_elem_num_dict: dict = {'C': 2000, 'H': 5000, 'N': 1000, 'O': 1000, 'P': 200, 'S': 200})[source][source]#
Bases:
object
Methods:
__init__
([max_elem_num_dict])Faster calculation of isotope abundance distribution by pre-building isotope distribution tables for most common elements.
calc_formula_distribution
(formula)Calculate isotope abundance distribution for a given formula
- __init__(max_elem_num_dict: dict = {'C': 2000, 'H': 5000, 'N': 1000, 'O': 1000, 'P': 200, 'S': 200})[source][source]#
Faster calculation of isotope abundance distribution by pre-building isotope distribution tables for most common elements.
We have considered large enough number of elements for shotgun proteomics. We can increase max_elem_num_dict to support larger peptide or top-down in the future. However, current MAX_ISOTOPE_LEN is not suitable for top-down, it must be extended to a larger number (100?). Note that non-standard amino acids have 1000000 C elements in AlphaBase, We clip 1000000 C to the maximal number of C in max_elem_num_dict. As they have very large masses thus impossible to identify, their isotope distributions do not matter.
- Parameters:
max_elem_num_dict (dict, optional) – Define the maximal number of the elements. Defaults to { ‘C’: 2000, ‘H’: 5000, ‘N’: 1000, ‘O’: 1000, ‘S’: 200, ‘P’: 200, }
- element_to_cum_dist_dict#
{element: cumulated isotope distribution array}, and the cumulated isotope distribution array is a 2-D float np.ndarray with shape (element_max_number, MAX_ISOTOPE_LEN).
- Type:
dict
- element_to_cum_mono_idx#
{element: mono position array of cumulated isotope distribution}, and mono position array is a 1-D int np.ndarray.
- Type:
dict
- calc_formula_distribution(formula: List[Tuple[str, int]]) Tuple[ndarray, int] [source][source]#
Calculate isotope abundance distribution for a given formula
- Parameters:
formula (List[tuple(str,int)]) – chemical formula: “[(‘H’,1),(‘C’,2),(‘O’,3)]”.
- Returns:
np.ndarray, isotope abundance distribution int, monoisotope position in the distribution array
- Return type:
tuple[np.ndarray, int]
Examples
>>> from alphabase.constants import IsotopeDistribution, parse_formula >>> iso = IsotopeDistribution() >>> formula = 'C(100)H(100)O(10)Na(1)Fe(1)' >>> formula = parse_formula(formula) >>> dist, mono = iso.calc_formula_distribution(formula) >>> dist array([1.92320044e-02, 2.10952666e-02, 3.13753566e-01, 3.42663681e-01, 1.95962632e-01, 7.69157517e-02, 2.31993814e-02, 5.71948249e-03, 1.19790438e-03, 2.18815385e-04]) >>> # Fe's mono position is 2 Da larger than its smallest mass, >>> # so the mono position of this formula shifts by +2 (Da). >>> mono 2
>>> formula = 'C(100)H(100)O(10)13C(1)Na(1)' >>> formula = parse_formula(formula) >>> dist, mono = iso.calc_formula_distribution(formula) >>> dist array([3.29033438e-03, 3.29352217e-01, 3.59329960e-01, 2.01524592e-01, 7.71395498e-02, 2.26229845e-02, 5.41229894e-03, 1.09842389e-03, 1.94206388e-04, 3.04911585e-05]) >>> # 13C's mono position is +1 Da shifted >>> mono 1
>>> formula = 'C(100)H(100)O(10)Na(1)' >>> formula = parse_formula(formula) >>> dist, mono = iso.calc_formula_distribution(formula) >>> dist array([3.29033438e-01, 3.60911319e-01, 2.02775462e-01, 7.76884706e-02, 2.27963906e-02, 5.45578135e-03, 1.10754072e-03, 1.95857410e-04, 3.07552058e-05, 4.35047710e-06]) >>> # mono position is normal (=0) for regular formulas >>> mono 0
- alphabase.constants.isotope.abundance_convolution(d1: ndarray, mono1: int, d2: ndarray, mono2: int) Tuple[ndarray, int] [source]#
If we have two isotope distributions, we can convolute them into one distribution.
- Parameters:
d1 (np.ndarray) – isotope distribution to convolute
mono1 (int) – mono position of d1.
d2 (np.ndarray) – isotope distribution to convolute
mono2 (int) – mono position of d2
- Returns:
np.ndarray, convoluted isotope distribution int, new mono position.
- Return type:
tuple[np.ndarray,int]
- alphabase.constants.isotope.formula_dist(formula: list | str) Tuple[ndarray, int] [source][source]#
Generate the isotope distribution and the mono index for a given formula (as a list, e.g. [(‘H’, 2), (‘C’, 2), (‘O’, 1)]).
- Parameters:
formula (Union[list, str]) – chemical formula, could be str or list. If str: “H(1)N(2)O(3)”. If list: “[(‘H’,1),(‘H’,2),(‘H’,3)]”.
- Returns:
np.ndarray, isotope distribution int, mono position
- Return type:
tuple[np.ndarray,int]
- alphabase.constants.isotope.one_element_dist(elem: str, n: int, chem_isotope_dist: Dict, chem_mono_idx: Dict) Tuple[ndarray, int] [source]#
Calculate the isotope distribution for an element and its numbers.
- Parameters:
elem (str) – element.
n (int) – element number.
chem_isotope_dist (numba.typed.Dict) – use CHEM_ISOTOPE_DIST as parameter.
chem_mono_idx (numba.typed.Dict) – use CHEM_MONO_IDX as parameter.
- Returns:
np.ndarray, isotope distribution of the element. int, mono position in the distribution
- Return type:
tuple[np.ndarray, int]