alphabase.constants.isotope#

Classes:

IsotopeDistribution([max_elem_num_dict])

Functions:

abundance_convolution(d1, mono1, d2, mono2)

If we have two isotope distributions, we can convolute them into one distribution.

formula_dist(formula)

Generate the isotope distribution and the mono index for a given formula (as a list, e.g. [('H', 2), ('C', 2), ('O', 1)]).

one_element_dist(elem, n, chem_isotope_dist, ...)

Calculate the isotope distribution for an element and its numbers.

class alphabase.constants.isotope.IsotopeDistribution(max_elem_num_dict: dict = {'C': 2000, 'H': 5000, 'N': 1000, 'O': 1000, 'P': 200, 'S': 200})[source][source]#

Bases: object

Methods:

__init__([max_elem_num_dict])

Faster calculation of isotope abundance distribution by pre-building isotope distribution tables for most common elements.

calc_formula_distribution(formula)

Calculate isotope abundance distribution for a given formula

__init__(max_elem_num_dict: dict = {'C': 2000, 'H': 5000, 'N': 1000, 'O': 1000, 'P': 200, 'S': 200})[source][source]#

Faster calculation of isotope abundance distribution by pre-building isotope distribution tables for most common elements.

We have considered large enough number of elements for shotgun proteomics. We can increase max_elem_num_dict to support larger peptide or top-down in the future. However, current MAX_ISOTOPE_LEN is not suitable for top-down, it must be extended to a larger number (100?). Note that non-standard amino acids have 1000000 C elements in AlphaBase, We clip 1000000 C to the maximal number of C in max_elem_num_dict. As they have very large masses thus impossible to identify, their isotope distributions do not matter.

Parameters:

max_elem_num_dict (dict, optional) – Define the maximal number of the elements. Defaults to { ‘C’: 2000, ‘H’: 5000, ‘N’: 1000, ‘O’: 1000, ‘S’: 200, ‘P’: 200, }

element_to_cum_dist_dict#

{element: cumulated isotope distribution array}, and the cumulated isotope distribution array is a 2-D float np.ndarray with shape (element_max_number, MAX_ISOTOPE_LEN).

Type:

dict

element_to_cum_mono_idx#

{element: mono position array of cumulated isotope distribution}, and mono position array is a 1-D int np.ndarray.

Type:

dict

calc_formula_distribution(formula: List[Tuple[str, int]]) Tuple[ndarray, int][source][source]#

Calculate isotope abundance distribution for a given formula

Parameters:

formula (List[tuple(str,int)]) – chemical formula: “[(‘H’,1),(‘C’,2),(‘O’,3)]”.

Returns:

np.ndarray, isotope abundance distribution int, monoisotope position in the distribution array

Return type:

tuple[np.ndarray, int]

Examples

>>> from alphabase.constants import IsotopeDistribution, parse_formula
>>> iso = IsotopeDistribution()
>>> formula = 'C(100)H(100)O(10)Na(1)Fe(1)'
>>> formula = parse_formula(formula)
>>> dist, mono = iso.calc_formula_distribution(formula)
>>> dist
array([1.92320044e-02, 2.10952666e-02, 3.13753566e-01, 3.42663681e-01,
        1.95962632e-01, 7.69157517e-02, 2.31993814e-02, 5.71948249e-03,
        1.19790438e-03, 2.18815385e-04])
>>> # Fe's mono position is 2 Da larger than its smallest mass,
>>> # so the mono position of this formula shifts by +2 (Da).
>>> mono
2
>>> formula = 'C(100)H(100)O(10)13C(1)Na(1)'
>>> formula = parse_formula(formula)
>>> dist, mono = iso.calc_formula_distribution(formula)
>>> dist
array([3.29033438e-03, 3.29352217e-01, 3.59329960e-01, 2.01524592e-01,
        7.71395498e-02, 2.26229845e-02, 5.41229894e-03, 1.09842389e-03,
        1.94206388e-04, 3.04911585e-05])
>>> # 13C's mono position is +1 Da shifted
>>> mono
1
>>> formula = 'C(100)H(100)O(10)Na(1)'
>>> formula = parse_formula(formula)
>>> dist, mono = iso.calc_formula_distribution(formula)
>>> dist
array([3.29033438e-01, 3.60911319e-01, 2.02775462e-01, 7.76884706e-02,
        2.27963906e-02, 5.45578135e-03, 1.10754072e-03, 1.95857410e-04,
        3.07552058e-05, 4.35047710e-06])
>>> # mono position is normal (=0) for regular formulas
>>> mono
0
alphabase.constants.isotope.abundance_convolution(d1: ndarray, mono1: int, d2: ndarray, mono2: int) Tuple[ndarray, int][source]#

If we have two isotope distributions, we can convolute them into one distribution.

Parameters:
  • d1 (np.ndarray) – isotope distribution to convolute

  • mono1 (int) – mono position of d1.

  • d2 (np.ndarray) – isotope distribution to convolute

  • mono2 (int) – mono position of d2

Returns:

np.ndarray, convoluted isotope distribution int, new mono position.

Return type:

tuple[np.ndarray,int]

alphabase.constants.isotope.formula_dist(formula: list | str) Tuple[ndarray, int][source][source]#

Generate the isotope distribution and the mono index for a given formula (as a list, e.g. [(‘H’, 2), (‘C’, 2), (‘O’, 1)]).

Parameters:

formula (Union[list, str]) – chemical formula, could be str or list. If str: “H(1)N(2)O(3)”. If list: “[(‘H’,1),(‘H’,2),(‘H’,3)]”.

Returns:

np.ndarray, isotope distribution int, mono position

Return type:

tuple[np.ndarray,int]

alphabase.constants.isotope.one_element_dist(elem: str, n: int, chem_isotope_dist: Dict, chem_mono_idx: Dict) Tuple[ndarray, int][source]#

Calculate the isotope distribution for an element and its numbers.

Parameters:
  • elem (str) – element.

  • n (int) – element number.

  • chem_isotope_dist (numba.typed.Dict) – use CHEM_ISOTOPE_DIST as parameter.

  • chem_mono_idx (numba.typed.Dict) – use CHEM_MONO_IDX as parameter.

Returns:

np.ndarray, isotope distribution of the element. int, mono position in the distribution

Return type:

tuple[np.ndarray, int]