Popitam: Towards new heuristic strategies to improve protein identification from tandem mass spectrometry data

Swiss Institute of Bioinformatics, Geneva, Switzerland.
PROTEOMICS (Impact Factor: 3.97). 01/2003; 3(6):870-8. DOI: 10.1002/pmic.200300402
Source: PubMed

ABSTRACT In recent years, proteomics research has gained importance due to increasingly powerful techniques in protein purification, mass spectrometry and identification, and due to the development of extensive protein and DNA databases from various organisms. Nevertheless, current identification methods from spectrometric data have difficulties in handling modifications or mutations in the source peptide. Moreover, they have low performance when run on large databases (such as genomic databases), or with low quality data, for example due to bad calibration or low fragmentation of the source peptide. We present a new algorithm dedicated to automated protein identification from tandem mass spectrometry (MS/MS) data by searching a peptide sequence database. Our identification approach shows promising properties for solving the specific difficulties enumerated above. It consists of matching theoretical peptide sequences issued from a database with a structured representation of the source MS/MS spectrum. The representation is similar to the spectrum graphs commonly used by de novo sequencing software. The identification process involves the parsing of the graph in order to emphasize relevant sections for each theoretical sequence, and leads to a list of peptides ranked by a correlation score. The parsing of the graph, which can be a highly combinatorial task, is performed by a bio-inspired algorithm called Ant Colony Optimization algorithm.

Download full-text


Available from: Robin Gras, Jul 07, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Mitochondria are the primary locus for the generation of reactive nitrogen species including peroxynitrite and subsequent protein tyrosine nitration. Protein tyrosine nitration may have important functional and biological consequences such as alteration of enzyme catalytic activity. In the present study, mouse liver mitochondria were incubated with peroxynitrite, and the mitochondrial proteins were separated by 1D and 2D gel electrophoresis. Nitrotyrosinylated proteins were detected with an anti-nitrotyrosine antibody. One of the major proteins nitrated by peroxynitrite was carbamoyl phosphate synthetase 1 (CPS1) as identified by LC-MS protein analysis and Western blotting. The band intensity of nitration normalized to CPS1 was increased in a peroxynitrite concentration-dependent manner. In addition, CPS1 activity was decreased by treatment with peroxynitrite in a peroxynitrite concentration- and time-dependent manner. The decreased CPS1 activity was not recovered by treatment with reduced glutathione, suggesting that the decrease of the CPS1 activity is due to tyrosine nitration rather than cysteine oxidation. LC-MS analysis of in-gel digested samples, and a Popitam-based modification search located 5 out of 36 tyrosine residues in CPS1 that were nitrated. Taken together with previous findings regarding CPS1 structure and function, homology modeling of mouse CPS1 suggested that nitration at Y1450 in an α-helix of allosteric domain prevents activation of CPS1 by its activator, N-acetyl-l-glutamate. In conclusion, this study demonstrated the tyrosine nitration of CPS1 by peroxynitrite and its functional consequence. Since CPS1 is responsible for ammonia removal in the urea cycle, nitration of CPS1 with attenuated function might be involved in some diseases and drug-induced toxicities associated with mitochondrial dysfunction.
    Biochemical and Biophysical Research Communications 02/2012; 420(1):54-60. DOI:10.1016/j.bbrc.2012.02.114 · 2.28 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Biomarker detection is one of the greatest challenges in Clinical Proteomics. Today, great hopes are placed into tandem mass spectrometry (MS/MS) to discover potential biomarkers. MS/MS is a technique that allows large scale data analysis, including the identification, characterization, and quantification of molecules. Especially the identification process, that implies to compare experimental spectra with theoretical amino acid sequences stored in specialized databases, has been subject for extensive research in bioinformatics since many years. Dozens of identification programs have been developed addressing different aspects of the identification process but in general, clinicians are only using a single tools for their data analysis along with a single set of specific parameters. Hence, a significant proportion of the experimental spectra do not lead to a confident identification score due to inappropriate parameters or scoring schemes of the applied analysis software. The swissPIT (Swiss Protein Identification Toolbox) project was initiated to provide the scientific community with an expandable multi-tool platform for automated and in-depth analysis of mass spectrometry data. The swissPIT uses multiple identification tools to automatic analyze mass spectra. The tools are concatenated as analysis workflows. In order to realize these calculation-intensive workflows we are using the Swiss Bio Grid infrastructure. A first version of the web-based front-end is available ( and can be freely accessed after requesting an account. The source code of the project will be also made available in near future.
    Studies in health technology and informatics 02/2007; 126:13-22.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Filtration techniques in the form of rapid elimination of candidate sequences while retaining the true one are key ingredients of database searches in genomics. Although SEQUEST and Mascot perform a conceptually similar task to the tool BLAST, the key algorithmic idea of BLAST (filtration) was never implemented in these tools. As a result MS/MS protein identification tools are becoming too time-consuming for many applications including search for post-translationally modified peptides. Moreover, matching millions of spectra against all known proteins will soon make these tools too slow in the same way that "genome vs genome" comparisons instantly made BLAST too slow. We describe the development of filters for MS/MS database searches that dramatically reduce the running time and effectively remove the bottlenecks in searching the huge space of protein modifications. Our approach, based on a probability model for determining the accuracy of sequence tags, achieves superior results compared to GutenTag, a popular tag generation algorithm. Our tag generating algorithm along with our de novo sequencing algorithm PepNovo can be accessed via the URL
    Journal of Proteome Research 08/2005; 4(4):1287-95. DOI:10.1021/pr050011x · 5.00 Impact Factor