Popitam: towards new heuristic strategies to improve protein identification from tandem mass spectrometry data.

Swiss Institute of Bioinformatics, Geneva, Switzerland.
PROTEOMICS (Impact Factor: 4.13). 07/2003; 3(6):870-8. DOI: 10.1002/pmic.200300402
Source: PubMed

ABSTRACT In recent years, proteomics research has gained importance due to increasingly powerful techniques in protein purification, mass spectrometry and identification, and due to the development of extensive protein and DNA databases from various organisms. Nevertheless, current identification methods from spectrometric data have difficulties in handling modifications or mutations in the source peptide. Moreover, they have low performance when run on large databases (such as genomic databases), or with low quality data, for example due to bad calibration or low fragmentation of the source peptide. We present a new algorithm dedicated to automated protein identification from tandem mass spectrometry (MS/MS) data by searching a peptide sequence database. Our identification approach shows promising properties for solving the specific difficulties enumerated above. It consists of matching theoretical peptide sequences issued from a database with a structured representation of the source MS/MS spectrum. The representation is similar to the spectrum graphs commonly used by de novo sequencing software. The identification process involves the parsing of the graph in order to emphasize relevant sections for each theoretical sequence, and leads to a list of peptides ranked by a correlation score. The parsing of the graph, which can be a highly combinatorial task, is performed by a bio-inspired algorithm called Ant Colony Optimization algorithm.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Proteins play a fundamental role in establishing the diversity of cellular processes in health or disease systems. This diversity is accomplished by a vast array of protein functions. In fact, a protein rarely has a single function. The majority of proteins are involved in numerous cellular processes, and these multiple functions are made possible by interactions with other molecules. The complexity of interactions is substantially increased by the spatial and temporal diversity of proteins. For example, proteins can be part of distinct complexes within different subcellular compartments or at different stages of the cell cycle. Posttranslational modifications can regulate and further expand the ability of proteins to establish localization- or temporal-dependent interactions. This complexity and functional divergence of interactions is further increased by the simultaneous presence of stable, transient, direct, and indirect protein interactions. Thus, an understanding of protein functions cannot be fully accomplished without knowledge of its interactions. Characterizing these interactions is therefore critical to understanding the biology of health and disease systems.
    Analytical Chemistry 11/2012; · 5.70 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In shotgun proteomics, protein mixtures are proteolytically digested before tandem mass spectrometry (MS/MS) analysis. Biological samples are generally characterized by a very high complexity, therefore a step of peptides fractionation before the MS analysis is essential. This passage reduces the sample complexity and increases its compatibility with the sampling performance of the instrument. Among all the existing approaches for peptide fractionation, isoelectric focusing has several peculiarities that are theoretically known but practically rarely exploited by the proteomics community. The main aim of this review is to draw the readers' attention to these unique qualities, which are not accessible with other common approaches, and that represent important tools to increase confidence in the identification of proteins and some post-translational modifications. The general characteristics of different methods to perform peptide isoelectric focusing with natural and artificial pH gradients, the existing instrumentation, and the informatics tools available for isoelectric point calculation are also critically described. Finally, we give some general conclusions on this strategy, underlying its principal limitations.
    Journal of Chromatography A 04/2013; · 4.61 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: High throughput protein identification and quantification analysis based on mass spectrometry are fundamental steps in most proteomics projects. Here, we present EasyProt (available at, a new platform for mass spectrometry data processing, protein identification, quantification and unexpected post-translational modification characterization. EasyProt provides a fully integrated graphical experience to perform a large part of the proteomic data analysis workflow. Our goal was to develop a software platform that would fulfill the needs of scientists in the field, while emphasizing ease-of-use for non-bioinformatician users. Protein identification is based on OLAV scoring schemes and protein quantification is implemented for both, isobaric labeling and label-free methods. Additional features are available, such as peak list processing, isotopic correction, spectra filtering, charge-state deconvolution and spectra merging. To illustrate the EasyProt platform, we present two identification and quantification workflows based on isobaric tagging and label-free methods.
    Journal of proteomics 12/2012; · 5.07 Impact Factor


Available from
Jun 10, 2014