Sample classification from protein mass spectrometry, by 'peak probability contrasts'.

Department of Health, Research and Policy, Stanford University, CA 94305, USA.
Bioinformatics (Impact Factor: 4.62). 12/2004; 20(17):3034-44. DOI: 10.1093/bioinformatics/bth357
Source: PubMed

ABSTRACT MOTIVATION: Early cancer detection has always been a major research focus in solid tumor oncology. Early tumor detection can theoretically result in lower stage tumors, more treatable diseases and ultimately higher cure rates with less treatment-related morbidities. Protein mass spectrometry is a potentially powerful tool for early cancer detection. We propose a novel method for sample classification from protein mass spectrometry data. When applied to spectra from both diseased and healthy patients, the 'peak probability contrast' technique provides a list of all common peaks among the spectra, their statistical significance and their relative importance in discriminating between the two groups. We illustrate the method on matrix-assisted laser desorption and ionization mass spectrometry data from a study of ovarian cancers. RESULTS: Compared to other statistical approaches for class prediction, the peak probability contrast method performs as well or better than several methods that require the full spectra, rather than just labelled peaks. It is also much more interpretable biologically. The peak probability contrast method is a potentially useful tool for sample classification from protein mass spectrometry data.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Mixture - modeling of mass spectra is an approach with many potential applications including peak detection and quantification, smoothing, de-noising, feature extraction and spectral signal compression. However, existing algorithms do not allow for automatic analyses of whole spectra. Therefore, despite highlighting potential advantages of mixture modeling of mass spectra of peptide/protein mixtures and some preliminary results presented in several papers, the mixture modeling approach was so far not developed to the stage enabling systematic comparisons with existing software packages for proteomic mass spectra analyses. In this paper we present an efficient algorithm for Gaussian mixture modeling of proteomic mass spectra of different types (e.g., MALDI-ToF profiling, MALDI-IMS). The main idea is automatic partitioning of protein mass spectral signal into fragments. The obtained fragments are separately decomposed into Gaussian mixture models. The parameters of the mixture models of fragments are then aggregated to form the mixture model of the whole spectrum. We compare the elaborated algorithm to existing algorithms for peak detection and we demonstrate improvements of peak detection efficiency obtained by using Gaussian mixture modeling. We also show applications of the elaborated algorithm to real proteomic datasets of low and high resolution.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a novel nonparametric Bayesian model using Lévy random field priors for identifying the presence and abundance of proteins from mass spectrometry data. Informed prior distributions, based on expert opinion and on preliminary laboratory experiments, help distinguish true peaks from background noise and help resolve un-certainty about peak multiplicity.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Mass spectrometry (MS) is an increasingly used technique in proteomics. MALDI and SELDI-TOF techniques enable the study of biological uids, e.g. human blood. Analysis of these samples can lead to the discovery of new biomarkers which can ease the diagnosis and prognosis of several diseases, e.g. various cancers. In this work, we focus on MS data from MALDI-TOF or SELDI-TOF experiments, generating data to identify potential new biomark-ers. Signal processing from raw spectra (output by mass spectrometers) to dierential analysis is considered here, reviewing methods based on local fea-tures detected in MS signal.

Full-text (2 Sources)

Available from
Jun 5, 2014