Improved validation of peptide MS/MS assignments using spectral intensity prediction
ABSTRACT A major limitation in identifying peptides from complex mixtures by shotgun proteomics is the ability of search programs to accurately assign peptide sequences using mass spectrometric fragmentation spectra (MS/MS spectra). Manual analysis is used to assess borderline identifications; however, it is error-prone and time-consuming, and criteria for acceptance or rejection are not well defined. Here we report a Manual Analysis Emulator (MAE) program that evaluates results from search programs by implementing two commonly used criteria: 1) consistency of fragment ion intensities with predicted gas phase chemistry and 2) whether a high proportion of the ion intensity (proportion of ion current (PIC)) in the MS/MS spectra can be derived from the peptide sequence. To evaluate chemical plausibility, MAE utilizes similarity (Sim) scoring against theoretical spectra simulated by MassAnalyzer software (Zhang, Z. (2004) Prediction of low-energy collision-induced dissociation spectra of peptides. Anal. Chem. 76, 3908-3922) using known gas phase chemical mechanisms. The results show that Sim scores provide significantly greater discrimination between correct and incorrect search results than achieved by Sequest XCorr scoring or Mascot Mowse scoring, allowing reliable automated validation of borderline cases. To evaluate PIC, MAE simplifies the DTA text files summarizing the MS/MS spectra and applies heuristic rules to classify the fragment ions. MAE output also provides data mining functions, which are illustrated by using PIC to identify spectral chimeras, where two or more peptide ions were sequenced together, as well as cases where fragmentation chemistry is not well predicted.
SourceAvailable from: Lunzhao Yi[Show abstract] [Hide abstract]
ABSTRACT: Accurate prediction of peptide fragment ion mass spectra is one of the critical factors to guarantee confident peptide identification by protein sequence database search in bottom-up proteomics. In an attempt to accurately and comprehensively predict this type of mass spectra, a framework named MS2PBPI is proposed. MS2PBPI firstly extracts fragment ions from large-scale MS/MS spectra datasets according to the peptide fragmentation pathways and uses binary trees to divide the obtained bulky data into tens to more than one thousand regions. For each adequate region, stochastic gradient boosting tree regression model is constructed. By constructing hundreds of these models, MS2PBPI is able to predict MS/MS spectra for unmodified and modified peptides with reasonable accuracy. Moreover, high consistency between predicted and experimental MS/MS spectra derived from different ion trap instruments with low and high resolving power is achieved. MS2PBPI outperforms existing algorithms MassAnalyzer and PeptideART.Analytical Chemistry 07/2014; 86(15). DOI:10.1021/ac501094m · 5.83 Impact Factor
[Show abstract] [Hide abstract]
ABSTRACT: Glycosylation is a common post-translational modification that plays a pivotal role in many aspects of the host protein function. Aberrant protein glycosylation is strongly associated with malignant transformation of the cells. Therefore a great deal of effort in the field of glycomics has been devoted to discovery of glycan-based cancer biomarkers. Analytical separation mainly in the form of liquid chromatography along with modern mass spectrometry are two key technologies that allow mining ultra-complex biological specimens for detection of disease associated/specific changes in glycosylation. This dissertation describes the development of novel analytical platforms for structural characterization and quantitation of N-glycans released from an isolated plasma glycoprotein as well as whole plasma glycome. Chapter 1 provides an overview on protein glycosylation, molecular structures of N-glycans and their biosynthesis as well asthe association between aberrant glycosylation of the proteins and onset and progression of cancer. The biological importance of glycosylation and its potential to serve as cancer biomarkers has driven the development of more efficient separation and more sensitive detection methods. Major analytical approaches and technologies that utilize liquid chromatography, mass spectrometry as well as high specificity enzyme tools, currently available to study glycome, their strengths and pitfall are reviewed. Aiming to find protein candidates with potential alteration in their glycosylation, a glycoproteomic analysis of plasma samples derived from Renal Cell Carcinoma (RCC) patients was conducted and described in Chapter 2. For this study, a multi-dimensional platform combining depletion of 14 most abundant proteins, multi lectin affinity fractionation followed by isoelectric focusing sub-fractionation, was used to reduce the dynamic range of plasma glycoproteome prior to LC-MS/MS analysis. This pilot study generated a list of protein candidates among which is clusterin (apolipoprotein J), a glycoprotein previously implicated in renal cell carcinoma. Chapter 3 describes the development of a multi-dimensional analytical platform for profiling alterations of clusterin N-glycosylation in the plasma of patients with RCC. In this work, an automated multi-dimensional HPLC platform enabling high throughput affinity enrichment of clusterin from plasma samples was developed. Integrated with two dimensional gel electrophoresis, high purity clusterin in microgram quantities suitable for glycan characterization was isolated. The analytical platform was applied to study clusterin glycosylation in a small group of RCC patients before and after nephrectomy. Structural characterization of N-glycans released from the purified clusterin was carried out using ultra performance liquid chromatography (UPLC) profiling of fluorescently labeled N-linked oligosaccharides on sub-2 µm hydrophilic interaction (HILIC) based stationary phase, integrated with sequential digestions with exoglycosidase enzymes. A statistically significant decrease was observed in the levels of A2G2S(3)2 glycans while the levels of FA2G2S(6)2) and A3G3S(6)2 were increased in the post-surgery plasma samples. Chapter 4 describes development and optimization of a novel analytical platform integrating differential chemical derivation with chromatographic separation for in-depth characterization of isomeric tri-sialylated N-glycans released from glycoproteins. Here, we combined linkage specific derivatization of sialic acids by reaction with the condensation reagent 4-(4,6-dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium chloride (DMT-MM) in methanol with nano-scale liquid chromatographic separation prior to a high resolution-high mass accuracy Orbitrap MS analysis. Our analytical strategy not only allows for linkage specific characterization of sialylated glycans directly from the precursor mass, but also improves the preceding HILIC separation by increasing the hydrophobicity and altering the selectivity of the oligosaccharide analytes. In the last chapter, the developed methodology in Chapter 4 was applied to investigate whether structural alterations in tri-sialylated N-glycans, released and enriched from the sera of pathological stage and sex matched patients bearing lung, breast, ovarian, pancreatic or gastric cancer, demonstrate any degree of cancer specificity, or whether changes in expression levels are purely cancer associated. The results of this pilot study indicated a limited degree of cancer specificity, particularly for pancreatic cancer, based on alterations in the relative abundance of specific tri-sialylated isomers.08/2013, Degree: PhD, Supervisor: William S. Hancock
[Show abstract] [Hide abstract]
ABSTRACT: The 2p3/2 core-level binding energy (BE) shifts of Fe surface and Fe nanoparticles have been analyzed by considering the bond order-length-strength (BOLS) correlation mechanism  in decomposition of the X-ray photoelectron spectrum (XPS). It turns out that the Fe 2p3/2 BE shifts positively by 2.17eV from the atomic value of 704.52eV to the bulk value of 706.69eV and that a further 0.32 and 0.16eV positive shift occurs, respectively, to the top and the second atomic layers. Consistency between BOLS predictions and the measured size dependence of BE shift clarifies the dominance of the broken-bond-induced local strain and quantum trapping in perturbing the Hamiltonian and hence the positive shift of the 2p3/2 BE of Fe surface and Fe nanostructures.Chemical Physics Letters 10/2009; 480(4):243-246. DOI:10.1016/j.cplett.2009.09.017 · 1.99 Impact Factor