Automated Workflows for Accurate Mass-based Putative Metabolite Identification in LC/MS-derived Metabolomic Datasets

School of Biomedicine, The University of Manchester, Manchester M13 9PT, UK.
Bioinformatics (Impact Factor: 4.98). 02/2011; 27(8):1108-12. DOI: 10.1093/bioinformatics/btr079
Source: PubMed


The study of metabolites (metabolomics) is increasingly being applied to investigate microbial, plant, environmental and mammalian systems. One of the limiting factors is that of chemically identifying metabolites from mass spectrometric signals present in complex datasets.
Three workflows have been developed to allow for the rapid, automated and high-throughput annotation and putative metabolite identification of electrospray LC-MS-derived metabolomic datasets. The collection of workflows are defined as PUTMEDID_LCMS and perform feature annotation, matching of accurate m/z to the accurate mass of neutral molecules and associated molecular formula and matching of the molecular formulae to a reference file of metabolites. The software is independent of the instrument and data pre-processing applied. The number of false positives is reduced by eliminating the inaccurate matching of many artifact, isotope, multiply charged and complex adduct peaks through complex interrogation of experimental data.
The workflows, standard operating procedure and further information are publicly available at

Download full-text


Available from: Louise C Kenny,
1 Follower
25 Reads
  • Source
    • "While the mapping of gene and protein IDs is in most cases straightforward, m/z ratios from non-targeted metabolomics experiments cannot be directly mapped to entries in the corresponding databases and the identification of metabolites is a major bottleneck in such experiments (Dunn et al. 2013; Scalbert et al. 2009). A common approach is to calculate putative monoisotopic masses and molecular formulas for all MS data set features and match these with known metabolites (Brown et al. 2011; Kuhl et al. 2012; Kaever et al. 2012; Lee et al. 2013). In order to identify relevant pathways, a popular approach is the Gene/ Metabolite Set Enrichment Analysis (G/M SEA) and "
    [Show abstract] [Hide abstract]
    ABSTRACT: A central aim in the evaluation of non-targeted metabolomics data is the detection of intensity patterns that differ between experimental conditions as well as the identification of the underlying metabolites and their association with metabolic pathways. In this context, the identification of metabolites based on non-targeted mass spectrometry data is a major bottleneck. In many applications , this identification needs to be guided by expert knowledge and interactive tools for exploratory data analysis can significantly support this process. Additionally, the integration of data from other omics platforms, such as DNA microarray-based transcriptomics, can provide valuable hints and thereby facilitate the identification of metabolites via the reconstruction of related metabolic pathways. We here introduce the MarVis-Pathway tool, which allows the user to identify metabolites by annotation of pathways from cross-omics data. The analysis is supported by an extensive framework for pathway enrichment and meta-analysis. The tool allows the mapping of data set features by ID, name, and accurate mass, and can incorporate information from adduct and isotope correction of mass spectrometry data. MarVis-Pathway was integrated in the MarVis-Suite (, which features the seamless highly interactive filtering, combination, clustering, and visualization of omics data sets. The func-tionality of the new software tool is illustrated using combined mass spectrometry and DNA microarray data. This application confirms jasmonate biosynthesis as important metabolic pathway that is upregulated during the wound response of Arabidopsis plants.
    Metabolomics 10/2014; 11(3). DOI:10.1007/s11306-014-0734-y · 3.86 Impact Factor
  • Source
    • "Metabolic features characterized by measuring both the accurate m/z and retention time, and corresponding putative molecular annotations were assigned by standard methods as described [50]. One or more molecular formulae within available databases were assigned to each feature with mass accuracy of ±3 ppm. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Blood-vessel dysfunction arises before overt hyperglycemia in type-2 diabetes (T2DM). We hypothesised that a metabolomic approach might identify metabolites/pathways perturbed in this pre-hyperglycemic phase. To test this hypothesis and for specific metabolite hypothesis generation, serum metabolic profiling was performed in young women at increased, intermediate and low risk of subsequent T2DM. Methods Participants were stratified by glucose tolerance during a previous index pregnancy into three risk-groups: overt gestational diabetes (GDM; n = 18); those with glucose values in the upper quartile but below GDM levels (UQ group; n = 45); and controls (n = 43, below the median glucose values). Follow-up serum samples were collected at a mean 22 months postnatally. Samples were analysed in a random order using Ultra Performance Liquid Chromatography coupled to an electrospray hybrid LTQ-Orbitrap mass spectrometer. Statistical analysis included principal component (PCA) and multivariate methods. Findings Significant between-group differences were observed at follow-up in waist circumference (86, 95%CI (79–91) vs 80 (76–84) cm for GDM vs controls, p<0.05), adiponectin (about 33% lower in GDM group, p = 0.004), fasting glucose, post-prandial glucose and HbA1c, but the latter 3 all remained within the ‘normal’ range. Substantial differences in metabolite profiles were apparent between the 2 ‘at-risk’ groups and controls, particularly in concentrations of phospholipids (4 metabolites with p≤0.01), acylcarnitines (3 with p≤0.02), short- and long-chain fatty acids (3 with p< = 0.03), and diglycerides (4 with p≤0.05). Interpretation Defects in adipocyte function from excess energy storage as relatively hypoxic visceral and hepatic fat, and impaired mitochondrial fatty acid oxidation may initiate the observed perturbations in lipid metabolism. Together with evidence from the failure of glucose-directed treatments to improve cardiovascular outcomes, these data and those of others indicate that a new, quite different definition of type-2 diabetes is required. This definition would incorporate disturbed lipid metabolism prior to hyperglycemia.
    PLoS ONE 09/2014; 9(9):e103217. DOI:10.1371/journal.pone.0103217 · 3.23 Impact Factor
  • Source
    • "However, it is the flexible nature of LC-MS technology that is also its greatest advantage. The extreme diversity of metabolite chemophysical properties or concentration, within a sample, precludes global assessment by any single analytical technology [38]. The ability to customize parameters to separate, detect, or even target a wide range of diverse molecules at low concentrations (e.g. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: Metabolomics is a well-established rapidly developing research field involving quantitative and qualitative metabolite assessment within biological systems. Recent improvements in metabolomics technologies reveal the unequivocal value of metabolomics tools in natural products discovery, gene-function analysis, systems biology and diagnostic platforms. Scope of review: We review here some of the prominent metabolomics methodologies employed in data acquisition and analysis of natural products and disease-related biomarkers. Major conclusions: This review demonstrates that metabolomics represents a highly adaptable technology with diverse applications ranging from environmental toxicology to disease diagnosis. Metabolomic analysis is shown to provide a unique snapshot of the functional genetic status of an organism by examining its biochemical profile, with relevance toward resolving phylogenetic associations involving horizontal gene transfer and distinguishing subgroups of genera possessing high genetic homology, as well as an increasing role in both elucidating biosynthetic transformations of natural products and detecting preclinical biomarkers of numerous disease states. General significance: This review expands the interest in multiplatform combinatorial metabolomic analysis. The applications reviewed range from phylogenetic assignment, biosynthetic transformations of natural products, and the detection of preclinical biomarkers.
    Biochimica et Biophysica Acta (BBA) - General Subjects 08/2014; 1840(12). DOI:10.1016/j.bbagen.2014.08.007 · 4.38 Impact Factor
Show more