[Show abstract][Hide abstract] ABSTRACT: Studies aimed at identifying serum markers of cellular metabolism (biomarkers) that are associated at baseline with aerobic capacity (VO(2max)) in young, healthy individuals have yet to be reported. Therefore, the goal of the present study was to use the standard chemistry screen and untargeted mass spectrometry (MS)-based metabolomic profiling to identify significant associations between baseline levels of serum analytes or metabolites with VO(2max) (77 subjects, age range 18-35 years). Use of multivariable linear regression identified three analytes (standard chemistry screen) and twenty-three metabolites (MS-based metabolomics) containing significant, sex-adjusted associations with VO(2max). In addition, fourteen metabolites were found to contain sex-specific associations with aerobic capacity. Subsequent stepwise multivariable linear regression identified the combination of SGOT, 4-ethylphenylsulfate, tryptophan, γ-tocopherol, and α-hydroxyisovalerate as overall, sex-adjusted baseline predictors of VO(2max) (adjusted R (2) = 0.66). However, the results of the stepwise model were found to be sensitive to outliers; therefore, random forest (RF) regression was performed. Use of RF regression identified a combination of seven covariates that explained 57.6 % of the variability inherent in VO(2max). Furthermore, inclusion of significant analytes, metabolites and sex-specific metabolites into a stepwise regression model identified the combination of five metabolites in males and seven metabolites in females as being able to explain 80 and 58 % of the variability inherent in VO(2max), respectively. In conclusion, the evidence presented in the current report is the first attempt to identify baseline serum biomarkers that are significantly associated with VO(2max) in young, healthy adult humans.
[Show abstract][Hide abstract] ABSTRACT: The interpretation of nuclear magnetic reso-nance (NMR) experimental results for metabolomics studies requires intensive signal processing and multivariate data analysis techniques. Standard quantification techniques at-tempt to minimize effects from variations in peak positions caused by sample pH, ionic strength, and composition. These techniques fail to account for adjacent signals which can lead to drastic quantification errors. Attempts at full spectrum deconvolution have been limited in adoption and development due to the computational resources required. Herein, we develop a novel localized deconvolution al-gorithm for general purpose quantification of NMR-based metabolomics studies. Localized deconvolution decreases average absolute quantification error by 97% and average relative quantification error by 88%. When applied to a 1 H metabolomics study, the cross-validation metric, Q 2 , improved 16% by reducing within group variability. This increase in accuracy leads to additional computing costs that are overcome by translating the algorithm to the map-reduce design paradigm.
International Conference on Bioinformatics & Computational Biology, Las Vegas, Nevada; 07/2012
[Show abstract][Hide abstract] ABSTRACT: 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD) elicits a broad spectrum of species-specific effects that have not yet been fully characterized. This study compares the temporal effects of TCDD on hepatic aqueous and lipid metabolite extracts from immature ovariectomized C57BL/6 mice and Sprague-Dawley rats using gas chromatography-mass spectrometry and nuclear magnetic resonance-based metabolomic approaches and integrates published gene expression data to identify species-specific pathways affected by treatment. TCDD elicited metabolite and gene expression changes associated with lipid metabolism and transport, choline metabolism, bile acid metabolism, glycolysis, and glycerophospholipid metabolism. Lipid metabolism is altered in mice resulting in increased hepatic triacylglycerol as well as mono- and polyunsaturated fatty acid (FA) levels. Mouse-specific changes included the induction of CD36 and other cell surface receptors as well as lipases- and FA-binding proteins consistent with hepatic triglyceride and FA accumulation. In contrast, there was minimal hepatic fat accumulation in rats and decreased CD36 expression. However, choline metabolism was altered in rats, as indicated by decreases in betaine and increases in phosphocholine with the concomitant induction of betaine-homocysteine methyltransferase and choline kinase gene expression. Results from these studies show that aryl hydrocarbon receptor-mediated differential gene expression could be linked to metabolite changes and species-specific alterations of biochemical pathways.
[Show abstract][Hide abstract] ABSTRACT: The interpretation of nuclear magnetic resonance (NMR) experimental results for metabolomics studies requires intensive signal
processing and multivariate data analysis techniques. A key step in this process is the quantification of spectral features,
which is commonly accomplished by dividing an NMR spectrum into several hundred integral regions or bins. Binning attempts
to minimize effects from variations in peak positions caused by sample pH, ionic strength, and composition, while reducing
the dimensionality for multivariate statistical analyses. Herein we develop an improved novel spectral quantification technique,
dynamic adaptive binning. With this technique, bin boundaries are determined by optimizing an objective function using a dynamic
programming strategy. The objective function measures the quality of a bin configuration based on the number of peaks per
bin. This technique shows a significant improvement over both traditional uniform binning and other adaptive binning techniques.
This improvement is quantified via synthetic validation sets by analyzing an algorithm’s ability to create bins that do not
contain more than a single peak and that maximize the distance from peak to bin boundary. The validation sets are developed
by characterizing the salient distributions in experimental NMR spectroscopic data. Further, dynamic adaptive binning is applied
to a 1H NMR-based experiment to monitor rat urinary metabolites to empirically demonstrate improved spectral quantification.
[Show abstract][Hide abstract] ABSTRACT: As metabolomic technology expands, validated techniques for analyzing highly dimensional categorical data are becoming increasingly
important. This manuscript presents a novel latent vector-based methodology for analyzing complex data sets with multiple
groups that include both high and low doses using orthogonal projections to latent structures (OPLS) coupled with hierarchical
clustering. This general methodology allows complex experimental designs (e.g., multiple dose and time combinations) to be
encoded and directly compared. Further, it allows for the inclusion of low dose samples that do not exhibit a strong enough
individual response to be modeled independently. A dose- and time-responsive metabolomic study was completed to evaluate and
demonstrate this methodology. Single doses (0.1–100mg/kg body weight) of α-naphthylisothiocyanate (ANIT), a common model
of hepatic cholestasis, were administered orally in corn oil to male Fischer 344 rats. Urine samples were collected pre-dose
and daily through day-4 post-dose. Blood samples were collected pre and post-dose to assess indices of clinical toxicity.
Urine samples were analyzed by 1H-NMR spectroscopy, and the spectra were adaptively binned to reduce dimensionality. The proposed methodology for NMR-based
urinary metabolomics was sensitive enough to detect ANIT-induced effects with respect to both dose and time at doses below
the threshold of clinical toxicity. A pattern of ANIT-dependent effects established at the highest dose was seen in the 50
and 20mg/kg dose groups, an effect not directly identifiable with individual principal component analysis (PCA). Coupling
the pattern found by the OPLS algorithm and hierarchical clustering revealed a relationship between the 100, 50 and 20mg/kg
dose groups, suggesting a characteristic effect of ANIT exposure. These studies demonstrate that the use of a metabolomics
approach with flexible binning of 1H spectra and appropriate application of multivariate analyses can reveal biologically relevant information about the temporal
metabolic perturbations caused by exposure and toxicity.
KeywordsNMR metabolomics–High dimension categorical data–Adaptive binning
[Show abstract][Hide abstract] ABSTRACT: Common contemporary practice within the nuclear magnetic resonance (NMR) metabolomics community is to evaluate and validate novel algorithms on empirical data or simplified simulated data. Empirical data captures the complex characteristics of experimental data, but the optimal or most correct analysis is unknown a priori; therefore, researchers are forced to rely on indirect performance metrics, which are of limited value. In order to achieve fair and complete analysis of competing techniques more exacting metrics are required. Thus, metabolomics researchers often evaluate their algorithms on simplified simulated data with a known answer. Unfortunately, the conclusions obtained on simulated data are only of value if the data sets are complex enough for results to generalize to true experimental data. Ideally, synthetic data should be indistinguishable from empirical data, yet retain a known best analysis.
We have developed a technique for creating realistic synthetic metabolomics validation sets based on NMR spectroscopic data. The validation sets are developed by characterizing the salient distributions in sets of empirical spectroscopic data. Using this technique, several validation sets are constructed with a variety of characteristics present in 'real' data. A case study is then presented to compare the relative accuracy of several alignment algorithms using the increased precision afforded by these synthetic data sets.
These data sets are available for download at http://birg.cs.wright.edu/nmr_synthetic_data_sets.
[Show abstract][Hide abstract] ABSTRACT: The work described in the following report was initiated to investigate the possibility of using novel biotechnologies for the discovery, down-selection, and pre-validation of biomarkers of toxic substance effects within the warfighter prior to health and operational performance decrement. Using the biotechnology of metabonomics, this effort focused on using nuclear magnetic resonance (NMR) spectroscopy and ultra pressure liquid chromatography mass spectrometry (UPLC/MS) for identification of liver-selective toxic effects following exposure to a known hepatotoxicant (alpha-naphthylisothiocyanate; ANIT) that induces cholestasis. Urine samples were analyzed by NMR spectroscopy and UPLC/MS and the data processed and analyzed by principal component analysis, linear discriminant analysis, and hierarchical clustering analysis. NMR- and UPLC/MS-based urinary metabonomics were sensitive enough to detect ANIT-induced toxic effects with respect to both dose and time. Understanding the cellular response to chemical exposure at the molecular level will not only facilitate the elucidation of the mechanism of chemical toxicity, but also allow accurate prediction of chemical toxicity and phenotypic outcome. Ultimately, this will lead to the identification of novel biomarkers for rapid monitoring and prediction of health hazards to the warfighter associated with chemical exposure.
[Show abstract][Hide abstract] ABSTRACT: In this study we examined the urinary metabolite profiles from rats following a single exposure to the kidney toxicants D-serine, puromycin, hippuric acid and amphotericin B at various doses, and as a function of time post-dose. In toxicology, such dose-time metabonomics studies are important for an accurate determination of the severity of biological effects, and for biomarker identification that may be associated with toxicity. The metabonomics analysis yielded a dose-response curve in principal component analysis space, and was able to detect exposure to D-serine and puromycin at much lower doses than standard clinical chemistry measures. Additionally, characteristic features in the urinary metabolite profiles could be ascertained as a function of dose. The results showed common features and some unique features in urinary metabolite profiles when analyzed by NMR and LC-MS, respectively.
[Show abstract][Hide abstract] ABSTRACT: Metabolomics offers the potential to assess the effects of toxicants on metabolite levels. To fully realize this potential,
a robust analytical workflow for identifying and quantifying treatment-elicited changes in metabolite levels by nuclear magnetic
resonance (NMR) spectrometry has been developed that isolates and aligns spectral regions across treatment and vehicle groups
to facilitate analytical comparisons. The method excludes noise regions from the resulting reduced spectra, significantly
reducing data size. Principal components analysis (PCA) identifies data clusters associated with experimental parameters.
Cluster-centroid scores, derived from the principal components that separate treatment from vehicle samples, are used to reconstruct
the mean spectral estimates for each treatment and vehicle group. Peak amplitudes are determined by scanning the reconstructed
mean spectral estimates. Confidence levels from Mann–Whitney order statistics and amplitude change ratios are used to identify
treatment-related changes in peak amplitudes. As a demonstration of the method, analysis of 13C NMR data from hepatic lipid extracts of immature, ovariectomized C57BL/6 mice treated with 30 μg/kg 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) or sesame oil vehicle, sacrificed at 72, 120, or 168 h, identified 152 salient peaks. PCA clustering showed
a prominent treatment effect at all three time points studied, and very little difference between time points of treated animals.
Phenotypic differences between two animal cohorts were also observed. Based on spectral peak identification, hepatic lipid
extracts from treated animals exhibited redistribution of unsaturated fatty acids, cholesterols, and triacylglycerols. This
method identified significant changes in peaks without the loss of information associated with spectral binning, increasing
the likelihood of identifying treatment-elicited metabolite changes.
[Show abstract][Hide abstract] ABSTRACT: In many metabolomics studies, NMR spectra are divided into bins of fixed width. This spectral quantification technique, known
as uniform binning, is used to reduce the number of variables for pattern recognition techniques and to mitigate effects from
variations in peak positions; however, shifts in peaks near the boundaries can cause dramatic quantitative changes in adjacent
bins due to non-overlapping boundaries. Here we describe a new Gaussian binning method that incorporates overlapping bins
to minimize these effects. A Gaussian kernel weights the signal contribution relative to distance from bin center, and the
overlap between bins is controlled by the kernel standard deviation. Sensitivity to peak shift was assessed for a series of
test spectra where the offset frequency was incremented in 0.5Hz steps. For a 4Hz shift within a bin width of 24Hz, the
error for uniform binning increased by 150%, while the error for Gaussian binning increased by 50%. Further, using a urinary
metabolomics data set (from a toxicity study) and principal component analysis (PCA), we showed that the information content
in the quantified features was equivalent for Gaussian and uniform binning methods. The separation between groups in the PCA
scores plot, measured by the J
2 quality metric, is as good or better for Gaussian binning versus uniform binning. The Gaussian method is shown to be robust
in regards to peak shift, while still retaining the information needed by classification and multivariate statistical techniques
for NMR-metabolomics data.
[Show abstract][Hide abstract] ABSTRACT: In many metabolomics studies, NMR spectra are divided into bins of fixed width to reduce the number of variables for pattern recognition techniques and to mitigate effects from variations in peak positions. Using this method, shifts in peaks near bin boundaries can cause dramatic quantitative changes in adjacent bins. Here we describe a quantization technique using a Gaussian kernel that incorporates overlapping bins to minimize these effects. Sensitivity to peak shift was assessed for a series of test spectra where the offset frequency was incremented in 1 Hz steps. For a 4 Hz shift within a bin width of 24 Hz, the error for uniform binning increased by 150% while the error for Gaussian binning increased by 50%. Using a urinary metabolomics dataset (from a toxicity study) and principal component analysis (PCA), we showed that the information content in the quantified features was equivalent for Gaussian and uniform binning methods.
U.S. Army Center for Health Promotion and Preventive Medicine 11th Annual Force Health Protection ConferenceU.S. Army Center for Health Promotion and Preventive Medicine 11th Annual Force Health Protection Conference; 01/2008
[Show abstract][Hide abstract] ABSTRACT: Nuclear magnetic resonance (NMR) spectroscopy is a non-invasive method of acquiring a metabolic profile from biofluids. This metabolic information may provide keys to the early detection of exposure to a toxin. A typical NMR toxicology data set has low sample size and high dimensionality. Thus, traditional pattern recognition techniques are not always feasible. In this paper, we evaluate several common alternatives for isolating these biomarkers. The fold test, unpaired t-test, and paired t-test were performed on an NMR-derived toxicological data set and results were compared. The paired t-test method was preferred, due to its ability to attribute statistical significance, to take into consideration consistency of a single subject over a time course, and to mitigate the low sample, high dimensionality problem. We then grouped the resulting statistically salient potential biomarkers based on their significance patterns and compared results to several known metabolites affected by the tested toxin. Based on these results, we present a statistical protocol of sequential t-tests and clustering techniques for identifying putative biomarkers. We then present the results of this protocol applied to a specific real world toxicological data set.
Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE 2007, October 14-17, 2007, Harvard Medical School, Boston, MA, USA; 01/2007
[Show abstract][Hide abstract] ABSTRACT: Nuclear magnetic resonance (NMR) spectroscopy is a non-invasive method of acquiring a metabolic profile from biofluids. Identifying biomarkers from these profiles may provide keys to the early detection of exposure to a toxin. Two common features of NMR data sets are small sample size and a large number of variables (i.e. high dimensionality). The high dimensionality arises from each sample spectrum being divided into a large number of regions, each of which is a dimension. Pattern recognition techniques can then be used to identify biomarkers from a data set that consists of metabolic profiles from a small number of samples. A typical first step of this analysis is to individually identify responsive spectral regions, followed by associating these regions with metabolites and biomarkers. In this paper, we evaluate several common alternatives to identify responsive regions, including the fold test, paired t-test, and logistic regression. Further, when performing these types of analyses, the issues of multiple-comparisons and false positive rates must be addressed. We compare several corrections for these issues including the Bonferroni, Holm’s, Westfall and Young, permutation, and bootstrap methods. The results of these statistical tests in combination with the multiple-comparison corrections were compared on both a simulated data set and an NMR-derived toxicology data set. Based on these results, we present a statistical protocol for determining putative biomarkers, designed to mitigate the low sample size, high dimensionality, and false positive issues associated with NMR data.
Proceedings of the Ohio Collaborative Conference on BioinformaticsProceedings of the Ohio Collaborative Conference on Bioinformatics; 01/2007
[Show abstract][Hide abstract] ABSTRACT: An increasingly important issue in force protection is the toxicology associated with toxic chemical and mixture exposure at uncharacterized deployed sites. Current methods for determining or monitoring toxic exposures to the warfighter in their working or living environment are not adequate to prevent serious health effects. Deployed personnel may be exposed to toxic chemicals as a result of industrial accidents, intentional or unintentional activities of enemy or friendly forces or sabotage. Rapid risk assessment of these scenarios requires the development of new testing methods. In order to prevent serious injury to the deployed warfighter exposed to toxic substances and to minimize mission degradation due to environmentally related adverse health effects, novel human monitoring methodologies that provide near real-time detection of potential toxic injury must be developed. It is necessary to devise methodologies that will predict or identify exposure of personnel to low concentrations of harmful substances before they cause harm to an individual. It is also important to identify methodologies that are relatively non-invasive, which could include collection of urine, blood, saliva or epithelial cells from humans. Emerging biotechnologies, such as toxicogenomics, proteomics and metabonomics will be investigated for their effectiveness to identify toxic effects upon the warfighter before they can induce a reduction in health and/or operational performance or before they can induce a disease process that would not manifest for several years.
[Show abstract][Hide abstract] ABSTRACT: The antioxidant capabilities of phosphatidylethanolamine plasmalogen (PlsEtn), in vivo, against lipid peroxidation were investigated via acute phosphine (PH(3)) administration in rats. Oxidative stress was assessed from measures of malondialdehyde and various enzyme activities, while NMR analyses of lipid and aqueous tissue extracts provided metabolic information in cerebellum, brainstem, and cortex. Brainstem had the highest basal [PlsEtn], and showed only moderate PH(3)-induced oxidative damage with no loss of ATP. The lowest basal [PlsEtn] was observed in cortex, where PH(3) caused a 51% decrease in [ATP]. The largest oxidative effect occurred in cerebellum, but [ATP] was unaffected. Myo-inositol+ethanolamine pretreatment attenuated all PH(3) effects. Specifically, the pretreatment attenuated the ATP decrease in cortex, and elevated brain [PlsEtn] in the cerebellum, nearly abolishing the cerebellar oxidative effects. Our data suggest a high basal [PlsEtn], or the capacity to synthesize new ethanolamine lipids (particularly PlsEtn) may protect against PH(3) toxicity.
Neurochemical Research 06/2006; 31(5):639-56. · 2.13 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: There are few studies of total body water (TBW) volume in children. Such studies are needed, as are new prediction equations for the clinical management of children with renal insufficiency and those receiving dialysis.
Mixed longitudinal data were from 124 white boys and 116 white girls 8 to 20 years of age. TBW volume was measured by deuterium nuclear magnetic resonance spectroscopy, and random effects models were used to determine patterns of change over time. Sex-specific TBW prediction equations were developed using regression analysis.
Boys had significantly greater (P < 0.05) mean TBW volumes than girls at all but 3 ages. TBW was significantly (P < 0.05) associated with age and maturation in the boys and the girls. In boys, mean TBW/WT varied from 0.55 to 0.59, while in the girls the mean declined from 0.53 to 0.49 by 16 years of age. Boys had significantly larger means for TBW/WT than girls, who had a significant, slight negative trend with age. The prediction equations were TBW = -25.87 + 0.23 (stature) + 0.37 (weight) for boys and TBW =-14.77 + 0.18 (stature) + 0.25 (weight) for girls.
Means are provided for TBW in white children from 8 to 20 years of age, whose average fatness affected the percentage of TBW in body weight. These updated TBW prediction equations perform better than those available from the past.
Kidney International 11/2005; 68(5):2317-22. · 8.52 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: A methodology has been implemented for analyzing microarray and NMR spectral data obtained from the same set of toxic-exposure dose-response experiments. The NMR spectra additionally track the time course of exposure. Analyses consist of screening the data to eliminate variates with insignificant signal, normalization appropriate to the experimental design, principal components analysis, and nonlinear classification using a support vector machine. It is found that exposure at subtoxic levels can be detected.
Fourth International IEEE Computer Society Computational Systems Bioinformatics Conference Workshops & Poster Abstracts (CSB 2005 Workshops), 8-11 August 2005, Stanford, CA, USA; 01/2005
[Show abstract][Hide abstract] ABSTRACT: Plasmalogens are ether-linked phospholipids highly abundant in nervous tissue. Previously we demonstrated that acute administration of myo-inositol (myo-Ins) + [2-(13)C] ethanolamine ([2-(13)C]Etn) significantly elevated phosphatidylethanolamine plasmalogen (PlsEtn) in rat whole brain. Current experiments investigated the effects of acute myo-Ins+[2-(13)C]Etn administration on [PlsEtn] and the biosynthesis of new Etn lipids using NMR spectroscopy in rat cerebral cortex, hippocampus, brainstem, midbrain and cerebellum. Treated rats received a single dose of myo-Ins + [2-(13)C]Etn and controls received saline rather than myoIns. Data reveal that the cerebellum is the brain region most affected by treatment, which resulted in a 22% increase in [PlsEtn] and 89% increase in newly synthesized Etn lipids relative to controls (P < 0.05). Furthermore, the cerebellar PlsEtn/phosphatidylethanolamine ratio and molar percentage of PlsEtn were significantly elevated by 12% and 8%, respectively (P < 0.05). These data suggest that myo-Ins influences Etn lipid metabolism in brain, particularly in the cerebellum where there is a stimulation in the biosynthesis of new Etn lipids with a preference towards PlsEtn.
Neurochemical Research 01/2005; 30(1):47-60. · 2.13 Impact Factor