ArticlePDF Available

Abstract and Figures

The number of diagnosed cases of Autism Spectrum Disorders (ASD) has increased dramatically over the last four decades; however, there is still considerable debate regarding the underlying pathophysiology of ASD. This lack of biological knowledge restricts diagnoses to be made based on behavioral observations and psychometric tools. However, physiological measurements should support these behavioral diagnoses in the future in order to enable earlier and more accurate diagnoses. Stepping towards this goal of incorporating biochemical data into ASD diagnosis, this paper analyzes measurements of metabolite concentrations of the folate-dependent one-carbon metabolism and transulfuration pathways taken from blood samples of 83 participants with ASD and 76 age-matched neurotypical peers. Fisher Discriminant Analysis enables multivariate classification of the participants as on the spectrum or neurotypical which results in 96.1% of all neurotypical participants being correctly identified as such while still correctly identifying 97.6% of the ASD cohort. Furthermore, kernel partial least squares is used to predict adaptive behavior, as measured by the Vineland Adaptive Behavior Composite score, where measurement of five metabolites of the pathways was sufficient to predict the Vineland score with an R² of 0.45 after cross-validation. This level of accuracy for classification as well as severity prediction far exceeds any other approach in this field and is a strong indicator that the metabolites under consideration are strongly correlated with an ASD diagnosis but also that the statistical analysis used here offers tremendous potential for extracting important information from complex biochemical data sets.
Content may be subject to copyright.
RESEARCH ARTICLE
Classification and adaptive behavior
prediction of children with autism spectrum
disorder based upon multivariate data
analysis of markers of oxidative stress and
DNA methylation
Daniel P. Howsmon
1,2
, Uwe Kruger
3
, Stepan Melnyk
4
, S. Jill James
4
, Juergen Hahn
1,2,3
*
1Department of Chemical and Biological Engineering, Rensselaer Polytechnic Institute, Troy, New York,
United States of America, 2Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic
Institute, Troy, New York, United States of America, 3Department of Biomedical Engineering, Rensselaer
Polytechnic Institute, Troy, New York, United States of America, 4Department of Pediatrics, University of
Arkansas for Medical Sciences, Little Rock, Arkansas, United States of America
*hahnj@rpi.edu
Abstract
The number of diagnosed cases of Autism Spectrum Disorders (ASD) has increased dra-
matically over the last four decades; however, there is still considerable debate regarding
the underlying pathophysiology of ASD. This lack of biological knowledge restricts diagno-
ses to be made based on behavioral observations and psychometric tools. However, physi-
ological measurements should support these behavioral diagnoses in the future in order to
enable earlier and more accurate diagnoses. Stepping towards this goal of incorporating
biochemical data into ASD diagnosis, this paper analyzes measurements of metabolite con-
centrations of the folate-dependent one-carbon metabolism and transulfuration pathways
taken from blood samples of 83 participants with ASD and 76 age-matched neurotypical
peers. Fisher Discriminant Analysis enables multivariate classification of the participants as
on the spectrum or neurotypical which results in 96.1% of all neurotypical participants being
correctly identified as such while still correctly identifying 97.6% of the ASD cohort. Further-
more, kernel partial least squares is used to predict adaptive behavior, as measured by the
Vineland Adaptive Behavior Composite score, where measurement of five metabolites of
the pathways was sufficient to predict the Vineland score with an R
2
of 0.45 after cross-
validation. This level of accuracy for classification as well as severity prediction far exceeds
any other approach in this field and is a strong indicator that the metabolites under consider-
ation are strongly correlated with an ASD diagnosis but also that the statistical analysis used
here offers tremendous potential for extracting important information from complex bio-
chemical data sets.
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 1 / 15
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
OPEN ACCESS
Citation: Howsmon DP, Kruger U, Melnyk S,
James SJ, Hahn J (2017) Classification and
adaptive behavior prediction of children with
autism spectrum disorder based upon multivariate
data analysis of markers of oxidative stress and
DNA methylation. PLoS Comput Biol 13(3):
e1005385. doi:10.1371/journal.pcbi.1005385
Editor: Christos A. Ouzounis, Centre for Research
and Technology-Hellas, GREECE
Received: August 12, 2016
Accepted: January 28, 2017
Published: March 16, 2017
Copyright: ©2017 Howsmon et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: De-identified data are
available in S1 Dataset.
Funding: DPH and JH gratefully acknowledge
partial financial support from the National Institutes
of Health (https://www.nih.gov/, Grant
1R01AI110642). The funders had no role in study
design, data collection and analysis, decision to
publish, or preparation of the manuscript.
Competing interests: The authors have declared
that no competing interests exist.
Author summary
Autism spectrum disorder (ASD) encompasses a family of neurological disorders charac-
terized by limited social interaction and restricted repetitive behaviors. The number of
children diagnosed with ASD has grown exponentially over the last four decades and is
now estimated to affect *1.5% of children. Although ASD is currently diagnosed and
treated based solely on psychometric tools, a biochemical view applicable to at least a sub-
set of ASD cases is emerging. Abnormalities in folate-dependent one carbon metabolism
and transsulfuration pathways can summarize a large number of observations of genetic
and environmental effects that increase ASD predisposition. However, these complex,
highly interconnected pathways require more advanced statistical models than the typical
univariate models presented in the literature. Therefore, we developed multivariate statis-
tical models that classify participants based on their neurological status and predict adap-
tive behavior in ASD. We emphasize that these models are cross-validated, helping to
ensure that the results will generalize to new samples. The models developed herein have
much stronger predictability than any existing approaches from the scientific literature.
Introduction
Autism Spectrum Disorder (ASD) encompasses a large group of early-onset neurological dis-
eases characterized by difficulties with social communication/interaction and expression of
restricted repetitive behaviors and interests [1]. In addition to these defining behavioral symp-
toms, individuals with ASD frequently have one or more co-occurring conditions, including
intellectual disability, ADHD, speech and language delays, psychiatric diagnoses, epilepsy,
sleep disorders, and gastrointestinal problems [25]. ASD affects *1.5% of the population and
affects males disproportionately [68]. It is associated with an impaired quality of life [9] and
the lifetime cost of supporting an individual with ASD amounts to $1.4–2.4MM, depending on
co-existing disorders [10].
It is generally acknowledged that ASD has a strong genetic component, but environmental
effects have also recently emerged as important contributors to the etiology and pathophysiol-
ogy of ASD in at least a subpopulation of cases. Early twin studies suggested that the heritabil-
ity of ASD was 80–90% [11]; however, twin studies since 2010 suggest a lower heritability of
only 37–55% [12,13]. Despite this high genetic association, only 15% of ASD cases have a
known genetic source [1]. Although genetic studies continue to provide new evidence for con-
tributing factors to ASD etiology [14], environmental effects such as maternal/paternal age,
toxic chemical exposure, maternal rubella infection, etc. are also emerging as key factors con-
tributing to ASD liability [13].
No generally accepted biomarkers for the diagnosis or diagnosis of the severity of ASD exist
to date. Instead, diagnostic evaluation involves a multi-disciplinary team of doctors usually
including a pediatrician, psychologist, speech and language pathologist, and occupational ther-
apist. Despite this current state of the art, work in identifying biomarkers that can support the
diagnosis process is ongoing. In particular, abnormalities in folate-dependent one-carbon
metabolism (FOCM) and transsulfuration (TS) likely contribute to the genetic and environ-
mental predisposition to ASD [15]. FOCM contributes to epigenetic gene expression through
DNA methylation and TS is the major contributor to intracellular redox status. An illustration
of these pathways overlaid with genetic and environmental contributions to ASD predisposi-
tion is presented in Fig 1.
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 2 / 15
Mutations or altered expression levels of several genes in these pathways have been associ-
ated with increased risk of ASD. Adenylosuccinate lyase (ADSL) deficiency leads to a purely
genetic form of autism by re-directing a large proportion of FOCM toward purine synthesis to
compensate for a reduction in de novo purine synthesis [15,16]. Methylenetetrahydrofolate
reductase (MTHFR) is responsible for generating 5-methyltetrahydrofolate, which in turn is
responsible for re-methylating homocysteine to methionine. In particular, the C677T poly-
morphism has been shown to increase ASD liability, especially in countries where prenatal
folate supplementation is low [17]. Limited evidence linking mutations in reduced folate car-
rier (RFC1) [18,19], transcobalamin II (TCII) [18], serine hydroxymethyltransferase I
(SHMT1) [20], 5-methyltetrahydrofolate-homocysteine methyltransferase reductase (MTRR)
[18,20], and catechol-O-methyltransferase (COMT) [18,21] to altered prevalence of ASD has
also been presented, although these contributions to ASD liability are currently contested [22].
Evidence for the association between environmentally-rooted FOCM/TS dysfunction and
ASD predisposition can be seen in prenatal valproate and toxic chemical exposure as well as
lack of maternal folate supplementation. Maternal valproate use during pregnancy has been
associated with higher incidence rates of ASD [23,24] and in utero valproate exposure has
been used to develop rodent models of autism [25]. Valproate exposure causes DNA hypo-
methylation [26,27] in key neurodevelopmental processes that have been mitigated by folate
supplementation [28]in vitro. Other chemicals such as heavy metals, ethyl alcohol, pesticides,
phthalates, polychlorinated biphenyls, and traffic-related air pollution (TRAP) have also been
shown to affect neurodevelopment and increase ASD liability [13,29]. These organic toxins
induce oxidative stress and heavy metals disrupt transsulfuration by binding glutathione, the
major contributor to intracellular redox homeostasis [30]. Additionally, glutathione is an
important regulator in the intracellular processing of methylcobalamin (vitamin B
12
), an
Fig 1. Illustration of folate-dependent one-carbon metabolism and transsulfuration pathways. Genetic and environmental effects
that increase ASD predisposition are shown in red whereas those that decrease ASD liability are shown in blue.
doi:10.1371/journal.pcbi.1005385.g001
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 3 / 15
essential cofactor for methionine synthase and the TS pathway [31]. Air dispersion models
coupled with traffic patterns/roadway geometry, meteorological data, and vehicle emission
data have been used to find a dose response between ASD prevalence and TRAP exposure
[32]. Additionally, common organic pollutants have been associated with increased autism
severity in children on the autism spectrum [33]. Two independent studies linked maternal
folate supplementation to a reduced risk of having a child with ASD [34,35]. This protective
effect is usually attributed to the involvement of FOCM in early epigenetic regulation of neuro-
development and neural tube formation [21,36]. For a more complete description of the evi-
dence for the potential contributions of FOCM/TS dysfunction to the ASD phenotype, see the
excellent review by Deth et al. [15].
Although differences between FOCM and TS pathways in children with ASD versus neuro-
typical controls have been shown previously [18,37,38], investigators have struggled with
identifying a single, predictive measurement of these pathways that separates individuals with
ASD from neurotypical controls or that correlates well with ASD severity. However, in many
complex problems one particular measurement may be insufficient and important informa-
tion can only be extracted by using multivariate statistical analysis. Indeed, incorporating mul-
tiple measurements of environmental toxins has been shown to increase the separability of
control and ASD participants [39] and better predict autism severity [33,39].
Latent variable techniques enable the discovery of important multivariate interactions, lead-
ing to improved classification and regression performance. Furthermore, latent variable tech-
niques allow assessing the importance of individual variables and are more robust to
uninformative variables. One popular latent variable technique for classification problems is
Fisher Discriminant Analysis (FDA), which achieves an optimal linear separability using a typ-
ically small set of latent variables that are linear combinations of the original variable set. FDA
has a long history in biological classification problems and was first used by Rao in 1948 to
interpret anthropological data [40]. Extensions of FDA, such as Kernel FDA (KFDA), exist
which can take nonlinear relationships into account for classification [41]. Latent variable
regression techniques include partial least squares (PLS) and its nonlinear counterpart kernel
PLS (KPLS) [42,43]. Using FDA for classification and KPLS for regression allow multivariate
interactions to surface, which are often hidden when only univariate analysis is considered. To
guarantee a statistically independent assessment of the multivariate classification and regres-
sion models, the presented study utilizes a cross-validatory approach, where the set of samples
used for model identification does not contain samples to evaluate the performance of the
identified models.
The presented work makes use of these advanced modeling and statistical analysis tools to
examine metabolite data of the FOCM/TS pathway in neurotypical participants (NEU) and
those on the autism spectrum (ASD) as well as their siblings (SIB). Using FDA, it is possible to
clearly distinguish the participants on the spectrum from their neurotypical peers and KPLS
unveils a strong correlation between metabolite concentrations of these pathways and adaptive
behavior as measured by the Vineland Adaptive Behavior Composite. This work not only ana-
lyzes the largest data set of its kind of these pathways in the scientific literature [38], but also
results in the strongest evidence to date of the association of FOCM/TS dysfunction with ASD.
Results
Classification into ASD, NEU, and SIB cohorts
Associating dysfunction of FOCM/TS pathways with ASD requires a distinction between or
separation of ASD and NEU groups based on FOCM/TS metabolites. Therefore, cross-
validatory FDA was performed using measurements of the FOCM/TS metabolites listed in
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 4 / 15
Table 1. A linear classifier based on these FDA scores is then used to classify ASD and NEU
participants. FDA scores and estimated probability distribution functions (PDFs) are provided
in Fig 2. The cross-validated misclassification rates of only 4.9% and 3.4% for the NEU and
ASD samples, respectively, eliminated more complex, nonlinear KFDA analysis from
consideration.
The performance of the classifier was then evaluated on the SIB cohort, a more challenging
classification problem due to partially shared genetic and environmental effects with the ASD
cohort. Using all measurements in Table 1, an FDA model was trained to separate the ASD
and NEU cohorts. Then, the trained FDA model was used to evaluate the SIB cohort (which
was not used for training). The resulting separation of ASD, NEU, and SIB presented in Fig 3
shows a slight increase in the overlap with the ASD cohort when compared with the perfor-
mance of the ASD vs. NEU classification. Furthermore, the SIB PDF shows significantly more
Table 1. FOCM/TS metabolites considered for analysis.
Methionine SAM SAH
SAM/SAH % DNA methylation 8-OHG
Adenosine Homocysteine Cysteine
Glu.-Cys. Cys.-Gly. tGSH
fGSH GSSG fGSH/GSSG
tGSH/GSSG Chlorotyrosine Nitrotyrosine
Tyrosine Tryptophane fCystine
fCysteine fCystine/fCysteine % oxidized glutathione
doi:10.1371/journal.pcbi.1005385.t001
Fig 2. Classification into ASD and NEU cohorts using FDA on all FOCM/TS metabolites. The plotted scores were
obtained via cross-validation and the probability distribution functions were obtained from fitting.
doi:10.1371/journal.pcbi.1005385.g002
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 5 / 15
overlap with the NEU PDF than the ASD PDF. These results support the hypothesis proposed
by James et al [38] that the siblings of the participants on the spectrum have FOCM/TS metab-
olite profiles that are significantly more similar to their neurotypical peers than their siblings,
even though genetically they are likely closer to their siblings than participants in the neuroty-
pical control group.
Analysis of important metabolites for classification
The simultaneous use of multiple measurements promises to increase the separability of the
cohorts; however, increasing the number of measurements increases the number of parame-
ters in the projection vector wthat maximizes the separability of the two groups (see Materials
and methods). Although cross-validation can help mitigate these effects, the increased number
of parameters can lead to over-fitting, which would indicate good performance for separation
on the existing data set, but poor separation performance when the analysis results are trans-
lated to new data. These over-fitting problems can be further mitigated by selecting only the
minimum number of variables required to adequately separate the two groups. Therefore, all
combinations of up to six variables were evaluated for separability. Select combinations of
higher numbers of variables were chosen in a greedy fashion to sequentially add measurements
that best improve the separation of the best six variables. Cross-validatory FDA was performed
on all variable combinations and probability distribution functions (PDFs) of the FDA scores
of the two cohorts were estimated. A receiver-operating-characteristic (ROC) curve was gener-
ated based on these PDFs. The C-statistic of the ROC curve provides a measure of the ability of
the classifier to separate into ASD and neurotypical groups. A C-statistic of 0.5 represents ran-
dom classification and a C-statistic of 1.0 represents perfect classification. Fig 4 plots the
Fig 3. Classification performance on the SIB cohort (yellow). There is significantly more overlap of the SIB cohort with
the NEU cohort (red) than with the ASD cohort (blue).
doi:10.1371/journal.pcbi.1005385.g003
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 6 / 15
maximum C-statistic for all combinations of a given number of variables. As the number of
variables increases, the C-statistic increases, saturates at 0.997, and then slightly decreases
when over-fitting occurs.
From these results, five variables (DNA methylation, 8-OHG, Glu.-Cys., fCystine/fCysteine,
% oxidized glutathione) were considered for further analysis; however, it should be noted that
select variable combinations distinct from this one provided similar performance for separat-
ing ASD and NEU participants. Chlorotyrosine and tGSH/GSSG were added to this set to
improve separability of the ASD and SIB groups, increasing the number of metabolites under
consideration to seven. The separability of the final minimal classifier based on these seven
variables is presented in Fig 5 with Type I and Type II error plots in S1 Fig.
Prediction of adaptive behavior in ASD
In addition to separation into neurologically distinct cohorts, metabolites in the FOCM/TS
pathway were investigated for predictability of adaptive behavior. Due to the inter-dependency
of pathway metabolites and possible nonlinear effects on psychological outcomes, nonlinear
regression via KPLS was used to evaluate the ability of pathway metabolites to predict adaptive
behavior in ASD (as measured by the Vineland Adaptive Behavior Composite score). Just as
was done in the FDA analysis, all combinations of a given number of variables were evaluated
for predictability. The cross-validatory R
2
of the regression was then used to determine the
optimal number of variables in the regression analysis. From the results in Fig 6, the R
2
begins
to decrease when more than five variables are used in the KPLS analysis. The maximum cross-
validatory R
2
was 0.45, corresponding to the KPLS model with the variable combination
GSSG, tGSH/GSSG, Nitrotyrosine, Tyrosine, and fCysteine used as inputs. These regression
results are plotted in Fig 6. (It is important to note that a few other variable combinations pro-
vided similar results, but only the best regression model is illustrated for clarity.) This strong
Fig 4. Selecting the Number of Variables for FDA based on C-statistic. Five variables were found to be sufficient for
separating the ASD and NEU groups while an additional two variables (totaling seven variables) were incorporated to
retain separation between ASD and SIB cohorts.
doi:10.1371/journal.pcbi.1005385.g004
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 7 / 15
correlation even after cross-validation indicates the importance of FOCM/TS dysfunction in
the pathophysiology of ASD.
Discussion
The multivariate statistical analysis presented herein provides unprecedented quantitative clas-
sification results for separating participants into ASD and NEU cohorts based solely on
Fig 5. FDA analysis and binary classification using the variables DNA methylation, 8-OHG, Glu.-Cys., fCystine/
fCysteine, % oxidized glutathione, Chlorotyrosine, and tGSH/GSSG. (a) individual cross-validated FDA scores and
fitted probability distribution functions and (b) the cross-validated confusion matrix for separation of ASD and neurotypical
(NEU) groups. TPR = TP/(TP + FN) is the True Positive Rate, FPR = FP/(FP + TN) is the False Positive Rate, PPV = TP/
(TP + FP) is the Positive Predictive Value, and NPV = TN/(TN + FN) is the Negative Predictive Value.
doi:10.1371/journal.pcbi.1005385.g005
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 8 / 15
biochemical data. Existing analyses report differences in mean metabolite levels or provide
qualitative illustrations of separating these two groups based on FOCM/TS metabolites [18,37,
38]. However, these strategies are not designed for classification and thus fail to successfully
classify participants. Here, FDA on seven metabolites allows sufficient separation such that a
linear classifier can correctly resolve 96.9% of participants. Such low misclassification rates dis-
suaded the use of more complex, nonlinear methods such as KFDA. Although FOCM/TS dys-
function likely does not completely detail ASD etiology, this biochemical analysis approaches
the accuracy needed for a clinical diagnostic tool.
Classification performance on the SIB group fortifies the argument for FOCM/TS involve-
ment in ASD since the large degree of shared genetic and environmental effects with the ASD
population only slightly worsens the separation. The sibling recurrence rate for ASD is esti-
mated to be 6.9–18.7% [7,44,45] and many siblings perform behaviorally and/or cognitively
at intermediate levels between those of ASD and NEU cohorts [4547] or express traits charac-
teristic of ASD [4749]. Therefore, the classification performance placing the SIB group
between the ASD and NEU groups, albeit much closer to the NEU group, is consistent with
the broader scientific literature on psychometric analysis of siblings of people with ASD.
Future work would benefit from assessing the SIB and NEU groups on measurements of the
Broader Autism Phenotype to validate these hypotheses on mild FOCM/TS dysfunction in the
SIB group.
Comparison or meta-analysis of regression analyses across studies is difficult due to differ-
ences in metabolites measured, origin of metabolites, available psychometric data, and metrics
of model performance. It is emphasized that extreme caution should be used when evaluating
fitted versus cross-validated metrics; for example, in [39], the best linear model can achieve a
fitting R
2
of 0.296, while obtaining a cross-validated R
2
of only 0.192. In general, fitting results
always surpass cross-validation results; nevertheless, the top-performing KPLS model in this
study achieved a cross-validatory R
2
of 0.45 due to its ability to reflect nonlinear behaviors/
interactions, which surpasses or compares with previous fitting [50,51] and cross-validated
results [39].
Fig 6. KPLS regression results. (a) maximum cross-validated R
2
for a given number of variables and (b) cross-validated
model predictions versus actual data points for the best combination of five variables (GSSG, tGSH/GSSG, Nitrotyrosine,
Tyrosine, and fCysteine).
doi:10.1371/journal.pcbi.1005385.g006
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 9 / 15
Nonlinear regression analysis of FOCM/TS metabolites enables prediction of key FOCM/
TS metabolites that are associated with adaptive behavior in ASD. Based upon all variable com-
binations evaluated in the KPLS regression analysis, top-performing models always incorpo-
rated (1) nitrotyrosine, (2) tyrosine, (3) fGSH or tGSH/GSSG, and (4) fCysteine or fCystine/
fCysteine. Interestingly, these variables are affected by high quality vitamin supplementation
that also decreases ASD severity in at least a subset of cases [5153]. While this forms an
intriguing direction for future studies, it should be noted that these studies should be repli-
cated and empirically tested on a wider scale before more definite conclusions can be drawn.
Furthermore, this approach can be extended to include other psychometric instruments (e.g.
the Autism Diagnostic Observation Schedule (ADOS) or Childhood Autism Rating Scales
(CARS)) that are more appropriate for diagnosis of ASD.
Developmental pediatricians, psychologists and other professionals can effectively use the
wealth of information provided by psychometric instruments to diagnose and evaluate patients
with ASD. However, these tests can rarely diagnose children under two years old since they
are based solely on behavioral assessment. As it is generally acknowledged that an earlier diag-
nosis can lead to a more favorable outcome in the long run [54], the identification of biomark-
ers which can be used in conjunction with psychometric measurements would be of
significant importance for ASD diagnosis. Furthermore, identification of these biomarkers can
facilitate the understanding of these complex disorders, which offers significant potential for
developing intervention strategies targeted to normalize these biomarkers in the future. How-
ever, it is important to note that these biomarkers may not simply be measurements of certain
metabolites but may require nonlinear statistical analysis of the measurements, as is done in
this work.
Materials and methods
Description of data
The data used in this study comes from the Arkansas Children’s Hospital Research Institute’s
autism IMAGE study [38]. The protocol was approved by the Institutional Review Board at
the University of Arkansas for Medical Sciences and all parents signed informed consent. The
interested reader is referred to [38] for detailed study design, including demographic informa-
tion and inclusion/exclusion criteria. Briefly, children between the ages of 3 and 10 years were
enrolled to assess levels of oxidative stress. ASD was defined by the Diagnostic and Statistical
Manual for Mental Disorders, Fourth Edition, the Autism Diagnostic Observation Schedule
(ADOS), and/or the Childhood Autism Rating Scales (CARS; score >30). FOCM/TS metabo-
lites from 83, 47, and 76 case (ASD), sibling (SIB), and age-matched control (NEU) children,
respectively, were used for classification. The metabolites under investigation are tabulated in
Table 1 and additional details of these measurements and derivations are presented in [38]. Of
the 83 participants on the autism spectrum, 55 also had Vineland II Scores recorded for use in
regression analysis (range 46–106). The Vineland Adaptive Behavior Composite evaluates
adaptive skills across the domains of communication, socialization, daily living skills, and
motor skills through a semi-structured caregiver interview [55]. Data are available in S1
Dataset.
Fisher Discriminant Analysis
Fisher Discriminant Analysis (FDA) is a dimensionality reduction tool that seeks to maximize
differences between multiple classes. Specifically, for nsamples of mmeasurements associated
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 10 / 15
with kdifferent classes, the between cluster variability S
B
is defined to be
SB¼X
k
i¼1
nið
xi
xÞð
xi
xÞT
where
xirepresents the mean vector of class i,
xrepresents the mean vector of all samples, and
n
i
represents the number of samples in class i. The within cluster variation is defined as
SW¼X
k
i¼1
niX
j2iðxj
xiÞðxj
xiÞT
where x
j
represents an individual sample. FDA seeks to find at most k1 vectors that maxi-
mize
JðwÞ ¼ wTSBw
wTSWw
In other words, FDA seeks to find linear combinations of variables that project samples in
the same group close to each other and project samples in different groups far away from each
other. The solution to this optimization problem is the generalized eigenvectors associated
with the k1 largest generalized eigenvalues of S1
WSB.
Kernel density estimation
Kernel density estimation attempts to determine the underlying probability distribution func-
tion from a set of reference samples. The main assumption is that additional samples are likely
to be found near the reference samples [5658]. Using a Gaussian kernel, this assumption is
formulated into an algorithm by associating a kernel function
Kxxi
s
 
with each observation x
i
. Here, xis the additional sample and σis the kernel parameter that
controls the shape of the distribution function. The estimated density function ^
fðxÞis then
given by
^
fðxÞ ¼ 1
nsX
n
i¼1
Kxxi
s
 
where nis the number of reference samples. The kernel parameter σis chosen to minimize the
mean integrated squared error (MISE) between the unknown density function f(x) and the
estimated density function ^
fðxÞ:
MISEðsÞ ¼ R1
 1 fðxÞ ^
fðxÞ
 2
using a cross-validatory approach [56].
Kernel partial least squares
Kernel techniques provide general nonlinear extensions to the popular linear partial
least squares (PLS) regression. The KPLS algorithm commences by defining a nonlinear
transformation f=ψ(x) on the predictor set x. In this work, ψ(x) is a Gaussian kernel. Rather
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 11 / 15
than regression on xas in linear PLS, yis regressed onto the high dimensional feature space
f[42,43].
Cross-validation
To avoid over-fitting and over-stating results, leave-one-out cross validation is employed in
both the FDA and KPLS analysis. The approach leaves out a single sample, fits an FDA or
KPLS model, and evaluates the prediction of the sample left out. This scheme is repeated for
each sample.
Supporting information
S1 Dataset. Biochemical and Adaptive Behavior Data from ASD, NEU, and SIB Partici-
pants.
(CSV)
S1 Fig. Type I and II Errors for the Final FDA Model. Cross-validated type I and type II
errors for the FDA model using the variables DNA methylation, 8-OHG, Glu.-Cys., fCystine/
fCysteine, % oxidized, Chlorotyrosine, and tGSH/GSSG.
(TIF)
Author Contributions
Conceptualization: DPH UK SJJ JH.
Data curation: SM SJJ.
Formal analysis: DPH UK.
Funding acquisition: JH.
Investigation: DPH SM UK.
Methodology: DPH UK SM SJJ JH.
Project administration: JH.
Resources: SM SJJ.
Software: DPH UK.
Supervision: SJJ JH.
Validation: DPH UK JH.
Visualization: DPH.
Writing – original draft: DPH.
Writing – review & editing: DPH UK SJJ JH.
References
1. American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders. 5th ed.
American Psychiatric Association; 2013.
2. Levy SE, Giarelli E, Lee LC, Schieve LA, Kirby RS, Cunniff C, et al. Autism spectrum disorder and co-
occurring developmental, psychiatric, and medical conditions among children in multiple populations of
the United States. Journal of Developmental & Behavioral Pediatrics. 2010; 31(4):267–275.
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 12 / 15
3. Perrin JM, Coury DL, Hyman SL, Cole L, Reynolds AM, Clemons T. Complementary and Alternative
Medicine Use in a Large Pediatric Autism Sample. Pediatrics. 2012; 130(Supplement 2):S77–S82. doi:
10.1542/peds.2012-0900E PMID: 23118257
4. Pulcini CD, Perrin JM, Houtrow AJ, Sargent J, Shui A, Kuhlthau K. Examining Trends and Coexisting
Conditions Among Children Qualifying for SSI Under ADHD, ASD, and ID. Academic Pediatrics. 2015;
15(4):439–443. doi: 10.1016/j.acap.2015.05.002 PMID: 26142070
5. Saunders A, Kirk IJ, Waldie KE. Autism Spectrum Disorder and Co-Existing Conditions: A Lexical Deci-
sion Erp Study. Clin Exp Psychol. 2015; 1(001). doi: 10.4172/2471-2701.1000101
6. Centers for Disease Control and Prevention. Prevalence of Autism Spectrum Disorders—Autism and
Developmental Disabilities Monitoring Network, 14 Sites, United States, 2008. Morbidity and Mortality
Weekly Report. 2012; 61:1–19.
7. Grønborg TK, Schendel DE, Parner ET. Recurrence of autism spectrum disorders in full- and half-
siblings and trends over time: a population-based cohort study. JAMA pediatrics. 2013; 167(10):
947–953. doi: 10.1001/jamapediatrics.2013.2259 PMID: 23959427
8. Maenner MJ, Rice CE, Arneson CL, Cunniff C, Schieve LA, Van Naarden Braun K, et al. Potential
impact of DSM-5 criteria on autism spectrum disorder prevalence estimates. JAMA Psychiatry. 2014;
71(3):292–300. doi: 10.1001/jamapsychiatry.2013.3893 PMID: 24452504
9. van Heijst BF, Geurts HM. Quality of life in autism across the lifespan: A meta-analysis. Autism. 2015;
19(2):158–167. doi: 10.1177/1362361313517053 PMID: 24443331
10. Buescher AVS, Cidav Z, Knapp M, Mandell DS. Costs of Autism Spectrum Disorders in the United King-
dom and the United States. JAMA Pediatrics. 2014; 168(8):721–728. doi: 10.1001/jamapediatrics.
2014.210 PMID: 24911948
11. Ronald A, Hoekstra RA. Autism spectrum disorders and autistic traits: A decade of new twin studies.
American Journal of Medical Genetics Part B: Neuropsychiatric Genetics. 2011; 156(3):255–274. doi:
10.1002/ajmg.b.31159
12. Gaugler T, Klei L, Sanders SJ, Bodea CA, Goldberg AP, Lee AB, et al. Most genetic risk for autism
resides with common variation. Nature Genetics. 2014; 46(8):881–885. doi: 10.1038/ng.3039 PMID:
25038753
13. Mandy W, Lai MC. Annual Research Review: The role of the environment in the developmental psycho-
pathology of autism spectrum condition. Journal of Child Psychology and Psychiatry. 2016; 57(3):
271–292. doi: 10.1111/jcpp.12501 PMID: 26782158
14. Scherer SW, Dawson G. Risk factors for autism: translating genomic discoveries into diagnostics.
Human Genetics. 2011; 130(1):123–148. doi: 10.1007/s00439-011-1037-2 PMID: 21701786
15. Deth R, Muratore C, Benzecry J, Power-Charnitsky VA, Waly M. How environmental and genetic fac-
tors combine to cause autism: A redox/methylation hypothesis. NeuroToxicology. 2008; 29(1):
190–201. doi: 10.1016/j.neuro.2007.09.010 PMID: 18031821
16. Jurecka A, Zikanova M, Kmoch S, Tylki-Szymańska A. Adenylosuccinate lyase deficiency. Journal of
Inherited Metabolic Disease. 2014; 38(2):231–242. doi: 10.1007/s10545-014-9755-y PMID: 25112391
17. Pu D, Shen Y, Wu J. Association between MTHFR Gene Polymorphisms and the Risk of Autism Spec-
trum Disorders: A Meta-Analysis. Autism Research. 2013; 6(5):384–392. doi: 10.1002/aur.1300 PMID:
23653228
18. James SJ, Melnyk S, Jernigan S, Cleves MA, Halsted CH, Wong DH, et al. Metabolic endophenotype
and related genotypes are associated with oxidative stress in children with autism. American Journal of
Medical Genetics Part B: Neuropsychiatric Genetics. 2006; 141B(8):947–956.doi: 10.1002/ajmg.b.
30366
19. James SJ, Melnyk S, Jernigan S, Pavliv O, Trusty T, Lehman S, et al. A functional polymorphism in the
reduced folate carrier gene and DNA hypomethylation in mothers of children with autism. American
Journal of Medical Genetics Part B: Neuropsychiatric Genetics. 2010; 153B(6):1209–1220.
20. Mohammad NS, Jain JMN, Chintakindi KP, Singh RP, Naik U, Akella RRD. Aberrations in folate meta-
bolic pathway and altered susceptibility to autism. Psychiatric Genetics. 2009; 19(4):171–176. PMID:
19440165
21. Schmidt RJ, Hansen RL, Hartiala J, Allayee H, Schmidt LC, Tancredi DJ, et al. Prenatal vitamins, one-
carbon metabolism gene variants, and risk for autism. Epidemiology (Cambridge, Mass). 2011; 22(4):
476–485.
22. Schaevitz LR, Berger-Sweeney JE. Gene-Environment Interactions and Epigenetic Pathways in
Autism: The Importance of One-Carbon Metabolism. ILAR Journal. 2012; 53(3–4):322–340. doi: 10.
1093/ilar.53.3-4.322 PMID: 23744970
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 13 / 15
23. Bromley RL, Mawer GE, Briggs M, Cheyne C, Clayton-Smith J, Garcı
´a-Fiñana M, et al. The prevalence
of neurodevelopmental disorders in children prenatally exposed to antiepileptic drugs. Journal of Neu-
rology, Neurosurgery & Psychiatry. 2013; 84(6):637–643. doi: 10.1136/jnnp-2012-304270
24. Christensen J, Grønborg TK, Sørensen MJ, Schendel D, Parner ET, Pedersen LH, et al. Prenatal
valproate exposure and risk of autism spectrum disorders and childhood autism. JAMA. 2013; 309(16):
1696–1703. doi: 10.1001/jama.2013.2270 PMID: 23613074
25. Roullet FI, Lai JKY, Foster JA. In utero exposure to valproic acid and autism—A current review of clini-
cal and animal studies. Neurotoxicology and Teratology. 2013; 36:47–56. PMID: 23395807
26. Dong E, Chen Y, Gavin DP, Grayson DR, Guidotti A. Valproate induces DNA demethylation in nuclear
extracts from adult mouse brain. Epigenetics. 2010; 5(8):730–735. doi: 10.4161/epi.5.8.13053 PMID:
20716949
27. Wang Z, Xu L, Zhu X, Cui W, Sun Y, Nishijo H, et al. Demethylation of Specific Wnt/β-Catenin Pathway
Genes and its Upregulation in Rat Brain Induced by Prenatal Valproate Exposure. The Anatomical
Record: Advances in Integrative Anatomy and Evolutionary Biology. 2010; 293(11):1947–1953. doi: 10.
1002/ar.21232
28. Fuller LC, Cornelius SK, Murphy CW, Wiens DJ. Neural crest cell motility in valproic acid. Reproductive
Toxicology. 2002; 16(6):825–839. doi: 10.1016/S0890-6238(02)00059-X PMID: 12401512
29. Landrigan PJ. What causes autism? Exploring the environmental contribution. Current Opinion in Pedi-
atrics. 2010; 22(2):219–225. PMID: 20087185
30. Shelton JF, Hertz-Picciotto I, Pessah IN. Tipping the Balance of Autism Risk: Potential Mechanisms
Linking Pesticides and Autism. Environmental Health Perspectives. 2012; 120(7):944–951. doi: 10.
1289/ehp.1104553 PMID: 22534084
31. Kim J, Hannibal L, Gherasim C, Jacobsen DW, Banerjee R. A Human Vitamin B12 Trafficking Protein
Uses Glutathione Transferase Activity for Processing Alkylcobalamins. Journal of Biological Chemistry.
2009; 284(48):33418–33424. doi: 10.1074/jbc.M109.057877 PMID: 19801555
32. Volk, Lurmann F, Penfold B, Hertz-Picciotto I, McConnell R. Traffic-related air pollution, particulate mat-
ter, and autism. JAMA Psychiatry. 2013; 70(1):71–77. doi: 10.1001/jamapsychiatry.2013.266 PMID:
23404082
33. Boggess A, Faber S, Kern J, Kingston HMS. Mean serum-level of common organic pollutants is predic-
tive of behavioral severity in children with autism spectrum disorders. Scientific Reports. 2016; 6:26185.
doi: 10.1038/srep26185 PMID: 27174041
34. Schmidt RJ, Tancredi DJ, Ozonoff S, Hansen RL, Hartiala J, Allayee H, et al. Maternal periconceptional
folic acid intake and risk of autism spectrum disorders and developmental delay in the CHARGE (CHild-
hood Autism Risks from Genetics and Environment) case-control study. The American Journal of Clini-
cal Nutrition. 2012; 96(1):80–89. doi: 10.3945/ajcn.110.004416 PMID: 22648721
35. Sure
´n P, Roth C, Bresnahan M, Haugen M, Hornig M, Hirtz D, et al. Association between maternal use
of folic acid supplements and risk of autism spectrum disorders in children. JAMA. 2013; 309(6):
570–577. doi: 10.1001/jama.2012.155925
36. Blom HJ. Folic acid, methylation and neural tube closure in humans. Birth Defects Research Part A:
Clinical and Molecular Teratology. 2009; 85(4):295–302. doi: 10.1002/bdra.20581
37. James SJ, Cutler P, Melnyk S, Jernigan S, Janak L, Gaylor DW, et al. Metabolic biomarkers of
increased oxidative stress and impaired methylation capacity in children with autism. The American
Journal of Clinical Nutrition. 2004; 80(6):1611–1617. PMID: 15585776
38. Melnyk S, Fuchs GJ, Schulz E, Lopez M, Kahler SG, Fussell JJ, et al. Metabolic Imbalance Associated
with Methylation Dysregulation and Oxidative Damage in Children with Autism. Journal of Autism and
Developmental Disorders. 2012; 42(3):367–377. doi: 10.1007/s10803-011-1260-7 PMID: 21519954
39. Adams J, Howsmon DP, Kruger U, Geis E, Gehn E, Fimbres V, et al. Significant Association of Urinary
Toxic Metals and Autism-Related Symptoms: A Nonlinear Statistical Analysis with Cross Validation.
PLOS One. 2017; 12(1):e0169526. doi: 10.1371/journal.pone.0169526 PMID: 28068407
40. Rao CR. The utilization of multiple measurements in problems of biological classification. Journal of the
Royal Statistical Society Series B (Methodological). 1948; 10(2):159–203.
41. Mika S, R atsch G, Weston J, Sch olkopf B, M uller KR. Fisher Discriminant Analysis with Kernels. In:
Proceedings of the Neural Networks for Signal Processing IX Workshop; 1999. p. 41–48.
42. Rosipal R, Trejo LJ. Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space. J
Mach Learn Res. 2002; 2:97–123.
43. Kim K, Lee JM, Lee IB. A novel multivariate regression approach based on kernel partial least squares
with orthogonal signal correction. Chemometrics and Intelligent Laboratory Systems. 2005; 79(1–2):
22–30. doi: 10.1016/j.chemolab.2005.03.003
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 14 / 15
44. Ozonoff S, Young GS, Carter A, Messinger D, Yirmiya N, Zwaigenbaum L, et al. Recurrence Risk
for Autism Spectrum Disorders: A Baby Siblings Research Consortium Study. Pediatrics. 2011; p.
2010–2825.
45. Constantino JN, Zhang Y, Frazier T, Abbacchi AM, Law P. Sibling recurrence and the genetic epidemi-
ology of autism. The American journal of psychiatry. 2010; 167(11):1349–1356. doi: 10.1176/appi.ajp.
2010.09101470 PMID: 20889652
46. Gizzonio V, Avanzini P, Fabbri-Destro M, Campi C, Rizzolatti G. Cognitive abilities in siblings of children
with autism spectrum disorders. Experimental Brain Research. 2014; 232(7):2381–2390. doi: 10.1007/
s00221-014-3935-8 PMID: 24710667
47. Ruzich E, Allison C, Smith P, Watson P, Auyeung B, Ring H, et al. Subgrouping siblings of people with
autism: Identifying the broader autism phenotype. Autism Research. 2016; 9(6):658–665. doi: 10.1002/
aur.1544 PMID: 26332889
48. Messinger D, Young GS, Ozonoff S, Dobkins K, Carter A, Zwaigenbaum L, et al. Beyond Autism: A
Baby Siblings Research Consortium Study of High-Risk Children at Three Years of Age. Journal of the
American Academy of Child and Adolescent Psychiatry. 2013; 52(3):300–308. doi: 10.1016/j.jaac.
2012.12.011 PMID: 23452686
49. Pisula E, Ziegart-Sadowska K. Broader Autism Phenotype in Siblings of Children with ASD—A Review.
International Journal of Molecular Sciences. 2015; 16(6):13217–13258. doi: 10.3390/ijms160613217
PMID: 26068453
50. Adams JB, Audhya T, McDonough-Means S, Rubin RA, Quig D, Geis E, et al. Nutritional and metabolic
status of children with autism vs. neurotypical children, and the association with autism severity. Nutri-
tion & Metabolism. 2011; 8:34. doi: 10.1186/1743-7075-8-34
51. Frye RE, Melnyk S, Fuchs G, Reid T, Jernigan S, Pavliv O, et al. Effectiveness of Methylcobalamin and
Folinic Acid Treatment on Adaptive Behavior in Children with Autistic Disorder Is Related to Glutathione
Redox Status. Autism Research and Treatment. 2013; 2013:e609705. doi: 10.1155/2013/609705
52. James SJ, Melnyk S, Fuchs G, Reid T, Jernigan S, Pavliv O, et al. Efficacy of methylcobalamin and foli-
nic acid treatment on glutathione redox status in children with autism. The American Journal of Clinical
Nutrition. 2009; 89(1):425–430. doi: 10.3945/ajcn.2008.26615 PMID: 19056591
53. Adams JB, Audhya T, McDonough-Means S, Rubin RA, Quig D, Geis E, et al. Effect of a vitamin/
mineral supplement on children and adults with autism. BMC Pediatrics. 2011; 11:111. doi: 10.1186/
1471-2431-11-111 PMID: 22151477
54. Zwaigenbaum L, Bryson S, Garon N. Early identification of autism spectrum disorders. Behavioural
Brain Research. 2013; 251:133–146. doi: 10.1016/j.bbr.2013.04.004 PMID: 23588272
55. Sparrow SS, Cicchetti DV, Balla DA. Vineland Adaptive Behavior Scales. 2nd ed. Minneapolis, MN:
Pearson Assessments; 2005.
56. Silverman BW. Density Estimation for Statistics and Data Analysis. CRC Press; 1986.
57. Chen Q, Kruger U, Leung AT. Regularised kernel density estimation for clustered process data. Control
Engineering Practice. 2004; 12(3):267–274. doi: 10.1016/S0967-0661(03)00083-2
58. Scott DW. Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons;
2015.
Classification and adaptive behavior prediction of children with ASD
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1005385 March 16, 2017 15 / 15

Supplementary resources (2)

... Given the complexity involved in screening young children by observation only, availability of a physiological test to support a diagnosis would be desirable. The dataset used in this analysis is derived from 159 blood samples taken from children aged 3 to 10 [30]. Of these 159 samples, 83 were from children with an autism diagnosis and 76 were from typically developing children. ...
... Internal standards were used to control the batch effects present between runs. A comprehensive list of the analyzed metabolites is provided in Howsmon et al.'s paper [30]. The ASD diagnosis was verified via the Autism Diagnostic Observation Schedule of Childhood Autism Rating Scales [30]. ...
... A comprehensive list of the analyzed metabolites is provided in Howsmon et al.'s paper [30]. The ASD diagnosis was verified via the Autism Diagnostic Observation Schedule of Childhood Autism Rating Scales [30]. Typically developing children had no medical history of neurologic symptoms reported by the parents [30]. ...
Article
Full-text available
There has been a rapid increase in the number of artificial intelligence (AI)/machine learning (ML)-based biomarker diagnostic classifiers in recent years. However, relatively little work has focused on assessing the robustness of these biomarkers, i.e., investigating the uncertainty of the AI/ML models that these biomarkers are based upon. This paper addresses this issue by proposing a framework to evaluate the already-developed classifiers with regard to their robustness by focusing on the variability of the classifiers’ performance and changes in the classifiers’ parameter values using factor analysis and Monte Carlo simulations. Specifically, this work evaluates (1) the importance of a classifier’s input features and (2) the variability of a classifier’s output and model parameter values in response to data perturbations. Additionally, it was found that one can estimate a priori how much replacement noise a classifier can tolerate while still meeting accuracy goals. To illustrate the evaluation framework, six different AI/ML-based biomarkers are developed using commonly used techniques (linear discriminant analysis, support vector machines, random forest, partial-least squares discriminant analysis, logistic regression, and multilayer perceptron) for a metabolomics dataset involving 24 measured metabolites taken from 159 study participants. The framework was able to correctly predict which of the classifiers should be less robust than others without recomputing the classifiers itself, and this prediction was then validated in a detailed analysis.
... Blood, urine, and fecal derived metabolites and biochemical compounds have been examined for their potential role in differentiating children with an ASD diagnosis and their typically developing peers (TD). [15][16][17][18][19][20][21] Using folate-dependent one-carbon metabolism (FOCM) and transsulfuration (TS) metabolites found in blood plasma, it was possible to correctly classify between ASD and TD cohorts with a specificity of 96.1% and sensitivity of 97.6%, following leave-one-out cross validation [15] . Using an independently collected data set, an 88% accuracy using FOCM/TS derived model panels was attained. ...
... Blood, urine, and fecal derived metabolites and biochemical compounds have been examined for their potential role in differentiating children with an ASD diagnosis and their typically developing peers (TD). [15][16][17][18][19][20][21] Using folate-dependent one-carbon metabolism (FOCM) and transsulfuration (TS) metabolites found in blood plasma, it was possible to correctly classify between ASD and TD cohorts with a specificity of 96.1% and sensitivity of 97.6%, following leave-one-out cross validation [15] . Using an independently collected data set, an 88% accuracy using FOCM/TS derived model panels was attained. ...
... [22] The relationship between ASD behavioral severity and metabolomics has also been a promising area of exploration with multivariate statistical analysis. Howsmon et al. [15] achieved an R 2 of 0.45 between metabolomic and behavioral data after cross-validation using FOCM/TS metabolites. In another study, metabolites related to the FOCM/TS pathway, glutathione and SAM have been shown to be correlated with ASD severity in blood plasma. ...
Article
Full-text available
Autism spectrum disorder (ASD) is defined as a neurodevelopmental disorder which results in impairments in social communications and interactions as well as repetitive behaviors. Despite current estimates showing that approximately 2.2% of children are affected in the United States, relatively little about ASD pathophysiology is known in part due to the highly heterogenous presentation of the disorder. Given the limited knowledge into the biological mechanisms governing its etiology, the diagnosis of ASD is performed exclusively based on an individual’s behavior assessed by a clinician through psychometric tools. Although there is no readily available biochemical test for ASD diagnosis, multivariate statistical methods show considerable potential for effectively leveraging multiple biochemical measurements for classification and characterization purposes. In this work, markers associated with the folate dependent one‐carbon metabolism and transulfuration (FOCM/TS) pathways analyzed via both Fisher Discriminant Analysis and Support Vector Machine showed strong capability to distinguish between ASD and TD cohorts. Furthermore, using Kernel Partial Least Squares regression it was possible to assess some degree of behavioral severity from metabolomic data. While the results presented need to be replicated in independent future studies, they represent a promising avenue for uncovering clinically relevant ASD biomarkers.
... Differences in mitochondrial metabolism, the gastrointestinal system, and redox regulation have been associated to varying degrees with ASD [3][4][5][6]. Divergences in metabolite profiles between children with ASD and their typically developing cohorts have been shown to exhibit significant differences up to the point where predictions about which metabolic profiles belong to the ASD or TD group have been made [7][8][9]. Furthermore, modulating metabolomic pathways holds significant promise as the basis to develop therapies addressing ASD co-occurring conditions and symptoms [10][11][12][13]. ...
... The interplay between the folate cycle, methionine cycle, and transsulfuration pathway plays an important role in cellular proliferation, redox homeostasis, and methylation [46]. Perturbations of the folate-dependent one-carbon metabolism (FOCM) and transsulfuration (TS) pathways have been well documented in individuals with ASD [7,47]. Metabolites related to the FOCM/TS pathways have been shown to serve potentially as effective biomarkers for predicting ASD diagnosis and have also been correlated with certain behavioral symptom severities [7,48]. ...
... Perturbations of the folate-dependent one-carbon metabolism (FOCM) and transsulfuration (TS) pathways have been well documented in individuals with ASD [7,47]. Metabolites related to the FOCM/TS pathways have been shown to serve potentially as effective biomarkers for predicting ASD diagnosis and have also been correlated with certain behavioral symptom severities [7,48]. ...
Article
Full-text available
There have been promising results regarding the capability of statistical and machine-learning techniques to offer insight into unique metabolomic patterns observed in ASD. This work re-examines a comparative study contrasting metabolomic and nutrient measurements of children with ASD (n = 55) against their typically developing (TD) peers (n = 44) through a multivariate statistical lens. Hypothesis testing, receiver characteristic curve assessment, and correlation analysis were consistent with prior work and served to underscore prominent areas where metabolomic and nutritional profiles between the groups diverged. Improved univariate analysis revealed 46 nutritional/metabolic differences that were significantly different between ASD and TD groups, with individual areas under the receiver operator curve (AUROC) scores of 0.6–0.9. Many of the significant measurements had correlations with many others, forming two integrated networks of interrelated metabolic differences in ASD. The TD group had 189 significant correlation pairs between metabolites, vs. only 106 for the ASD group, calling attention to underlying differences in metabolic processes. Furthermore, multivariate techniques identified potential biomarker panels with up to six metabolites that were able to attain a predictive accuracy of up to 98% for discriminating between ASD and TD, following cross-validation. Assessing all optimized multivariate models demonstrated concordance with prior physiological pathways identified in the literature, with some of the most important metabolites for discriminating ASD and TD being sulfate, the transsulfuration pathway, uridine (methylation biomarker), and beta-amino isobutyrate (regulator of carbohydrate and lipid metabolism).
... Abnormalities in transmethylation and trans-sulfuration pathways are so pervasive that they have been investigated as diagnostic markers for ASD. One study using the Fisher Discriminant Analysis found that these biomarkers could discriminate between ASD and typically developing individuals with a 97% accuracy with a follow-up study showing up to a 96% accuracy for the training dataset and 88-95% accuracy for the validation dataset [90,91]. Functional variations of trans-sulfation deficits have been developed but remain rather preliminary [92]. ...
Article
Full-text available
Autism spectrum disorder is an increasingly prevalent neurodevelopmental disorder in the world today, with an estimated 2% of the population being affected in the USA. A major complicating factor in diagnosing, treating, and understanding autism spectrum disorder is that defining the disorder is solely based on the observation of behavior. Thus, recent research has focused on identifying specific biological abnormalities in autism spectrum disorder that can provide clues to diagnosis and treatment. Biomarkers are an objective way to identify and measure biological abnormalities for diagnostic purposes as well as to measure changes resulting from treatment. This current opinion paper discusses the state of research of various biomarkers currently in development for autism spectrum disorder. The types of biomarkers identified include prenatal history, genetics, neurological including neuroimaging, neurophysiologic, and visual attention, metabolic including abnormalities in mitochondrial, folate, trans-methylation, and trans-sulfuration pathways, immune including autoantibodies and cytokine dysregulation, autonomic nervous system, and nutritional. Many of these biomarkers have promising preliminary evidence for prenatal and post-natal pre-symptomatic risk assessment, confirmation of diagnosis, subtyping, and treatment response. However, most biomarkers have not undergone validation studies and most studies do not investigate biomarkers with clinically relevant comparison groups. Although the field of biomarker research in autism spectrum disorder is promising, it appears that it is currently in the early stages of development.
... Finally, nitrotyrosine levels were measured as markers of protein oxidation/nitrosylation; increasing levels of nitrotyrosine indicate increased OS. Previous research has suggested increased levels of plasma GSSG as well as decreased levels of GSH and the GSH:GSSG ratio, consistent with reduced antioxidant status, in autistic children, while 8-OHdG and nitrotyrosine levels have been shown to accurately distinguish autistic from neurotypical children (Howsmon et al., 2017;James et al., 2004James et al., , 2006. ...
Article
Full-text available
We examined associations between prenatal oxidative stress (OS) and child autism-related outcomes. Women with an autistic child were followed through a subsequent pregnancy and that younger sibling’s childhood. Associations between glutathione (GSH), glutathione disulfide (GSSG), 8-oxo-deoxyguanine (8-OHdG), and nitrotyrosine and younger sibling Social Responsiveness Scale (SRS) scores were examined using quantile regression. Increasing GSH:GSSG (suggesting decreasing OS) was associated with minor increases in SRS scores (50th percentile β: 1.78, 95% CI: 0.67, 3.06); no other associations were observed. Results from this cohort with increased risk for autism do not support a strong relationship between OS in late pregnancy and autism-related outcomes. Results may be specific to those with enriched autism risk; future work should consider other timepoints and biomarkers.
... Dans les analyses de cerveaux d'enfants ayant présenté des TSA, réalisées post mortem, des marqueurs d'oxydation et des produits de la glycation de protéines, chez des enfants de sept ans [43]. De même, Daniel Howsmon a mis en évidence des différences de concentrations sanguines de métabolites du folate et des voies de transulfuration entre des enfants, âgés de 3 à 10 ans, présentant des TSA (83 enfants) et des enfants neurotypiques (76 enfants) [44]. Dans cette étude, 96,1 % des enfants autistes avaient pu être identifiés ; ce test sérologique semble également permettre de prédire la survenue des comportements adaptatifs des enfants, mesurés par le test de Vineland 6 . ...
Article
Les troubles du spectre de l’autisme (TSA) « naissent » in utero à la suite d’évènements pathologiques génétiques ou environnementaux. Le diagnostic des TSA n’est cependant effectué que vers l’âge de 3-5 ans en Europe et aux États-Unis. Un pronostic précoce permettrait pourtant d’atténuer la sévérité des atteintes cognitives, grâce à des approches psycho-éducatives. Une large panoplie d’approches a été suggérée pour établir un pronostic précoce des TSA, se fondant sur l’imagerie cérébrale, sur des enregistrements EEG, sur des biomarqueurs sanguins ou sur l’analyse des contacts visuels. Nous avons développé une approche fondée sur l’analyse par machine learning des données biologiques et échographiques recueillies en routine, du début de la grossesse au lendemain de la naissance, dans les maternités françaises. Ce programme qui permet d’identifier la presque totalité des bébés neurotypiques et la moitié des bébés qui auront un diagnostic de TSA quelques années plus tard, permet aussi d’identifier les paramètres ayant un impact sur le pronostic. Si quelques-uns d’entre eux étaient attendus, d’autres n’ont aucun lien avec les TSA. L’étude sans a priori des données de maternité devrait ainsi permettre un pronostic des TSA dès la naissance, ainsi que de mieux comprendre la pathogenèse de ces syndromes et de les traiter plus tôt.
... Therefore, many studies have detected not only the level of GSH in the sample but also the GSH/GSSG ratio. Interestingly, the rising GSH/GSSG ratio is a consistent result in all related studies (101,103,105,114,121), indicating that it is a good indicator of oxidative stress in the human body. This is consistent with the findings of a meta-analysis of oxidative stress marker abnormalities in children with ASD (147). ...
Article
Full-text available
Autism spectrum disorder (ASD) is a type of neurodevelopmental disorder that has been diagnosed in an increasing number of children around the world. Existing data suggest that early diagnosis and intervention can improve ASD outcomes. However, the causes of ASD remain complex and unclear, and there are currently no clinical biomarkers for autism spectrum disorder. More mechanisms and biomarkers of autism have been found with the development of advanced technology such as mass spectrometry. Many recent studies have found a link between ASD and elevated oxidative stress, which may play a role in its development. ASD is caused by oxidative stress in several ways, including protein post-translational changes (e.g., carbonylation), abnormal metabolism (e.g., lipid peroxidation), and toxic buildup [e.g., reactive oxygen species (ROS)]. To detect elevated oxidative stress in ASD, various biomarkers have been developed and employed. This article summarizes recent studies about the mechanisms and biomarkers of oxidative stress. Potential biomarkers identified in this study could be used for early diagnosis and evaluation of ASD intervention, as well as to inform and target ASD pharmacological or nutritional treatment interventions.
Article
Full-text available
Neurodevelopmental disorders are associated with metabolic pathway imbalances; however, most metabolic measurements are made peripherally, leaving central metabolic disturbances under-investigated. Cerebrospinal fluid obtained intraoperatively from children with autism spectrum disorder (ASD, n = 34), developmental delays (DD, n = 20), and those without known DD/ASD (n = 34) was analyzed using large-scale targeted mass spectrometry. Eighteen also had epilepsy (EPI). Metabolites significantly related to ASD, DD and EPI were identified by linear models and entered into metabolite–metabolite network pathway analysis. Common disrupted pathways were analyzed for each group of interest. Central metabolites most involved in metabolic pathways were L-cysteine, adenine, and dodecanoic acid for ASD; nicotinamide adenine dinucleotide phosphate, L-aspartic acid, and glycine for EPI; and adenosine triphosphate, L-glutamine, ornithine, L-arginine, L-lysine, citrulline, and L-homoserine for DD. Amino acid and energy metabolism pathways were most disrupted in all disorders, but the source of the disruption was different for each disorder. Disruption in vitamin and one-carbon metabolism was associated with DD and EPI, lipid pathway disruption was associated with EPI and redox metabolism disruption was related to ASD. Two microbiome metabolites were also detected in the CSF: shikimic and cis-cis-muconic acid. Overall, this study provides increased insight into unique metabolic disruptions in distinct but overlapping neurodevelopmental disorders.
Article
DNA methylation data have become a valuable source of information for biomarker development, because, unlike static genetic risk estimates, DNA methylation varies dynamically in relation to diverse exogenous and endogenous factors, including environmental risk factors and complex disease pathology. Reliable methods for genome-wide measurement at scale have led to the proliferation of epigenome-wide association studies and subsequently to the development of DNA methylation-based predictors across a wide range of health-related applications, from the identification of risk factors or exposures, such as age and smoking, to early detection of disease or progression in cancer, cardiovascular and neurological disease. This Review evaluates the progress of existing DNA methylation-based predictors, including the contribution of machine learning techniques, and assesses the uptake of key statistical best practices needed to ensure their reliable performance, such as data-driven feature selection, elimination of data leakage in performance estimates and use of generalizable, adequately powered training samples. DNA methylation-based predictors of health aim to predict outcomes such as exposure, phenotype or disease on the basis of genome-wide levels of DNA methylation. The authors review applications of existing DNA methylation-based predictors and highlight key statistical best practices to ensure their reliable performance.
Article
Full-text available
Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by impaired social interaction and restricted, repetitive behavior. Multiple studies have suggested mitochondrial dysfunction, glutamate excitotoxicity, and impaired detoxification mechanism as accepted etiological mechanisms of ASD that can be targeted for therapeutic intervention. In the current study, blood samples were collected from 40 people with autism and 40 control participants after informed consent and full approval from the Institutional Review Board of King Saud University. Sodium (Na ⁺ ), Potassium (K ⁺ ), lactate dehydrogenase (LDH), glutathione-s-transferase (GST), and mitochondrial respiratory chain complex I (MRC1) were measured in plasma of both groups. Predictive models were established to discriminate individuals with ASD from controls. The predictive power of these five variables, individually and in combination, was compared using the area under a ROC curve (AUC). We compared the performance of principal component analysis (PCA), discriminant analysis (DA), and binary logistic regression (BLR) as ways to combine single variables and create the predictive models. K ⁺ had the highest AUC (0.801) of any single variable, followed by GST, LDH, Na ⁺ , and MRC1, respectively. Combining the five variables resulted in higher AUCs than those obtained using single variables across all models. Both DA and BLR were superior to PCA and comparable to each other. In our study, the combination of Na ⁺ , K ⁺ , LDH, GST, and MRC1 showed the highest promise in discriminating individuals with autism from controls. These results provide a platform that can potentially be used to verify the efficacy of our models with a larger sample size or evaluate other biomarkers.
Article
Full-text available
Introduction A number of previous studies examined a possible association of toxic metals and autism, and over half of those studies suggest that toxic metal levels are different in individuals with Autism Spectrum Disorders (ASD). Additionally, several studies found that those levels correlate with the severity of ASD. Methods In order to further investigate these points, this paper performs the most detailed statistical analysis to date of a data set in this field. First morning urine samples were collected from 67 children and adults with ASD and 50 neurotypical controls of similar age and gender. The samples were analyzed to determine the levels of 10 urinary toxic metals (UTM). Autism-related symptoms were assessed with eleven behavioral measures. Statistical analysis was used to distinguish participants on the ASD spectrum and neurotypical participants based upon the UTM data alone. The analysis also included examining the association of autism severity with toxic metal excretion data using linear and nonlinear analysis. “Leave-one-out” cross-validation was used to ensure statistical independence of results. Results and Discussion Average excretion levels of several toxic metals (lead, tin, thallium, antimony) were significantly higher in the ASD group. However, ASD classification using univariate statistics proved difficult due to large variability, but nonlinear multivariate statistical analysis significantly improved ASD classification with Type I/II errors of 15% and 18%, respectively. These results clearly indicate that the urinary toxic metal excretion profiles of participants in the ASD group were significantly different from those of the neurotypical participants. Similarly, nonlinear methods determined a significantly stronger association between the behavioral measures and toxic metal excretion. The association was strongest for the Aberrant Behavior Checklist (including subscales on Irritability, Stereotypy, Hyperactivity, and Inappropriate Speech), but significant associations were found for UTM with all eleven autism-related assessments with cross-validation R² values ranging from 0.12–0.48.
Article
Full-text available
Autism spectrum disorders (ASD), and their pathogenesis, are growing public health concerns. This study evaluated common organic pollutant serum-concentrations in children, as it related to behavioral severity determined by rating scales and the Autism Diagnostic Observation Schedule (ADOS). Thirty children, ages 2-9, with ASD and thirty controls matched by age, sex, and socioeconomic status were evaluated using direct blood serum sampling and ADOS. Pooling concentrations of all studied pollutants into a single variable yielded cohort-specific neurobehavioral relationships. Pooled serum-concentration correlated significantly with increasing behavioral severity on the ADOS in the ASD cohort (p = 0.011, r = 0.54), but not controls (p = 0.60, r = 0.11). Logistic regression significantly correlated mean pollutant serum-concentration with the probability of diagnosis of behaviorally severe autism, defined as ADOS >14, across all participants (odds ratio = 3.43 [95% confidence: 1.14-10.4], p = 0.0287). No specific analyte correlated with ADOS in either cohort. The ASD cohort displayed greater quantitative variance of analyte concentrations than controls (p = 0.006), suggesting a wide range of detoxification functioning in the ASD cohort. This study supports the hypothesis that environmental exposure to organic pollutants may play a significant role in the behavioral presentation of autism.
Article
Full-text available
The current study sought to clarify the nature of lexical decision-making information processing, using a lexical decision paradigm during EEG, in 4 groups: pure-ASD; pure-ADHD; pure-anxiety; and neurotypical controls. We also aimed to understand whether there were differences between groups when ASD presents as a comorbid condition (ASD + ADHD). The P100 and the N170 components of the evoked potential (ERPs) were the focus of analyses. Overall, we found larger P100 amplitudes in the right (relative to the left) hemisphere in neurotypical controls. This early ERP component likely reflects pre-linguistic processing (e.g., the sorting of nouns into categories) at a stage before the language-dominant left hemisphere takes over. We also found that those with pure-ADHD had longer P100 latencies than both the pure-anxiety and pure-ASD groups towards all lexical stimuli. The pure-ADHD group also showed smaller amplitudes toward word stimuli than toward pseudowords and nonwords. The ASD + ADHD group had significantly longer latencies towards pseudowords than the pure-ASD group. A unique pattern of ERPs was therefore observed in the comorbid group, which suggests that the two conditions are separate. This finding is in accord with the latest revision of the Diagnostic and Statistical Manual of Mental Disorders (DSM-V).
Article
Full-text available
Background: Although autism spectrum condition (ASC) is strongly genetic in origin, accumulating evidence points to the critical roles of various environmental influences on its emergence and subsequent developmental course. Methods: A developmental psychopathology framework was used to synthesise literature on environmental factors associated with the onset and course of ASC (based on a systematic search of the literature using PubMed, PsychInfo and Google Scholar databases). Particular emphasis was placed on gene-environment interplay, including gene-environment interaction (G × E) and gene-environment correlation (rGE). Results: Before conception, advanced paternal and maternal ages may independently enhance offspring risk for ASC. Exogenous prenatal risks are evident (e.g. valproate and toxic chemicals) or possible (e.g. selective serotonin reuptake inhibitors), and processes endogenous to the materno-foeto-placental unit (e.g. maternal diabetes, enhanced steroidogenic activities and maternal immune activation) likely heighten offspring vulnerability to ASC. Folate intake is a prenatal protective factor, with a particular window of action around 4 weeks preconception and during the first trimester. These prenatal risks and protective mechanisms appear to involve G × E and potentially rGE. A variety of perinatal risks are related to offspring ASC risk, possibly reflecting rGE. Postnatal social factors (e.g. caregiver-infant interaction, severe early deprivation) during the first years of life may operate through rGE to influence the likelihood of manifesting a full ASC phenotype from a 'prodromal' phase (a proposal distinct to the discredited and harmful 'refrigerator mother hypothesis'); and later postnatal risks, after the full manifestation of ASC, shape life span development through transactions mediated by rGE. There is no evidence that vaccination is a postnatal risk for ASC. Conclusions: Future investigations should consider the specificity of risks for ASC versus other atypical neurodevelopmental trajectories, timing of risk and protective mechanisms, animal model systems to study mechanisms underlying gene-environment interplay, large-sample genome-envirome designs to address G × E and longitudinal studies to elucidate how rGE plays out over time. Clinical and public health implications are discussed.
Book
Clarifies modern data analysis through nonparametric density estimation for a complete working knowledge of the theory and methods. Featuring a thoroughly revised presentation, Multivariate Density Estimation: Theory, Practice, and Visualization, Second Edition maintains an intuitive approach to the underlying methodology and supporting theory of density estimation. Including new material and updated research in each chapter, the Second Edition presents additional clarification of theoretical opportunities, new algorithms, and up-to-date coverage of the unique challenges presented in the field of data analysis.The new edition focuses on the various density estimation techniques and methods that can be used in the field of big data. Defining optimal nonparametric estimators, the Second Edition demonstrates the density estimation tools to use when dealing with various multivariate structures in univariate, bivariate, trivariate, and quadrivariate data analysis. Continuing to illustrate the major concepts in the context of the classical histogram, Multivariate Density Estimation: Theory, Practice, and Visualization, Second Edition also features:. Over 150 updated figures to clarify theoretical results and to show analyses of real data sets. An updated presentation of graphic visualization using computer software such as R. A clear discussion of selections of important research during the past decade, including mixture estimation, robust parametric modeling algorithms, and clustering. More than 130 problems to help readers reinforce the main concepts and ideas presented. Boxed theorems and results allowing easy identification of crucial ideas. Figures in color in the digital versions of the book. A website with related data sets. Multivariate Density Estimation: Theory, Practice, and Visualization, Second Edition is an ideal reference for theoretical and applied statisticians, practicing engineers, as well as readers interested in the theoretical aspects of nonparametric estimation and the application of these methods to multivariate data. The Second Edition is also useful as a textbook for introductory courses in kernel statistics, smoothing, advanced computational statistics, and general forms of statistical distributions.