Content uploaded by Peng Dai
Author content
All content in this area was uploaded by Peng Dai on Jan 25, 2017
Content may be subject to copyright.
Healthy Cognitive Aging: a Hybrid Random Vector Functional-Link Model for the
Analysis of Alzheimers Disease
Peng Dai1,2,3, Femida Gwadry-Sridhar1,2,3, Michael Bauer1, Michael Borrie4, Xue Teng3, for the ADNI∗
1Department of Computer Science, University of Western Ontario, London, ON, Canada
2Robarts Research, London, ON, Canada
3Pulse Infoframe Inc., London, ON, Canada
4Division of Geriatric Medicine, University of Western Ontario, London, ON, Canada
peng.dai.ca@ieee.org, {fgwadrys, bauer}@uwo.ca, michael.borrie@sjhc.london.on.ca, xt.biomedical@gmail.com
Abstract
Alzheimer’s disease (AD) is a genetically complex neurode-
generative disease, which leads to irreversible brain damage,
severe cognitive problems and ultimately death. A number
of clinical trials and study initiatives have been set up to
investigate AD pathology, leading to large amounts of high
dimensional heterogeneous data (biomarkers) for analysis.
This paper focuses on combining clinical features from dif-
ferent modalities, including medical imaging, cerebrospinal
fluid (CSF), etc., to diagnose AD and predict potential pro-
gression. Due to privacy and legal issues involved with clin-
ical research, the study cohort (number of patients) is rela-
tively small, compared to thousands of available biomark-
ers (predictors). We propose a hybrid pathological analysis
model, which integrates manifold learning and Random Vec-
tor functional-link network (RVFL) so as to achieve better
ability to extract discriminant information with limited train-
ing materials. Furthermore, we model (current and future)
cognitive healthiness as a regression problem about age. By
comparing the difference between predicted age and actual
age, we manage to show statistical differences between dif-
ferent pathological stages. Verification tests are conducted
based on the Alzheimers Disease Neuroimaging Initiative
(ADNI) database. Extensive comparison is made against dif-
ferent machine learning algorithms, i.e. Support Vector Ma-
chine (SVM), Random Forest (RF), Decision Tree and Mul-
tilayer Perceptron (MLP). Experimental results show that our
proposed algorithm achieves better results than the compari-
son targets, which indicates promising robustness for practi-
cal clinical implementation.
Introduction
According to the 2015 World Alzheimer report, there are an
estimated 46 million people worldwide living with demen-
tia at a total cost of over $818 billion, which is estimated to
increase to a trillion dollar by 2018 (Alzheimers Disease In-
ternational, 2015). Alzheimers disease (AD) is one of the
most common causes for dementia, accounting for about
∗Data used in preparation of this article were obtained from
the Alzheimers Disease Neuroimaging Initiative (ADNI) database
(adni.loni.usc.edu). As such, the investigators within the ADNI
contributed to the design and implementation of ADNI and/or pro-
vided data but did not participate in analysis or writing of this re-
port.
Copyright c
2017, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
60% of the total. The disease presents a tremendous burden
and challenge to public health, health care delivery, social
services and the family (Alzheimers Disease International,
2015). AD usually develops in situ while the patient is cog-
nitively normal. At some point in time, sufficient brain dam-
age accumulates to result in cognitive symptoms, which may
further deteriorate to disability and ultimately death. There is
currently no effective cure to reverse the damages caused by
Alzheimer’s. Treatments are mainly to ease cognitive symp-
toms, delay progression and improve quality of life via as-
sistive technologies. Therefore, it is crucial to diagnose or
predict AD as early as possible so as to allow treatments
start early, which helps patients to maintain cognitive func-
tionality.
Clinical diagnosis of AD often includes establishing the
presence of dementia, amnesia and a deficit in one or
more cognitive functions, such as skilled movements (limb
apraxia), language (aphasia) or executive function (e.g.,
planning, attention and abstract reasoning)(Scott and Barrett
2007; American Psychiatric Association, 2013). The diag-
nosis process is complex, which involves a number of as-
sessments, e.g. medical history review, physical examina-
tion, neurological examination, cognitive testing, laboratory
testing and brain imaging. Physicians usually evaluate the
above mentioned tests based on experience with quantitative
guidelines. It is very challenging especially for early AD pa-
tients without clear cognitive symptoms.
With recent advances in artificial intelligence, evidence
has shown that effective application of machine learning al-
gorithms can greatly improve the efficiency of many tasks.
Machine learning offers valuable tools for advanced diag-
nostic techniques, which can assist the clinicians to bet-
ter understand the information underlying the high dimen-
sional heterogeneous medical variables. Automatic diagno-
sis of AD can be formulated as a classification problem. The
problem is particularly challenging due to the inherent dif-
ficulty in distinguishing between normal aging, mild cogni-
tive impairment (MCI), and early signs of AD. For example,
patients with dementia may not complain of cognitive dif-
ficulty owing to loss of self-awareness, while patients with
depression often complain of memory difficulties and seek
medical attention of their own initiative (Scott and Barrett
2007).
Because of the fast development of medical technology, a
number of medical tests have been developed for AD anal-
ysis, which yields a large amount of high dimensional het-
erogeneous data. However, due to privacy and legal issues of
clinical research, the study cohort (i.e. the number of avail-
able patients) is relatively small. If well trained, state-of-
the-art machine learning algorithms, e.g. deep learning, can
usually achieve very good performance (Hinton et al. 2012),
which on the other hand require a large amount of training
materials, i.e. large cohort. Therefore, we have to balance
between algorithm complexity and required data. In this pa-
per, we propose a hybrid system which combines manifold
learning and random vector functional link network (RVFL)
to achieve better ability to capture high dimensional non-
linear information from clinical data. Different from tradi-
tional Artificial Neural Network (ANN), RVFL sets most of
the parameters completely at random, which do not need to
be tuned during training. Besides, manifold learning is inte-
grated as part of the system, which helps to construct a better
representation of the high dimensional heterogeneous data.
Combined with manifold learning, RVFL is able to obtain a
satisfying approximation of the original problem. It is par-
ticularly suitable for problems where only limited data are
available.
Related Work
Alzheimer’s disease causes progressive damages to the hu-
man brain, causing massive brain cell death and thus atrophy
in various brain regions. Magnetic resonance imaging (MRI)
techniques utilize strong magnetic fields to form anatomi-
cal images of the body, which provides a valuable tool to
directly observe brain changes such as cerebral atrophy or
ventricular expansion. Therefore, MRI has become one of
the most widely used means to assist AD diagnosis. A large
amount of work has been done about applying image pro-
cessing techniques to MRIs. For example, Keraudren et. al.
proposed to use Scale-Invariant Feature Transform (SIFT) to
analyze brain atrophy (Keraudren et al. 2013). Another im-
portant approach is to establish 3D brain model and extract
volumetric information of various brain regions. Freesurfer
is one of the most commonly used package for the analysis
and visualization of structural and functional neuroimaging
data (Dale, Fischl, and Sereno 1999). In our current imple-
mentation, brain volume information together with genome
and demographics (age, gender, education) forms the feature
vector.
The diagnosis of Alzheimer’s disease can be formulated
as a classification problem, where the clinical diagnosis can
serve as labels and the high dimensional medical variables
can serve as features. Therefore, a number of related work
have been reported during the past decades. Lebedev et.
al. utilized the random forest algorithm for AD diagnosis
(Lebedev et al. 2014). Lopez et. al. utilized support vector
machine (SVM) to detect early signs of AD (L´
opez et al.
2011). Feature selection algorithms, e.g. statistical signifi-
cance, are widely used for dimension reduction in clinical
studies. Recently, manifold learning algorithms are intro-
duced to relevant studies. Conventional manifold learning
refers to nonlinear dimensionality reduction methods based
on the assumption that high-dimensional input data are sam-
pled from a smooth manifold so that one can embed these
data into the low-dimensional manifold while preserving
some structural (or geometric) properties that exist in the
original input space (Lin and Zha 2008). Instead of remov-
ing redundant feature dimensions, manifold learning algo-
rithms construct a low dimensional representation based on
the original data. Lopez et al. implements PCA as part of
their system(L´
opez et al. 2011). Dai et. al. proposed an im-
proved isometric mapping algorithm for feature embedding
and utilized ensemble learning algorithms for similar tasks
(Dai et al. 2015; 2016a).
Recently, deep learning has become one of the most pow-
erful machine learning techniques, which has shown supe-
rior performance in various practical applications, e.g. nat-
ural language processing and image recognition (Hinton et
al. 2012; LeCun, Bengio, and Hinton 2015). Inspired by its
promising performance, researchers have been trying to im-
plement deep learning in dementia research. Li et al. pro-
posed a hybrid system which combined principal component
analysis (PCA) and deep learning autoencoder to extract dis-
criminative features for AD diagnosis (Li et al. 2015). Payan
et al. proposed a deep learning algorithm based on 3D con-
volution of MRI images (Payan and Montana 2015). Dai et.
al. utilized multilayer perceptron (MLP) for AD diagnosis
and prognosis (Dai et al. 2016b). It has to be noted that
all the above mentioned work mainly implement a relative
‘easy’ or ‘shallow’ version of the deep learning algorithms.
There are only a small number of hidden layers and barely
show any complex structures, e.g. convolution layer. This is
mainly due to the fact that the study cohort is relatively small
compared with the imaging database used in deep learning
studies. Therefore, we have to balance between algorithm
complexity and the issues caused by limited training data.
In this paper, we study Random vector functional link net-
work (RVFL), in which only the output weights are chosen
as adaptable parameters, while the remaining parameters are
constrained to random values independently selected in ad-
vance (Husmeier 1999). RVFL simplifies the artificial neu-
ral network as a linear regression problem on top of a series
of randomly assigned transition functions (hidden layers),
which is an efficient approximation of the original nonlinear
optimization problem.
Methods
Problem Formulation
As the patient develops AD, there are pathological changes
in various regions of the human brain, which can be mea-
sured by volumetric changes. Besides, medical history, labo-
ratory testing, physical examination and cognitive testing are
all closely related to the final diagnosis (American Psychi-
atric Association, 2013). All those medical variables forms
the original feature vector, f. There are totally Nppartic-
ipants in the study. For each participant u, the diagnos-
tic analysis is repeated (follow-up medical tests) every 6
months, which will form a r×nfeature vector, f1×(r·n),
where nis the number of features obtained in each test and
ris the number of follow-up tests.
There are generally two problems in Alzheimer’s disease
research, i.e. diagnosis and prognosis. Diagnosis intends to
identify if the patient is cognitively normal or AD (see Prob-
lem 1). Prognosis is to tell how the patient will evolve in
AD pathology (see Problem 2). For example, if the patient
is currently healthy, the prognosis task is to determine if the
patient will stay healthy or likely develop AD.
Problem 1 (Diagnosis) Given different patients, described
as feature B, how to decide the patient’s pathological sta-
tus, e.g. Healthy, Mild Cognitive Impairment (MCI), or
Alzheimer’s disease (AD)?
Problem 2 (Prognosis) Given a patient, described as B,
and his/her historical mental status label, D, how to pre-
dict if the patient will stay at the same stage or progress in
the pathological path?
The aging process can be understood as an interactive
process between the human body and the environment, de-
scribed as a sequence of medical variables. Diagnosis is to
reveal the current cognitive status and thus a classification
problem. For prognosis, a straight forward approach is to
construct a time series model, e.g. Hidden Markov Model,
to capture the temporal evolution trajectory. Nevertheless,
due to the lack of valid data, it’s very difficult to fully train
advanced time series models. We formulate the prognosis
problem as a classification problem based on the sequence
of clinical diagnosis from ADNI. The prognosis is gener-
ated as ‘progression’ and ‘no progression’. In the progno-
sis task, we group the current and preceding observation,
{Bt,f ,Bt−1,f }, to form the new feature vector so as to ac-
count for the temporal cognitive changes.
Problem 3 (Healthy Aging) Given different patients, de-
scribed as feature B, what causes some people develop AD
while other people remain healthy?
Moreover, we investigate a third problem about healthy
aging within our proposed framework. While aging, vari-
ous functions of the human body gradually degrades. The
healthy aging problem intends to explain the difference that
features various aging pathways, i.e. healthy or dementia.
Figure 1 shows the diagram of our proposed system. Our
proposed system mainly consists of two parts, i.e. mani-
fold learning and Random vector functional link network
(RVFL), which will be discussed in detail in the following
sections.
Manifold Learning
The available clinical data are high dimensional heteroge-
neous, which are obtained from different sources, e.g. med-
ical imaging and blood tests. It is of vital importance to pre-
process the data so as to remove noise, normalize scaling
factors, etc. Another important step is to reduce feature di-
mension. Because of the curse of dimensionality, the data
required to fully represent the hidden mechanism increase
exponentially as the feature dimension increases. Therefore,
dimension reduction is one of the most simple and effective
way to boost system performance.
Manifold learning algorithms are designed to construct a
low dimensional representation, which preserves the topo-
logical or structural properties of the original data (Lee and
Medical
Imaging
Manifold Learning
Random Vector Functional-Link
network (RVFL)
Other
biomarkers:
amyloid-β, etc.
Demographic
Information:
age, ApoE, etc.
Diagnosis &
Prognosis Aging Speed
Figure 1: Schematic overview of the proposed methodology.
Verleysen 2007). In this paper, we compare the performance
of different manifold learning algorithms in our RVFL
based framework, including Principal Component Anal-
ysis (PCA), Neighborhood Preserving Embedding (NPE)
(Xiaofei He et al. 2005), Locality Preserving Projections
(LPP) (Xiaofei He 2003) and stochastic proximity embed-
ding (SPE). NPE and SPE are based on neighborhood graph.
Our experimental results show that LPP shows better perfor-
mance in our current framework.
Locality Preserving Projection (LPP) is a linear approx-
imation of the nonlinear Laplacian Eigenmap (Belkin and
Niyogi 2001). A neighborhood graph is firstly constructed
with weights defined as
Wi,j =e−kfi−fjk
t(1)
where fiis the feature vector for different patients
Then, LPP embedding can be calculated as the general-
ized eigenvector problem
FLFTa=λFDFTa(2)
where Dis a diagonal matrix whose entries are column sums
of W,Dii = ΣjWji .L=D−Wis the Laplacian matrix.
Random vector functional-link (RVFL) network
The idea of functional link network was suggested by Pao
and co-workers in 1988 (Klassen, Pao, and Chen 1988). A
typical artificial neural network consists of a linear link of
inputs together with nonlinear activation functions, while
Pao and co-workers suggested that a link can also be non-
linear. In a semi-linear net, the pattern vector at any layer is
multiplied linearly by a matrix of link weights to yield the
vector input to the next layer. Pao suggested that a nonlinear
functional transform be carried out along a nonlinear func-
tional link to yield a new pattern vector in a larger space
(Klassen, Pao, and Chen 1988). Since functional link net-
work incorporates nonlinearity by variations of additional
input nodes, generally ‘flat’ nets with no hidden layers are
sufficient for most of practical tasks (Klassen, Pao, and Chen
1988).
Random vector functional link network (RVFL) is one
of the practical implementations of functional link network.
G1G2...
x1...
GN-1
x2
GN
xd
o1ok
...
(a) Hidden Layer Neural Network: dInput Nodes, Nhid-
den nodes, koutput Nodes.
G1G2...x1... GN-1
x2GN
xd
o1ok
...
Enhanced NodesInput
Output
Target Valuest1tm
...
β1 β2 ... βj
(b) Random vector functional link network (RVFL): dInput
Nodes, NRandom Neurons (equivalent to hidden nodes), k
output Nodes.
Figure 2: Diagram of (a) hidden-layer net and (b) functional-
link network architectures.
It’s a multilayer perceptron (MLP) in which only the out-
put weights are chosen as adaptable parameters, while the
remaining parameters are constrained to random values in-
dependently selected in advance (Husmeier 1999). Standard
single layer neural network can be modeled as
oi=XβjG(Ajx+bj)(3)
The random-vector version of the functional-link net gener-
ates Aand b, randomly, and learn only β.
Figure 2(b) shows the net architecture of a single hidden
layer RVFL net. Although random vector (or feature) have
been generally believed to be less powerful than learned
features, it has shown reasonably success in many practi-
cal applications (Saxe et al. 2011; Rahimi and Recht 2008).
Recently, Huang et. al. further improved RVFL to extreme
learning machine (ELM), which achieves very promising re-
sults with simple network structure (Huang, Zhu, and Siew
2006). Random vector can significantly simplify the algo-
rithm complexity, since a large amount of the parameters are
randomly selected and do not need to be tuned. The RVFL
can also be considered to consist of input and output lay-
ers, but no hidden layers. An input layer has enhanced input
values which are created by various functional links with
original input values.
The key advantage of RVFL related approaches is the
ability to obtain promising results with limited data, where
state-of-the-art algorithms, e.g. deep learning, probably can-
not properly trained. In our present implementation, we use
the Extreme Learning Machine (ELM) version of RVFL 1.
Healthy Aging
Although people are in different aging path (healthy or de-
mentia), there will always be degradation in various func-
tions of daily living. Patients with dementia ‘moves faster’ in
the aging process than the healthy aging counterparts. There-
fore, a straight forward approach to evaluate the cognitive
health is to study the ‘pathological’ aging status.
A regression model is constructed based on RVFL to esti-
mate the patient’s age.
A=f(B)(4)
where Bis the feature matrix consisting relevant medical
variables. The model is trained using the healthy partici-
pants. We assume the predictive power of our proposed al-
gorithm is reasonably well. Therefore, the predicted age can
reflect the actual status of the patient’s brain. When applied
to an AD patient, it reflects how old the patient should be if
he/she is healthy.
Then, we study the difference between the predicted age
and the actual age.
Adif =Ap−Areal (5)
where Areal is the actual age; Apis the predicted age. Based
on our study, Adif follows Normal distribution. It reflects
the aging speed of the target patient and shows different sta-
tistical properties for different AD stages. More details will
be given in the results section.
Results and Discussion
In this section, detailed descriptions about the database and
experiment settings are presented. Extensive comparison is
made to show how the performance of our proposed algo-
rithm.
Data acquisition and pre-processing
Verification tests are performed based on the Alzheimers
Disease Neuroimaging Initiative (ADNI) database
(adni.loni.usc.edu). The ADNI was launched in 2003
by the National Institute on Aging (NIA), the National In-
stitute of Biomedical Imaging and Bioengineering (NIBIB),
the Food and Drug Administration (FDA), private pharma-
ceutical companies and non-profit organizations, as a $60
million, 5-year public/private partnership. The primary goal
of ADNI has been to test whether serial magnetic resonance
imaging (MRI), positron emission tomography (PET), other
biological markers, and clinical and neuropsychological
assessment can be combined to measure the progression
of mild cognitive impairment (MCI) and early Alzheimers
disease (AD). (Weiner et al. 2012).
The Principal Investigator of this initiative is Michael W.
Weiner, MD, VA Medical Center and University of Califor-
nia - San Francisco. ADNI is the result of efforts of many
1The implementation codes are provided by the authors at
https://github.com/dclambert/Python-ELM (Huang, Zhu, and Siew
2006) .
co-investigators from a broad range of academic institutions
and private corporations, and subjects have been recruited
from over 50 sites across the U.S. and Canada. The initial
goal of ADNI was to recruit 800 subjects but ADNI has
been followed by ADNI-GO and ADNI-2. For up-to-date
information, see www.adni-info.org .
Experiment setup
As described in previous sections, there are multiple feature
points for the same patient corresponding the patients’ dif-
ferent visits to ADNI site. After removing invalid entries,
there are totally 2158 data points, with 636 Healthy Control
(HC) records, 1056 MCI records, and 466 AD records. Ten
fold cross validation is used in our experiments. Comparison
is made against Multilayer Perceptron (MLP), Support Vec-
tor Machine (SVM), Random Forest (Breiman 2001) and
Decision Tree. The optimal result of MLP is achieved with 2
hidden layers. SVM is implemented with Radial basis func-
tion (RBF) kernel.
Experimental results
Aging Speed
Alzheimer’s disease is a geriatric disease, and thus age is a
strong risk factor of the disease. Since aging is (to the best
of our knowledge) is inevitable, what makes the difference
is aging speed. Aging speed is defined as the difference be-
tween structural age (predicted age) and the demographic
age (actual age). Higher aging speed indicates more like-
lihood (or vulnerability) to dementia. We fit the proposed
algorithm into a regression task to estimate the patient age
based on the healthy control set. Then, we calculate the esti-
mation difference, Adif in Equation (5). Figure 3 shows the
results. It can be seen that at younger age (e.g. <70) the
predicted age tends to be smaller than the actual age, while
at older age (e.g. >70) the predicted age tends to be larger
than the actual age. This is mainly due to the fact that age is
a strong indicator of AD. Therefore, there are more occur-
rences in the senior population. In our current experiment
settings, mean(Adif ) = −1.33 and std(Adif ) = 8.72.
Based on normality tests, i.e. Kolmogorov-Smirnov test and
Anderson-Darling test, the difference between predicted age
and real age can be modeled by a normal distribution.
50
60
70
80
90
Age
Predicted Age
HC
MCI
AD
Predicted Age ' std
Figure 3: Predicted age Vs. actual age.
Mean estimation difference, mean(Adif ), for HC, MCI
and AD are -1.02, 1.65 and 2.44, respectively. It can be
seen that HC participants tend to be older than the predicted
age, while MCI and AD patients tend to be younger than
the predicted age. The physical meaning of the results is
that healthy people possess a younger brain (or nearly at the
same age). Nevertheless, the brain of an AD patient seems
older than the actual age. Besides, MCI patients are more
concentrated to the predicted age, while AD patients are
more scattered. Different pathological phases show differ-
ent statistical properties. Although HC, MCI and AD partic-
ipants show different mean estimation difference, the corre-
sponding standard deviations are relatively large, 4.35, 6.31
and 6.71. There are large overlap between neighboring cate-
gories. This is mainly due to the complex pathology of AD.
Although severe brain damage can lead to cognitive dis-
order, there is no direct causal relationship between struc-
tural anomaly and dementia symptoms in geriatric cohorts.
The underlying anatomical mechanism of dementia (cog-
nitive disorder) is still unclear. The majority of the input
features are brain volumes extracted from MRIs. Therefore,
our proposed framework is more suitable to identify struc-
tural anomaly. When it comes to cognitive disorder, cogni-
tive assessment scores, e.g. Mini Mental State Examination
(MMSE), may be more suitable, since they directly answer
all the criteria (in the form of interactive questionnaires) of
clinical dementia diagnosis. However, the objective of our
research project is to help identify risk factors associated
with brain structural changes and other related biomark-
ers, while cognitive assessments treat internal pathological
changes as a black box. This work offers a valuable tool
to model the aging process as different pathways featured
by various aging speed, which can be calculated based on
anatomical properties of the brain. It explains how brain
pathological aging affects the aging pathways of different
patients (Problem 3).
Automatic Diagnosis One of the most important prob-
lems in AD study is diagnosis. The key contribution of this
paper is an automatic diagnosis system based on RVFL net-
work. Manifold learning is incorporated as part of the sys-
tem to remove noise and construct low dimensional repre-
sentation of the original high dimensional data. Figure 4
shows the comparison results. In the experiments all the re-
sults are based on RVFL. We compare the results from prin-
cipal component analysis (PCA), Neighborhood Preserving
Embedding (NPE), Locality Preserving Projections (LPP)
and stochastic proximity embedding (SPE). It can be seen
that generally manifold learning algorithms improves the
performance of the classification algorithm. In particular,
NPE and LPP improve the system performance by about
2%. The optimal results are achieved at about 40 selected
dimensions.
Table 1 gives the experimental results. It can be seen
that our proposed algorithm achieves very promising results,
overall 93.28% accuracy. The precisions for HC, MCI and
AD are 92.91%, 92.21% and 96.64%, respectively. The sen-
sitivities for HC, MCI and AD are 94.81%, 95.36% and
86.48%, respectively. Although the sensitivity for AD is rel-
0 10 20 30 40 50
Number of Selected Dimension
30
40
50
60
70
80
90
100
Diagnosis Accuracy (%)
No manifold
PCA
NPE
LPP
SPE
Figure 4: Comparison of different manifold learning algo-
rithms.
atively low, 86.48%, AD is mainly misclassified as MCI. It
is not very harmful. It has to be noted that the progression
of AD is a gradual process, which may take decades. The
benchmarks between different pathology phases are rela-
tively vague. Therefore, classification of the transition stages
is very challenging.
Table 1: Confusion Matrix, ten-fold cross validation.
Predicted Class Rate
HC MCI AD
True HC 603 31 2 94.81%
Class MCI 37 1007 12 95.36%
AD 9 54 403 86.48%
Rate 92.91% 92.21% 96.64% 93.28%
Extensive comparison is made against Support Vector
Machine (SVM), Random Subspace (RS), Multilayer Per-
ceptron (MLP), Random Forest and Decision Tree. Table 2
shows the results for comparison targets. It can be seen that
our proposed algorithm achieves superior results. The im-
provements are 9.45%, 3.95%, 9.07%, and 25.91%, respec-
tively. Besides, we also show that with only medical imaging
data (‘Imaging’ in Table 2), the proposed system achieves
84.13% accuracy. The integration of multiple sources of
medical data show clear synergy and added value to the
overall performance.
Table 2: Recognition results for comparison targets (%).
SVM MLP RF DT Imaging
Accuracy 83.83 89.33 84.21 67.37 84.13
Rel. Imp. 9.45 3.95 9.07 25.91 9.15
Prognosis
Another important problem in Alzheimer’s disease research
is prognosis, i.e. the prediction of AD progression (Problem
2). In this task, only patients with clear pathological progres-
sion are included in the experiments. Besides, since we uti-
lize coupled observation as input, only those patients with at
least 2 consecutive observation at the same stage are chosen.
There are totally 425 valid records in the prognosis task.
Table 3: Confusion Matrix for prognosis, ten-fold cross val-
idation.
Predicted Class Rate
Progression No Prog.
True Progression 32 26 55.17%
Class No Prog. 1 366 99.73%
Rate 96.97% 93.37% 93.65%
Table 3 shows the confusion matrix for the prognosis task.
It can be seen that our proposed algorithm achieves a nearly
perfect result to predict no progression, 99.73%. However,
when it comes to progression, the results are two-fold. On
the one hand, the proposed algorithm achieves very high
precision, 96.97%. On the other hand, the sensitivity is rel-
atively low, 55%. This is mainly due to the fact that clinical
diagnosis of various stages of Alzheimer’s consists of many
subjective decisions. Moreover, the anatomical structures of
the brain is closely correlated to dementia symptoms. How-
ever, there is no clear qualitative causal relationship between
structure change and the corresponding symptoms. Besides,
as shown in the table, the data in the prognosis problem are
extremely unbalanced. The data for ‘progression’ is less than
0.1 of the ‘No Progression’ category, which makes different
classifiers tend to overfit on the ‘No Progression’ class.
Conclusion
In this study, we show that Random Vector Functional-link
(RVFL) network is very suitable for Alzheimer’s disease
analysis, due to its ability to incorporate nonlinear relation-
ship with a single layer structure. The proposed algorithm
achieves very promising results in the diagnosis task. On
the other hand, our proposed algorithm achieves very high
prognosis precision with relatively low sensitivity. This may
due to the complex nature of Alzheimer’s disease pathology.
Moreover, we present a novel analysis framework to study
the aging speed of the participants, which clearly follows the
biological aging process. Preliminary results are presented
to show the potential of aging speed as a strong indicator
for diagnosis and prognosis applications as well as a tool to
assess future machine learning algorithms. Future work will
be focused on the investigation of a complete aging model
to describe different aging styles.
Acknowledgments
Data collection and sharing for this project was funded
by the Alzheimer’s Disease Neuroimaging Initiative
(ADNI) (National Institutes of Health Grant U01
AG024904) and DOD ADNI (Department of Defense
award number W81XWH-12-2-0012). Please refer to
http://adni.loni.usc.edu/ for more details.
References
Alzheimers Disease International,. 2015. World Alzheimer
Report 2015: The Global Impact of Dementia.
American Psychiatric Association,. 2013. Diagnostic and
statistical manual of mental disorders. 5th edition edition.
Belkin, M., and Niyogi, P. 2001. Laplacian eigenmaps and
spectral techniques for embedding and clustering. In Ad-
vances in Neural Information Processing Systems 14, 585–
591. MIT Press.
Breiman, L. 2001. Random Forests. Machine Learning
45(1):5–32.
Dai, P.; Gwadry-Sridhar, F.; Bauer, M.; and Borrie, M. 2015.
A hybrid manifold learning algorithm for the diagnosis and
prognostication of Alzheimer’s disease. In AMIA 2015 An-
nual Symposium, San Francisco, CA, USA, 14-18 Nov.
Dai, P.; Gwadry-Sridhar, F.; Bauer, M.; and Borrie, M.
2016a. Bagging Ensembles for the Diagnosis and Prognosti-
cation of Alzheimer’s Disease. In the Thirtieth AAAI Confer-
ence on Artificial Intelligence (AAAI-16), Phoenix, Arizona,
USA, 12-17 Feb.
Dai, P.; Gwadry-Sridhar, F.; Bauer, M.; and Borrie,
M. 2016b. Longitudinal Brain Structure Changes
in Health/MCI Patients: A Deep Learning Approach
for the Diagnosis and Prognosis of Alzheimer’s Dis-
ease. In Alzheimer’s Association International Conference
(AAIC),Toronto, ON, Canada, 24-27 July.
Dale, A.; Fischl, B.; and Sereno, M. I. 1999. Cortical
surface-based analysis: I. segmentation and surface recon-
struction. NeuroImage 9(2):179 – 194.
Hinton, G.; Deng, L.; Yu, D.; Dahl, G.; Mohamed, A.-r.;
Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.; Sainath, T.;
and Kingsbury, B. 2012. Deep Neural Networks for Acous-
tic Modeling in Speech Recognition: The Shared Views of
Four Research Groups. IEEE Signal Processing Magazine
29(6):82–97.
Huang, G.-B.; Zhu, Q.-Y.; and Siew, C.-K. 2006. Extreme
learning machine: Theory and applications. Neurocomput-
ing 70(1):489–501.
Husmeier, D. 1999. Neural Networks for Conditional Prob-
ability Estimation. Springer-Verlag London.
Keraudren, K.; Kyriakopoulou, V.; Rutherford, M.; Hajnal,
J. V.; and Rueckert, D. 2013. Localisation of the brain
in fetal MRI using bundled SIFT features. Medical image
computing and computer-assisted intervention : MICCAI ...
International Conference on Medical Image Computing and
Computer-Assisted Intervention 16(Pt 1):582–9.
Klassen; Pao; and Chen. 1988. Characteristics of the func-
tional link net: a higher order delta rule net. In IEEE In-
ternational Conference on Neural Networks, 507–513 vol.1.
IEEE.
Lebedev, A. V.; Westman, E.; Van Westen, G. J. P.; Kram-
berger, M. G.; Lundervold, A.; Aarsland, D.; Soininen, H.;
Koszewska, I.; Mecocci, P.; Tsolaki, M.; Vellas, B.; Love-
stone, S.; and Simmons, A. 2014. Random Forest ensem-
bles for detection and prediction of Alzheimer’s disease with
a good between-cohort robustness. NeuroImage. Clinical
6:115–25.
LeCun, Y.; Bengio, Y.; and Hinton, G. 2015. Deep learning.
Nature 521(7553):436–444.
Lee, J. J. A., and Verleysen, M. 2007. Nonlinear dimension-
ality reduction. Springer.
Li, F.; Tran, L.; Thung, K.-H.; Ji, S.; Shen, D.; and Li, J.
2015. A Robust Deep Model for Improved Classification of
AD/MCI Patients. IEEE journal of biomedical and health
informatics 19(5):1610–6.
Lin, T., and Zha, H. 2008. Riemannian manifold learning.
Pattern Analysis and Machine Intelligence, IEEE Transac-
tions on 30(5):796–809.
L´
opez, M.; Ram´
ırez, J.; G´
orriz, J.; ´
Alvarez, I.; Salas-
Gonzalez, D.; Segovia, F.; Chaves, R.; Padilla, P.; and
G´
omez-R´
ıo, M. 2011. Principal component analysis-based
techniques and supervised classification schemes for the
early detection of Alzheimer’s disease. Neurocomputing
74(8):1260–1271.
Payan, A., and Montana, G. 2015. Predicting alzheimer’s
disease: a neuroimaging study with 3d convolutional neural
networks. CoRR abs/1502.02506.
Rahimi, A., and Recht, B. 2008. Random features for large-
scale kernel machines. In Platt, J. C.; Koller, D.; Singer,
Y.; and Roweis, S. T., eds., Advances in Neural Information
Processing Systems 20. Curran Associates, Inc. 1177–1184.
Saxe, A.; Koh, P. W.; Chen, Z.; Bhand, M.; Suresh, B.; and
Ng, A. Y. 2011. On random weights and unsupervised
feature learning. In Getoor, L., and Scheffer, T., eds., Pro-
ceedings of the 28th International Conference on Machine
Learning (ICML-11), 1089–1096. New York, NY, USA:
ACM.
Scott, K. R., and Barrett, A. M. 2007. Dementia syndromes:
evaluation and treatment. Expert review of neurotherapeu-
tics 7(4):407–22.
Weiner, M.; Veitch, D.; Aisen, P.; Beckett, L.; Cairns, N.;
Green, R.; Harvey, D.; Jack, C.; Jagust, W.; Liu, E.; Mor-
ris, J.; Petersen, R.; Saykin, A.; Schmidt, M.; Shaw, L.;
Siuciak, J. A.; Soares, H.; Toga, A.; and Trojanowski, J.
2012. The Alzheimer’s Disease Neuroimaging Initiative: a
review of papers published since its inception. Alzheimer’s
& dementia : the journal of the Alzheimer’s Association 8(1
Suppl):S1–68.
Xiaofei He; Deng Cai; Shuicheng Yan; and Hong-Jiang
Zhang. 2005. Neighborhood preserving embedding. In
Tenth IEEE International Conference on Computer Vision
(ICCV’05) Volume 1, volume 2, 1208–1213 Vol. 2. IEEE.
Xiaofei He, P. N. 2003. Locality Preserving Projections.