ArticlePDF Available

Abstract and Figures

Background The aim of this study is to use classification methods to predict future onset of Alzheimer's disease in cognitively normal subjects through automated linguistic analysis. Methods To study linguistic performance as an early biomarker of AD, we performed predictive modeling of future diagnosis of AD from a cognitively normal baseline of Framingham Heart Study participants. The linguistic variables were derived from written responses to the cookie-theft picture-description task. We compared the predictive performance of linguistic variables with clinical and neuropsychological variables. The study included 703 samples from 270 participants out of which a dataset consisting of a single sample from 80 participants was held out for testing. Half of the participants in the test set developed AD symptoms before 85 years old, while the other half did not. All samples in the test set were collected during the cognitively normal period (before MCI). The mean time to diagnosis of mild AD was 7.59 years. Findings Significant predictive power was obtained, with AUC of 0.74 and accuracy of 0.70 when using linguistic variables. The linguistic variables most relevant for predicting onset of AD have been identified in the literature as associated with cognitive decline in dementia. Interpretation The results suggest that language performance in naturalistic probes expose subtle early signs of progression to AD in advance of clinical diagnosis of impairment. Funding Pfizer, Inc. provided funding to obtain data from the Framingham Heart Study Consortium, and to support the involvement of IBM Research in the initial phase of the study. The data used in this study was supported by Framingham Heart Study's National Heart, Lung, and Blood Institute contract (N01-HC-25195), and by grants from the National Institute on Aging grants (R01-AG016495, R01-AG008122) and the National Institute of Neurological Disorders and Stroke (R01-NS017950).
Content may be subject to copyright.
Research Paper
Linguistic markers predict onset of Alzheimers disease
Elif Eyigoz
*, Sachin Mathur
, Mar Santamaria
, Guillermo Cecchi
*, Melissa Naylor
IBM Thomas J. Watson Research Center, IBM Research, Yorktown Heights, NY 10598, United States
Pzer Worldwide Research and Development, Cambridge, MA 02139, United States
Article History:
Received 27 April 2020
Revised 19 September 2020
Accepted 22 September 2020
Available online 22 October 2020
Background: The aim of this study is to use classication methods to predict future onset of Alzheimers dis-
ease in cognitively normal subjects through automated linguistic analysis.
Methods: To study linguistic performance as an early biomarker of AD, we performed predictive modeling of
future diagnosis of AD from a cognitively normal baseline of Framingham Heart Study participants. The lin-
guistic variables were derived from written responses to the cookie-theft picture-description task. We com-
pared the predictive performance of linguistic variables with clinical and neuropsychological variables. The
study included 703 samples from 270 participants out of which a dataset consisting of a single sample from
80 participants was held out for testing. Half of the participants in the test set developed AD symptoms
before 85 years old, while the other half did not. All samples in the test set were collected during the cogni-
tively normal period (before MCI). The mean time to diagnosis of mild AD was 7.59 years.
Findings: Signicant predictive power was obtained, with AUC of 0.74 and accuracy of 0.70 when using lin-
guistic variables. The linguistic variables most relevant for predicting onset of AD have been identied in the
literature as associated with cognitive decline in dementia.
Interpretation: The results suggest that language performance in naturalistic probes expose subtle early signs
of progression to AD in advance of clinical diagnosis of impairment.
Funding: Pzer, Inc. provided funding to obtain data from the Framingham Heart Study Consortium, and to
support the involvement of IBM Research in the initial phase of the study. The data used in this study was
supported by Framingham Heart Studys National Heart, Lung, and Blood Institute contract (N01-HC-25195),
and by grants from the National Institute on Aging grants (R01-AG016495, R01-AG008122) and the National
Institute of Neurological Disorders and Stroke (R01-NS017950).
© 2020 Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license
1. Introduction
A key priority in Alzheimers disease (AD) research is the identica-
tion of early intervention strategies that will decrease the risk, delay the
onset, or slow the progression of disease. Early interventions can only
be effectively tested and implemented if the population that stands to
benetcanbeidentied. While many variables have been associated
with risk of AD, there is still a great need for the development of cheap,
reliable biomarkers of preclinical AD. Aging-related cognitive decline
manifests itself in almost all aspects of language comprehension and
production. Even seemingly mundane linguistic abilities, such as object
naming, engage extensive brain networks [1]. As a result, these linguis-
tic abilities can easily be disrupted, which makes language competence
a sensitive indicator of mental dysfunction. The inuential Nun Study
[2] provided initial evidence of a correlation between lower linguistic
performance early in life and higher incidence of cognitive decline and
conversion rates to AD late in life.
The aim of this study is to test to what extent linguistic perfor-
mance at a single time point can be utilized as a prognostic marker of
conversion to AD. We used data from the Framingham Heart Study
(FHS) [3], a large cohort longitudinal study spanning several decades.
As a part of FHS, qualifying participants were administered a neuro-
psychological (NP) test battery in successive visits [46], which
included the cookie-theft picture description task (CTT) from the Bos-
ton Aphasia Diagnostic Examination [7]. Picture description tasks are
used to assess discourse in subjects with disorders such as aphasia
and dementia, and CTT has become the most frequently used picture
description task in clinical settings [8]. We applied computational
techniques to extract linguistic variables from written responses to
the CTT and compared their prognostic value with that of more tradi-
tional clinical variables that could easily be obtained in the screening
period of a clinical trial, including NP test scores, demographic and
genetic information, and medical history. Using the variables
obtained when the participants were assessed to be cognitively
* Corresponding authors.
E-mail addresses: (E. Eyigoz),
(G. Cecchi), (M. Naylor).
2589-5370/© 2020 Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (
EClinicalMedicine 28 (2020) 100583
Contents lists available at ScienceDirect
journal homepage:
normal, we developed models to predict whether or not a particular
participant will develop MCI due to AD on or before 85 years old.
Our work signicantly differs from the current literature on pre-
dicting future onset of AD in the following ways: First, our prediction
is based on data collected while the participants were cognitively
healthy. Second, we focus exclusively on variables readily attainable
as part of the screening phase of an early-intervention trial and assess
predictive performance using only linguistic metrics derived from a
single administration of the Cookie Theft Task, a relatively simple and
naturalistic language probe. Third, we utilized a machine learning
approach to deal with a multivariate representation of linguistic per-
formance. Finally, we compare the predictive ability of language fea-
tures with that of more traditional variables associated with
identication of high risk for AD, e.g., for inclusion in a clinical trial of
potentially disease-modifying therapy).
2. Methods
2.1. Cognitive assessment in the Framingham heart study
The FHS is a well-documented, community-based cohort study
initiated in 1948, with the purpose of longitudinal monitoring of par-
ticipantshealth [3,9]. Cognitive status monitoring of the original
cohort began in 1975, and since 1981 the participantscognitive sta-
tus has been assessed with the MiniMental State Examination
(MMSE) [10] at examinations taking place every 4 years [4,11]. Partic-
ipants in the offspring cohort have undergone MMSEs since
1991, and have undergone NP examinations every 5 or 6 years since
1999 [6]. Annual neurologic and neuropsychological examinations
were performed when cognitive decline was reported by a family
member of the participant, upon referral by a physician or by the
investigators of the FHS, or after review of the participants medical
records [11]. Cognitive status monitoring of the participants was
reviewed by the Institutional Review Board of Boston University, and
informed consent was obtained from the participants.
The neuropsychological test battery resulted in a dementia rating,
which represents the impression of the examiner who administered
the test battery [11]. The test battery included the cookie-theft pic-
ture description task (CTT) from the Boston Aphasia Diagnostic Exam-
ination, in which participants were asked to write down the
description of the cookie-theft picture. As highlighted above, picture
description tasks are commonly used to assess discourse in subjects
with disorders such as aphasia and dementia, and, given its sensi-
tivity to cognitive impairments, CTT has become the most frequently
used picture description task in clinical settings [8]. The FHS study
participants who qualied for inclusion in our study were among the
oldest participants of the FHS, mostly from the original cohort, which
was limited in its representativeness of the wider population [3].
A dementia-review panel with at least one neurologist and one
neuropsychologist reviewed possible cognitive decline and dementia
cases documented in the FHS [12,13]. Diagnosis of dementia was
based on criteria from DSM-IV [14], and diagnosis of Alzheimers dis-
ease was based on criteria from NINCDSADRDA [15,16].
2.2. Predictive modeling approach
To t predictive models of future diagnosis of AD, we had to deter-
mine which participants to label as cases and which to label as con-
trols. FHS participants varied in terms of whether their data was
comprehensively reviewed by a panel of experts to determine
dementia and AD status. We rst identied a clinically dened test
set by using these dementia reviews to label cases, selecting one CTT
sample from each case, and matching it to a CTT sample collected in a
control of the approximately the same age, gender and level of educa-
tion. Because the FHS data available to us included a dementia review
for only 39% of participants, only 80 of the participants qualied for
inclusion in this test set. This left a very large number of participants
unused. While most of the participants did not have dementia review
data allowing for denitive labeling of cases, a dementia rating was
available for the majority of the administrations of the neuropsycho-
logical test-battery. Using these dementia ratings, we used additional
participants to create a training set. In semi-supervised learning ter-
minology, the clinical dementia-review provided the ground-truth
labels of the test data, whereas the dementia ratings provided the
weak labels of the training data. This weakly-labeled training set was
used only for machine learning training.
We validated predictive models in two ways: the hold-out method
and the cross-validation method. For the hold-out method, we made use
of the weakly-labeled training data by tting the model to weakly
labeled training data and then validating it on the held-out ground-truth
test data. For the cross-validation method we implemented 20-fold cross
validation on the test data (see the Supplementary Material for details).
2.3. Selection of participants and samples
In this study, the onset of AD was dened as the onset of mild cog-
nitive impairment (MCI) in a participant who later received a diagno-
sis of AD. MCI is a heterogeneous condition; however, for those MCI
patients who eventually convert to AD, MCI is considered by many to
represent early-stage AD [1719]. AD patients who developed MCI
on or before age 85 (denoted as 85) were dened as cases.
We dened the normal-aging group as the participants who were
recorded to be dementia-free on or after age 85 ( 85). The control
group was dened as the combination of the normal-aging group and
AD patients whose onset of cognitive impairment was after 85 (>85)
years old, as depicted in Fig. 1. According to this denition, all cases
have already developed cognitive impairment due to AD at 85, and
none of the controls have developed cognitive impairment due to AD at
85. Age 85 was chosen as a threshold, because this threshold was the
optimum age to provide the largest balanced test set from the FHS data
that was available to this study. As the age threshold increases, less par-
ticipants qualify to be controls, and more participants qualify to be cases.
Conversely, as the age threshold decreases, more participants qualify to
be controls, and less participants qualify to be cases. In addition to pro-
viding the largest test set from FHS, age 85 has been widely used as a
threshold to dene oldest-old in AD studies [20]. It has been suggested
that very-late onset AD (VLOAD), as dened by AD onset after the sec-
ond half of the ninth decade differs from earlier-onset AD with respect
to genetic and environmental patterns: Genetic risk factors for AD are
more inuential at relatively earlier ages with decreasing inuence as
age increases; while environmental factors may play a larger role in
developing VLOAD [2022].
The test data set included only one sample per participant, and
they were matched to a control sample using age (+/- 2 years), gen-
der, and education. As our purpose was to predict conversion in cog-
nitively normal subjects, we included only samples collected prior to
any cognitive impairment onset. Samples from participants who did
not meet criteria for inclusion as ground-truth were used for training;
see the Supplementary Material for the details. Fig. 2 shows a dia-
gram summarizing the selection of participants and samples for the
test set and the weakly-labeled training set. The demographics of
participants in the test and training data sets can be seen in Table 1.
For the test cases, the mean time to diagnosis with mild AD from cog-
nitive normality was 7.59 years with standard deviation of 4.91, and
the mean time to cognitive impairment onset from cognitive normal-
ity was 3.93 years with standard deviation of 3.69.
2.4. Psycholinguistic analyses
In this section, we provide an overview of the psycholinguistic
analyses that were performed automatically for this study. See the
2E. Eyigoz et al. / EClinicalMedicine 28 (2020) 100583
Fig. 1. The diagram depicts method for selection of cases vs controls and predictive model setting. Participants who developed MCI due to AD on or before age 85 were selected as
cases, and participants remained dementia-free until age 85 were selected as controls. The three predictive models included only non-linguistic variables, only linguistic variables,
or both (see Table 3), collected when participants were considered cognitively normal, and were trained to predict conversion status by age 85 vs. later or no conversion.
Fig. 2. Method of selection of participants for creating a test data set and a training data set from the FHS data. The available data consisted of 3113 samples from 1254 participants,
486 of which have been reviewed by a panel for dementia status. The participants who were reviewed by the panel were candidates for creating a test data set. Their samples were
eliminated according to the inclusion criteria, and then the qualifying samples were passed through age, education and gender matching. This resulted in a test set of 80 samples.
The participants who were not reviewed by the dementia review panel were used for creating a larger weakly-labeled data set, only for the purpose of machine learning training.
Validation of predictive modeling consisted of the hold-out method (train on weak-labels, test on ground-truth), and cross-validation (train on ground-truth, test on ground-truth).
E. Eyigoz et al. / EClinicalMedicine 28 (2020) 100583 3
Supplementary Material for the variables computed for the analyses
presented in this section.
Verbosity, lexical richness, and repetitiveness was assessed by
using metrics such as number of words, number of unique words,
and frequencies of repetitions (Fig. 3). Misspellings, use of punctua-
tion, and uppercasing were analyzed to assess writing performance
and style. Language-modeling analyses were performed to model the
distributions of word sequences. Syntactic complexity was assessed
through analysis of parse trees. Semantic content was assessed
through analysis of participantsmention of information content
units. Finally, propositional idea density analysis was used to quantify
syntactic and semantic complexity.
2.5. Non-linguistic variables
The non-linguistic variables are age, gender, education (dichoto-
mized as college degree vs. no college degree, and high-school vs. no
high-school degree), number of APOE e4 alleles, two binary indicator
variables capturing evidence of hypertension or diabetes, and varia-
bles resulting from the NP tests. The NP tests used in this study
include assessment of visuospatial and executive reasoning, object
naming, memory, attention, abstraction, and language skills. A total
of 13 NP tests, as listed in Table 2, resulting in 32 NP variables (see
Supplementary Table 6) were used in this study. Consequently, the
comprehensiveness of neuropsychological assessment used in this
study surpass concise assessments, such as the Montreal cognitive
assessment MoCA [23]. The clinical measures MMSE and the demen-
tia ratings were not included in the models, as all samples in the
ground-truth labeled test set, for both controls and cases, were col-
lected during the periods of cognitive normality and had no signi-
cant variance.
2.6. Variable selection and training of predictive models
In total, 87 linguistic variables were computed (see Supplemen-
tary Table 6). Two NP test scores were excluded, because they were
missing for more than half of the samples, leaving 31 NP test scores,
three clinical and two demographic variables as non-linguistic varia-
bles. Before training predictive models, variable selection was per-
formed strictly on the training data by using a univariate test between
the preclinical AD cases and the control groups for each variable and
eliminating variables that were not statistically signicant (p>=
0.05). The t-test was used in the cross-validation experiments and
the Wilcoxon signed rank test was used in the hold-out experiments.
The use of different univariate tests for different experiment condi-
tions was justied due to difference in data size and the noise in the
Fig. 3. CTT examples from FHS, including an unimpaired sample (a), an impaired sample showing telegraphic speech and lack of punctuation (b), and an even more impaired sample
showing in addition signicant misspellings and minimal grammatic complexity, e.g. lack of subjects (c).
Table 1
Age demographic and the education level of the ground-truth labeled and weakly-labeled data sets. The
number of samples and the number of participants are the same in the ground-truth labeled set, as only one
sample per participant was included.
Ground-truth labels Weak labels
Age (mean +/- SD) Participants/
Samples Age (mean +/- SD) Samples Participants
Control Female 78.86 §6.01 22 84.34 §5.18 326 86
Male 79.0 §4.39 18 83.72 §4.99 191 46
Case Female 78.45 §5.36 22 71.79 §5.1 61 29
Male 79.17 §4.29 18 73.76 §5.43 45 29
Control No college 78.74 §5.89 23 83.83 §5.14 204 60
College 79.18 §4.48 17 84.36 §5.08 313 72
Case No college 78.22 §5.19 23 72.59 §5.71 22 16
College 79.53 §4.42 17 72.63 §5.23 84 42
Total 80 80 623 190
4E. Eyigoz et al. / EClinicalMedicine 28 (2020) 100583
weak labels. See Supplementary Material for details of the variable
selection for the hold-out and the cross-validation methods. For
training predictive models, we experimented with linear SVM, logis-
tic regression and Naïve Bayes classiers. The hyperparameters of the
classiers were set using nested cross-validation.
2.7. Longitudinal analysis of linguistic and NP variables
To identify possible longitudinal trends present in our multi-
dimensional assessment of cognitive status, we implemented a fac-
torization analysis, using all available samples from each eligible par-
ticipant taking into account the correlational structure between both
linguistic variables and the NP test scores. For this, we used the cases
and the normal-aging participants who have a record of cognitive
impairment onset. We aligned their samples temporally by their cog-
nitive impairment date. The frequency of administration of the NP
exams varied across participants and was on average 2.2 years. In
order to normalize this variance across participants, we created syn-
thetic samples by linear interpolation with a frequency of six months.
We then used Nonnegative Matrix Factorization (NMF) on the up-
sampled dataset. To compare the progression of cases and the
normal-aging group, we projected the latter onto the factors learned
for the former. The projections of each sample on the rst factor were
computed, and then averaged over all samples in each six-month
interval to obtain the loading of each interval. For this analysis, we
used all NP variables, and linguistic variables that were statistically
signicant on the test set with t-test, that were statistically signicant
on the training set with Wilcoxon signed rank test, and linguistic var-
iables that were statistically signicant with the Cox proportional-
hazards model analysis, which is described in the following section.
2.8. Analysis of time to diagnosis with mild AD
To assess whether linguistic variables associated with the time-to-
diagnosis with mild AD, we used Cox proportional-hazards models.
Date of mild AD diagnosis was obtained from the dementia review,
and the participants who were recorded as dementia free in their
dementia review were censored. If a censored participant was alive
at the date of the review, then the review date was used as the censor
date. If the participant was no longer alive at the date of the review,
then the oldest age the participant is known to be not demented was
used as the censor date. Models for each single linguistic variable
included as additional covariates age, gender, and education (i.e., col-
lege degree vs. no college degree.)
2.9. Role of funding source
Pzer, Inc. provided funding to obtain the data from the Framing-
ham Heart Study Consortium, and to support the involvement of IBM
Research in the initial phase of the study. The funder had no role in
data analysis or interpretation, which were the responsibility of the
contributing authors.
3. Results
Univariate tests of individual variables between cases and controls
showed that future onset of AD was associated with telegraphic
speech, repetitiveness, and misspellings (see Supplementary Table
3). Telegraphic speech, as exemplied in Fig. 3, is a common symp-
tom of non-uent aphasia. In telegraphic speech, grammatical struc-
ture is reduced or absent, such that language contains simplied
phrases consisting mainly of content words, with morphology and
function words largely missing [24,25]. As shown in the examples
from cognitively impaired participants in Fig. 3, telegraphic speech is
not only simpler in grammatical structure, but also marked by lack of
determiners (the,a), auxiliaries (is,are) and entire subjects. Fur-
thermore, samples from impaired participants further demonstrate
misspellings and lack of punctuation.
Fig. 4. The ROC curve of the test-set with the hold-out method for the linguistic-based
model (see Table 3). This result was obtained by a Logistic Regression classier.
Table 2
Neuropsychological tests (NP) included in the predictive models along with clinical and demographic variables, separately
and in conjunction with linguistic variables. See Supplementary Table 1 for the full list of NP variables obtained through
these NP tests.
Cognitive Domain Description
Word retrieval Boston Naming Test
Learning The paired associate learning subtest from the Wechsler Memory Scale (WMS)
Attention and concentration Wechsler Adult Intelligence Scale (WAIS) score for digit span
Verbal memory The logical memory subtest from the Wechsler Memory Scale (WMS)
Premorbid intelligence The reading subset of the Wide range achievement test WRAT-3
Verbal ability and executive control Verbal uency
Attention and concentration Trail making tests A and B
Abstract reasoning The similarities test from Wechsler Adult Intelligence Scale (WAIS)
Visuoperceptual organization Hooper Visual Organization Test
Visual memory The visual reproduction subtest from the Wechsler Memory Scale (WMS-R)
Verbal Comprehension The information subset of the Wechsler Adult Intelligence Scale (WAIS-R)
Spatial visualization Block design test
Psychomotor speed Finger tapping test
E. Eyigoz et al. / EClinicalMedicine 28 (2020) 100583 5
Prediction performance in each experimental setting obtained by the
best performing classier are shown in Table 3. The plots showing the
separation of the test and the training datasets by the best performing
classier reported in Table 3 canbeseeninSupplementaryFigure2.In
Table 3, all metrics for each experimental setting were obtained by the
same classier. The results obtained by other classiers that are not
reported in Table 3 can be found in the Supplementary Table 2.
Supplementary Table 4 shows the weights assigned to the linguis-
tic variables by the best performing classier reported in Table 3 in
the hold-out method. Similarly, Supplementary Table 5 shows the
weights assigned to the non-linguistic variables by the best
performing classier reported in Table 3 in the hold-out method. We
performed a step-wise classication analysis by ranking the variables
with respect to the weights assigned to the them by the best per-
forming classier, and by incrementally adding variables for classi-
cation until all variables with p-value <0.05 were exhausted.
Supplementary Figure 4 shows that the highest AUC of 0.76 was
obtained with using the highest ranked 10 linguistic variables, which
can be found in Supplementary Table 4.
In order to assess statistical signicance, we computed a null distri-
bution of AUCs for chance classication outcomes, and applied z-statis-
tics to estimate the probability of the AUCs obtained by the predictive
Table 3
Results of prediction experiments for the three models. AUC stands for the area-under-the-curve statistic. Accuracy is ratio of
correctly predicted samples to the total number of samples. Positive predictive value is the ratio of correctly predicted positive
samples to the total predicted positive samples. Sensitivity is the ratio of correctly predicted positive samples to the all obser-
vations in the patient class. All metrics for each experimental setting were obtained by the same classier. The best perform-
ing classiers in the hold-out experiments were Logistic Regression in the linguistic and non-linguistic settings, and Naïve
Bayes with the combination of linguistic and non-linguistic features. The best performing classiers in the CV-experiments
were Logistic Regression in the linguistic settings, and Naïve Bayes in the non-linguistic setting and the combination of lin-
guistic and non-linguistic features.
CV method Hold-out method
Best classier Logistic Regression Logistic Regression
Linguistic variables from single
CTT samples during cognitive normalcy
Accuracy 0.65 0.70
AUC 0.73 0.74
Positive predictive value 0.64 0.74
Sensitivity 0.67 0.62
Best classier Naïve Bayes Logistic Regression
Non-linguistic variables (age, gender,
education, APOE, hypertension, diabetes, NP)
Accuracy 0.60 0.59
AUC 0.64 0.60
Positive predictive value 0.64 0.61
Sensitivity 0.44 0.48
Best classier Naïve Bayes Naïve Bayes
Aggregation of linguistic and
non-linguistic variables
Accuracy 0.67 0.69
AUC 0.72 0.67
Positive predictive value 0.81 0.71
Sensitivity 0.44 0.62
Fig. 5. The results of the non-negative matrix factorization (NMF) analysis of the linguistic and the NP variables on longitudinal data. This plot demonstrates that the factorization of
the variables without using time information temporal trend as well as a differentiation between cases and controls, which starts several years before cognitive impairment. The
controlssamples are projected onto the factorization learned from the casessamples and averaged over six-month intervals. Controls are shown in blue, and cases in red. The hori-
zontal axis is years to/from cognitive impairment onset, where 0 stands for the date of cognitive impairment. .
6E. Eyigoz et al. / EClinicalMedicine 28 (2020) 100583
models [26]. The z-score indicated that AUC of 0.74 (see Fig. 4 for the
ROC curve) corresponds to a 4.26-fold increase in predictability over
chance (p<0.001). We observed a ten-point increase in accuracy
obtained by adding linguistic variables to the non-linguistic variables
(non-linguistic alone 0.59, combined 0.69). This indicates that linguistic
variables offer signicant information over the non-linguistic variables
in terms of their predictive diagnostic ability. The ratio of z-scores rela-
tive to the hull hypothesis indicated that the linguistic variables yielded
aclassication performance 2.4 times better than non-linguistic varia-
bles; the ratio of AUC gains respect to chance, provides a comparable
value (0.19/0.09 = 2.11).
To examine the effects of education and sex on performance of the
model using linguistic variables and the hold-out method, AUC scores
were computed for participants with college degree vs participants
without a college degree, and for females vs males. The participants
with a college degree were harder to predict than participants without
a college degree (AUC of 0.70 for college-degree vs 0.76 for no-college
degree, see Supplementary Figure 3 for the ROC curves). The ratio of z-
scores indicated that classication of the participants with no college
degree was 1.52 times better than for the participants with college
degree (as above, the gain ratio is 0.26/0.20 = 1.3). Similarly, females
were both more accurately and more condently predicted than males,
and the difference is substantial (AUC of 0.83 for females vs 0.64 for
males, see Supplementary Figure 3 for the ROC curves). The ratio of z-
scores indicated that the females were classied 2.61 times better than
males when compared to chance (the gain ratio is 0.33/0.14 = 2.35).
The longitudinal analysis in Fig. 5 shows the results of NMF factor-
ization of linguistic and NP variables, and demonstrates that an unsu-
pervised grouping of the variables without using time information
indeed shows a clear temporal trend, as well as a differentiation
between cases and controls which starts several years before cogni-
tive impairment. The plot shows the change in the loading of each
time interval on the rst component obtained by NMF, where 0 in
the horizontal axis stands for the date of cognitive impairment onset.
The green line shows the controlsprogression in time, whereas the
blue line shows the casess progression in time, with a steeper decline
for the cases. Supplementary Figure 5 shows the loading of the factors
on the rst component from the NMF analysis, which shows the
respective contribution of linguistic and NP variables in the computa-
tion of the plot in Fig. 5 in the manuscript.
For the Cox proportional-hazards analysis, we used 143 partici-
pants, of which 28 were censored, with a total of 1159 person-years,
where average was 8.10 years per person. See Table 4 for all statisti-
cally signicant linguistic variables according to the Wald statistic.
Our results show that using the referentially generic terms boy, girl,
woman instead of the more specicson, brother, sister, daughter,
mother to refer to the subjects in the picture is associated with higher
risk of AD. Our results also show that mentioning the details in the
picture, such as the dishcloth and the dishes, is associated with lower
risk of AD. Consequently, this analysis revealed that the strongest
prognostic factors of AD involved semantic processing.
4. Discussion
Our results demonstrate that it is possible to predict future onset
of Alzheimers disease using language samples obtained from cogni-
tively normal individuals. Moreover, we showed that using linguistic
variables from a single administration of the cookie-theft picture
description task performed better than predictive models that incor-
porated APOE, demographic variables, and NP test results.
Linguistic competence is a behavioral marker of educational and
occupational attainment, both of which have been suggested to
increase cognitive reserveby epidemiologic studies. Higher cogni-
tive reserve allows some people to be more resilient to brain pathol-
ogy than others [27], such that they can compensate the dysfunction
and delay diagnosis of AD [28]. In this regard, we found a signicant
differentiation between participants with and without college educa-
tion. Furthermore, it is well-known that the prevalence of AD is sig-
nicantly higher in women as compared to men, and that women
show a faster rate of progression after onset of cognitive impairment
[2931]. Similar to what we observed with educational attainment,
we found that it is much easier to predict conversion in women than
in men, suggesting that prodromal changes are more prominent in
females than in males.
The linguistic variables that we identied as most relevant for
predicting future onset of AD, prominently agraphia, telegraphic
speech and repetitiveness (see Supplementary Table 3), have been
consistently identied in the literature as associated with cognitive
decline in dementia. Repetitive speech that involves repetitive ques-
tioning, repetitive stories/statements, repetitive themes have been
reported in patients with dementia [32,33]. Studies on agraphia in
dementia and in AD participants have shown that patients made
more writing errors compared to controls [34]. Declines in structural
complexity of utterances have been extensively investigated in peo-
ple with Alzheimers disease and dementia [35,36]. Another linguistic
element that has been associated with dementia, referential specic-
ity, was identied as having a strong weight in the survival analysis,
which is supported by a large number of studies showing that
semantic impairments are the earliest linguistic markers of
dementia [37,38].
While the Cox Proportional-Hazards analysis identied semantic/
lexical factors, these factors did not prove to be discriminatory in the
classication tasks. We believe that this is due to the differences in
the design of these analyses. An age threshold was used for inclusion
in the control vs case group in the classication task, whereas the
Cox analysis treated all participants with a diagnosis of AD equally as
non-censored participant. As a result, among the 115 non-censored
participants in the Cox analysis, 48 of them had MCI onset after 85
years old, which would put them in the control group in the classi-
cation task. The contrasting results in these analyses indicate that, in
accordance with prior literature, semantic factors are predictive of
future diagnosis of AD for all subjects regardless of the age of onset,
as opposed to being predictive of AD onset before mid-eighties. Simi-
larly, verbosity and lexical richness metrics, which stand out as
strong markers of cognitive impairment in already demented
patients [39], were not among the strong predictors of future diagno-
sis of AD in cognitively normal individuals in our study.
The result of the longitudinal analysis of linguistic and NP varia-
bles, depicted in Fig. 5, shows a steeper decline in the trajectory of
aging for the AD group as compared with normal aging, which starts
during the preclinical phase. Similar clinical trajectories for AD and
normal aging were suggested in the literature [40].
The analysis of the written version of the CTT may be considered a
limitation of our study. The spoken version of the task may reveal dif-
ferent aspects of linguistic dysfunction. Another limitation of our
study is that a thorough analysis of the correlational structure of
Table 4
Results of the Cox proportional hazards models: HR
stands for hazard ratio, CI for lower 95 and upper 95 con-
dence interval for the hazard ratio. HRs are for 1 SD
increase in these measures.
ICU HR CI P-value
falling 1.3148 (1.0531.6417) 0.0157
dishes 0.8172 (0.69010.9677) 0.0193
girl 1.1895 (1.02311.3829) 0.024
dishcloth 0.8368 (0.7120.9835) 0.0307
boy 1.1704 (1.01161.354) 0.0344
woman 1.2066 (1.00131.4539) 0.0484
E. Eyigoz et al. / EClinicalMedicine 28 (2020) 100583 7
linguistic features and neuropsychological test scores is outside the
scope of the present article. Finally, our denition of the caseand
controllabels, while designed to be as clinically relevant as possible,
is ultimately discretionary and open to interpretation.
Biomarkers such as cerebrospinal uid or brain imaging [41] and
neuropsychological tests [41,42] have been used to predict progres-
sion of MCI to AD/dementia. Most recently, very promising results
were reported using Neurolament light chain (NfL) for disease pro-
gression at the early pre-symptomatic stages of familial Alzheimers
disease [43]. However, these are still technologically or logistically
demanding, and require signicant specialistsinvolvement. On the
other hand, simple, naturalistic and inexpensive speech probes, as
our results suggest, can provide an assistive tool for the early detec-
tion and progression monitoring of AD, particularly given that such
probes can be easily adapted to remote digital platforms with low
patient burden.
Elif Eyigoz contributed to the research design, implemented the
coding and ran the experiments. She performed the literature review
and drafted and edited the manuscript. Finally, she contributed to
the interpretation of the results. Melissa Naylor contributed to the
research design, reviewed and edited the manuscript, and contrib-
uted to the interpretation of the results. Guillermo Cecchi contributed
to the research design, reviewed and edited the manuscript, and con-
tributed to the interpretation of the results. Sachin Mathur contrib-
uted to the coding, and research design, and reviewed and edited the
manuscript. Mar Santamaria contributed to the research design and
reviewed the manuscript.
Pzer, Inc. provided funding to obtain the data from the Framing-
ham Study Consortium, and funding to IBM Corp. for the initial phase
of the study. The data used in this study was supported by Framing-
ham Heart Studys National Heart, Lung, and Blood Institute contract
(N01-HC-25195), and by grants from the National Institute on Aging
grants (R01-AG016495,R01-AG008122) and the National Institute of
Neurological Disorders and Stroke (R01-NS017950).
Data sharing statement
In order to gain access to the Framingham Heart Study (FHS) data,
investigators have to submit a research proposal for review by one or
more FHS review committees. Approved study proposals further
require a fully executed Data and Materials Distribution Agreement,
and an IRB approval. The Data and Materials Distribution Agreement
can be accessed from the following link: https://framingham
Declaration of Competing Interests
Elif Eyigoz and Guillermo Cecchi has worked as salaried employ-
ees of IBM Corp. for the full duration of this project. Melissa Naylor
was a salaried employee of Pzer, Inc. when assigned to this project,
until October 2018, and since then has been a salaried employee of
Takeda Pharmaceuticals. Sachin Mathur and Mar Santamaria have
worked as salaried employees of Pzer, Inc. for the full duration of
this project. Guillermo Cecchi declares that IBM holds a patent (US-
9508360-B2) for the extraction of one of the features used in the lin-
guistic model.
Supplementary materials
Supplementary material associated with this article can be found,
in the online version, at doi:10.1016/j.eclinm.2020.100583.
[1] Roth CR, Helm-Estabrooks N. Boston naming test. Encyclopedia of clinical neuro-
psychology. Springer International Publishing; 2018. p. 6115.
[2] Snowdon DA. Linguistic ability in early life and cognitive function and Alz-
heimers disease in late life. Findings from the Nun study. JAMA: J Am Med Assoc
[3] Dawber TR, Meadors GF, Moore Jr FE. Epidemiological approaches to heart dis-
ease: the Framingham study. Am J Public Health Nations Health 1951;41:27986.
[4] Farmer ME, White LR, Kittner SJ, Kaplan E, Moes E, McNamara P, et al. Neuropsy-
chological test performance in Framingham: a descriptive study. Psychol Rep
[5] Seshadri S, Wolf PA, Beiser A, Au R, McNulty K, White R, et al. Lifetime risk of
dementia and Alzheimers disease: the impact of mortality on risk estimates in
the Framingham study. Neurology 1997;49:1498504.
[6] Au R, Seshadri S, Wolf PA, Elias MF, Elias PK, Sullivan L, et al. New norms for a new
generation: cognitive performance in the Framingham offspring cohort. Exp
Aging Res 2004;30:33358.
[7] Goodglass H, Kaplan E. The assessment of aphasia and related disorders. Lea &
Febiger; 1972.
[8] Cummings L. Describing the cookie theft picture: sources of breakdown in Alz-
heimers dementia. Pragm Soc 2019;10:15376.
[9] Kannel WB, Feinleib M, McNamara PM, Garrison RJ, Castelli WP. An investigation
of coronary heart disease in families: the Framingham offspring Study. Am J Epi-
demiol 1979;110:28190.
[10] Folstein MF, Folstein SE, McHugh PR. Mini-mental state: a practical method for
grading the cognitive state of patients for the clinician. J Psychiatr Res
[11] Seshadri S, Wolf P, Beiser A, Au R, McNulty K, White R, et al. Lifetime risk of
dementia and Alzheimers disease: the impact of mortality on risk estimates in
the Framingham Study. Neurology 1997;49:1498504.
[12] Seshadri S, Beiser A, Au R, Wolf PA, Evans DA, Wilson RS, et al. Operationalizing
diagnostic criteria for Alzheimers disease and other age-related cognitive
impairmentpart 2. Alzheimers Dement 2011;7:3552.
[13] Au R, Seshadri S, Knox K, Beiser A, Himali JJ, Cabral HJ, et al. The Framingham
brain donation program: neuropathology along the cognitive continuum. Curr
Alzheimer Res 2012;9:67386.
[14] Association AP, others. Diagnostic and statistical manual of mental disorders
(DSM-5Ò). American Psychiatric Pub; 2013.
[15] McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadlan EM. Clinical
diagnosis of Alzheimers disease: report of the NINCDS-ADRDA Work Group
under the auspices of department of health and human services task force on Alz-
heimers disease. Neurology 1984;34 939939.
[16] Bachman DL, Wolf PA, Linn RT, Knoefel JE, Cobb JL, Belanger AJ, et al. Incidence of
dementia and probable Alzheimers disease in a general population the Framing-
ham study. Neurology 1993;43 515515.
[17] Morris JC, Storandt M, Miller JP, McKeel DW, Price JL, Rubin EH, et al. Mild cogni-
tive impairment represents early-stage Alzheimer disease. Arch Neurol 2001:58.
[18] Morris JC. Mild cognitive impairment is early-stage Alzheimer disease: time to
revise diagnostic criteria. Arch Neurol 2006;63:156.
[19] Stephan B, Hunter S, Harris D, Llewellyn D, Siervo M, Matthews F, et al. The neuro-
pathological prole of mild cognitive impairment (MCI): a systematic review. Mol
Psychiatry 2012;17:1056.
[20] Silverman JM, Smith CJ, Marin DB, Mohs RC, Propper CB. Familial patterns of risk
in very late-onset Alzheimer disease. Arch Gen Psychiatry 2003;60:1907.
[21] Silverman JM, Li G, Zaccario ML, Smith CJ, Schmeidler J, Mohs RC, et al. Patterns of
risk in rst-degree relatives of patients with Alzheimers disease. Arch Gen Psy-
chiatry 1994;51:57786.
[22] Silverman JM, Ciresi G, Smith CJ, Marin DB, Schnaider-Beeri M. Variability of
familial risk of Alzheimer disease across the late life span. Arch Gen Psychiatry
[23] Nasreddine ZS, Phillips NA, B
edirian V, Charbonneau S, Whitehead V, Collin I,
et al. The montreal cognitive assessment, MoCA: a brief screening tool for mild
cognitive impairment. J Am Geriatr Soc 2005;53:6959.
[24] Goodglass H. Understanding aphasia. Academic Press; 1993.
[25] Thompson CK. Treatment of syntactic and morphologic decits in agrammatic
aphasia: treatment of underlying forms. Language intervention strategies in
aphasia and related neurogenic communication disorders: fth edition. Wolters
Kluwer Health Adis (ESP); 2012. p. 73555.
[26] Mason SJ, Graham NE. Areas beneath the relative operating characteristics (ROC)
and relative operating levels (ROL) curves: statistical signicance and interpreta-
tion. Quart J R Meteorol Soc: J Atmos Sci Appl Meteorol Phys Oceanogr
[27] Stern Y. Inuence of education and occupation on the incidence of Alzheimers
disease. JAMA 1994;271:100410.
[28] Katzman R, Terry R, DeTeresa R, Brown T, Davies P, Fuld P, et al. Clinical, patholog-
ical, and neurochemical changes in dementia: a subgroup with preserved mental
status and numerous neocortical plaques. Ann Neurol: Off J Am Neurol Assoc
Child Neurol Soc 1988;23:13844.
8E. Eyigoz et al. / EClinicalMedicine 28 (2020) 100583
[29] Andersen K, Launer LJ, Dewey ME, Letenneur L, Ott A, Copeland JRM, et al. Gender
differences in the incidence of AD and vascular dementia: the EURODEM studies.
Neurology 1999;53 19921992.
[30] Vi~
na J, Lloret A. Why women have more Alzheimers disease than men: gender
and mitochondrial toxicity of amyloid-bpeptide. J Alzheimers Dis 2010;20:
[31] Mielke M, Vemuri P, Rocca W. Clinical epidemiology of Alzheimers disease:
assessing sex and gender differences. Clin Epidemiol 2014:37.
[32] Barton S, Findlay D, Blake RA. The management of inappropriate vocalisation in
dementia: a hierarchical approach. Int J Geriatr Psychiatry 2005;20:11806.
[33] de Lira JO, Ortiz KZ, Campanha AC, Bertolucci PHF, Minett TSC. Microlinguistic
aspects of the oral narrative in patients with Alzheimers disease. Int Psychoger-
iatr 2010;23:40412.
[34] Lambert J, Eustache F, Viader F, Dary M, Rioux P, Lechevalier B, et al. Agraphia in
Alzheimers disease: an independent lexical impairment. Brain Lang
[35] Kempler D, Almor A, Tyler LK, Andersen ES, MacDonald MC. Sentence compre-
hension decits in Alzheimers disease: a comparison of off-line vs. on-line sen-
tence processing. Brain Lang 1998;64:297316.
[36] Lyons K, Kemper S, Labarge E, Ferraro FR, Balota D, Storandt M. Oral language and
Alzheimers disease: a reduction in syntactic complexity. Aging, Neuropsychol
Cogn 1994;1:27181.
[37] Martin A, Fedio P. Word production and comprehension in Alzheimers disease:
the breakdown of semantic knowledge. Brain Lang 1983;19:12441.
[38] Appell J, Kertesz A, Fisman M. A study of language functioning in Alzheimer
patients. Brain Lang 1982;17:7391.
[39] Bucks RS, Singh S, Cuerden JM, Wilcock GK. Analysis of spontaneous, conversa-
tional speech in dementia of Alzheimer type: evaluation of an objective technique
for analysing lexical performance. Aphasiology 2000;14:7191.
[40] Sperling RA, Aisen PS, Beckett LA, Bennett DA,Craft S, Fagan AM,et al. Toward den-
ing the preclinical stages of Alzheimers disease: recommendations from the national
institute on aging-Alzheimers association workgroups on diagnostic guidelines for
Alzheimers disease. Alzheimer\Textquotesingles Dement 2011;7:28092.
[41] Cui Y, Liu B, Luo S, Zhen X, Fan M, Liu T, et al. Identication of conversion from
mild cognitive impairment to Alzheimers disease using multivariate predictors.
PLoS ONE 2011;6:e21896.
[42] Pereira T, Lemos L, Cardoso S, Silva D, Rodrigues A, Santana I, et al. Predicting pro-
gression of mild cognitive impairment to dementia using neuropsychological
data: a supervised learning approach using time windows. BMC Med Inform Decis
Mak 2017:17.
[43] Preische O, Schultz S, Apel A, Kuhle J, Kaeser S, Barro C, et al. Dominantly inherited
Alzheimer network. serum neurolament dynamics predicts neurodegeneration
and clinical progression in presymptomatic Alzheimers disease. Nat Med
E. Eyigoz et al. / EClinicalMedicine 28 (2020) 100583 9
... onset. 14,18 In particular, numerous studies have investigated spontaneous, connected speech during picture description tasks [14][15][16] and reported linguistic differences in AD patients such as increased repetition and reduced informative content related to word-finding difficulties. 7,8,15,16 In addition, speech and language impairments do not constitute a central feature for DLB, 19 but they are also observed in the course of DLB. ...
... onset. 14,18 In particular, numerous studies have investigated spontaneous, connected speech during picture description tasks [14][15][16] and reported linguistic differences in AD patients such as increased repetition and reduced informative content related to word-finding difficulties. 7,8,15,16 In addition, speech and language impairments do not constitute a central feature for DLB, 19 but they are also observed in the course of DLB. ...
... From a clinical perspective, a classification model using speech analysis may assist clinicians in differential diagnosis as a screening tool because speech data can be acquired in routine clinical practice. In fact, a number of studies on speech analysis for data collected during neuropsychological assessment have succeeded in detecting MCI and dementia, 14,16,[47][48][49] and several of them have shown that the addition of speech analysis to the neuropsychological assessment has been found to improve its accuracy. 48,49 Another clinical implication of this study is that our findings using the tablet-based application might help develop a self-administrated tool for the early detection of AD and DLB. ...
Full-text available
Introduction: Early differential diagnosis of Alzheimer's disease (AD) and dementia with Lewy bodies (DLB) is important, but it remains challenging. Different profiles of speech and language impairments between AD and DLB have been suggested, but direct comparisons have not been investigated. Methods: We collected speech responses from 121 older adults comprising AD, DLB, and cognitively normal (CN) groups and investigated their acoustic, prosodic, and linguistic features. Results: The AD group showed larger differences from the CN group than the DLB group in linguistic features, while the DLB group showed larger differences in prosodic and acoustic features. Machine-learning classifiers using these speech features achieved 87.0% accuracy for AD versus CN, 93.2% for DLB versus CN, and 87.4% for AD versus DLB. Discussion: Our findings indicate the discriminative differences in speech features in AD and DLB and the feasibility of using these features in combination as a screening tool for identifying/differentiating AD and DLB.
... The first two steps uses shared components that involve text pre-processing and natural language processing (NLP) feature engineering algorithms. The final stage can be a regression step to evaluate an individual's mini mental state, or a classification step to compute the probability of them having AD today, or a service to predict the onset of AD before turning 85. [4], [9]. ...
... By simply replacing the final scoring stage with regression models and selecting a different subset of extracted linguistic features, a new service is created to provide a snapshot of an individual's mini mental state. Likewise, a service for predicting the onset of AD before turning 85 can be created [9]. ...
Full-text available
This paper highlights the design philosophy and architecture of the Health Guardian, a platform developed by the IBM Digital Health team to accelerate discoveries of new digital biomarkers and development of digital health technologies. The Health Guardian allows for rapid translation of artificial intelligence (AI) research into cloud-based microservices that can be tested with data from clinical cohorts to understand disease and enable early prevention. The platform can be connected to mobile applications, wearables, or Internet of things (IoT) devices to collect health-related data into a secure database. When the analytics are created, the researchers can containerize and deploy their code on the cloud using pre-defined templates, and validate the models using the data collected from one or more sensing devices. The Health Guardian platform currently supports time-series, text, audio, and video inputs with 70+ analytic capabilities and is used for non-commercial scientific research. We provide an example of the Alzheimer's disease (AD) assessment microservice which uses AI methods to extract linguistic features from audio recordings to evaluate an individual's mini-mental state, the likelihood of having AD, and to predict the onset of AD before turning the age of 85. Today, IBM research teams across the globe use the Health Guardian internally as a test bed for early-stage research ideas, and externally with collaborators to support and enhance AI model development and clinical study efforts.
... In a medical study [23], Elif Eyigoz et al. proposed predicting the development of Alzheimer's disease based on the early detection of cognitive impairments, which are identified by analyzing patient responses and presented as linguistic variables. ...
Full-text available
The solution of most MCDM-problems involves measuring the characteristics of a research object, converting the estimations into a confidence distribution specified on a set of qualitative gradations and aggregating the estimations in accordance with the structure of the criteria system. The quality of the problems solution as a whole directly depends on the quality of measuring the characteristics of a research object. Data for obtaining estimations of the characteristics are often inaccurate, incomplete, approximate. Modern researches either fragmentarily touch on the questions of measurement quality, or focus on other questions. Our goal is to choose such parameters for converting the value of the quantitative characteristic of a research object into a confidence distribution, which provide the best measurement quality. Based on the observation channel (OC) concept proposed by G. Klir, we refined the measurement quality criteria, determined the composition of the OC parameters, developed an algorithm for calculating the measurement quality criteria and choosing the best OC for the most common MCDM-problems. As calculations have shown, in the most common MCDM-problems, the best is OC, which is built on the basis of a bell-shaped membership function and has a scale of seven blocks. The obtained result will allow researchers to justify the choice of OC parameters from the view-point of the maximum quality of measuring the quantitative characteristics of a research object in MCDM-problems and uncertainty conditions.
... Some researchers have attempted to link changes in language abilities to cognitive decline in AD [13]. Several linguistic variables have been used to predict the onset of AD [14]. While these independently conducted assessments with speech and language components can be bene cial for identifying the early stage of AD, neurophysiological evidence of neural dysfunction during speaking [15] may provide a more sensitive prognostic measure of disease progression in AD. ...
Full-text available
Background: Alzheimer’s disease (AD) is a neurodegenerative disease involving cognitive impairment and abnormalities in speech and language. Here, we examine how AD affects the fidelity of auditory feedback predictions during speaking. We focus on the phenomenon of speaking-induced suppression (SIS), the auditory cortical responses’ suppression during auditory feedback processing. SIS is determined by subtracting the magnitude of auditory cortical responses during speaking from listening to playback of the same speech. Our state feedback control model of speech motor control explains SIS as arising from the onset of auditory feedback matching a prediction of that feedback onset during speaking – a prediction that is absent during passive listening to playback of the auditory feedback. Our model hypothesizes that the auditory cortical response to auditory feedback reflects the mismatch with the prediction: small during speaking, large during listening, with the difference being SIS. Normally, during speaking, auditory feedback matches its predictions, then SIS will be large. Any reductions in SIS will indicate inaccuracy in auditory feedback prediction not matching the actual feedback. Methods: We investigated SIS in AD patients (n = 20; mean (SD) age, 60.77 (10.04); female (%), 55.00) and healthy controls (n = 12; mean (SD) age, 63.68 (6.07); female (%), 83.33) through magnetoencephalography-based functional imaging. Results: We found a significant reduction in SIS at approximately 100 ms in AD patients compared to healthy controls (linear mixed effects model, F(1, 57.5) = 6.849, P= 0.011). Conclusions: The results suggest that AD patients generate inaccurate auditory feedback predictions, contributing to abnormalities in AD speech.
... Looking for non-invasive, low-cost, and non-adversarial approaches to diagnosing and classifying Alzheimer's patients, researchers are trying to use speech analysis and linguistic parameters such as lexical richness, lexical-syntactic diversity, word-to-speech ratio, and MMSE score to help in preclinical stages [44][45][46]. Linguistic markers such as telegraphic speech, prominently agraphia, repetitiveness of questions and statement, more writing errors, declines in structural complexity of the utterances, and semantic impairments can be helpful in predicting the onset of AD [47]. ...
Full-text available
Introduction: Linguistic disorders are one of the common problems in Alzheimer's disease, which in recent years has been considered as one of the key parameters in the diagnosis of Alzheimer (AD). Given that changes in sentence processing and working memory and the relationship between these two activities may be a diagnostic parameter in the early and preclinical stages of AD, the present study examines the comprehension and production of sentences and working memory in AD patients and healthy aged people. Methods: Twenty-five people with mild Alzheimer's and 25 healthy elderly people participated in the study. In this study, we used the digit span to evaluate working memory. Syntactic priming and sentence completion tasks in canonical and non-canonical conditions were used for evaluating sentence production. We administered sentence picture matching and cross-modal naming tasks to assess sentence comprehension. Results: The results of the present study revealed that healthy elderly people and patients with mild Alzheimer's disease have a significant difference in comprehension of relative clause sentences (P <0.05). There was no significant difference between the two groups in comprehension of simple active, simple active with noun phrase and passive sentences (P> 0.05). They had a significant difference in auditory and visual reaction time (P <0.05). Also there was a significant difference between the two groups in syntactic priming and sentence completion tasks. However, in non-canonical condition of sentence completion, the difference between the two groups was not significant (P> 0.05). Conclusion: The results of the present study showed that the mean scores related to comprehension, production and working memory in people with mild Alzheimer's were lower than healthy aged people, which indicate sentence processing problems at this level of the disease. People with Alzheimer have difficulty comprehending and producing complex syntactic structures and have poorer performance in tasks that required more memory demands. It seems that the processing problems of these people are due to both working memory and language problems, which are not separate from each other and both are involved in.
... Linguistic components other than verbal fluency have been less extensively explored. A number of writing (e.g., misspelling and number of commas) and syntactic (e.g., dependency labels and determiners) parameters may be useful in the prediction of ACS development in CN individuals (Eyigoz et al., 2020). Measures of semantic degradation and namely lexical impoverishment reflected as increased production of high-frequency words during spontaneous speech appear to be fine predictors of future cognitive testing performance in individuals with normal cognition (Ostrand & Gunstad, 2021). ...
Full-text available
Objectives There is limited research on the prognostic value of language tasks regarding mild cognitive impairment (MCI) and Alzheimer’s clinical syndrome (ACS) development in the cognitively normal (CN) elderly, as well as MCI to ACS conversion. Methods Participants were drawn from the population-based Hellenic Longitudinal Investigation of Aging and Diet (HELIAD) cohort. Language performance was evaluated via verbal fluency [semantic (SVF) and phonemic (PVF)], confrontation naming [Boston Naming Test short form (BNTsf)], verbal comprehension, and repetition tasks. An additional language index was estimated using both verbal fluency tasks: SVF-PVF discrepancy. Cox proportional hazards analyses adjusted for important sociodemographic parameters (age, sex, education, main occupation, and socioeconomic status) and global cognitive status [Mini Mental State Examination score (MMSE)] were performed. Results A total of 959 CN and 118 MCI older (>64 years) individuals had follow-up investigations after a mean of ∼3 years. Regarding the CN group, each standard deviation increase in the composite language score reduced the risk of ACS and MCI by 49% (8–72%) and 32% (8–50%), respectively; better SVF and BNTsf performance were also independently associated with reduced risk of ACS and MCI. On the other hand, using the smaller MCI participant set, no language measurement was related to the risk of MCI to ACS conversion. Conclusions Impaired language performance is associated with elevated risk of ACS and MCI development. Better SVF and BNTsf performance are associated with reduced risk of ACS and MCI in CN individuals, independent of age, sex, education, main occupation, socioeconomic status, and MMSE scores at baseline.
... To date, the actual temporal nature of this has not been formally addressed in the existing literature . Thus, seemingly impressive NLP findings in research in high risk youth identifies a signal that has predictive value several years later (e.g., Bedi et al., 2015;Corcoran et al., 2018), and likewise in mild cognitive impairment (MCI) finds a subtle signal in language usage that predicts an elevated risk of developing dementia several years later (e.g., Eyigoz et al., 2020). Yet are these signals the same across illnesses and also time? ...
Full-text available
Modern advances in computational language processing methods have enabled new approaches to the measurement of mental processes. However, the field has primarily focused on model accuracy in predicting performance on a task or a diagnostic category. Instead the field should be more focused on determining which computational analyses align best with the targeted neurocognitive/psychological functions that we want to assess. In this paper we reflect on two decades of experience with the application of language-based assessment to patients' mental state and cognitive function by addressing the questions of what we are measuring, how it should be measured and why we are measuring the phenomena. We address the questions by advocating for a principled framework for aligning computational models to the constructs being assessed and the tasks being used, as well as defining how those constructs relate to patient clinical states. We further examine the assumptions that go into the computational models and the effects that model design decisions may have on the accuracy, bias and generalizability of models for assessing clinical states. Finally, we describe how this principled approach can further the goal of transitioning language-based computational assessments to part of clinical practice while gaining the trust of critical stakeholders.
Full-text available
Introduction Limitations in effective dementia therapies mean that early diagnosis and monitoring are critical for disease management, but current clinical tools are impractical and/or unreliable, and disregard short-term symptom variability. Behavioural biomarkers of cognitive decline, such as speech, sleep and activity patterns, can manifest prodromal pathological changes. They can be continuously measured at home with smart sensing technologies, and permit leveraging of interpersonal interactions for optimising diagnostic and prognostic performance. Here we describe the ContinUous behavioural Biomarkers Of cognitive Impairment (CUBOId) study, which explores the feasibility of multimodal data fusion for in-home monitoring of mild cognitive impairment (MCI) and early Alzheimer’s disease (AD). The report focuses on a subset of CUBOId participants who perform a novel speech task, the ‘TV task’, designed to track changes in ecologically valid conversations with disease progression. Methods and analysis CUBOId is a longitudinal observational study. Participants have diagnoses of MCI or AD, and controls are their live-in partners with no such diagnosis. Multimodal activity data were passively acquired from wearables and in-home fixed sensors over timespans of 8–25 months. At two time points participants completed the TV task over 5 days by recording audio of their conversations as they watched a favourite TV programme, with further testing to be completed after removal of the sensor installations. Behavioural testing is supported by neuropsychological assessment for deriving ground truths on cognitive status. Deep learning will be used to generate fused multimodal activity-speech embeddings for optimisation of diagnostic and predictive performance from speech alone. Ethics and dissemination CUBOId was approved by an NHS Research Ethics Committee (Wales REC; ref: 18/WA/0158) and is sponsored by University of Bristol. It is supported by the National Institute for Health Research Clinical Research Network West of England. Results will be reported at conferences and in peer-reviewed scientific journals.
Accumulation of amyloid-beta (Aβ) in the brain is associated with neurodegeneration in Alzheimer's disease and can be an indicator of early disease progression. Thus, the non-invasively and inexpensively observable features related to Aβ accumulation are promising biomarkers. However, in the experimental discovery of biomarkers in preclinical models, Aβ and biomarker candidates are usually not observed in identical sample populations. This study established a hierarchical Bayesian model that predicts Aβ accumulation level solely from biomarker candidates by integrating incomplete information. The model was applied to 5xFAD mouse behavioral experimental data. The predicted Aβ accumulation level obeyed the observed amount of Aβ when multiple features were used for learning and prediction. Based on the evaluation of predictability, the results suggest that the proposed model can contribute to discovering novel biomarkers, that is, multivariate biomarkers relevant to the accumulation state of abnormal proteins.
Full-text available
Comparing the performance of select SVM and DWD implementations in the context of schizophrenia detection via FNC and SBM modalities extracted from MRIs
Full-text available
Speech-language pathologists routinely use picture description tasks to assess expository discourse in clients with disorders such as aphasia and dementia. One picture description task-the Cookie Theft picture from the Boston Diagnostic Aphasia Examination-has come to dominate clinical settings more than any other task. In this article, I examine why this particular picture description task has proven to be so successful in assessing expository discourse in clients with language and cognitive disorders. Using data from the University of Pittsburgh Alzheimer and Related Dementias Study, recurrent cognitive-linguistic impairments in the Cookie Theft picture descriptions of clients with Alzheimer's dementia are explored. These impairments are mostly pragmatic in nature. It is argued that the sensitivity of the Cookie Theft picture description task to these impairments makes it an ideal assessment tool for any investigation which aims to identify pragmatic markers of neurodegenerative diseases such as the dementias.
Full-text available
Neurofilament light chain (NfL) is a promising fluid biomarker of disease progression for various cerebral proteopathies. Here we leverage the unique characteristics of the Dominantly Inherited Alzheimer Network and ultrasensitive immunoassay technology to demonstrate that NfL levels in the cerebrospinal fluid (n = 187) and serum (n = 405) are correlated with one another and are elevated at the presymptomatic stages of familial Alzheimer’s disease. Longitudinal, within-person analysis of serum NfL dynamics (n = 196) confirmed this elevation and further revealed that the rate of change of serum NfL could discriminate mutation carriers from non-mutation carriers almost a decade earlier than cross-sectional absolute NfL levels (that is, 16.2 versus 6.8 years before the estimated symptom onset). Serum NfL rate of change peaked in participants converting from the presymptomatic to the symptomatic stage and was associated with cortical thinning assessed by magnetic resonance imaging, but less so with amyloid-β deposition or glucose metabolism (assessed by positron emission tomography). Serum NfL was predictive for both the rate of cortical thinning and cognitive changes assessed by the Mini–Mental State Examination and Logical Memory test. Thus, NfL dynamics in serum predict disease progression and brain neurodegeneration at the early presymptomatic stages of familial Alzheimer’s disease, which supports its potential utility as a clinically useful biomarker. © 2019, The Author(s), under exclusive licence to Springer Nature America, Inc.
Full-text available
Background Predicting progression from a stage of Mild Cognitive Impairment to dementia is a major pursuit in current research. It is broadly accepted that cognition declines with a continuum between MCI and dementia. As such, cohorts of MCI patients are usually heterogeneous, containing patients at different stages of the neurodegenerative process. This hampers the prognostic task. Nevertheless, when learning prognostic models, most studies use the entire cohort of MCI patients regardless of their disease stages. In this paper, we propose a Time Windows approach to predict conversion to dementia, learning with patients stratified using time windows, thus fine-tuning the prognosis regarding the time to conversion. Methods In the proposed Time Windows approach, we grouped patients based on the clinical information of whether they converted (converter MCI) or remained MCI (stable MCI) within a specific time window. We tested time windows of 2, 3, 4 and 5 years. We developed a prognostic model for each time window using clinical and neuropsychological data and compared this approach with the commonly used in the literature, where all patients are used to learn the models, named as First Last approach. This enables to move from the traditional question “Will a MCI patient convert to dementia somewhere in the future” to the question “Will a MCI patient convert to dementia in a specific time window”. Results The proposed Time Windows approach outperformed the First Last approach. The results showed that we can predict conversion to dementia as early as 5 years before the event with an AUC of 0.88 in the cross-validation set and 0.76 in an independent validation set. Conclusions Prognostic models using time windows have higher performance when predicting progression from MCI to dementia, when compared to the prognostic approach commonly used in the literature. Furthermore, the proposed Time Windows approach is more relevant from a clinical point of view, predicting conversion within a temporal interval rather than sometime in the future and allowing clinicians to timely adjust treatments and clinical appointments. Electronic supplementary material The online version of this article (doi:10.1186/s12911-017-0497-2) contains supplementary material, which is available to authorized users.
The Framingham Heart Study (FHS) was started in 1948 as a prospective investigation of cardiovascular disease in a cohort of adult men and women. Continuous surveillance of this sample of 5209 subjects has been maintained through biennial physical examinations. In 1971 examinations were begun on the children of the FHS cohort. This study, called the Framingham Offspring Study (FOS), was undertaken to expand upon knowledge of cardiovascular disease, particularly in the area of familial clustering of the disease and its risk factors. This report reviews the sampling design of the FHS and describes the nature of the FOS sample. The FOS families appear to be of typical size and age structure for families with parents born in the late 19th or early 20th century. In addition, there is little evidence that coronary heart disease (CHD) experience and CHD risk factors differ in parents of those who volunteered for this study and the parents of those who did not volunteer.
Spontaneous, conversational speech in probable dementia of Alzheimer type (DAT) participants and healthy older controls was analysed using eight linguistic measures. These were evaluated for their usefulness in discriminating between healthy and demented individuals. The measures were; noun rate, pronounrate, verb rate, adjective rate, clause-like semantic unit rate (all per 100 words), including three lexical richness measures; type token ratio (TTR), Brunet's Index (W) and Honore's statistic (R). Results suggest that these measures offer a sensitive method of assessing spontaneous speech output in DAT. Comparison between DAT and healthy older participants demonstrates that these measures discriminate well between these groups. This method shows promise as a diagnostic and prognostic tool, and as a measure for use in clinical trials. Further validation in a large sample of patient versus control 'norms' in addition to evaluation in other types of dementia is considered.
Background: Although an increased cumulative risk for primary progressive dementia (PPD) has been repeatedly demonstrated in relatives of probands with Alzheimer's disease (AD), an examination of their rates of illness at different ages has not been previously undertaken. Such an examination might reveal possible age-related characteristics associated with a more familial variety of AD.Methods: Using family history interviews and survival analysis, the cumulative risk for and 5-year age-specific hazard rates of PPD were assessed in the first-degree relatives of 200 probands with AD and two nondemented control groups—179 elderly ascertained through the Alzheimer's Disease Research Center (ADRC-derived controls) and 427 elderly ascertained from community senior centers (community controls).Results: The PPD risk curve for the relatives of probands with AD rose to about 30% and was significantly higher than the curves for the relatives of the ADRCderived and community controls, where comparable rates were observed (approximately 12%). The age-specific hazard rates of PPD were calculated in three groups of relatives for each 5-year interval from ages 45 to 49 years through ages 85 to 89 years. The age-specific relative risk (RR,) for PPD in the relatives of probands with AD began to steadily diminish from the 75- to 79-year age interval (RRi=13.49) through the 85- to 89-year age interval (RRi=0.96) compared with the relatives of ADRCderived controls and from the 60- to 64-year age interval (RRi=16.15) through the 85- to 89-year age interval (RRi=2.03) compared with the relatives of the community controls.Conclusions: These data indicate that, for relatives of probands with AD, while the lifetime risk for PPD is greater than in relatives of controls, the familial contribution to the risk for PPD decreases with increasing age. The higher risk for PPD in relatives of probands with AD may be substantially diminished or even eliminated by the latter half of the ninth decade.