ArticlePublisher preview availableLiterature Review

A Systematic Review of Expressive and Receptive Prosody in People With Dementia

American Speech-Language-Hearing Association
Journal of Speech, Language, and Hearing Research
Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Purpose This review was designed to provide a systematic overview of prosody in people with a primary diagnosis of dementia (PwD) and evaluate the potential use of prosodic features for diagnosis of dementia. Method A systematic search of five databases was conducted using Medical Subject Headings and keywords. Studies included in the review were evaluated for their methodological quality using the modified Joanna Briggs Institute checklist. Results A total of 14 articles were identified as being relevant for this review. Among the 14 articles, the methodological quality ranged, with eight rated as weak, four rated as moderate, and two rated as strong. Ten of the 14 articles had people with Alzheimer's disease (AD) as participants, and the remaining four had people with frontotemporal dementia as participants. Four articles focused on receptive prosody, another six focused on expressive prosody, and the remaining four articles were investigations into both. The 14 articles presented inconsistent findings, and various tasks were used to measure prosodic features in PwD in the articles. Prosody was studied as a diagnostic tool for dementia in four of the articles, all of which were based on expressive prosody in individuals with AD. Among the four articles, three proposed the use of automatic speech analysis for diagnosis of AD. Conclusions This review demonstrates that prosody in PwD is an underinvestigated area. In particular, it was concerning that most articles were of weak methodological quality. Nevertheless, it was found that prosody may be a potential diagnostic tool for assessing dementia. More studies that replicate the existing studies and those with stronger methodology are needed to confirm that receptive and/or expressive prosody can be used for dementia diagnosis.
This content is subject to copyright. Terms and conditions apply.
JSLHR
Review Article
A Systematic Review of Expressive
and Receptive Prosody in People
With Dementia
Chorong Oh,
a
Richard J. Morris,
b
and Xianhui Wang
a
Purpose: This review was designed to provide a systematic
overview of prosody in people with a primary diagnosis of
dementia (PwD) and evaluate the potential use of prosodic
features for diagnosis of dementia.
Method: A systematic search of five databases was
conducted using Medical Subject Headings and keywords.
Studies included in the review were evaluated for their
methodological quality using the modified Joanna Briggs
Institute checklist.
Results: A total of 14 articles were identified as being relevant
for this review. Among the 14 articles, the methodological
quality ranged, with eight rated as weak, four rated as
moderate, and two rated as strong. Ten of the 14 articles
had people with Alzheimers disease (AD) as participants,
and the remaining four had people with frontotemporal
dementia as participants. Four articles focused on receptive
prosody, another six focused on expressive prosody, and
the remaining four articles were investigations into both.
The 14 articles presented inconsistent findings, and various
tasks were used to measure prosodic features in PwD in
the articles. Prosody was studied as a diagnostic tool for
dementia in four of the articles, all of which were based on
expressive prosody in individuals with AD. Among the four
articles, three proposed the use of automatic speech analysis
for diagnosis of AD.
Conclusions: This review demonstrates that prosody in
PwD is an underinvestigated area. In particular, it was
concerning that most articles were of weak methodological
quality. Nevertheless, it was found that prosody may be
a potential diagnostic tool for assessing dementia. More
studies that replicate the existing studies and those with
stronger methodology are needed to confirm that receptive
and/or expressive prosody can be used for dementia
diagnosis.
Around the world, approximately 50 million people
have dementia, and roughly 10 million people
develop the disease every year (World Health
Organization [WHO], 2018). Consequently, a growing
need exists for clinical assessment and management of people
with dementia (PwD; Brookmeyer et al., 1998). Speech-
language pathologists (SLPs) play an essential role support-
ing PwD, their family members, and the general population
(Bourgeois et al., 2016) by assessing, diagnosing, and treat-
ing PwD and by educating caregivers (American Speech-
Language-Hearing Association, 2016). In doing this work,
SLPs and other clinicians need to maintain awareness of
the distinct phenotypes of dementia that include Alzheimers
dementia (AD), Lewy body dementia (LBD), vascular demen-
tia (VaD), frontotemporal dementia (FTD), and mixed de-
mentia (WHO, 2018). These distinct phenotypes of dementia
highlight the need among SLPs and other health profes-
sionals for evidence-based, objective measures they can use
for making accurate evaluations of the cognitive deficits
presented by people with different types of dementia. These
measures can help SLPs provide their services effectively
and efficiently. Furthermore, with a better understanding
of the differences in cognitivelinguistic deficits among
PwD, SLPs will be able to design and evaluate interven-
tions that better serve the different phenotypes of demen-
tia. Consequently, this line of research may contribute to
an improved quality of life for PwD and their caregivers.
To date, research to differentiate among the dementia
variants has relied heavily upon a weighted combination
of genetic and protein biomarkers at the cellular level,
neuroanatomical integrity at the organ level, and physical/
psychological behavior at the organism level (Reilly et al.,
2010). However, only a small number of studies report on
a
School of Rehabilitation and Communication Sciences, Ohio
University, Athens
b
School of Communication Science & Disorders, Florida State
University, Tallahassee
Correspondence to Chorong Oh: ohc@ohio.edu
Editor-in-Chief: Bharath Chandrasekaran
Editor: Cara E. Stepp
Received January 7, 2021
Revision received April 15, 2021
Accepted June 13, 2021
https://doi.org/10.1044/2021_JSLHR-21-00013
Disclosure: The authors have declared that no competing financial or nonfinancial
interests existed at the time of publication.
Journal of Speech, Language, and Hearing Research Vol. 64 38033825 October 2021 Copyright © 2021 American Speech-Language-Hearing Association 3803
... Prosody comprises different voice features such as intonation, accent, speech rate, or loudness (Wells, 2007). However, prior research has yet to focus on the caregivers' background and knowledge about the importance of communication and, specifically, the role of different prosody strategies when speaking to AD patients (Oh et al., 2021). Therefore, the goal of this research is twofold: first, to know the relevance caregivers (professional and family) give to communication with AD patients, and second, to determine what prosody strategies they consider most effective in communicating with them. ...
... Most communication strategies studied are primarily content-oriented (Shaw et al., 2022); only the recommendation to speak slowly can be considered a prosody element in nonverbal communication. This lack of references is due to the scarce studies on prosody strategies to speak with AD patients (Oh et al., 2021). Only in recent years has research paid attention to the phonic aspects and, specifically, prosody. ...
... In this section, we provide a few studies that have tried to establish a connection between prosodic feature variation and communication effects on elders and AP patients. However, evidence is limited, nonconclusive, and even contradictory (Oh et al., 2021), possibly due to the variability of patients' situations and the challenges of measuring effects on them. ...
... Similarly, a review on AD patients reported conflicting results on linguistic prosody comprehension and found that most articles were of weak methodological quality (see Oh et al. 2021 for a review). Nonetheless, patients with major depressive disorders were found to be impaired in the comprehension of pragmatic prosody on emphatic stresses (Zurlo & Ruggiero 2021). ...
... Finally, intervention and training programs that aim at pragmatic prosody recognition with materials and conditions tailored toward individual emotion, cognition, and language abilities are expected to be developed for ASD (Schreibman et al. 2015 and other mental disorders. Although impairments in pragmatic or affective prosody recognition are systematically documented in research studies on individuals with various mental disorders (Icht et al. 2021, Lin et al. 2018, Oh et al. 2021, prosody comprehension/production tasks are rarely included in clinical settings. ...
Article
In response to uncovering brain mechanisms underlying vocal communication and searching for biomarkers for mental illnesses, speech prosody has been increasingly studied in recent years in multiple disciplines, including psycholinguistics. In this article, we provide an up-to-date synthesis of the theoretical foundation and empirical evidence to profile linguistic and emotional prosody in the proper characterization of mental disorders, including schizophrenia, autism, Alzheimer's disease, and depression. Our review reveals a need to develop theoretically motivated and methodologically integrated approaches to the study of context-driven comprehension and expression of pragmatic-affective prosody, which will help elucidate the core features of socio-communicative problems in individuals with mental disorders. We propose that comprehensive models within and across the conventional cognition-emotion-language trichotomy need to be developed to integrate current findings and guide future research. In particular, there needs to be due emphasis on investigating multisensory and cross-modal effects in normal and pathological prosody research. Our review calls for multidisciplinary efforts to address the challenging issues to inform and inspire the advancement of linguistic theories and psychiatric diagnosis and treatment. Expected final online publication date for the Annual Review of Linguistics, Volume 9 is January 2023. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
... These novel observations suggest that the dominance of communicative channels in emotion perception can be modulated by cognitive decline to varying degrees. It is plausible that recognizing emotional prosody necessitates more cognitive resources than the other two channels, thus leading to perceptual challenges across various special populations, including individuals with schizophrenia, mild cognitive impairments, and Parkinson's disease (Amlerova et al., 2022;Lin, Li, Wang, et al., 2023;Monetta et al., 2008;Oh et al., 2021;Oron et al., 2020;Paulmann & Pell, 2010;Taitelbaum-Swead et al., 2022). Another possible explanation is that our stimulus set was developed and validated by young individuals, which might be more familiar and less effortful to process for younger people than for older people. ...
Article
Full-text available
Purpose Prior research extensively documented challenges in recognizing verbal and nonverbal emotion among older individuals when compared with younger counterparts. However, the nature of these age-related changes remains unclear. The present study investigated how older and younger adults comprehend four basic emotions (i.e., anger, happiness, neutrality, and sadness) conveyed through verbal (semantic) and nonverbal (facial and prosodic) channels. Method A total of 73 older adults (43 women, Mage = 70.18 years) and 74 younger adults (37 women, Mage = 22.01 years) partook in a fixed-choice test for recognizing emotions presented visually via facial expressions or auditorily through prosody or semantics. Results The results confirmed age-related decline in recognizing emotions across all channels except for identifying happy facial expressions. Furthermore, the two age groups demonstrated both commonalities and disparities in their inclinations toward specific channels. While both groups displayed a shared dominance of visual facial cues over auditory emotional signals, older adults indicated a preference for semantics, whereas younger adults displayed a preference for prosody in auditory emotion perception. Notably, the dominance effects observed in older adults for visual and semantic cues were less pronounced for sadness and anger compared to other emotions. These challenges in emotion recognition and the shifts in channel preferences among older adults were correlated with their general cognitive capabilities. Conclusion Together, the findings underscore that age-related obstacles in perceiving emotions and alterations in channel dominance, which vary by emotional category, are significantly intertwined with overall cognitive functioning. Supplemental Material https://doi.org/10.23641/asha.27307251
... In addition to the symptoms of language disorders, the vocabulary level, complexity of syntactic structure, and use of irregular words are significantly affected by factors such as age, educational background, and cognitive ability; therefore, it is difficult to use these predictors as indicators of early AD [37]. In contrast, the frequency of hesitation, impaired affective prosody, emphasis of specific syllables, changes in tempo or timing, differences in pitch and intonation, and irregular breathing can be used as indicators in speech analysis and processing of voice signals [38][39][40]. Language analysis is important owing to its suitability for classification; some studies have shown that it can be used to distinguish between people with and without AD with over 91.2% accuracy [41,42]. ...
Article
Full-text available
Background: Alzheimer's disease (AD) is the most common form of dementia, which makes the lives of patients and their families difficult for various reasons. Therefore, early detection of AD is crucial to alleviating the symptoms through medication and treatment. Objective: Given that AD strongly induces language disorders, this study aims to detect AD rapidly by analyzing the language characteristics. Materials and methods: The mini-mental state examination for dementia screening (MMSE-DS), which is most commonly used in South Korean public health centers, is used to obtain negative answers based on the questionnaire. Among the acquired voices, significant questionnaires and answers are selected and converted into mel-frequency cepstral coefficient (MFCC)-based spectrogram images. After accumulating the significant answers, validated data augmentation was achieved using the Densenet121 model. Five deep learning models, Inception v3, VGG19, Xception, Resnet50, and Densenet121, were used to train and confirm the results. Results: Considering the amount of data, the results of the five-fold cross-validation are more significant than those of the hold-out method. Densenet121 exhibits a sensitivity of 0.9550, a specificity of 0.8333, and an accuracy of 0.9000 in a five-fold cross-validation to separate AD patients from the control group. Conclusions: The potential for remote health care can be increased by simplifying the AD screening process. Furthermore, by facilitating remote health care, the proposed method can enhance the accessibility of AD screening and increase the rate of early AD detection.
... In particular, the current research is the first to include the VaD group. According to a recent systematic review (Oh et al., 2021), prosody and dementia studies included DAT and frontotemporal dementia groups only. Second, in this study, emotional prosody was clearly distinguished from linguistic prosody, supported by the neurotypical listeners' emotion evaluation. ...
Article
Full-text available
Introduction This pilot research was designed to investigate if prosodic features from running spontaneous speech could differentiate dementia of the Alzheimer’s type (DAT), vascular dementia (VaD), mild cognitive impairment (MCI), and healthy cognition. The study included acoustic measurements of prosodic features (Study 1) and listeners’ perception of emotional prosody differences (Study 2). Methods For Study 1, prerecorded speech samples describing the Cookie Theft picture from 10 individuals with DAT, 5 with VaD, 9 with MCI, and 10 neurologically healthy controls (NHC) were obtained from the DementiaBank. The descriptive narratives by each participant were separated into utterances. These utterances were measured on 22 acoustic features via the Praat software and analyzed statistically using the principal component analysis (PCA), regression, and Mahalanobis distance measures. Results The analyses on acoustic data revealed a set of five factors and four salient features (i.e., pitch, amplitude, rate, and syllable) that discriminate the four groups. For Study 2, a group of 28 listeners served as judges of emotions expressed by the speakers. After a set of training and practice sessions, they were instructed to indicate the emotions they heard. Regression measures were used to analyze the perceptual data. The perceptual data indicated that the factor underlying pitch measures had the greatest strength for the listeners to separate the groups. Discussion The present pilot work showed that using acoustic measures of prosodic features may be a functional method for differentiating among DAT, VaD, MCI, and NHC. Future studies with data collected under a controlled environment using better stimuli are warranted.
... Different subtypes of affective prosody disorders exist and result in discrete profiles of behavioural symptoms characterised by distinctive performance patterns depending on the type of task completed by patients (Sheppard et al., 2021;Wright et al., 2018). Indeed, affective-prosodic deficits have been reported for both processing modes (comprehension and production) in adults with several neurological conditions (Hawthorne & Fischer, 2020;Oh et al., 2021;Ross, 2021). Impairments in the comprehension of affective prosody include deficits in different receptive abilities: the discrimination of affective features from prosody, the recognition of the affective meaning of prosodic patterns, and the integration of affective information from prosody and other communication modalities, such as facial or verbal cues (e.g., Bucks & Radford, 2004;Perry et al., 2001;Rymarczyk & Grabowska, 2007;Sidtis & Van Lancker Sidtis, 2003;Wright et al., 2018). ...
Article
Background: Individuals with affective-prosodic deficits have difficulty understanding or expressing emotions and attitudes through prosody. Affective prosody disorders can occur in multiple neurological conditions, but the limited knowledge about the clinical groups prone to deficits complicates their identification in clinical settings. Additionally, the nature of the disturbance underlying affective prosody disorder observed in different neurological conditions remains poorly understood. Aims: To bridge these knowledge gaps and provide relevant information to speech-language pathologists for the management of affective prosody disorders, this study provides an overview of research findings on affective-prosodic deficits in adults with neurological conditions by answering two questions: (1) Which clinical groups present with acquired affective prosodic impairments following brain damage? (2) Which aspects of affective prosody comprehension and production are negatively affected in these neurological conditions? Methods & procedures: We conducted a scoping review following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews guidelines. A literature search was undertaken in five electronic databases (MEDLINE, PsycINFO, EMBASE, CINAHL and Linguistics, and Language Behavior Abstracts) to identify primary studies reporting affective prosody disorders in adults with neurological impairments. We extracted data on clinical groups and characterised their deficits based on the assessment task used. Outcomes & results: The review of 98 studies identified affective-prosodic deficits in 17 neurological conditions. The task paradigms typically used in affective prosody research (discrimination, recognition, cross-modal integration, production on request, imitation and spontaneous production) do not target the processes underlying affective prosody comprehension and production. Therefore, based on the current state of knowledge, it is not possible to establish the level of processing at which impairment occurs in clinical groups. Nevertheless, deficits in the comprehension of affective prosody are observed in 14 clinical groups (mainly recognition deficits) and deficits in the production of affective prosody (either on request or spontaneously) in 10 clinical groups. Neurological conditions and types of deficits that have not been investigated in many studies are highlighted. Conclusions & implications: The aim of this scoping review was to provide an overview on acquired affective prosody disorders and to identify gaps in knowledge that warrant further investigation. Deficits in the comprehension or production of affective prosody are common to numerous clinical groups with various neurological conditions. However, the underlying cause of affective prosody disorders across them is still unknown. Future studies should implement standardised assessment methods with specific tasks based on a cognitive model to identify the underlying deficits of affective prosody disorders. What this paper adds: What is already known on the subject What is already known on the subjectAffective prosody is used to share emotions and attitudes through speech and plays a fundamental role in communication and social interactions. Affective prosody disorders can occur in various neurological conditions, but the limited knowledge about the clinical groups prone to affective-prosodic deficits and about the characteristics of different phenotypes of affective prosody disorders complicates their identification in clinical settings. Distinct abilities underlying the comprehension and production of affective prosody can be selectively impaired by brain damage, but the nature of the disturbance underlying affective prosody disorders in different neurological conditions remains unclear. What this study adds Affective-prosodic deficits are reported in 17 neurological conditions, despite being recognised as a core feature of the clinical profile in only a few of them. The assessment tasks typically used in affective prosody research do not provide accurate information about the specific neurocognitive processes impaired in the comprehension or production of affective prosody. Future studies should implement assessment methods based on a cognitive approach to identify underlying deficits. The assessment of cognitive/executive dysfunctions, motor speech impairment and aphasia might be important for distinguishing primary affective prosodic dysfunctions from those secondarily impacting affective prosody. What are the potential clinical implications of this study? Raising awareness about the possible presence of affective-prosodic disorders in numerous clinical groups will facilitate their recognition by speech-language pathologists and, consequently, their management in clinical settings. A comprehensive assessment covering multiple affective-prosodic skills could highlight specific aspects of affective prosody that warrant clinical intervention.
Conference Paper
Full-text available
This study was conducted to determine whether acoustic analysis of speech prosody can assist in differential diagnosis of cognitive impairment. Speech samples describing the Cookie Theft picture were obtained from the DementiaBank and analyzed acoustically using 22 speech prosody features via the Praat software. Included in this speech dataset were 10 people with dementia of the Alzheimer’s type, 9 people with mild cognitive impairment, 5 people with vascular dementia, and 10 neurologically healthy controls. The principal components analysis, and Mahalonabis distance tests revealed that using acoustic measures of speech prosody may be a functional method for differential diagnosis of the cognitive impairment types.
Article
Full-text available
Purpose Diagnosis and classification of primary progressive aphasia (PPA) requires confirmation of specific speech and language symptoms, highlighting the important role of speech-language pathologists in the evaluation process. The purpose of this case report is to inform speech-language pathologists regarding current practices for diagnostic assessment in PPA, describing standard approaches as well as complementary, state-of-the-art procedures that may improve diagnostic precision. Method We describe the diagnostic evaluation of a 49-year-old woman with complaints of progressive word-finding difficulty. She completed standard neurological, neuropsychological, and speech-language evaluations, as well as magnetic resonance and positron emission tomography imaging of her brain. In addition, a history of developmental speech, language, and learning abilities was obtained, as well as genetic testing and assessment of cerebrospinal fluid biomarkers. We discuss the evaluation results in the context of the most current research related to PPA diagnosis. Conclusion Detailed behavioral assessment, thorough intake of symptom history and neurodevelopmental differences, multimodal neuroimaging, and comprehensive examination of genes and biomarkers are of paramount importance for detecting and characterizing PPA, with ramifications for early behavioral and/or pharmacological intervention. Supplemental Material https://doi.org/10.23641/asha.12771113
Article
Full-text available
Purpose The aim of the study was to use systematic review and meta-analysis to quantitatively assess the currently available acoustic evidence for prosodic production impairments as a result of right-hemisphere damage (RHD), as well as to develop methodological recommendations for future studies. Method We systematically reviewed papers reporting acoustic features of prosodic production in RHD in order to identify shortcomings in the literature and make recommendations for future studies. We estimated the meta-analytic effect size of the acoustic features. We extracted standardized mean differences from 16 papers and estimated aggregated effect sizes using hierarchical Bayesian regression models. Results RHD did present reduced fundamental frequency variation, but the trait was shared with left-hemisphere damage. RHD also presented evidence for increased pause duration. No meta-analytic evidence for an effect of prosody type (emotional vs. linguistic) was found. Conclusions Taken together, the currently available acoustic data show only a weak specific effect of RHD on prosody production. However, the results are not definitive, as more reliable analyses are hindered by small sample sizes, lack of detail on lesion location, and divergent measuring techniques. We propose recommendations to overcome these issues: Cumulative science practices (e.g., open data and code sharing), more nuanced speech signal processing techniques, and the integration of acoustic measures and perceptual judgments are recommended to more effectively investigate prosody in RHD.
Article
Full-text available
Speech analysis could provide an indicator of Alzheimer's disease and help develop clinical tools for automatically detecting and monitoring disease progression. While previous studies have employed acoustic (speech) features for characterisation of Alzheimer's dementia, these studies focused on a few common prosodic features, often in combination with lexical and syntactic features which require transcription. We present a detailed study of the predictive value of purely acoustic features automatically extracted from spontaneous speech for Alzheimer's dementia detection, from a computational paralinguistics perspective. The effectiveness of several state-of-the-art paralinguistic feature sets for Alzheimer's detection were assessed on a balanced sample of DementiaBank's Pitt spontaneous speech dataset, with patients matched by gender and age. The feature sets assessed were the extended Geneva minimalistic acoustic parameter set (eGeMAPS), the emobase feature set, the ComParE 2013 feature set, and new Multi-Resolution Cochleagram (MRCG) features. Furthermore, we introduce a new active data representation (ADR) method for feature extraction in Alzheimer's dementia recognition. Results show that classification models based solely on acoustic speech features extracted through our ADR method can achieve accuracy levels comparable to those achieved by models that employ higher-level language features. Analysis of the results suggests that all feature sets contribute information not captured by other feature sets. We show that while the eGeMAPs feature set provides slightly better accuracy than other feature sets individually (71.34%), “hard fusion” of feature sets improves accuracy to 78.70%.
Conference Paper
Full-text available
Picture description tasks are used for the detection of cognitive decline associated with Alzheimer’s disease (AD). Recent years have seen work on automatic AD detection in picture descriptions based on acoustic and word-based analysis of the speech. These methods have shown some success but lack an ability to capture any higher level effects of cognitive decline on the patient’s language. In this paper, we propose a novel model that encompasses both the hierarchical and sequential structure of the description and detect its informative units by attention mechanism. Automatic speech recognition (ASR) and punctuation restoration are used to transcribe and segment the data. Using the DementiaBank database of people with AD as well as healthy controls (HC), we obtain an F-score of 84.43% and 74.37% when using manual and automatic transcripts respectively. We further explore the effect of adding additional data (a total of 33 descriptions collected using a ‘ digital doctor’ ) during model training, and increase the F-score when using ASR transcripts to 76.09%. This outperforms baseline models, including bidirectional LSTM and bidirectional hierarchical neural network without an attention mechanism, and demonstrate that the use of hierarchical models with attention mechanism improves the AD/HC discrimination performance.
Article
Full-text available
Communication accommodation describes how individuals adjust their communicative style to that of their conversational partner. We predicted that interpersonal prosodic correlation related to pitch and timing would be decreased in behavioral variant frontotemporal dementia (bvFTD). We predicted that the interpersonal correlation in a timing measure and a pitch measure would be increased in right temporal FTD (rtFTD) due to sparing of the neural substrate for speech timing and pitch modulation but loss of social semantics. We found no significant effects in bvFTD, but conversations including rtFTD demonstrated higher interpersonal correlations in speech rate than healthy controls.
Chapter
Frontotemporal dementia (FTD) is the second commonest cause of young onset dementia. Our understanding of FTD and its related syndromes has advanced significantly in recent years. Among the most prominent areas of progress is the overlap between FTD, MND, and other neurodegenerative conditions at a clinicopathologic and genetic level. In parallel major advances in neuroimaging techniques, the discovery of new genetic mutations as well as the development of potential biomarkers may serve to further expand knowledge of the biologic processes at play in FTD and may in turn propel research toward identifying curative and preventative pharmacologic therapies. The aim of this chapter is to discuss the clinical, pathologic, and genetic complexities of FTD and related disorders.
Article
Background: Recently, many studies have been carried out to detect Alzheimer's disease (AD) from continuous speech by linguistic analysis and modeling. However, few of them utilize language models (LMs) to extract linguistic features and to investigate the lexical-level differences between AD and healthy speech. Objective: Our goals include obtaining state-of-art performance of automatic AD detection, emphasizing N-gram LMs as powerful tools for distinguishing AD patients' narratives from those of healthy controls, and discovering the differences of lexical usages between AD patients and healthy people. Method: We utilize a subset of the DementiaBank corpus, including 240 control samples from 98 control participants and 256 AD samples from 194 "PossibleAD" or "ProbableAD" participants. Baseline models are built through area under curve-based feature selection and using five machine learning algorithms for comparison. Perplexity features are extracted using LMs to build enhanced detection models. Finally, the differences of lexical usages between AD patients and healthy people are investigated by a proportion test based on unigram probabilities. Results: Our baseline model obtains a detection accuracy of 80.7% . This accuracy increases to 85.4% after integrating the perplexity features derived from LMs. Further investigations show that AD patients tend to use more general, less informative, and less accurate words to describe characters and actions than healthy controls. Conclusion: The perplexity features extracted by LMs can benefit the automatic AD detection from continuous speech. There exist lexical-level differences between AD and healthy speech that can be captured by statistical N-gram LMs.