Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Purpose Individuals with neurogenic speech disorders require ongoing therapeutic support to achieve functional communication goals. Alternative methods for service delivery, such as tablet-based speech therapy applications, may help bridge the gap and bring therapeutic interventions to the patient in an engaging way. The purpose of this study was to evaluate an iPad-based speech therapy app that uses automatic speech recognition (ASR) software to provide feedback on speech accuracy to determine the ASR's accuracy against human judgment and whether participants' speech improved with this ASR-based feedback. Method Five participants with apraxia of speech plus aphasia secondary to stroke completed an intensive 4-week at-home therapy program using a novel word training app with built-in ASR. Multiple baselines across participants and behaviors designs were employed, with weekly probes and follow-up at 1 month posttreatment. Four sessions a week of 100 practice trials each were prescribed, with 1 being clinician-run and the remainder done independently. Dependent variables of interest were ASR–human agreement on accuracy during practice trials and human-judged word production accuracy over time in probes. Also, user experience surveys were completed immediately posttreatment. Results ASR–human agreement on accuracy averaged ~80%, which is a common threshold applied for interrater agreement. All participants demonstrated improved word production accuracy over time with the ASR-based feedback and maintenance of gains after 1 month. All participants reported enjoying using the app with support of a speech pathologist. Conclusion For these participants with apraxia of speech plus aphasia due to stroke, satisfactory gains were made in word production accuracy with an app-based therapy program providing ASR-based feedback on accuracy. Findings support further testing of this ASR-based approach as a supplement to clinician-run sessions to assist clients with similar profiles in achieving higher amount and intensity of practice as well as empowering them to manage their own therapy program. Supplemental Material https://doi.org/10.23641/asha.8206628

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In a related study, an application-based treatment program using ASR technology was provided to adults with aphasia and apraxia of speech (Ballard et al., 2019). ...
... Although not directly comparable to child-centered ASR intervention programs, this study's use of ASR with disordered speech and its reported effectiveness in comparison to clinician judgments provide valuable information for general comparisons. In their study, Ballard et al. (2019) present an ASR platform that provides feedback regarding the accuracy of whole-word productions, not phonemic targets. The authors reported an "acceptable" average ASR-clinician agreement of approximately 80%, ranging from 65% to 82% across participants. ...
... The analyses demonstrated notably high percentages of agreement across phonemes for the ASA algorithm with the clinicians (on average, 92% for sounds judged as "acceptable"). This high level of agreement, which is comparable to or greater than the threshold of human (clinician) interrater performance (Ballard et al., 2019), highlights the potential of the platform's use in the absence of the clinician to enhance practice and carryover of alreadystimulable target phoneme productions. Practice in the absence of a clinician should focus on reinforcing accurate productions. ...
Article
Purpose Automatic speech analysis (ASA) and automatic speech recognition systems are increasingly being used in the treatment of speech sound disorders (SSDs). When utilized as a home practice tool or in the absence of the clinician, the ASA system has the potential to facilitate treatment gains. However, the feedback accuracy of such systems varies, a factor that may impact these gains. The current research analyzes the feedback accuracy of a novel ASA algorithm (Amplio Learning Technologies), in comparison to clinician judgments. Method A total of 3,584 consonant stimuli, produced by 395 American English–speaking children and adolescents with SSDs (age range: 4–18 years), were analyzed with respect to automatic classification of the ASA algorithm, clinician–ASA agreement, and interclinician agreement. Further analysis of results as related to phoneme acquisition categories (early-, middle-, and late-acquired phonemes) was conducted. Results Agreement between clinicians and ASA classification for sounds produced accurately was above 80% for all phonemes, with some variation based on phoneme acquisition category (early, middle, late). This variation was also noted for ASA classification into “acceptable,” “unacceptable,” and “unknown” (which means no determination of phoneme accuracy) categories, as well as interclinician agreement. Clinician–ASA agreement was reduced for misarticulated sounds. Conclusions The initial findings of Amplio's novel algorithm are promising for its potential use within the context of home practice, as it demonstrates high feedback accuracy for correctly produced sounds. Furthermore, complexity of sound influences consistency of perception, both by clinicians and by automated platforms, indicating variable performance of the ASA algorithm across phonemes. Taken together, the ASA algorithm may be effective in facilitating speech sound practice for children with SSDs, even in the absence of the clinician.
... In another similar study, Hernandez et al. developed a serious game with an automatic feedback feature for hearing impaired children (Céspedes-Hernández et al., 2015). Ballard et al. performed a feasibility study of their tablet-based, fully automated therapy tool for children with apraxia without any role of SLP and other stakeholders (Ballard et al., 2019). Moreover, V. Robles-Bykbaev et al. proposed a framework imitating the main functionality of SLP along with a robotic assistant motivating children in therapy activity and automatically giving real time feedback (V. ...
... The system automatically evaluates the correctness of the exercises and gives real time feedback. On the other hand, Ballard et al. proposed a tabletbased therapy tool for children with apraxia (Ballard et al., 2019). Furthermore, Ng et al. and Sztaho et al. proposed a computer-based prosody teaching system for children with hearing impairment and a computer-based visual feedback system for the hearing impaired, respectively (Ng et al., 2018;Sztahó et al., 2018). ...
... Few studies (4 out of 24) compared the results of their automated tool with the conventional mode of speech therapy provided by SLPs (see Figure 12). Ballard et al. conducted an interrater agreement test between their ASR tool and SLPs and found ASR-human agreement averaged 80% (Ballard et al., 2019). In another study, Sztaho et al. found that their automated tool scores correspond to the subjective evaluation by SLPs (Sztahó et al., 2018). ...
Preprint
Full-text available
This paper presents a systematic literature review of published studies on AI-based automated speech therapy tools for persons with speech sound disorders (SSD). The COVID-19 pandemic has initiated the requirement for automated speech therapy tools for persons with SSD making speech therapy accessible and affordable. However, there are no guidelines for designing such automated tools and their required degree of automation compared to the conventional speech therapy given by Speech Language Pathologists (SLPs). In this systematic review, we followed the PRISMA framework to address four research questions: 1) what types of SSD do AI-based automated speech therapy tools address, 2) what is the level of autonomy achieved by such tools, 3) what are the different modes of intervention, and 4) how effective are such tools in comparison with the conventional mode of speech therapy. An extensive search was conducted on digital libraries to find research papers relevant to our study from 2007 to 2022. The results show that AI-based automated speech therapy tools for persons with SSD are increasingly gaining attention among researchers. Articulation disorders were the most frequently addressed SSD based on the reviewed papers. Further, our analysis shows that most researchers proposed fully automated tools without considering the role of other stakeholders. Our review indicates that mobile-based and gamified applications were the most frequent mode of intervention. The results further show that only a few studies compared the effectiveness of such tools compared to the conventional mode of speech therapy. Our paper presents the state-of-the-art in the field, contributes significant insights based on the research questions, and provides suggestions for future research directions.
... Mauszycki and Wambaugh (2020) The remaining five of the 22 were articulatory kinematic treatments with some modifications. Varley et al. (2016) and Ballard et al. (2019) reported the use of computer/tablet-based therapy. Varley et al. (2016) used an RCT with a cross-over design to compare a speech intervention with a sham intervention. ...
... Varley et al. (2016) used an RCT with a cross-over design to compare a speech intervention with a sham intervention. Ballard et al. (2019) used a tablet-based treatment using automatic speech recognition software (ASR) to determine the effect of ASR feedback on speech improvement. Marangolo et al. (2013) compared the effect of speech therapy plus bihemispheric tDCS or a sham condition on change in speed and accuracy of articulation. ...
... Treatment outcomes. The primary outcome measures of the treated behaviors of 22/27 studies were the accuracy of targeted phoneme/blends/clusters in words or the accuracy of complete words (Ballard et al., 2019;Bislick et al., 2014;Bislick, 2020;Farias et al., 2014;Haley et al., 2021;Hurkmans et al., 2015;Johnson, 2018;Johnson, Lott, & Prebor, 2018;Marangolo et al., 2013;Mauszycki, Wright, et al., 2016;Mozeiko et al., 2020;Preston & Leaman, 2014;Wambaugh et al., 2013Wambaugh et al., , 2016Wambaugh et al., , 2017Wambaugh et al., , 2020Wambaugh, Wright, Boss, et al., 2018;Zumbansen et al., 2014). Verbal communication in daily life (Hurkmans et al., 2015), changes in speech motor learning (Johnson, Lasker, et al., 2018), decrease in speech symptoms (Jungblut et al., 2014), and informativeness in connected speech (Zumbansen et al., 2014) were other primary outcomes reported in the studies. ...
Article
Purpose: This systematic review aims to summarize and evaluate the available literature on speech and language therapy interventions for acquired apraxia of speech since 2012. Method: A systematic search in six electronic databases was performed from 2013 to 2020. The following primary outcomes were summarized: (a) improvement in targeted behaviors, (b) generalization, and (c) maintenance of outcomes. Moreover, studies were evaluated for the level of evidence and the clinical phase. Results: Of the 3,070 records identified, 27 studies were included in this review. The majority of the studies (n = 22) used articulatory kinematic approaches followed by intersystemic facilitation/reorganization treatments (n = 4) and other approaches (n = 1). According to the classes defined in Clinical Practice Guideline Process Manual (Gronseth et al., 2017), one was Class II, 10 were Class III, 10 were Class III-b (fulfill Class III criteria except for independence of assessors' criterion), and five were Class IV. In terms of clinical phase, one study classified as Phase III, 10 as Phase II, and 15 as Phase I. Conclusions: Among the interventions for apraxia of speech, articulatory kinematic treatments have become prominent over the last 8 years. Focusing on self-administrated therapies, use of technology for therapy administration and development of treatments that focus on apraxia of speech and aphasia simultaneously were identified as new advancements in the apraxia of speech literature. The methodological quality, clinical phase, and level of evidence of the studies have improved within the past 8 years. Large-scale randomized controlled trials for articulatory kinematic approaches and future studies on other treatment approaches are warranted. Supplemental material: https://doi.org/10.23641/asha.22223785.
... The search identified 1529 possible studies for screening, 99 studies underwent a full text review, and ultimately 11 RCTs [16-18, 20, 25-31] and 18 quasiexperimental studies [32][33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48][49] met the eligibility criteria for inclusion in this manuscript (Fig. 1). No qualitative study met all the inclusion criteria and thus no qualitative studies were included. ...
... Six quasi-experimental studies ( Table 2) used therapy apps to study aphasia, and each showed improvement in aphasia recovery. The mobile app designs focused on expressive and receptive communication by creating visual associations with pictures [40,45,46] or using voice recognition software to guide tasks [34,36,44]. One study also used a spatial awareness game, Bejeweled, to target chronic (> 1 year) expressive aphasia but it had no impact on recovery [44]. ...
... Three RCTs (Table 5) and 4 quasi-experimental studies ( Table 2) assessed adherence to exercise using therapy apps, rehab videos, reminders, or a combination of rehab videos with reminders. One RCT [20] and 2 quasi-experimental studies [36,47] used therapy apps and measured exercise adherence. The RCT [20] showed an improvement in adherence to ambulation and the 2 quasi-experimental studies [36,47] found that therapy apps did not improve adherence to exercise. ...
Article
Full-text available
Background Stroke is a significant contributor of worldwide disability and morbidity with substantial economic consequences. Rehabilitation is a vital component of stroke recovery, but inpatient stroke rehabilitation programs can struggle to meet the recommended hours of therapy per day outlined by the Canadian Stroke Best Practices and American Heart Association. Mobile applications (apps) are an emerging technology which may help bridge this deficit, however this area is understudied. The purpose of this study is to review the effect of mobile apps for stroke rehabilitation on stroke impairments and functional outcomes. Specifically, this paper will delve into the impact of varying mobile app types on stroke rehabilitation. Methods This systematic review included 29 studies: 11 randomized control trials and 18 quasi-experimental studies. Data extrapolation mapped 5 mobile app types (therapy apps, education apps, rehab videos, reminders, and a combination of rehab videos with reminders) to stroke deficits (motor paresis, aphasia, neglect), adherence to exercise, activities of daily living (ADLs), quality of life, secondary stroke prevention, and depression and anxiety. Results There were multiple studies supporting the use of therapy apps for motor paresis or aphasia, rehab videos for exercise adherence, and reminders for exercise adherence. For permutations involving other app types with stroke deficits or functional outcomes (adherence to exercise, ADLs, quality of life, secondary stroke prevention, depression and anxiety), the results were either non-significant or limited by a paucity of studies. Conclusion Mobile apps demonstrate potential to assist with stroke recovery and augment face to face rehabilitation, however, development of a mobile app should be carefully planned when targeting specific stroke deficits or functional outcomes. This study found that mobile app types which mimicked principles of effective face-to-face therapy (massed practice, task-specific practice, goal-oriented practice, multisensory stimulation, rhythmic cueing, feedback, social interaction, and constraint-induced therapy) and education (interactivity, feedback, repetition, practice exercises, social learning) had the greatest benefits. Protocol registration PROPSERO (ID CRD42021186534). Registered 21 February 2021
... In another similar study, Hernandez et al. developed a serious game with an automatic feedback feature for hearing impaired children (Céspedes-Hernández et al., 2015). Ballard et al. performed a feasibility study of their tablet-based, fully automated therapy tool for children with apraxia without any role of SLP and other stakeholders (Ballard et al., 2019). Moreover, V. Robles-Bykbaev et al. proposed a framework imitating the main functionality of SLP along with a robotic assistant motivating children in therapy activity and automatically giving real time feedback (V. ...
... The system automatically evaluates the correctness of the exercises and gives real time feedback. On the other hand, Ballard et al. proposed a tabletbased therapy tool for children with apraxia (Ballard et al., 2019). Further-more, Ng et al. and Sztaho et al. proposed a computer-based prosody teaching system for children with hearing impairment and a computer-based visual feedback system for the hearing impaired, respectively (Ng et al., 2018;Sztahó et al., 2018). ...
... Few studies (4 out of 24) compared the results of their automated tool with the conventional mode of speech therapy provided by SLPs (see Figure 12). Ballard et al. conducted an interrater agreement test between their ASR tool and SLPs and found ASR-human agreement averaged 80% (Ballard et al., 2019). In another study, Sztaho et al. found that their automated tool scores correspond to the subjective evaluation by SLPs (Sztahó et al., 2018). ...
Preprint
Full-text available
This paper presents a systematic literature review of published studies on AI-based automated speech therapy tools for persons with speech sound disorders (SSD). The COVID-19 pandemic has initiated the requirement for automated speech therapy tools for persons with SSD making speech therapy accessible and affordable. However, there are no guidelines for designing such automated tools and their required degree of automation compared to human experts. In this systematic review, we followed the PRISMA framework to address four research questions: 1) what types of SSD do AI-based automated speech therapy tools address, 2) what is the level of autonomy achieved by such tools, 3) what are the different modes of intervention, and 4) how effective are such tools in comparison with human experts. An extensive search was conducted on digital libraries to find research papers relevant to our study from 2007 to 2022. The results show that AI-based automated speech therapy tools for persons with SSD are increasingly gaining attention among researchers. Articulation disorders were the most frequently addressed SSD based on the reviewed papers. Further, our analysis shows that most researchers proposed fully automated tools without considering the role of other stakeholders. Our review indicates that mobile-based and gamified applications were the most frequent mode of intervention. The results further show that only a few studies compared the effectiveness of such tools compared to expert Speech-Language Pathologists (SLP). Our paper presents the state-of- the-art in the field, contributes significant insights based on the research questions, and provides suggestions for future research directions.
... Eight studies had developed app solutions to support training in relation to speech-related deficits. In a study by Ballard et al. (42) chronic stroke survivors improved their word production accuracy after having trained their speech using an app in addition to face-to-face visits. Participants expressed that they enjoyed the app training, but that regular contact with a speech pathologist was also important. ...
... The use of apps can thus support stroke rehabilitation, promote self-management and empowerment in the later stages of the rehabilitation process (80). However, in the majority of apps targeting training in a home-setting, a health professional continuously supported/facilitated/adjusted/evaluated the hometraining intervention using either face-to-face visits (66), and/or phone/Skype (42,45,48,53,57,65,74). In 2 studies support was supplied by significant others (58,76) and in 1 study there was a lack of information regarding support (59). ...
... Only three studies included in this scoping review aimed to support several components of the rehabilitation process (not only information regarding how to manage stroke-related deficits (video/text/pictures), but also discharge support, in addition to providing exercise programmes (videos and text) (76) and goal-setting (49). Stroke survivors, their significant others, and health professionals, have expressed a need for more timely information, a more coordinated cross-sectional transition from inpatient to outpatient rehabilitation, a better overview of the entire rehabilitation process, and improved follow-up and contact with health professionals after discharge (6,7,42,82). An app solution aimed at supporting people with chronic diseases and accommodating patients' needs for a more comprehensive solution and a greater overview of the rehabilitation process has been tested in patients newly diagnosed with osteoporosis in Denmark (83). ...
Article
Full-text available
Aim: The aim of this study was to describe, and review evidence of mobile and web-based applications being used to support the rehabilitation process after stroke. The secondary aim was to describe participants' stroke severity, and use of applications in relation to, respectively, the setting and phase of the rehabilitation process. Method: A scoping review methodology was used to identify studies, through databases as PubMed, Cinahl, Embase and AMED. Additionally, grey literature was searched. The studies were categorized using the model of rehabilitation by Derick Wade. Results: The literature search resulted in 10,142 records. Thirty-six studies were included in which applications were used to support: assessment (n=13); training (n=20); discharge from hospital (n=2); and both training and discharge from hospital (n=1). Of the 36 studies, 25 studies included participants with mild to moderate stroke, and four studies included participants with severe stroke. In seven studies the stroke severity was not reported. Eighteen studies included participants with chronic stroke, 12 acute-subacute stroke, and three included participants with acute and/or subacute and/or chronic stroke. In three studies, stroke onset was not reported. Applications were used in a rehabilitation setting (n=16), home setting (n=13), both settings (n=3). In four studies the setting were not reported. Conclusion: Most included studies of applications developed to support the rehabilitation process after stroke have been explorative. They included primarily participants with mild or moderate stroke and focused on a limited aspect of the rehabilitation process, e.g., assessment or training. Future applications to support stroke rehabilitation should accommodate stroke survivors' and caregivers' need for solutions, irrespective of stroke severity and throughout the entire rehabilitation process.
... Given the structure of current rehabilitation programs, and perhaps the comfort of learners with AOS, it is unlikely that one-on-one speech intervention protocols can offer optimal practice opportunities in most clinical settings. As others have observed, technology-supported home programs can help bridge the gap by complementing clinician-administered treatment (Ballard, Etter, Shen, Monroe, & Tien Tan, 2019;Bilda, 2011;Cherney, Halper, Holland, & Cole, 2008;Fink, Brecher, Schwartz, & Robey, 2002;Hoover & Carney, 2014;Palmer et al., 2019;Varley et al., 2016). Some programs also have the capacity to support continued learning after the formal therapy period. ...
... Though not developed specifically for AOS, script training has been applied successfully to treat this disorder with some generalization to communication contexts, but not to untrained items (Henry et al., 2018;Youmans et al., 2011). Other treatment programs for speech production after stroke have also used words and phrases as intervention targets with good success (Ballard et al., 2019;Fridriksson et al., 2012;Friedman et al., 2010;Johnson et al., 2018;Lasker, Stierwalt, Spence, & Cavin-Root, 2010). ...
... Based on the principle of autonomy-support, there may be additional advantages to provide feedback at the request of the learner. Though potentially challenging during home practice, this type of feedback might be accomplished through automated speech recognition, as has been incorporated in treatment programs that prioritize external feedback (Ballard et al., 2019) or, if delayed feedback is helpful, simply by saving select self-recordings to share with the treating clinician in upcoming sessions. Interestingly, the motor learning literature indicates that when learners are free to decide when to receive external feedback, they tend to request it for their most accurate trials rather than trials in which they struggle (Wulf & Lewthwaite, 2016). ...
Article
Background: Most treatments for acquired apraxia of speech (AOS) rely on clinician-controlled practice conditions and repeated exposure to unimpaired speaker models. Recent motor learning research indicates that autonomy-support, expectation of competence, and external attentional focus may be more beneficial for motivation and skill learning. Aims: We evaluated the feasibility and basic therapeutic effect for the initial phase of a new speech production treatment program, ActionSC, that uses self-modeling and clinician coaching to help learners with AOS build their own practice program. Methods and Procedures: The single participant was a woman with moderate AOS and nonfluent aphasia. She met with project staff twice per week to review practice strategies, develop and adjust self-modeled video cues, work on her speech, and monitor progress. The program was structured around a custom app installed on a tablet computer. Most practice was directed by the participant based on options provided by the treating clinician. We used a multiple baseline across behavior design to evaluate the relationship between this treatment and oral reading probe performance for 30 conversational phrases the participant wanted to learn to say. Outcomes and Results: Experimental control was demonstrated, with target phrases remaining at baseline levels, then improving at the time treatment was introduced sequentially across three conversation topics. Effect sizes were moderate to large after 9–12 therapy sessions plus independent home practice. The participant assumed an active role in evaluating her own performance, administering and adjusting cues, and organizing her home practice. Conclusions: An autonomy-supportive and confidence-building format for speech practice can be feasible and effective for people with AOS. Fixed cueing hierarchies, augmented feedback, and attentional focus on speech movements may be less important in AOS treatment than previously thought. In addition to replicating our preliminary results with other participants and circumstances, there is a need to extend treatment development to later learning phases in order to promote positive change in real-life settings.
... In many fields, strings are converted to numbers for practical use (Bird et al., 2009). For example, converting strings to numbers powers chatbots (Dale, 2016;Raj, 2019), speech-to-text software (Ballard, Etter, Shen, Monroe, & Tan, 2019;Hair et al., 2020;Malik, Malik, Mehmood, & Makhdoom, 2021), popular search engines such as Google (Rogers, 2002), and virtual assistant technologies such as Alexa or the Echo (Kim, 2018). Though publications with these technologies are not common in behavior analysis, there are examples of counting the number of times specific words are emitted (Dounavi, 2014;Petursdottir, Carr, & Michael, 2005), the proportion of times a verbal response is emitted following other verbal stimuli (Carp & Pettursdottir, 2015;Partington & Bailey, 1993), and quantitatively analyzing the behavior analytic published literature (Blair, Shawler, Debacher, Harper, & Dorsey, 2018;Dixon, Reed, Smith, Belisle, & Jackson, 2015). ...
... Common examples include Apple's Siri, Amazon's Alexa, and Google's Assistant. Such automated recording of verbal behavior is being used to improve speech intelligibility (Ballard et al., 2019;Hair et al., 2020) with easy-to-imagine future use cases of providing consistent automated feedback on pronunciation and automated reinforcement to shape vocal-verbal behavior. ...
... In many fields, strings are converted to numbers for practical use (Bird et al., 2009). For example, converting strings to numbers powers chatbots (Dale, 2016;Raj, 2019), speech-to-text software (Ballard, Etter, Shen, Monroe, & Tan, 2019;Hair et al., 2020;Malik, Malik, Mehmood, & Makhdoom, 2021), popular search engines such as Google (Rogers, 2002), and virtual assistant technologies such as Alexa or the Echo (Kim, 2018). Though publications with these technologies are not common in behavior analysis, there are examples of counting the number of times specific words are emitted (Dounavi, 2014;Petursdottir, Carr, & Michael, 2005), the proportion of times a verbal response is emitted following other verbal stimuli (Carp & Pettursdottir, 2015;Partington & Bailey, 1993), and quantitatively analyzing the behavior analytic published literature (Blair, Shawler, Debacher, Harper, & Dorsey, 2018;Dixon, Reed, Smith, Belisle, & Jackson, 2015). ...
... Common examples include Apple's Siri, Amazon's Alexa, and Google's Assistant. Such automated recording of verbal behavior is being used to improve speech intelligibility (Ballard et al., 2019;Hair et al., 2020) with easy-to-imagine future use cases of providing consistent automated feedback on pronunciation and automated reinforcement to shape vocal-verbal behavior. ...
... Most crucially, the use of ASR has been shown to be effective in estimating speakers' intelligibility deficits for different clinical populations who may present with speech impairments [13], such as those resulting from a laryngectomy [14], a cleft palate [15], or head and neck cancer [16]. Additionally, the clinical validity of ASR has also been explored in individuals with apraxia of speech and aphasia with promising results [17,18]. Project Euphonia has achieved a large-scale data set with over 1 million recordings of disordered speech, with the ultimate goal to personalize ASR models to enhance communication in individuals who experience speech and language difficulties [19,20]. ...
... The potential capacity of ASR to outperform human listeners has been shown in recent studies [19], although further work is required with longer utterances and different speech tasks, as summarized in the limitations section below. Our findings also echo those reported with other clinical populations, such as those with a diagnosis of apraxia of speech and aphasia [17,18]. Additionally, our data provided no evidence that the mean probability of ASR success differed between the 2 groups of speakers, either a speaker with dysarthria or a healthy control. ...
Article
Full-text available
Background: Most individuals with Parkinson disease (PD) experience a degradation in their speech intelligibility. Research on the use of automatic speech recognition (ASR) to assess intelligibility is still sparse, especially when trying to replicate communication challenges in real-life conditions (ie, noisy backgrounds). Developing technologies to automatically measure intelligibility in noise can ultimately assist patients in self-managing their voice changes due to the disease. Objective: The goal of this study was to pilot-test and validate the use of a customized web-based app to assess speech intelligibility in noise in individuals with dysarthria associated with PD. Methods: In total, 20 individuals with dysarthria associated with PD and 20 healthy controls (HCs) recorded a set of sentences using their phones. The Google Cloud ASR API was used to automatically transcribe the speakers' sentences. An algorithm was created to embed speakers' sentences in +6-dB signal-to-noise multitalker babble. Results from ASR performance were compared to those from 30 listeners who orthographically transcribed the same set of sentences. Data were reduced into a single event, defined as a success if the artificial intelligence (AI) system transcribed a random speaker or sentence as well or better than the average of 3 randomly chosen human listeners. These data were further analyzed by logistic regression to assess whether AI success differed by speaker group (HCs or speakers with dysarthria) or was affected by sentence length. A discriminant analysis was conducted on the human listener data and AI transcriber data independently to compare the ability of each data set to discriminate between HCs and speakers with dysarthria. Results: The data analysis indicated a 0.8 probability (95% CI 0.65-0.91) that AI performance would be as good or better than the average human listener. AI transcriber success probability was not found to be dependent on speaker group. AI transcriber success was found to decrease with sentence length, losing an estimated 0.03 probability of transcribing as well as the average human listener for each word increase in sentence length. The AI transcriber data were found to offer the same discrimination of speakers into categories (HCs and speakers with dysarthria) as the human listener data. Conclusions: ASR has the potential to assess intelligibility in noise in speakers with dysarthria associated with PD. Our results hold promise for the use of AI with this clinical population, although a full range of speech severity needs to be evaluated in future work, as well as the effect of different speaking tasks on ASR.
... Other groups have made inroads in clinically validating ASR for dysarthria by investigating the relationship between perceptual severity measures and ASR transcription (Tu et al., 2016). For example, comparing human transcription and ASR transcription, Jacks et al. (2019) found very high correlations (Spearman ρ = .96-.98) using IBM Watson for speakers with aphasia and/or apraxia of speech (AOS) following a stroke; Maier et al. (2010) reported Spearman rho between −.88 and −.90 for a hidden Markov model (HMM)-based ASR system used on head and neck cancer patients with dysglossia and dysphonia; and Ballard et al. (2019) found agreement of 75.7% between human and ASR (using CMU Pocket-Sphinx) judgments of word-level productions by people with aphasia and AOS following stroke. Looking at the relationship between Google ASR accuracy and clinician-rated severity, Tu et al. (2016) found a moderate correlation (Pearson r = .69) ...
... for speakers with dysarthria when using the Google ASR engine. Still, the evidence for clinical validity of ASR-as applied to a specific clinical populationremains scant and has primarily been evaluated for aphasia and AOS (e.g., Ballard et al., 2019;Jacks et al., 2019). ...
Article
Purpose: There is increasing interest in using automatic speech recognition (ASR) systems to evaluate impairment severity or speech intelligibility in speakers with dysarthria. We assessed the clinical validity of one currently available off-the-shelf (OTS) ASR system (i.e., a Google Cloud ASR API) for indexing sentence-level speech intelligibility and impairment severity in individuals with amyotrophic lateral sclerosis (ALS), and we provided guidance for potential users of such systems in research and clinic. Method: Using speech samples collected from 52 individuals with ALS and 20 healthy control speakers, we compared word recognition rate (WRR) from the commercially available Google Cloud ASR API (Machine WRR) to clinician-provided judgments of impairment severity, as well as sentence intelligibility (Human WRR). We assessed the internal reliability of Machine and Human WRR by comparing the standard deviation of WRR across sentences to the minimally detectable change (MDC), a clinical benchmark that indicates whether results are within measurement error. We also evaluated Machine and Human WRR diagnostic accuracy for classifying speakers into clinically established categories. Results: Human WRR achieved better accuracy than Machine WRR when indexing speech severity, and, although related, Human and Machine WRR were not strongly correlated. When the speech signal was mixed with noise (noise-augmented ASR) to reduce a ceiling effect, Machine WRR performance improved. Internal reliability metrics were worse for Machine than Human WRR, particularly for typical and mildly impaired severity groups, although sentence length significantly impacted both Machine and Human WRRs. Conclusions: Results indicated that the OTS ASR system was inadequate for early detection of speech impairment and grading overall speech severity. While Machine and Human WRR were correlated, ASR should not be used as a one-to-one proxy for transcription speech intelligibility or clinician severity ratings. Overall, findings suggested that the tested OTS ASR system, Google Cloud ASR, has limited utility for grading clinical speech impairment in speakers with ALS.
... An updated version of this reported an average ASR accuracy of 82%, with ranges between 69% and 93% across patients (Abad et al., 2013). The second group (Ballard et al., 2019) evaluated a digitally delivered intervention using picture naming tasks in native Australian English speaking people with both apraxia and aphasia. They used the open-source ASR engine CMU PocketSphinx (Cmusphinx/Pocketsphinx, 2014/2020) to provide patients with 'correct'/'incorrect' feedback for each of their naming attempts during treatment. ...
... In the literature, to date, two key studies have evaluated ASR systems' performance on aphasic speakers' word naming attempts (Abad et al., 2013;Ballard et al., 2019). The level of heterogeneity between the three studies (including ours) is high (different languages spoken, types of aphasia, level of impairment, vocabulary assessed) nevertheless, NUVA does appear to offer both a more accurate and less variable performance. ...
Article
Full-text available
Anomia (word-finding difficulties) is the hallmark of aphasia, an acquired language disorder most commonly caused by stroke. Assessment of speech performance using picture naming tasks is a key method for both diagnosis and monitoring of responses to treatment interventions by people with aphasia (PWA). Currently, this assessment is conducted manually by speech and language therapists (SLT). Surprisingly, despite advancements in automatic speech recognition (ASR) and artificial intelligence with technologies like deep learning, research on developing automated systems for this task has been scarce. Here we present NUVA, an utterance verification system incorporating a deep learning element that classifies 'correct' versus' incorrect' naming attempts from aphasic stroke patients. When tested on eight native British-English speaking PWA the system's performance accuracy ranged between 83.6% to 93.6%, with a 10-fold cross-validation mean of 89.5%. This performance was not only significantly better than a baseline created for this study using one of the leading commercially available ASRs (Google speech-to-text service) but also comparable in some instances with two independent SLT ratings for the same dataset.
... An updated version of this reported an average ASR accuracy of 82%, with ranges between 69% and 93% across patients (Abad et al., 2013). The second group (Ballard et al., 2019) evaluated a digitally delivered intervention using picture naming tasks in native Australian English speaking people with both apraxia and aphasia. They used the open-source ASR engine CMU PocketSphinx (Cmusphinx/Pocketsphinx, 2014/2020) to provide patients with 'correct'/'incorrect' feedback for each of their naming attempts during treatment. ...
... In the literature, to date, two key studies have evaluated ASR systems' performance on aphasic speakers' word naming attempts (Abad et al., 2013;Ballard et al., 2019). The level of heterogeneity between the three studies (including ours) is high (different languages spoken, types of aphasia, level of impairment, vocabulary assessed) nevertheless, NUVA does appear to offer both a more accurate and less variable performance. ...
Preprint
Full-text available
Anomia (word-finding difficulties) is the hallmark of aphasia, an acquired language disorder most commonly caused by stroke. Assessment of speech performance using picture naming tasks is a key method for both diagnosis and monitoring of responses to treatment interventions by people with aphasia (PWA). Currently, this assessment is conducted manually by speech and language therapists (SLT). Surprisingly, despite advancements in automatic speech recognition (ASR) and artificial intelligence with technologies like deep learning, research on developing automated systems for this task has been scarce. Here we present NUVA, an utterance verification system incorporating a deep learning element that classifies 'correct' versus' incorrect' naming attempts from aphasic stroke patients. When tested on eight native British-English speaking PWA the system's performance accuracy ranged between 83.6% to 93.6%, with a 10-fold cross-validation mean of 89.5%. This performance was not only significantly better than a baseline created for this study using one of the leading commercially available ASRs (Google speech-to-text service) but also comparable in some instances with two independent SLT ratings for the same dataset.
... AOS is a neurogenic motor speech disorder that typically co-occurs with aphasia. It has been shown to respond positively to behavioral intervention, even when chronic (Ballard, Etter, Shen, Monroe, & Tan, 2019). Symptoms of AOS are thought to result from disruptions in speech motor planning or programming and include articulation errors (often distortions), slow rate of speech, and prosodic abnormalities (Duffy, 2013;McNeil, Robin, & Schmidt, 2009). ...
... Warren, Fey, and Yoder (2007) described numerous intensity variables that are pertinent to the maximization of treatment effects (i.e., dose, dose form, dose frequency, total intervention duration, and cumulative intervention intensity; Warren et al., 2007). Ballard et al. (2019) and Wambaugh, Duffy, McNeil, Robin, and Rogers (2006) provided rudimentary data such as average number of sessions, intervention frequency, and intervention duration for AOS treatment reports included in their systematic review; however, these intensity variables were not evaluated relative to treatment response. Beyond summarizations in systematic reviews of the amount of treatment provided, dose frequency in combination with total intervention duration have been the only intensity variables studied with respect to AOS treatment response (Wambaugh et al., 2013(Wambaugh et al., , 2018. ...
Article
Purpose The aim of this study was to examine the effects of dose frequency, an aspect of treatment intensity, on articulation outcomes of sound production treatment (SPT). Method Twelve speakers with apraxia of speech and aphasia received SPT administered with an intense dose frequency and a nonintense/traditional dose frequency (SPT-T). Each participant received both treatment intensities in the context of multiple baseline designs across behaviors. SPT-Intense was provided for 3 hourly sessions per day/3 days per week; and SPT-T for 1 hour-long session per day/3 days per week. Twenty-seven treatment sessions were completed with each phase of treatment. Articulation accuracy was measured in probes of production of treated and untreated words. Results All participants achieved improved articulation of treated words with both intensities; there were no notable differences in magnitude of improvement associated with dose frequency. Positive response generalization to untrained words was found in 21 of 24 treatment applications; the cases of negligible response generalization occurred with SPT-T words. Conclusions Dose frequency (and corresponding total intervention duration) did not appear to impact treatment response for treated items. Disparate response generalization findings for 3 participants in the current study may relate to participant characteristics such as apraxia of speech severity and/or stimuli factors.
... Continually training machine learning models with the speech of PWA can improve the machine's evaluation of aphasic speech (49), thereby better assisting PWA in rehabilitation. Furthermore, AI technology can be integrated with specific equipment and technologies to assist communication for PWA, such as incorporating speech recognition technology into devices like iPads (50,51), thus providing automatic feedback and enabling PWA to undertake self-directed rehabilitation training. ...
Article
Full-text available
Aphasia is a language disorder caused by brain injury that often results in difficulties with speech production and comprehension, significantly impacting the affected individuals’ lives. Recently, artificial intelligence (AI) has been advancing in medical research. Utilizing machine learning and related technologies, AI develops sophisticated algorithms and predictive models, and can employ tools such as speech recognition and natural language processing to autonomously identify and analyze language deficits in individuals with aphasia. These advancements provide new insights and methods for assessing and treating aphasia. This article explores current AI-supported assessment and treatment approaches for aphasia and highlights key application areas. It aims to uncover how AI can enhance the process of assessment, tailor therapeutic interventions, and track the progress and outcomes of rehabilitation efforts. The article also addresses the current limitations of AI’s application in aphasia and discusses prospects for future research.
... In clinical settings, automated tools for detecting paraphasias in an individual's speech can ultimately allow for more efficient and consistent assessment procedures. Additionally, for supplementary treatment options such as remote, self-directed speech therapy (via smartphone), automatically identifying paraphasic errors is critical in providing constructive feedback to the user [6,7]. ...
... Studying automatic speech recognition (ASR) for children's speech has led to significant improvements in the areas of voice search [1], language learning for kids [2], [3], diagnosis and remedial therapy for pathological speech [4], and even toys and games. For more than two decades, researchers have developed a myriad of techniques [5]- [8] and collected children's speech corpora to further spur research in this field [9], [10]. ...
... In clinical settings, automated tools for detecting paraphasias in an individual's speech can ultimately allow for more efficient and consistent assessment procedures. Additionally, for supplementary treatment options such as remote, self-directed speech therapy (via smartphone), automatically identifying paraphasic errors is critical in providing constructive feedback to the user [6,7]. ...
Preprint
Aphasia is a language disorder that can lead to speech errors known as paraphasias, which involve the misuse, substitution, or invention of words. Automatic paraphasia detection can help those with Aphasia by facilitating clinical assessment and treatment planning options. However, most automatic paraphasia detection works have focused solely on binary detection, which involves recognizing only the presence or absence of a paraphasia. Multiclass paraphasia detection represents an unexplored area of research that focuses on identifying multiple types of paraphasias and where they occur in a given speech segment. We present novel approaches that use a generative pretrained transformer (GPT) to identify paraphasias from transcripts as well as two end-to-end approaches that focus on modeling both automatic speech recognition (ASR) and paraphasia classification as multiple sequences vs. a single sequence. We demonstrate that a single sequence model outperforms GPT baselines for multiclass paraphasia detection.
... Indeed, most of the research available on ASR for writing involves users with learning difficulties (e.g. Ballard et al., 2019;Le et al., 2018;Quinlan, 2004). ...
Conference Paper
This study explores the potential of Automatic Speech Recognition (ASR) as a writing tool by investigating user behaviours (strategies henceforth) and text quality (lexical diversity) when users engage with the technology. Thirty English second language writers dictated texts into an ASR system (Google Voice Typing) while also using optional additional input devices, such as keyboards and mice. Analysis of video recordings and field observations revealed four strategies employed by users to produce texts: use of ASR exclusively, ASR in tandem with keyboarding, ASR followed by keyboarding, and ASR followed by both keyboarding and ASR. These strategies reflected cognitive differences and text generation challenges. Text quality was operationalized through lexical diversity metrics. Results showed that ASR use in tandem with keyboarding and ASR followed by both keyboarding and ASR yielded greater lexical diversity, whereas the use of ASR exclusively or ASR followed by keyboarding had lower diversity. Findings suggest that the integrated use of ASR and keyboarding activates dual channels, thus dispersing cognitive load and possibly improving text quality (i.e. lexical diversity). This exploratory study demonstrates potential for ASR as a complementary writing tool and lays groundwork for further research on the strategic integration of ASR and keyboarding to improve the quality of written texts.
... In a recent work [7], the authors showed the possibility of using EEG-based automatic speech recognition systems as a feedback tool in speech therapy for patients with aphasia. The results presented in another article show the first step towards demonstrating the feasibility of using non-invasive neural signals to develop a reliable real-time speech prosthesis for stroke survivors suffering from aphasia, apraxia and dysarthria [9]. ...
Chapter
Full-text available
Existing studies in the field of speech disorders do not provide a systematic understanding of the relationship between the bioelectrical activity of the brain and the nature of speech disorders, the characteristics of the processes of speech perception and internal pronunciation. This work is aimed at comparing the activity of the brain during the internal pronunciation of words by a group of people without speech disorders and a group of people with rhotacism. For the first time, an analysis and comparison of event-related potentials (ERP) of the brain in the process of internal pronunciation in people with and without rhotacism was carried out. The electroencephalographic (EEG) study involved 36 people, 18 of them had a speech disorder in the form of rhotacism. The subjects were presented with auditory stimuli (words) spoken by a speaker with standard sound pronunciation. The subject’s task was to mentally repeat the word, maintaining the intonation and pronunciation features, as in external speech. The results obtained in this study using a new method of localization of brain activity demonstrate significant differences in ERP during the mental pronunciation of words between the studied groups of people in a number of brain structures, including cortical and subcortical formations. The group of people with rhotacism is characterized by the presence of a pronounced ERP N200 in more evolutionarily early brain structures, such as the midbrain, medulla oblongata and insular lobe on the left. The group of people without speech disorders is characterized by the presence of pronounced ERP in the following structures: caudate nuclei on the right and left, right globus pallidus, cingulate cortex, striatum, dorsomedial prefrontal cortex, anterior cingulate cortex, field 17 on the right and left, Broca’s area on the right, Wernicke’s area on the right, angular gyrus on the right, anterior prefrontal cortex on the right and left. All differences are obtained with an estimate of 95% confidence interval.
... For example, an evidence-based behavior change technique was used through interactive mobile applications [13], and a finger training app on tablet PCs was developed to restore the ability to use the affected hands of stroke patients [14]. Ballard et al. [15] developed a language therapy application to improve the word-production ability of stroke patients suffering from apraxia of speech and aphasia. ...
Article
Full-text available
Rehabilitation training is essential for a successful recovery of upper extremity function after stroke. Training programs are typically conducted in hospitals or rehabilitation centers, supervised by specialized medical professionals. However, frequent visits to hospitals can be burdensome for stroke patients with limited mobility. We consider a self-administered rehabilitation system based on a mobile application in which patients can periodically upload videos of themselves performing reach-to-grasp tasks to receive recommendations for self-managed exercises or progress reports. Sensing equipment aside from cameras is typically unavailable in the home environment. A key contribution of our work is to propose a deep learning-based assessment model trained only with video data. As all patients carry out identical tasks, a fine-grained assessment of task execution is required. Our model addresses this difficulty by learning RGB and optical flow data in a complementary manner. The correlation between the RGB and optical flow data is captured by a novel module for modality fusion using cross-attention with Transformers. Experiments showed that our model achieved higher accuracy in movement assessment than existing methods for action recognition. Based on the assessment model, we developed a patient-centered, solution-based mobile application for upper extremity exercises for hemiplegia, which can recommend 57 exercises with three levels of difficulty. A prototype of our application was evaluated by potential end-users and achieved a good quality score on the Mobile Application Rating Scale (MARS).
... Achieving therapistindependent training for verbal speech production necessitates the use of automatic speech recognition (ASR) technology to recognize and assess spoken words. There is initial evidence suggesting that digital speech recognition technologies utilizing ASR can improve verbal word production in individuals with aphasia and apraxia of speech (Ballard et al., 2019). ...
Article
Full-text available
Introduction LingoTalk is a German speech-language app designed to enhance lexical retrieval in individuals with aphasia. It incorporates automatic speech recognition (ASR) to provide therapist-independent feedback. The execution and effectiveness of a self-administered intervention with LingoTalk was explored in a case series study. Methods Three individuals with chronic aphasia participated in a highly individualized, supervised self-administered intervention lasting 3 weeks. The LingoTalk app closely monitored the frequency, intensity and progress of the intervention. Treatment efficacy was assessed using a multiple baseline design, examining both item-specific treatment effects and generalization to untreated items, an untreated task, and spontaneous speech. Results All participants successfully completed the intervention with LingoTalk, although one participant was not able to use the ASR feature. None of the participants fully adhered to the treatment protocol. All participants demonstrated significant and sustained improvement in the naming of practiced items, although there was limited evidence of generalization. Additionally, there was a slight reduction in word-finding difficulties during spontaneous speech. Discussion This small-scale study indicates that self-administered intervention with LingoTalk can improve oral naming of treated items. Thus, it has the potential to complement face-to-face speech-language therapy, such as within in a “flipped speech room” approach. The choice of feedback mode is discussed. Transparent progress monitoring of the intervention appears to positively influence patients' motivation.
... All the articles reviewed presented either 1) data that was collected from or 2) systems/ models that were tested by people with aphasia who had a variety of language backgrounds (Cantonese; Chinese (Taiwan), English, Finnish, German, Italian, Portuguese (Brazil), and Swedish) and who had a variety of different aphasia syndromes (Broca, Wernicke, Global, Anomic, Transcortical, Conduction, or Residual aphasia) and severity levels. It is of interest to note that our search did also find articles that investigated the use of an iPad TM as a tool in aphasia rehabilitation (for example, Ballard et al., 2019;Hoover & Carney, 2014;Stark & Warburton, 2018). However, these were not included because, while the iPad TM does use AI (and automatic speech recognition (ASR) technology), the main focus of these studies was not to study how the AI used in the device was affecting the rehabilitation process but rather the authors sought to determine whether and/or how the addition of an iPad TM to aphasia therapy could be beneficial. ...
Article
Background In recent years, artificial intelligence (AI) has become commonplace in our daily lives, making its way into many different settings, including health and rehabilitation. While there is an increase in research on AI use in different sectors, information is sparse regarding whether and how AI is used in aphasia rehabilitation. Aims The objective of this scoping review was to describe and understand how AI is currently being used in the rehabilitation of people with aphasia (PWA). Our secondary goal was to determine if and how AI is being integrated into Augmentative and alternative communication (AAC) devices or applications for aphasia rehabilitation. Methods Using the Arksey and O’Malley (2005) Levac and colleagues (2010) frameworks, we identified the research question: In what way is artificial intelligence (AI) used in language rehabilitation for people with aphasia (PWA)? We then selected search terms and searched six databases which resulted in the identification of 663 studies. Based on the inclusion criteria, 28 suitable studies were retained. We then charted, collated and summarised the data in order to generate four main themes: (1) AI used for the classification or diagnosis of aphasia/aphasic syndromes or for the classification or diagnosis of primary progressive aphasia (PPA)/PPA variants; (2) AI used for aphasia therapy; (3) AI used to create models of lexicalization; and (4) AI used to classify paraphasic errors. Results None of the articles retained incorporated AI in AAC devices or applications in the context of aphasia rehabilitation. The majority of articles (n=17) used AI to classify aphasic syndromes or to differentiate PWA from healthy controls or persons with dementia. Another subset of articles (n=7) used AI in the attempt to augment an aphasia therapy intervention. Finally, two articles used AI to create a model of lexicalization and another two used AI to classify different types of paraphasias in the utterances of PWA. Conclusion Regarding performance accuracy of the diagnosis tools, results show that, regardless the type of AI approach used, models were able to differentiate between aphasic syndromes with a relatively high level of accuracy. Although significant advancements in AI and more interaction between the fields of aphasia rehabilitation and AI are required before AI can be integrated in aphasia rehabilitation, it nevertheless has the potential to be a central component of novel AAC devices or applications and be incorporated into innovative methods for aphasia assessment and therapy. However, for a transition to the clinic, new technologies or interventions using AI will need to be assessed to determine their efficacy and acceptance by both speech-language pathologists and PWA.
... In the field of clinical linguistics, the study conducted by Ballard et al. (2019) has found that a tablet or android-based language application that contains word games with a combination of images, audio, and video has been proven to improve the language skills of people with aphasia caused by stroke. In another context, the legal case for example, Subyantoro's literature research (2019) has reported that there are three objects of forensic linguistics, namely, (1) language as a legal product; (2) language in the judicial process; and (3) language as evidence. ...
Article
Full-text available
This study investigates the hybridity across linguistic studies from 2017 to 2019. It specifically attempts to figure out the trends of hybrid linguistic areas. To have a clear insight into the issue, a qualitative text analysis was adopted as the design of study. As the data sources, 304 research articles in linguistics were successfully retrieved from the digital data bases of internationally reputable linguistic journals. From each year, the newest released articles were purposively selected as the data sources. To have the clear insight into the hybrid areas across linguistic studies, the initial analysis was carried out on the titles of research articles. Further, analysis was also conducted on the abstracts and research questions. Based on the analysis on the titles, abstracts and research questions, it was found that there were 16 types of hybridity across linguistic studies from 2017 to 2019. The two most frequent hybrid linguistic fields in sequence encompass ‘Critical Discourse Analysis + Multimodality’ and ‘Critical Discourse Analysis + Systemic Functional Linguistics’. It is expected that the results of this study contributes to provide the insight into the possibility of mixing different areas of linguistic studies as a way of solving human’s growing complex humanistic problems.
... Sakar et al. in [49] have used machine learning to diagnose Parkinson's disease by using a voice dataset collected from such patients. In a tablet-based therapy for aphasia, ASR has been used to provide feedback to the patient to improve the speech of aphasia patients [50]. A system to improve the quality of speech in aphasia patients using "processing prosthesis", which is a software that allows users to record speech fragments and build them into larger structures by manipulating visual icons, has been used in combination with an ASR system [51]. ...
Article
Full-text available
Aphasia is a type of speech disorder that can cause speech defects in a person. Identifying the severity level of the aphasia patient is critical for the rehabilitation process. In this research, we identify ten aphasia severity levels motivated by specific speech therapies based on the presence or absence of identified characteristics in aphasic speech in order to give more specific treatment to the patient. In the aphasia severity level classification process, we experiment on different speech feature extraction techniques, lengths of input audio samples, and machine learning classifiers toward classification performance. Aphasic speech is required to be sensed by an audio sensor and then recorded and divided into audio frames and passed through an audio feature extractor before feeding into the machine learning classifier. According to the results, the mel frequency cepstral coefficient (MFCC) is the most suitable audio feature extraction method for the aphasic speech level classification process, as it outperformed the classification performance of all mel-spectrogram, chroma, and zero crossing rates by a large margin. Furthermore, the classification performance is higher when 20 s audio samples are used compared with 10 s chunks, even though the performance gap is narrow. Finally, the deep neural network approach resulted in the best classification performance, which was slightly better than both K-nearest neighbor (KNN) and random forest classifiers, and it was significantly better than decision tree algorithms. Therefore, the study shows that aphasia level classification can be completed with accuracy, precision, recall, and F1-score values of 0.99 using MFCC for 20 s audio samples using the deep neural network approach in order to recommend corresponding speech therapy for the identified level. A web application was developed for English-speaking aphasia patients to self-diagnose the severity level and engage in speech therapies.
... This increase in low-cost, readily available technology provides speech-language pathologists (SLPs) with flexible therapy options. However, in spite of the commercial proliferation of apps, relatively few studies have examined their feasibility or quality of instruction (Ballard et al., 2019;Furlong et al., 2018;McKechnie et al., 2020), and there is a dearth of studies examining their use by SLPs in clinical settings. Scant research into the use of apps by clinicians is not limited to the field of speech-language pathology. ...
Article
Purpose: Thousands of technological applications (apps) have emerged in the past decade, yet few studies have examined how apps are used by speech-language pathologists (SLPs), their effectiveness, and SLPs' feelings regarding their use. This study explored how SLPs use apps and their feelings regarding their use in schools, as well as considerations made by SLPs prior to implementing apps in therapy sessions. Method: A survey was distributed electronically to school-based SLPs in Ohio, yielding 69 valid responses. The study probed SLP demographics, patterns of app use in schools, and feelings toward their use in a school setting. Results: Results showed 77% of SLPs reported using apps in their treatment sessions and reported generally positive feelings regarding app use. SLPs considered factors such as age, cognitive ability, and treatment targets when using apps in treatment. SLPs who reported not using apps cited personal preference and price as the most common factors influencing their decision. SLPs also noted concerns about excessive screen time. Conclusions: Results of this study carry clinical implications for future development and assessment of technology to be used for service delivery in schools. Given that the majority of school-based SLPs report using apps with their students, research on the role of apps in supporting learning for speech-language services is urgently needed.
... A high PER indicates an even higher WER. In a very recent work described in [10] authors explored the possibility of using ASR systems as a feedback tool while providing speech therapy to aphasia patients. Their results demonstrated an increase in the effectiveness of the speech therapy when coupled with ASR technology. ...
Conference Paper
In this paper, we propose a deep learning-based algorithm to improve the performance of automatic speech recognition (ASR) systems for aphasia, apraxia, and dysarthria speech by utilizing electroencephalography (EEG) features recorded synchronously with aphasia, apraxia, and dysarthria speech. We demonstrate a significant decoding performance improvement by more than 50% during test time for isolated speech recognition task and we also provide preliminary results indicating performance improvement for the more challenging continuous speech recognition task by utilizing EEG features. The results presented in this paper show the first step towards demonstrating the possibility of utilizing non-invasive neural signals to design a real-time robust speech prosthetic for stroke survivors recovering from aphasia, apraxia, and dysarthria. Our aphasia, apraxia, and dysarthria speech-EEG data set will be released to the public to help further advance this interesting and crucial research.
... Such devices enable AAC users to communicate with others via prestored symbols, pictures, and texts as an alternative communication modality [4,5]. In addition, educational speech therapy apps, including game apps that contain speech sound stimuli or language-based activities, have been implemented during therapy to target specific intervention domains [2,6,7]. ...
Article
Full-text available
Abstract Background: With the plethora of mobile apps available on the Apple App Store, more speech-language pathologists (SLPs) have adopted apps for speech-language therapy services, especially for pediatric clients. App Store reviews are publicly available data sources that can not only create avenues for communication between technology developers and consumers but also enable stakeholders such as parents and clinicians to share their opinions and view opinions about the app content and quality based on user experiences. Objective: This study examines the Apple App Store reviews from multiple key stakeholders (eg, parents, educators, and SLPs) to identify and understand user needs and challenges of using speech-language therapy apps (including augmentative and alternative communication [AAC] apps) for pediatric clients who receive speech-language therapy services. Methods: We selected 16 apps from a prior interview study with SLPs that covered multiple American Speech-Language-Hearing Association Big Nine competencies, including articulation, receptive and expressive language, fluency, voice, social communication, and communication modalities. Using an automatic Python (Python Software Foundation) crawler developed by our research team and a Really Simple Syndication feed generator provided by Apple, we extracted a total of 721 app reviews from 2009 to 2020. Using qualitative coding to identify emerging themes, we conducted a content analysis of 57.9% (418/721) reviews and synthesized user feedback related to app features and content, usability issues, recommendations for improvement, and multiple influential factors related to app design and use. Results: Our analyses revealed that key stakeholders such as family members, educators, and individuals with communication disorders have used App Store reviews as a platform to share their experiences with AAC and speech-language apps. User reviews for AAC apps were primarily written by parents who indicated that AAC apps consistently exhibited more usability issues owing to violations of design guidelines in areas of aesthetics, user errors, controls, and customization. Reviews for speech-language apps were primarily written by SLPs and educators who requested and recommended specific app features (eg, customization of visuals, recorded feedback within the app, and culturally diverse character roles) based on their experiences working with a diverse group of pediatric clients with a variety of communication disorders. Conclusions: To our knowledge, this is the first study to compile and analyze publicly available App Store reviews to identify areas for improvement within mobile apps for pediatric speech-language therapy apps from children with communication disorders and different stakeholders (eg, clinicians, parents, and educators). The findings contribute to the understanding of apps for children with communication disorders regarding content and features, app usability and accessibility issues, and influential factors that impact both AAC apps and speech-language apps for children with communication disorders who need speech therapy.
... Nonetheless, there are still challenges related to automatic speech recognition (ASR) that must be solved worldwide in order to extend these therapy applications, since they basically depend on adequate engines that should properly recognize aphasic speech. ASR systems are usually trained with the voices of people without any speech pathology, and their performance degrades when they are applied to aphasic speech [23][24][25][26][27]. Furthermore, ASR systems are usually language-dependent and have to be trained with hundreds or thousands of hours of transcribed speech. ...
Article
Full-text available
Automatic speech recognition in patients with aphasia is a challenging task for which studies have been published in a few languages. Reasonably, the systems reported in the literature within this field show significantly lower performance than those focused on transcribing non-pathological clean speech. It is mainly due to the difficulty of recognizing a more unintelligible voice, as well as due to the scarcity of annotated aphasic data. This work is mainly focused on applying novel semi-supervised learning methods to the AphasiaBank dataset in order to deal with these two major issues, reporting improvements for the English language and providing the first benchmark for the Spanish language for which less than one hour of transcribed aphasic speech was used for training. In addition, the influence of reinforcing the training and decoding processes with out-of-domain acoustic and text data is described by using different strategies and configurations to fine-tune the hyperparameters and the final recognition systems. The interesting results obtained encourage extending this technological approach to other languages and scenarios where the scarcity of annotated data to train recognition models is a challenging reality.
... Such devices enable AAC users to communicate with others via prestored symbols, pictures, and texts as an alternative communication modality [4,5]. In addition, educational speech therapy apps, including game apps that contain speech sound stimuli or language-based activities, have been implemented during therapy to target specific intervention domains [2,6,7]. ...
Preprint
Full-text available
BACKGROUND With the plethora of mobile applications (apps) available in the App Store, more speech-language pathologists (SLPs) have adopted apps for speech-language therapy services, especially for pediatric clients who may benefit from various interactive apps. App Store reviews are publically available data sources that can not only create avenues for communication between technology developers and consumers to understand user needs and challenges but also inform and educate clinicians about the app content and quality based on clinical user experience with their clients. OBJECTIVE This study examines the iOS App Store reviews from multiple key stakeholders (e.g., parents, educators, SLPs, and individuals with communication disorders) to better understand user needs and challenges of using speech-language therapy apps (including augmentative and alternative communication, or AAC apps) for pediatric clients who receive speech-language therapy services. METHODS We selected a total of 16 apps from a prior interview study with SLPs which covered multiple American Speech-Language-Hearing Association (ASHA) Big Nine competencies, including articulation, receptive and expressive language, fluency, voice, social communication, and communication modalities. Using an automatic Python Crawler developed by our research team, we extracted a total of 1107 app reviews from 2009 to 2020. Using a qualitative coding scheme and usability guidelines, we conducted a content analysis of a total of 484 reviews and synthesized user feedback related to app features and content, usability issues, and multiple influential factors related to app use. RESULTS Our analysis revealed that multiple key stakeholders, such as family members, educators, as well as individuals with communication disorders have utilized app store reviews as a platform to share their experiences with AAC and speech-language apps. User reviews suggested that AAC apps consistently demonstrated more usability issues due to violations of design guidelines in areas of aesthetics, user errors, controls, and customization. Reviews for speech-language apps reviews were mostly written by SLPs who requested and recommended specific app features (e.g., customization of visuals, recorded feedback within the app, culturally diverse character roles) based on their clinical service delivery working with a diverse age group of pediatric clients with a variety of communication disorders. CONCLUSIONS To our knowledge, this is the first study that analyzes publicly available app store reviews to examine mobile apps for pediatric speech-language therapy apps from children with communication disorders (CwCDs) and different stakeholders, including clinicians, educators, and parents. The findings contributed to the understanding of CwCDs’ app content and features as well as usability and accessibility issues with both AAC apps and speech-language apps. App reviews also revealed influential factors that highlight ongoing financial, sociocultural, ethical and moral considerations for app design and development for CwCDs who need speech therapy.
... A high PER indicates an even higher WER. In a very recent work described in [10] authors explored the possibility of using ASR systems as a feedback tool while providing speech therapy to aphasia patients. Their results demonstrated an increase in the effectiveness of the speech therapy when coupled with ASR technology. ...
Preprint
Full-text available
In this paper, we propose a deep learning-based algorithm to improve the performance of automatic speech recognition (ASR) systems for aphasia, apraxia, and dysarthria speech by utilizing electroencephalography (EEG) features recorded synchronously with aphasia, apraxia, and dysarthria speech. We demonstrate a significant decoding performance improvement by more than 50\% during test time for isolated speech recognition task and we also provide preliminary results indicating performance improvement for the more challenging continuous speech recognition task by utilizing EEG features. The results presented in this paper show the first step towards demonstrating the possibility of utilizing non-invasive neural signals to design a real-time robust speech prosthetic for stroke survivors recovering from aphasia, apraxia, and dysarthria. Our aphasia, apraxia, and dysarthria speech-EEG data set will be released to the public to help further advance this interesting and crucial research.
... Their in-house ASR-engine called AUDIMUS [7] using a keyword spotting technique to score spoken naming attempts as 'correct'/'incorrect' reported an average accuracy of 82%, with ranges between 69% and 93% across patients [5]. The second group [8] evaluated a digitally delivered picture naming intervention in native Australian English speaking people with apraxia plus aphasia. They used the open-source ASR engine CMU PocketSphinx [9] to provide patients with 'correct'/'incorrect' feedback. ...
Conference Paper
Full-text available
Anomia (word finding difficulties) is the hallmark of aphasia an acquired language disorder, most commonly caused by stroke. Assessment of speech performance using pijcture naming tasks is therefore a key method for identification of the disorder and monitoring patient’s response to treatment interventions. Currently, this assessment is conducted manually by speech and language therapists (SLT). Surprisingly, despite advancements in ASR and artificial intelligence with technologies like deep learning, research on developing automated systems for this task has been scarce. Here we present an utterance verification system incorporating a deep learning element that classifies ‘correct’/’incorrect’ naming attempts from aphasic stroke patients. When tested on 8 native British-English speaking aphasics the system’s performance accuracy ranged between 83.6% to 93.6%, with a 10 fold cross validation mean of 89.5%. This performance was not only significantly better than one of the leading commercially available ASRs (Google speech-to-text service) but also comparable in some instances with two independent SLT ratings for the same dataset.
... Nowadays, it is indispensable not only for private use but also for various professions [3]. In several medical specialties, ASR has been tested and implemented in speech-language pathology research, diagnostics and therapeutics such as in speech apraxia [4,5]. Among medical professionals, ASR is commonly used to convert speech into text for data entries and has already been sufficiently tested in a mobile environment. ...
Article
Full-text available
Introduction For decades, automatic speech recognition (ASR) has been the subject of research and its range of applications broadened. Presently, ASR among physicians is mainly used to convert speech into text but not to implement instructions in the operating room (OR). This study aimed to evaluate physicians of different surgical professions on their personal experience and posture towards ASR. Methods A 16-item survey was distributed electronically to hospitals and outpatient clinics in southern Germany addressing physicians on the potential applications of ASR in the OR. Results The survey was responded by 185 of 2693 physicians (response rate: 6.9%) with a mean age of 41.8 ± 9.8 years. ASR is desirable in the OR regardless of the field of speciality (93.7%). While only 2.7% have used ASR, 87.9% evaluate its future potential as high. 91.0% of those working in a university hospital would consider testing ASR, while 67.5% of those in non-university hospitals and practices (p = 0.001). 90.1% of responders of strictly surgical specialities see potential in ASR while 73.7% in non-surgical specialities evaluate its future potential as high (p = 0.01). 58.3% of those over the age of 60 consider the use of ASR without a headset to be imaginable, while 96.3% among those under the age of 60. There were no statistically significant differences regarding sex and professional position. Conclusion Foreseeably, ASR is anticipated to be integrated into ORs and valued at a high market potential. Our study provides information about physicians’ individual preferences from various surgical disciplines regarding ASR.
Article
Purpose This project explores the perceived implications of artificial intelligence (AI) tools and generative language tools, like ChatGPT, on practice in speech-language pathology. Method A total of 107 clinician ( n = 60) and student ( n = 47) participants completed an 87-item survey that included Likert-style questions and open-ended qualitative responses. The survey explored participants' current frequency of use, experience with AI tools, ethical concerns, and concern with replacing clinicians, as well as likelihood to use in particular professional and clinical areas. Results were analyzed in the context of qualitative responses to typed-response open-ended questions. Results A series of analyses indicated participants are somewhat knowledgeable and experienced with GPT software and other AI tools. Despite a positive outlook and the belief that AI tools are helpful for practice, programs like ChatGPT and other AI tools are infrequently used by speech-language pathologists and students for clinical purposes, mostly restricted to administrative tasks. Conclusion While impressions of GPT and other AI tools cite the beneficial ways that AI tools can enhance a clinician's workloads, participants indicate a hesitancy to use AI tools and call for institutional guidelines and training for its adoption.
Article
Purpose: The purpose of this study is to examine recent research trends regarding automatic speech recognition (ASR), which is used in the evaluation and intervention of speech disorders.Methods: Through a search engine, articles published in domestic journals were searched. A total of 27 papers were selected from the searched documents and analyzed according to the year, research subject, speech task, and ASR system.Results: The years with the most research was done in 2019~2021. The subjects who most frequently underwent speech evaluation and treatment using ASR system were those with dysarthria. The speech production tasks used to utilize ASR were at the word and sentence level, and commercialized and non-commercialized ASR systems were used similarly.Conclusion: These results might be used for the fundamental data to establish the most suitable evaluation methods and intervention plans for patients with speech disorders in clinical and research fields.
Article
Full-text available
As a multi-ethnic country with a large population, China is endowed with diverse dialects, which brings considerable challenges to speech recognition work. In fact, due to geographical location, population migration, and other factors, the research progress and practical application of Chinese dialect speech recognition are currently at different stages. Therefore, exploring the significant regional heterogeneities in specific recognition approaches and effects, dialect corpus, and other resources is of vital importance for Chinese speech recognition work. Based on this, we first start with the regional classification of dialects and analyze the pivotal acoustic characteristics of dialects, including specific vowels and tones patterns. Secondly, we comprehensively summarize the existing dialect phonetic corpus in China, which is of some assistance in exploring the general construction methods of dialect phonetic corpus. Moreover, we expound on the general process of dialect recognition. Several critical dialect recognition approaches are summarized and introduced in detail, especially the hybrid method of Artificial Neural Network (ANN) combined with the Hidden Markov Model(HMM), as well as the End-to-End (E2E). Thirdly, through the in-depth comparison of their principles, merits, disadvantages, and recognition performance for different dialects, the development trends and challenges in dialect recognition in the future are pointed out. Finally, some application examples of dialect speech recognition are collected and discussed.
Article
People with Broca's aphasia (PBA) have been commonly known to show difficulties in expressing language. The speech partner of PBA assumes that PBA’s utterances do not make sense. In fact, PBA do not make ideas arbitrarily. Pragmatically, their utterances in communication can still be analyzed. This study aims to explain the verbal and non-verbal language characteristics of PBA. To achieve this goal, a qualitative approach was carried out using the case study method on individuals with Broca aphasia who had hemorrhagic strokes. The framework used to reveal communication strategies was the relevance theory of communication and cognition of Sperber & Wilson. The findings of the study show that verbal communication was conducted by retrieving words that were already available in the mental lexicon and then by paraphrasing them through association and collocation. Non-verbal communication was carried out through cues, especially when the individual had difficulties recalling words. The individual's failure to produce language and derive ideas from the mental lexicon is the result of disturbances in the short-term memory area.
Article
Full-text available
Phraseological or multi-word-pattern corpus-driven analysis of language in use has offered significant insights in recent years into how linguistic discourse can vary. This variation has been researched across genres, registers, disciplines, and native or non-native differences. However, very few studies have presented the gender-based analysis of academic research discourse within the EFL/ESL perspective. The current study explored the use of lexical bundles practiced by male and female researchers working in the EFL/ESL academic context within KSA. Corpora comprising almost 300,000 words including 68 research articles, 36 by female and 32 by male researchers were collected and run through Lancsbox 6.0 software package. The analysis was based on the frequency and structural patterns across the selected data. For the critical analysis of structural patterns, the structural taxonomy framework offered by Gezegin-Bal (2109) adapted from Biber et al. (1999) was employed. As established by the findings of the study, prepositional and noun phrases remained overwhelmingly more frequent and common in both corpora. There were no significant gender-based differences in the use of lexical bundles found which reflects that both male and female researchers practiced similar expressions in their use of the English language.
Article
Purpose: Motivation is a complex phenomenon that can influence a person's ability to make progress in treatment. We sought to understand how motivation is currently measured and utilized in aphasia rehabilitation by identifying treatment studies that (1) include measurement of motivation and (2) use motivation to predict treatment response. Method: A scoping review was conducted by systematically searching PubMed, CINAHL, EBSCO, Ovid MEDLINE, and APA PsycInfo using the following search terms: (measurement OR treatment OR rehabilitation OR predict*) AND (motiv* OR engagement OR adherence OR compliance) AND (aphasia OR dysphasia). Results: Two studies met our inclusion criteria. Motivation was measured differently across studies. No studies used motivation to predict treatment outcomes. Discussion/conclusions: Despite the importance of motivation in aphasia rehabilitation success, studies that include its measurement are sparse. Additional research is needed and should include development of measurement tools and evaluation of the predictive value of motivation on treatment outcomes.
Conference Paper
This research exposes the need for more user experience and usability research on speech recognition software for users with apraxia, a speech disability. It provides feedback about common speech recognition devices from users with apraxia and speech impediments. The relatively high prevalence of apraxia and other speech disorders suggests that a large population may need technology to help improve quality of life and socialization. Speech and audio processing software might help improve both. Voice-controlled software and personal assistants can only improve this community’s lives if they provide parity of user experience. The article provides an overview of research insights and public feedback to help designers create more user-centered speech recognition software for this population. First, the article offers an integrative review of article findings from 2009 to 2020. Only 9 of 120 provided sufficient detail about the 20% of the users diagnosed with apraxia. The studies covered therapeutic rather than mundane settings. Only about a fifth of the users and participants recruited for the studies were diagnosed with apraxia of speech, a particular disorder that directly impacts speech recognition accuracy and precision. The samples were often heterogeneous in speech diagnosis, gender, and age. Others were homogeneous in terms of race and ethnicity. These factors are important because they may impact tone, texture, intonation, and other speech detection variables. Study methods were primarily orthodox user testing involving task scenarios. Second, the research gathers user feedback from users with speech impediments on Twitter. Most of the 143 tweets were negative about the performance of speech recognition technologies. There was far more negative feedback about the technologies and their inability to understand users with apraxia and speech impediments. The tweets did not reveal a wide range of activities, suggesting that the technology is only marginally useful to users with apraxia or speech impediments. Future studies should include more homogeneous samples in terms of speech conditions and more heterogeneous samples in terms of demographics. Future studies should also gather more direct feedback from users and compare technologies, which might require modifying user experience and usability research methods. Furthermore, more research studies reporting product design for this community should detail the user experience and usability testing involved. Finally, product designers should not only test products with diverse populations, including those with disabilities, but they should also test in mundane and therapeutic settings and applications and develop personae to help them keep in mind their particular needs. While recruiting and retaining these users might be difficult, any extra effort will pay dividends in product quality and marketability.
Article
Purpose This study was conducted to explore the effectiveness of speech to text as a form of biofeedback intervention for speech sound production in children with articulation disorders. Method A multiple-baseline across-participants design was used for this study. Speech-to-text biofeedback was implemented with three children aged 7–9 years who demonstrated consonantal articulation errors. Data regarding accuracy of target phoneme production were repeatedly collected across baseline, treatment, and probe phases. Results Based on the preliminary data collected and analyzed during this study, results suggest that speech to text is an effective approach for addressing speech sound production. All three participants demonstrated improvement in the production of their target phonemes. In addition, all of the participants maintained their skills posttreatment. Conclusions The results of this study provide initial support for the use of speech to text for children who demonstrate articulation disorders. Implications for future research and practice based on the results are discussed.
Article
Digital games can make speech therapy exercises more enjoyable for children and increase their motivation during therapy. However, many such games developed to date have not been designed for long-term use. To address this issue, we developed Apraxia World, a speech therapy game specifically intended to be played over extended periods. In this study, we examined pronunciation improvements, child engagement over time, and caregiver and automated pronunciation evaluation accuracy while using our game over a multi-month period. Ten children played Apraxia World at home during two counterbalanced 4-week treatment blocks separated by a 2-week break. In one treatment phase, children received pronunciation feedback from caregivers and in the other treatment phase, utterances were evaluated with an automated framework built into the game. We found that children made therapeutically significant speech improvements while using Apraxia World, and that the game successfully increased engagement during speech therapy practice. Additionally, in offline mispronunciation detection tests, our automated pronunciation evaluation framework outperformed a traditional method based on goodness of pronunciation scoring. Our results suggest that this type of speech therapy game is a valid complement to traditional home practice.
Article
Full-text available
Objective To conduct a scoping review of mobile health (mHealth) app interventions to support needs of adults living with the effects of stroke reported in the literature. Data Sources PubMed, CINAHL, and Scopus were systematically searched for peer-reviewed publications. Articles were published between January 2007 and September 2020 and met predefined inclusion and exclusion criteria. Study Selection Articles included were written in English language, involved adults older than 18 years-of-age, described a mHealth app specifically tested and/or developed as an intervention for someone with stroke to be used remotely and/or independently without constant provider supervision or assistance. Articles were excluded if they focused on acute management of stroke only, focused on primary prevention, were animal studies, were not an app for smartphone or tablet, and did not describe an empirical study. Data Extraction Two researchers independently screened titles and abstracts for inclusion. The full-text articles were then reviewed for eligibility by the research team. Data was extracted and verified by a third reviewer. Data Synthesis The search yielded 2,123 studies and 49 were included for data extraction. The findings reveal that a global surge of studies on mHealth apps for people with stroke have emerged within the past two years. Most studies were developed for persons with stroke in the United States and the primary content foci included: upper extremity function (31.5%); lower extremity function (5.3%); general exercise, physical activity, and/or functional mobility (23.7%); trunk control (5.3%); medical management and secondary prevention (26.3%); language and speech skills (20.5%); cognitive skills (7.9%); general disability and activities of daily living (ADL; 5.3%); and home safety (2.6%). Of the included studies, a majority were preliminary in nature with 36.7% being categorized as pilot or feasibility trials and 24.4% discussing initial design, development, and/or refinement. Conclusions Results from this study reveal that the number of apps specifically developed for people with stroke and described in the scientific literature are growing exponentially. The apps have widely varied content to meet the needs of persons with stroke; however, the studies are generally preliminary in nature focusing on development, usability, and initial pilot testing. This review highlights the need for additional research and development of mHealth apps targeted for adults with stroke. Development should consider the various and complex needs of people living with the effects of chronic stroke while large-scale trials are needed to build upon the existing evidence.
Article
Background: Empirical study of the effects of treatment for acquired apraxia of speech (AOS) has been ongoing for more than four decades. The evidence-base supporting behavioral therapies for AOS has been systematically reviewed previously. However, a substantial body of AOS treatment research has accumulated which has not been summarized in any form. Aims: The aim of the current report is to provide an overview of the extant AOS treatment research published since 2012. This summarization is intended to highlight advances that have strengthened or extended the AOS treatment evidence base and is not intended as a systematic review. Main Contribution: This report is a synopsis of recent AOS treatment research organized according to treatment foci and/or approaches. New research is described relative to the context of existing research to facilitate understanding of the current state of the evidence. Conclusions: AOS treatment investigations have continued to be focused primarily on articulatory-kinematic approaches, but with increasing attention to specific aspects of treatment that may contribute to outcomes. Interest in musical/rhythmic approaches has seen a resurgence with additional evidence indicating positive effects on speech production abilities. Data supporting approaches that combine AOS and aphasia treatments are now available and advances with technologically enhanced treatments continue. The AOS treatment evidence base continues to benefit from expansion.
Article
Full-text available
Purpose: The purpose of this study was to review treatment studies of semantic feature analysis (SFA) for persons with aphasia. The review documents how SFA is used, appraises the quality of the included studies, and evaluates the efficacy of SFA. Method: The following electronic databases were systematically searched (last search February 2017): Academic Search Complete, CINAHL Plus, E-journals, Health Policy Reference Centre, MEDLINE, PsycARTICLES, PsycINFO, and SocINDEX. The quality of the included studies was rated. Clinical efficacy was determined by calculating effect sizes (Cohen's d) or percent of nonoverlapping data when d could not be calculated. Results: Twenty-one studies were reviewed reporting on 55 persons with aphasia. SFA was used in 6 different types of studies: confrontation naming of nouns, confrontation naming of verbs, connected speech/discourse, group, multilingual, and studies where SFA was compared with other approaches. The quality of included studies was high (Single Case Experimental Design Scale average [range] = 9.55 [8.0-11]). Naming of trained items improved for 45 participants (81.82%). Effect sizes indicated that there was a small treatment effect. Conclusions: SFA leads to positive outcomes despite the variability of treatment procedures, dosage, duration, and variations to the traditional SFA protocol. Further research is warranted to examine the efficacy of SFA and generalization effects in larger controlled studies.
Article
Full-text available
Medication adherence is crucial for success in the management of patients with chronic conditions. This study analyzes whether a mobile application on a tablet aimed at supporting drug intake and vital sign parameter documentation affects adherence in elderly patients. Patients with coronary heart disease and no prior knowledge of tablet computers were recruited. They received a personal introduction to the mobile application Medication Plan, installed on an Apple iPad. The study was conducted using a crossover design with 3 sequences: initial phase, interventional phase (28 days of using the app system), and comparative phase (28 days of using a paper diary). Users experienced the interventional and comparative phases alternately. A total of 24 patients (12 males; mean age 73.8 years) were enrolled in the study. The mean for subjectively assessed adherence (A14-scale; 5-point Likert scale, from ?never? to ?very often? which results in a score from 0 to 56) before the study was 50.0 (SD = 3.44). After both interventions there was a significant increase, which was more pronounced after the interventional phase (54.0; SD = 2.01) than after the comparative phase (52.6; SD = 2.49) (for all pairs after both interventions, P?<0.001). Neither medical conditions nor the number of drug intake (amount and frequency of drug taking) per day affected subjective adherence. Logging data showed a significantly stronger adherence for the medication app than the paper system for both blood pressure recordings (P?<0.001) and medication intake (P = 0.033). The majority of participants (n = 22) stated that they would like to use the medication app in their daily lives and would not need further assistance with the app. A mobile app for medication adherence increased objectively and subjectively measured adherence in elderly users undergoing rehabilitation. The findings have promising clinical implications: digital tools can assist chronic disease patients achieve adherence to medication and to blood pressure measurement. Although this requires initial offline training, it can reduce complications and clinical overload because of nonadherence.
Article
Full-text available
Self-delivered speech therapy provides an opportunity for individualised dosage as a complement to the speech-therapy regime in the long-term rehabilitation pathway. Few apps for speech therapy have been subject to clinical trials, especially on a self-delivered platform. In a crossover design study, the Comprehensive Aphasia Test (CAT) and Cookie Theft Picture Description (CTPD) were used to measure untrained improvement in a group of chronic expressive aphasic patients after using a speech therapy app. A pilot study (n = 3) and crossover design (n = 7) comparing the therapy app with a non-language mind-game were conducted. Patients self-selected their training on the app, with a recommended use of 20 minutes per day. There was significant post-therapy improvement on the CAT and CTPD but no significant improvement after the mind-game intervention, suggesting there were language-specific effects following use of the therapy app. Improvements on the CTPD, a functional measurement of speech, suggest that a therapy app can produce practical, important changes in speech. The improvements post-therapy were not due to type of language category trained or amount of training on the app, but an inverse relationship with severity at baseline and post-therapy improvement was shown. This study suggests that self-delivered therapy via an app is beneficial for chronic expressive aphasia.
Article
Full-text available
In pronunciation learning, students are often hampered in their attempts to study or practice autonomously by their limited abilities to monitor their speech for errors. Automatic Speech Recognition (ASR) has great potential for providing feedback, allowing students to become more autonomous pronunciation learners. This study examined the effect of ASR use as part of a three-week pronunciation workshop on students' autonomous learning beliefs and behaviors. The study utilized three groups: 1) CONV: conventional face-to-face pronunciation training workshop (n = 15), 2) STRAT: mostly conventional with minimal ASR strategy training (n = 17), and 3) HYBRID: hybrid with half of workshop time using ASR (n = 16). Changes in beliefs and behaviors were tracked using pre-, post-, and delayed post-workshop surveys, along with interviews and weekly learning logs. Results showed that while CONV reported no significant change, groups introduced to ASR, STRAT and HYBRID, significantly increased their beliefs of autonomy from the pre- to post-workshop survey and pointed to the feedback from ASR as enabling them to practice autonomously. However, after the workshop ended, HYBRID reported significantly more time spent on autonomous pronunciation learning and more use of ASR than STRAT and CONV, highlighting the need for a gradualist approach to autonomy through repeated practice with ASR.
Article
Full-text available
A systematic review of published intervention studies of acquired Apraxia of Speech, by appointed committee of the Academy of Neurological Communication Disorders and Sciences, updating the previous committee's review from 2006. A systematic search of 11 databases identified 215 articles, with 26 meeting inclusion criteria of (1) stating intention to measure effects of treatment on AOS and (2) data representing treatment effects for at least one individual stated to have AOS. All studies involved within-participant experimental designs, with sample sizes of 1 to 44 (median = 1). Confidence in diagnosis was rated high to reasonable in 18/26 studies. Most studies (24/26) reported on articulatory-kinematic approaches; two applied rhythm/rate control methods. Six studies had sufficient experimental control for Class III rating (American Academy of Neurology Clinical Practice Guidelines Process Manual, 2011) with 15 others satisfying all criteria for Class III except use of independent or objective outcome measurement. The most important global clinical conclusion from this review is that the weight of evidence supports a strong effect for both articulatory-kinematic and rate/rhythm approaches to AOS treatment. The quantity of work, experimental rigor, and reporting of diagnostic criteria continue to improve and strengthen confidence in the corpus of research.
Conference Paper
Full-text available
Lexical stress is a key diagnostic marker of disordered speech as it strongly affects speech perception. In this paper we introduce an automated method to classify between the different lexical stress patterns in children’s speech. A deep neural network is used to classify between strong-weak (SW), weak-strong (WS) and equal-stress (SS/WW) patterns in English by measuring the articulation change between the two successive syllables. The deep neural network architecture is trained using a set of acoustic features derived from pitch, duration and intensity measurements along with the energies in different frequency bands. We compared the performance of the deep neural classifier to a traditional single hidden layer MLP. Results show that the deep neural classifier outperforms the traditional MLP. The accuracy of the deep neural system is approximately 85% when classifying between the unequal stress patterns (SW/WS) and greater than 70% when classifying both equal and unequal stress patterns.
Article
Full-text available
Background: Apraxia of Speech (AOS) is partly characterised by impaired production of prosody in words and sentences. Identification of dysprosody is based on perceptual judgements of clinicians, with limited literature on potential quantitative objective measures.Aims: This study investigated whether an acoustic measure quantifying degree of lexical stress contrastiveness in three syllable words, produced in isolation and in a carrier sentence, differentiated individuals with AOS with/without aphasia (AOS), aphasia only (APH), and healthy controls (CTL).Methods & Procedures: Eight individuals with aphasia, nine with AOS plus aphasia and 8 age-matched control participants named pictures of strong–weak and weak–strong polysyllabic words in isolation and in a declarative carrier sentence. Pairwise Variability Indices (PVI) were used to measure the normalised relative vowel duration and peak intensity over the first two syllables of the polysyllabic words.Outcomes & Results: Individuals with aphasia performed similarly to control participants in all conditions. AOS participants demonstrated significantly lower PVI_vowel duration values for words with weak–strong stress produced in the sentence condition only, compared to controls and individuals with aphasia. This was primarily due to disproportionately long vowels in the word-initial weak syllable for AOS participants. There was no difference among groups on PVI_intensity.Conclusions: The finding of reduced lexical stress contrastiveness for weak–strong words in sentences for individuals with mild to moderate–severe AOS is consistent with the perceptual diagnostic feature of equal stress in AOS. Findings provide support for use of the objective PVI_vowel duration measure to help differentiate individuals with AOS (with/without aphasia), from those with aphasia only. Future research is warranted to explore the utility of this acoustic measure, and others, for reliable diagnosis of AOS.
Article
Full-text available
The delivery of tablet-based rehabilitation for individuals with post-stroke aphasia is relatively new, therefore, this study examined the effectiveness of an iPad-based therapy to demonstrate improvement in specific therapy tasks and how the tasks affect overall language and cognitive skills. Fifty-one individuals with aphasia due to a stroke or traumatic brain injury (TBI) were recruited to use an iPad-based software platform, Constant Therapy, for a 10 week therapy program. Participants were split into an experimental (N = 42) and control (N = 9) group. Both experimental and control participants received a 1 h clinic session with a clinician once a week, the experimental participants additionally practiced the therapy at home. Participants did not differ in the duration of the therapy and both groups of participants showed improvement over time in the tasks used for the therapy. However, experimental participants used the application more often and showed greater changes in accuracy and latency on the tasks than the control participants; experimental participants' severity level at baseline as measured by standardized tests of language and cognitive skills were a factor in improvement on the tasks. Subgroups of task co-improvement appear to occur between different language tasks, between different cognitive tasks, and across both domains. Finally, experimental participants showed more significant and positive changes due to therapy in their standardized tests than control participants. These results provide preliminary evidence for the usefulness of a tablet-based platform to deliver tailored language and cognitive therapy to individuals with aphasia.
Article
Full-text available
We present word frequencies based on subtitles of British television programmes. We show that the SUBTLEX-UK word frequencies explain more of the variance in the lexical decision times of the British Lexicon Project than the word frequencies based on the British National Corpus and the SUBTLEX-US frequencies. In addition to the word form frequencies, we also present measures of contextual diversity part-of-speech specific word frequencies, word frequencies in children programmes, and word bigram frequencies, giving researchers of British English access to the full range of norms recently made available for other languages. Finally, we introduce a new measure of word frequency, the Zipf scale, which we hope will stop the current misunderstandings of the word frequency effect.
Article
Full-text available
Abstract Abstract Determining the optimal amount of intervention is possibly the biggest challenge facing speech-language pathologists (SLPs) today. Baker (2012) has provided an erudite and pithy summary of the relevant literature in the field of optimizing intervention outcomes, and proposed a conceptual framework to measure all the inputs and acts that may contribute to the algorithm of intervention intensity. In the following article, two issues are discussed: first, that the use of technological advances to increase intensity should focus on everyday communication outcomes and, secondly, that measuring the effects of treatments which aim to increase intensity should include the perceptions of the client. Describing the evidence-based kernels underlying treatment success is a complex endeavour, particularly when the target treatment outcome is improved conversation. A recent qualitative study is described where clients with brain injury and their families were asked about their perceptions of a communication partner training program, to help determine which part of the treatment worked and why. It is argued that such an approach may provide important information regarding the "active ingredients" of treatment success.
Article
Full-text available
A primary goal of neurorehabilitation is to guide recovery of functional skills after injury through evidence-based interventions that operate to manipulate the sensorimotor environment of the client. While choice of intervention is an important decision for clinicians, we contend it is only one part of producing optimal activity-dependent neuroplastic changes. A key variable in the rehabilitation equation is engagement. Applying principles of engagement may yield greater neuroplastic changes and functional outcomes for clients. We review the principles of neuroplasticity and engagement and their potential linkage through concepts of attention and motivation and strategies such as mental practice and enriched environments. Clinical applications and challenges for enhancing engagement during rehabilitation are presented. Engagement strategies, such as building trust and rapport, motivational interviewing, enhancing the client education process, and interventions that empower clients, are reviewed. Well-controlled research is needed to test our theoretical framework and suggested outcomes. Clinicians may enhance engagement by investing time and energy in the growth and development of the therapeutic relationship with clients, as this is paramount to maintaining clients' investment in continuing therapy and also may act as a driver of neuroplastic changes.
Article
Full-text available
Consideration of client values and preferences for service delivery is integral to engaging with the evidence-based practice triangle (E(3)BP), but as yet such preferences are under-researched. This exploratory study canvassed paediatric speech-language pathology services around Australia through an online survey of parents and compared reported service delivery to preferences, satisfaction, and external research evidence on recommended service delivery. Respondents were 154 parents with 192 children, living across a range of Australian locations and socio-economic status areas. Children had a range of speech and language disorders. A quarter of children waited over 6 months to receive initial assessment. Reported session type, frequency, and length were incongruent with both research recommendations and parents' wishes. Sixty per cent of parents were happy or very happy with their experiences, while 27% were unhappy. Qualitative responses revealed concerns such as; a lack of available, frequent, or local services, long waiting times, cut-off ages for eligibility, discharge processes, and an inability to afford private services. These findings challenge the profession to actively engage with E(3)BP including; being cognisant of evidence-based service delivery literature, keeping clients informed of service delivery policies, individualizing services, and exploring alternative service delivery methods.
Article
Full-text available
Word frequency is the most important variable in research on word processing and memory. Yet, the main criterion for selecting word frequency norms has been the availability of the measure, rather than its quality. As a result, much research is still based on the old Kucera and Francis frequency norms. By using the lexical decision times of recently published megastudies, we show how bad this measure is and what must be done to improve it. In particular, we investigated the size of the corpus, the language register on which the corpus is based, and the definition of the frequency measure. We observed that corpus size is of practical importance for small sizes (depending on the frequency of the word), but not for sizes above 16-30 million words. As for the language register, we found that frequencies based on television and film subtitles are better than frequencies based on written sources, certainly for the monosyllabic and bisyllabic words used in psycholinguistic research. Finally, we found that lemma frequencies are not superior to word form frequencies in English and that a measure of contextual diversity is better than a measure based on raw frequency of occurrence. Part of the superiority of the latter is due to the words that are frequently used as names. Assembling a new frequency norm on the basis of these considerations turned out to predict word processing times much better than did the existing norms (including Kucera & Francis and Celex). The new SUBTL frequency norms from the SUBTLEX(US) corpus are freely available for research purposes from http://brm.psychonomic-journals.org/content/supplemental, as well as from the University of Ghent and Lexique Web sites.
Article
Full-text available
There has been renewed interest on the part of speech-language pathologists to understand how the motor system learns and determine whether principles of motor learning, derived from studies of nonspeech motor skills, apply to treatment of motor speech disorders. The purpose of this tutorial is to introduce principles that enhance motor learning for nonspeech motor skills and to examine the extent to which these principles apply in treatment of motor speech disorders. This tutorial critically reviews various principles in the context of nonspeech motor learning by reviewing selected literature from the major journals in motor learning. The potential application of these principles to speech motor learning is then discussed by reviewing relevant literature on treatment of speech disorders. Specific attention is paid to how these principles may be incorporated into treatment for motor speech disorders. Evidence from nonspeech motor learning suggests that various principles may interact with each other and differentially affect diverse aspects of movements. Whereas few studies have directly examined these principles in speech motor (re)learning, available evidence suggests that these principles hold promise for treatment of motor speech disorders. Further research is necessary to determine which principles apply to speech motor (re)learning in impaired populations.
Article
Full-text available
A comprehensive evidence-based review of stroke rehabilitation was created to be an up-to-date review of the current evidence in stroke rehabilitation and to provide specific conclusions based on evidence that could be used to help direct stroke care at the bedside and at home. A literature search using multiple data-bases was used to identify all trials from 1968 to 2001. Methodological quality of the individual randomized controlled trials was assessed using the Physiotherapy Evidence Database (PEDro) quality assessment scale. A five-stage level-of-evidence approach was used to determine the best practice in stroke rehabilitation. Over 403 treatment-based articles investigating of various areas of stroke rehabilitation were identified. This included 272 randomized controlled trials.
Article
Purpose: To assist in remote treatment, speech-language pathologists (SLPs) rely on mobile games, which though entertaining, lack feedback mechanisms. Games integrated with automatic speech recognition (ASR) offer a solution where speech productions control gameplay. We therefore performed a feasibility study to assess children's and SLPs' experiences towards speech-controlled games, game feature preferences and ASR accuracy. Method: Ten children with childhood apraxia of speech (CAS), six typically developing (TD) children and seven SLPs trialled five games and answered questionnaires. Researchers also compared the results of ASR to perceptual judgment. Result: Children and SLPs found speech-controlled games interesting and fun, despite ASR-human disagreements. They preferred games with rewards, challenge and multiple difficulty levels. Automatic speech recognition-human agreement was higher for SLPs than children, similar between TD and CAS and unaffected by CAS severity (77% TD, 75% CAS - incorrect; 51% TD, 47% CAS, 71% SLP - correct). Manual stop recording yielded higher agreement than automatic. Word length did not influence agreement. Conclusion: Children's and SLPs' positive responses towards speech-controlled games suggest that they can engage children in higher intensity practice. Our findings can guide future improvements to the ASR, recording methods and game features to improve the user experience and therapy adherence.
Article
Purpose: A systematic search and review of published studies was conducted on the use of automated speech analysis (ASA) tools for analysing and modifying speech of typically-developing children learning a foreign language and children with speech sound disorders to determine (i) types, attributes, and purposes of ASA tools being used; (ii) accuracy against human judgment; and (iii) performance as therapeutic tools. Method: Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were applied. Across nine databases, 32 articles published between January 2007 and December 2016 met inclusion criteria: (i) focussed on children's speech; (ii) tools used for speech analysis or modification; and (iii) reporting quantitative data on accuracy. Result: Eighteen ASA tools were identified. These met the clinical threshold of 80% agreement with human judgment when used as predictors of intelligibility, impairment severity, or error category. Tool accuracy was typically <80% accuracy for words containing mispronunciations. ASA tools have been used effectively to improve to children's foreign language pronunciation. Conclusion: ASA tools show promise for automated analysis and modification of children's speech production within assessment and therapeutic applications. Further work is needed to train automated systems with larger samples of speech to increase accuracy for assessment and therapeutic feedback.
Conference Paper
This paper presents Apraxia World, a remote therapy tool for speech sound disorders that integrates speech exercises into an engaging platformer-style game. In Apraxia World, the player controls the avatar with virtual buttons/joystick, whereas speech input is associated with assets needed to advance from one level to the next. We tested performance and child preference of two strategies for delivering speech exercises: during each level, and after it. Most children indicated that doing exercises after completing each level was less disruptive and preferable to doing exercises scattered through the level. We also found that children liked having perceived control over the game (character appearance, exercise behavior). Our results indicate that (i) a familiar style of game successfully engages children, (ii) speech exercises function well when decoupled from game control, and (iii) children are willing to complete required speech exercises while playing a game they enjoy.
Article
Spontaneous speech analysis plays an important role in the study and treatment of aphasia, but can be difficult to perform manually due to the time consuming nature of speech transcription and coding. Techniques in automatic speech recognition and assessment can potentially alleviate this problem by allowing clinicians to quickly process large amount of speech data. However, automatic analysis of spontaneous aphasic speech has been relatively under-explored in the engineering literature, partly due to the limited amount of available data and difficulties associated with aphasic speech processing. In this work, we perform one of the first large-scale quantitative analysis of spontaneous aphasic speech based on automatic speech recognition (ASR) output. We describe our acoustic modeling method that sets a new recognition benchmark on AphasiaBank, a large-scale aphasic speech corpus. We propose a set of clinically-relevant quantitative measures that are shown to be highly robust to automatic transcription errors. Finally, we demonstrate that these measures can be used to accurately predict the revised Western Aphasia Battery (WAB-R) Aphasia Quotient (AQ) without the need for manual transcripts. The results and techniques presented in our work will help advance the state-of-the-art in aphasic speech processing and make ASR-based technology for aphasia treatment more feasible in real-world clinical applications.
Article
Background: McNeil and colleagues argued that individuals with pure apraxia of speech (AOS) have low variability of speech error type and error location within repeated multisyllabic words, compared to individuals with conduction aphasia. While this concept has been challenged, subsequent studies have varied in the stimuli and tasks used. Aims: Our aim was to re-examine the variability of segmental errors, as well as lexical prosodic errors, using the same stimuli and tasks as used by McNeil and colleagues in a sample of individuals with AOS plus aphasia or aphasia alone. This sample is considered to be clinically relevant given the high concomitance of these disorders. Methods & Procedures: Participants were 20 individuals with stroke-related AOS plus aphasia and 21 with aphasia alone (APH), with diagnosis based on expert judgments using published criteria. Three consecutive repetitions of 10 polysyllabic words were elicited and variability of error type, error location, and durational stress contrast was measured. Outcome & Results: Errors were significantly more variable in type and more consistent in location within word for the AOS group than the APH group. The AOS group showed a greater number of errors overall, were less likely to improve production over the three repetition trials, and produced no clear difference in vowel duration across the first two syllables (i.e., durational stress contrast) across repetitions. The measure of durational stress contrast was a stronger predictor of AOS presence than the measures of error variability. Conclusions: The divergence of our findings from previous work likely reflects the more complex profile of the AOS plus aphasia cases in the current study. While durational stress contrast was sufficient to predict diagnostic group, error variability measures were significantly associated with AOS and can contribute to developing targeted intervention goals.
Article
Background Technologies are becoming increasingly popular in the treatment of language disorders and offer numerous possibilities, but little is known about their effectiveness and limitations. Aim The aim of this systematic review was to investigate the effectiveness of treatments delivered by technology in the management of post-stroke anomia. Methods As a guideline for conducting this review, we used the PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions. We conducted a systematic search of publications in PubMed, PsycInfo and Current Contents. We also consulted Google Scholar. Without any limitations as to publication date, we selected studies designed to assess the effectiveness of an intervention delivered by a technology, namely computer or smart tablet, to specifically improve anomia in post-stroke participants. The main outcomes studied were improvement in naming skills and generalisation to untreated items and daily communication. Results We examined 23 studies in this review. To date, computers constitute the most popular technology by far; only a few studies explored the effectiveness of smart tablets. In some studies, technology was used as a therapy tool in a clinical setting, in the presence of the clinician, while in others, therapy with technology was self-administered at home, without the clinician. All studies confirmed the effectiveness of therapy provided by technology to improve naming of trained items. However, generalisation to untrained items is unclear and assessment of generalisation to daily communication is rare. Discussion The results of this systematic review confirm that technology is an efficient approach in the management of post-stroke anomia. In future studies, ecological tasks aimed at evaluating therapy's effectiveness with word retrieval in real-life situations should be added since the ultimate goal of improving anomia is to increase the ability to retrieve words more easily in everyday life.
Article
Aphasia is a chronic condition that usually requires long-term rehabilitation. However, even if many effective treatments can be offered to patients and families, speech therapy services for individuals with aphasia often remain limited because of logistical and financial considerations, especially more than 6 months after stroke. Therefore, the need to develop tools to maximize rehabilitation potential is unquestionable. The aim of this study was to test the efficacy of a self-administered treatment delivered with a smart tablet to improve written verb naming skills in CP, a 63-year-old woman with chronic aphasia. An ABA multiple baseline design was used to compare CP's performance in verb naming on three equivalent lists of stimuli trained with a hierarchy of cues, trained with no cues, and not trained. Results suggest that graphemic cueing therapy, done four times a week for 3 weeks, led to better written verb naming compared to baseline and to the untrained list. Moreover, generalization of the effects of treatment was observed in verb production, assessed with a noun-to-verb production task. Results of this study suggest that self-administered training with a smart tablet is effective in improving naming skills in chronic aphasia. Future studies are needed to confirm the effectiveness of new technologies in self-administered treatment of acquired language deficits.
Article
Children with developmental disabilities such as childhood apraxia of speech (CAS) require repeated intervention sessions with a speech therapist, sometimes extending over several years. Technology-based therapy tools offer the potential to reduce the demanding workload of speech therapists as well as time and cost for families. In response to this need, we have developed “Tabby Talks,” a multi-tier system for remote administration of speech therapy. This paper describes the speech processing pipeline to automatically detect common errors associated with CAS. The pipeline contains modules for voice activity detection, pronunciation verification, and lexical stress verification. The voice activity detector evaluates the intensity contour of an utterance and compares it against an adaptive threshold to detect silence segments and measure voicing delays and total production time. The pronunciation verification module uses a generic search lattice structure with multiple internal paths that covers all possible pronunciation errors (substitutions, insertions and deletions) in the child’s production. Finally, the lexical stress verification module classifies the lexical stress across consecutive syllables into strong–weak or weak-strong patterns using a combination of prosodic and spectral measures. These error measures can be provided to the therapist through a web interface, to enable them to adapt the child’s therapy program remotely. When evaluated on a dataset of typically developing and disordered speech from children ages 4–16 years, the system achieves a pronunciation verification accuracy of 88.2% at the phoneme level and 80.7% at the utterance level, and lexical stress classification rate of 83.3%.
Article
In an effort to responsibly incorporate evidence based on single-case designs (SCDs) into the What Works Clearinghouse (WWC) evidence base, the WWC assembled a panel of individuals with expertise in quantitative methods and SCD methodology to draft SCD standards. In this article, the panel provides an overview of the SCD standards recommended by the panel (henceforth referred to as the Standards) and adopted in Version 1.0 of the WWC's official pilot standards. The Standards are sequentially applied to research studies that incorporate SCDs. The design standards focus on the methodological soundness of SCDs, whereby reviewers assign the categories of Meets Standards, Meets Standards With Reservations, and Does Not Meet Standards to each study. Evidence criteria focus on the credibility of the reported evidence, whereby the outcome measures that meet the design standards (with or without reservations) are examined by reviewers trained in visual analysis and categorized as demonstrating Strong Evidence, Moderate Evidence, or No Evidence. An illustration of an actual research application of the Standards is provided. Issues that the panel did not address are presented as priorities for future consideration. Implications for research and the evidence-based practice movement in psychology and education are discussed. The WWC's Version 1.0 SCD standards are currently being piloted in systematic reviews conducted by the WWC. This document reflects the initial standards recommended by the authors as well as the underlying rationale for those standards. It should be noted that the WWC may revise the Version 1.0 standards based on the results of the pilot; future versions of the WWC standards can be found at http://www.whatworks.ed.gov.
Article
Background: Studies of verb anomia therapy in poststroke aphasia are rare, even though verbs are central to speech. In the great majority of studies, a traditional face-to-face setting with phonological, semantic, and/or sensorimotor cues is used to enhance lexical access. Technology-based therapy is expanding rapidly and has the potential to help individuals with aphasia improve verb retrieval abilities.Aims: This study aimed to assess the treatment outcome of and satisfaction with a therapy for verb anomia in chronic poststroke aphasia using a tablet for self-administered treatment at home.Methods & Procedures: Single-case studies were conducted with two participants who presented with chronic poststroke verb anomia. The following four phases were completed: (1) baseline measures; (2) training on the use of the tablet; (3) self-administered therapy with the tablet; and (4) follow-up, generalisation, and rating of the therapy measures. Twenty self-administered sessions were completed.Outcomes & Results: A significant improvement in verb naming was observed for the two participants following therapy. No generalisation was found for untreated verbs or for treated verbs in a novel task. Both participants were very satisfied with the self-administered treatment using a tablet.Conclusions: The use of tablet-based therapy to improve verb naming could be an interesting way to enhance rehabilitation of acquired language deficits. New technologies are promising and further studies are needed to demonstrate their usefulness and determine the pros and cons of using them in clinical settings.
Conference Paper
Speaker dependent (SD) ASR systems have significantly lower word error rates (WER) compared to speaker independent (SI) systems. However, SD systems require sufficient training data from the target speaker, which is impractical to collect in a short time. We present a technique for training SD models using just few minutes of speaker's data. We compensate for the lack of adequate speaker-specific data by selecting neighbours from a database of existing speakers who are acoustically close to the target speaker. These neighbours provide ample training data, which is used to adapt the SI model to obtain an initial SD model for the new speaker with significantly lower WER. We evaluate various neighbour selection algorithms on a large-scale medical transcription task and report significant reduction in WER using only 5 mins of speaker-specific data. We conduct a detailed analysis of various factors such as gender and accent in the neighbour selection. Finally, we study neighbour selection and adaptation in the context of discriminative objective functions.
Article
The current study investigated the effectiveness of a home practice program based on the iPad (Apple Inc., Cupertino, CA), implemented after 2 weeks of intensive language therapy, for maintaining and augmenting treatment gains in people with chronic poststroke aphasia. Five of eight original participants completed the 6-month home practice program in which they autonomously practiced retrieving words for objects and actions. Half of these words had been trained and half were untrained during therapy. Practice included tasks such as naming to confrontation, repeating from a video model, and picture/word matching presented on an iPad. All participants maintained advances made on words trained during the intensive treatment and additionally were able to learn new words by practicing daily over a 6-month period. The iPad and other tablet devices have great potential for personalized home practice to maintain and augment traditional aphasia rehabilitation. It appears that motivation to use the technology and adequate training are more important factors than age, aphasia type or severity, or prior experience with computers.
Article
The proliferation of tablet technology and the development of apps to support aphasia rehabilitation offer increasing opportunities for speech-language pathologists in a clinical setting. This article describes the components of an Intensive Comprehensive Aphasia Program at Boston University and details how usage of the iPad (Apple Inc., Cupertino, CA) was incorporated. We describe how the iPad was customized for use in individual, dyadic, and group treatment formats and how its use was encouraged through home practice tasks. In addition to providing the participants with step-by-step instructions for the usage of each new app, participants had multiple opportunities for practice across various treatment formats. Examples of how the participants continued using their iPad beyond the program suggest how the usage of this device has generalized into their day-to-day life. An overall summary of performance on targeted linguistic measures as well as an analysis of functional and quality-of-life measures reveal statistically significant improvements pre- to posttreatment.
Article
Objectives: The aim of this study was to examine the potential cost-effectiveness of self-managed computer therapy for people with long-standing aphasia post stroke and to estimate the value of further research. Methods: The incremental cost-effectiveness ratio of computer therapy in addition to usual stimulation compared with usual stimulation alone was considered in people with long-standing aphasia using data from the CACTUS trial. A model-based approach was taken. Where possible the input parameters required for the model were obtained from the CACTUS trial data, a United Kingdom-based pilot randomized controlled trial that recruited thirty-four people with aphasia and randomized them to computer treatment or usual care. Cost-effectiveness was described using an incremental cost-effectiveness ratio (ICER) together with cost-effectiveness acceptability curves. A value of information analysis was undertaken to inform future research priorities. Results: The intervention had an ICER of £3,058 compared with usual care. The likelihood of the intervention being cost-effective was 75.8 percent at a cost-effectiveness threshold of £20,000 per QALY gained. The expected value of perfect information was £37 million. Conclusions: Our results suggest that computer therapy for people with long-standing aphasia is likely to represent a cost-effective use of resources. However, our analysis is exploratory given the small size of the trial it is based upon and therefore our results are uncertain. Further research would be of high value, particularly with respect to the quality of life gain achieved by people who respond well to therapy.