Article

Sensorimotor Learning in Children and Adults: Exposure to Frequency-Altered Auditory Feedback during Speech Production

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Auditory feedback plays an important role in the acquisition of fluent speech; however, this role may change once speech is acquired and individuals no longer experience persistent developmental changes to the brain and vocal tract. For this reason, we investigated whether the role of auditory feedback in sensorimotor learning differs across children and adult speakers. Participants produced vocalizations while they heard their vocal pitch predictably or unpredictably shifted downward one semitone. The participants' vocal pitches were measured at the beginning of each vocalization, before auditory feedback was available, to assess the extent to which the deviant auditory feedback modified subsequent speech motor commands. Sensorimotor learning was observed in both children and adults, with participants' initial vocal pitch increasing following trials where they were exposed to predictable, but not unpredictable, frequency-altered feedback. Participants' vocal pitch was also measured across each vocalization, to index the extent to which the deviant auditory feedback was used to modify ongoing vocalizations. While both children and adults were found to increase their vocal pitch following predictable and unpredictable changes to their auditory feedback, adults produced larger compensatory responses. The results of the current study demonstrate that both children and adults rapidly integrate information derived from their auditory feedback to modify subsequent speech motor commands. However, these results also demonstrate that children and adults differ in their ability to use auditory feedback to generate compensatory vocal responses during ongoing vocalization. Since vocal variability also differed across the children and adult groups, these results also suggest that compensatory vocal responses to frequency altered feedback manipulations initiated at vocalization onset may be modulated by vocal variability.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... All ten of the studies exploring responses to f 0 altered auditory feedback involved unexpected (within trial) perturbations. Two studies also contrasted responses to unexpected perturbations with sustained (predictable) manipulations (Scheerer et al., 2016;Heller Murray and Stepp, 2020). In terms of manipulations applied, all ten of the studies included a negative manipulation of one semitone (−100 cents). ...
... All f 0 manipulation studies explored compensatory responses to pitch perturbations, and generally found children compensated in the opposite direction of the shift. Following responses were examined in three studies (Russo et al., 2008;Liu P. et al., 2010;Scheerer et al., 2020b), with two studies excluding participants who followed the perturbation (Scheerer et al., 2016;Demopoulos et al., 2018). Results exploring the magnitude of compensation responses to unexpected perturbations were mixed. ...
... Results exploring the magnitude of compensation responses to unexpected perturbations were mixed. When contrasting across age groups, children were found to have: (a) reduced magnitude responses compared to adults Scheerer et al., 2016), (b) increased magnitude responses compared to adults (Liu P. et al., 2010;Heller Murray and Stepp, 2020), (c) increased responses that followed the manipulation compared to adults (i.e., following responses; Liu P. et al., 2010), and (d) no effect of age across childhood (Liu et al., 2013;Scheerer et al., 2013bScheerer et al., , 2020a or compared to adults (Scheerer et al., 2013b;Heller Murray and Stepp, 2020). Heller Murray and Stepp (2020) found when analyzing opposing responses that only children with less sensitive pitch discrimination showed significantly larger responses, compared to adults and children with adult-like pitch discrimination. ...
Article
Full-text available
Purpose The ability to hear ourselves speak has been shown to play an important role in the development and maintenance of fluent and coherent speech. Despite this, little is known about the developing speech motor control system throughout childhood, in particular if and how vocal and articulatory control may differ throughout development. A scoping review was undertaken to identify and describe the full range of studies investigating responses to frequency altered auditory feedback in pediatric populations and their contributions to our understanding of the development of auditory feedback control and sensorimotor learning in childhood and adolescence. Method Relevant studies were identified through a comprehensive search strategy of six academic databases for studies that included (a) real-time perturbation of frequency in auditory input, (b) an analysis of immediate effects on speech, and (c) participants aged 18 years or younger. Results Twenty-three articles met inclusion criteria. Across studies, there was a wide variety of designs, outcomes and measures used. Manipulations included fundamental frequency (9 studies), formant frequency (12), frequency centroid of fricatives (1), and both fundamental and formant frequencies (1). Study designs included contrasts across childhood, between children and adults, and between typical, pediatric clinical and adult populations. Measures primarily explored acoustic properties of speech responses (latency, magnitude, and variability). Some studies additionally examined the association of these acoustic responses with clinical measures (e.g., stuttering severity and reading ability), and neural measures using electrophysiology and magnetic resonance imaging. Conclusion Findings indicated that children above 4 years generally compensated in the opposite direction of the manipulation, however, in several cases not as effectively as adults. Overall, results varied greatly due to the broad range of manipulations and designs used, making generalization challenging. Differences found between age groups in the features of the compensatory vocal responses, latency of responses, vocal variability and perceptual abilities, suggest that maturational changes may be occurring in the speech motor control system, affecting the extent to which auditory feedback is used to modify internal sensorimotor representations. Varied findings suggest vocal control develops prior to articulatory control. Future studies with multiple outcome measures, manipulations, and more expansive age ranges are needed to elucidate findings.
... For these reasons, researchers have been exploring more ecologically valid methods to study auditory feedback control in speech production. In the laryngeal subsystem, researchers have proposed within-trial variability of voice fundamental frequency (f o ) during speech tasks as a potential window into auditory feedback control (Scheerer & Jones, 2012;Scheerer et al., 2013Scheerer et al., , 2016. In the articulatory subsystem, a new method of "vowel centering," which measures within-utterance change in formant frequencies, has been proposed as a method to assess auditory feedback control (Niziolek & Guenther, 2013;Niziolek & Kiran, 2018;Niziolek et al., , 2015. ...
... Other studies have found similar results but only in children aged 6-11 years and not in adults aged 18-28 years (Heller Murray & Stepp, 2020;Rathna Kumar et al., 2013). However, Scheerer et al. (2016) did not find this relationship in either adults or children. The relationship between vocal response magnitudes to unexpected pitch shifts and vocal baseline variability has also been explored in persons with Parkinson's disease. ...
... Our results do not support a linear relationship between baseline variability and reflexive responses in the laryngeal or the articulatory subsystem, although previous research reported positive correlations between f o reflexive responses and f o baseline variability (Heller Murray & Stepp, 2020;Scheerer & Jones, 2012;Scheerer et al., 2013Scheerer et al., , 2016. The correlation between f o reflexive responses and f o baseline variability found in Heller Murray and Stepp (2020) was only observed in children, which could be, in part, due to the incomplete development of the speech motor control subsystems in this age range (6-11 years), causing more individual variability in which some relied more heavily on feedback. ...
Article
Purpose Auditory feedback is thought to contribute to the online control of speech production. Yet, the standard method of estimating auditory feedback control (i.e., reflexive responses to auditory–motor perturbations), although sound, requires specialized instrumentation, meticulous calibration, unnatural tasks, and specific acoustic environments. The purpose of this study was to explore more ecologically valid features of speech production to determine their relationships with auditory feedback mechanisms. Method Two previously proposed measures of within-utterance variability (centering and baseline variability) were compared with reflexive response magnitudes in 30 adults with typical speech. These three measures were estimated for both the laryngeal and articulatory subsystems of speech. Results Regardless of the speech subsystem, neither centering nor baseline variability was shown to be related to reflexive response magnitudes. Likewise, no relationships were found between centering and baseline variability. Conclusions Despite previous suggestions that centering and baseline variability may be related to auditory feedback mechanisms, this study did not support these assertions. However, the detection of such relationships may have required a larger degree of variability in responses, relative to that found in those with typical speech. Future research on these relationships is warranted in populations with more heterogeneous responses, such as children or clinical populations. Supplemental Material https://doi.org/10.23641/asha.17330546
... In addition to experiments employing spectral perturbation, a number of studies have investigated the effects of pitch perturbations on sensorimotor learning in children. Scheerer et al. (2016) examined the effects of downward pitch perturbations in children aged 3-8 years old and young adults and reported further evidence of compensatory responses to auditory perturbation by children. Their study found that both groups demonstrated sensorimotor corrections, but compensatory responses to pitch shifts were higher in adults compared to children. ...
... In summary, reliable adaptive responses to perturbed auditory feedback have been found in adults (e.g., Cai et al., 2010;Houde and Jordan, 2002;Villacorta et al., 2007) but have not been systematically investigated in children compared to adults (MacDonald et al., 2012;Shiller et al., 2010;Daliri et al., 2018;Scheerer et al., 2016). In a previous study comparing typically developing children with children with speech sound disorder, we have found indications of motor learning for both groups. ...
... The results imply that outcome measures of token-to-token variability do not reflect the demands of auditory-motor integration in typically developing children but rather express a general immaturity in speech motor execution leading to the presence of background noise across all experimental phases. In other words, the higher variability is not borne out of task difficulties, perceptual limitations, or underdeveloped sensorimotor integration (although some of these issues might still exist), but the higher variability exists because of variability in production (MacDonald et al., 2012;Scheerer et al., 2016). In addition, the higher production variability in children might potentially affect their perceived reliability of the sensory input and impact sensory learning, which in turn may lead to reduced perceptual compensation effects (Shiller et al., 2010). ...
Article
Auditory feedback plays an important role in speech motor learning, yet, little is known about the strength of motor learning and feedback control in speech development. This study investigated compensatory and adaptive responses to auditory feedback perturbation in children (aged 4-9 years old) and young adults (aged 18-29 years old). Auditory feedback was perturbed by near-real-time shifting F1 and F2 of the vowel /I:/ during the production of consonant-vowel-consonant words. Children were able to compensate and adapt in a similar or larger degree compared to young adults. Higher token-to-token variability was found in children compared to adults but not disproportionately higher during the perturbation phases compared to the unperturbed baseline. The added challenge to auditory-motor integration did not influence production variability in children, and compensation and adaptation effects were found to be strong and sustainable. Significant group differences were absent in the proportions of speakers displaying a compensatory or adaptive response, an amplifying response, or no consistent response. Within these categories, children produced significantly stronger compensatory, adaptive, or amplifying responses, which could be explained by less-ingrained existing representations. The results are interpreted as both auditory-motor integration and learning capacities are stronger in young children compared to adults.
... Of note, the DIVA model is primarily designed for speech motor control, yet ample behavioral work suggests that similar control systems are involved in vocal motor control [e.g. [6][7][8][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29] ]. According to DIVA, early vocalizations allow the auditory and somatosensory sensory feedback systems to learn the relationships between a given motor command and the sensory feedback stemming from the resultant vocalization. ...
... An alternative for why some individuals may have smaller response magnitudes to unexpected pitch-shifts is that they may decrease weighting on any sensory feedback system and become more reliant on a third control system, forward control 9,10,13,14 . Behaviorally, the process of updating the forward system through sensorimotor adaptation is examined via evaluation of vocal response magnitudes to predictable, sustained auditory pitch-shifts [21][22][23]28,29,31,32 . Larger vocal response magnitudes in this type of experimental paradigm are suggestive of a system that can effectively incorporate error corrections from the auditory feedback system and use this information to update the stored motor plan. ...
... Larger vocal response magnitudes in this type of experimental paradigm are suggestive of a system that can effectively incorporate error corrections from the auditory feedback system and use this information to update the stored motor plan. Conversely, small or absent vocal responses suggest either a low weighting of forward control or decreased ability to execute sensorimotor adaptation [21][22][23]28,29,31,32 . Examination of sensorimotor adaptation in adults shows variable magnitudes of vocal responses to sustained pitch-shifts [e.g. ...
Article
Full-text available
The purpose of this study was to examine the relationships between vocal pitch discrimination abilities and vocal responses to auditory pitch-shifts. Twenty children (6.6–11.7 years) and twenty adults (18–28 years) completed a listening task to determine auditory discrimination abilities to vocal fundamental frequency (fo) as well as two vocalization tasks in which their perceived fo was modulated in real-time. These pitch-shifts were either unexpected, providing information on auditory feedback control, or sustained, providing information on sensorimotor adaptation. Children were subdivided into two groups based on their auditory pitch discrimination abilities; children within two standard deviations of the adult group were classified as having adult-like discrimination abilities (N = 11), whereas children outside of this range were classified as having less sensitive discrimination abilities than adults (N = 9). Children with less sensitive auditory pitch discrimination abilities had significantly larger vocal response magnitudes to unexpected pitch-shifts and significantly smaller vocal response magnitudes to sustained pitch-shifts. Children with less sensitive auditory pitch discrimination abilities may rely more on auditory feedback and thus may be less adept at updating their stored motor programs.
... To maintain accurate speech articulation, the brain relies on auditory feedback, or the sound of one's voice, to monitor and regulate the activity of these muscles [Callan et al., 2000;Guenther, 1994Guenther, , 2006Max, Guenther, Gracco, Ghosh, & Wallace, 2004]. One of the most effective methods for assessing the importance of auditory feedback for regulating speech is utilizing frequency altered feedback (FAF) [Burnett, Freedland, Larson, & Hain, 1998;Burnett, Senner, & Larson, 1997;Elman, 1981;Scheerer, Jacobson, & Jones, 2016;Scheerer & Jones, 2012, 2014. Previous research has demonstrated that when the fundamental frequency (F0) of a speaker's auditory feedback is briefly altered, a manipulation that is perceived as a change in vocal pitch, the speaker produces a compensatory response in the direction opposite to the manipulation [e.g. ...
... The size, timing, and variability of these responses to FAF have allowed researchers to gain valuable insight into how individuals use auditory feedback for speech motor control. For example, using this FAF paradigm, Scheerer et al. [2016] and Scheerer, Jacobson, and Jones [2019] have demonstrated that children as young as 2 years old produce compensatory responses to changes in their auditory feedback. Larger compensatory responses have also been found in individuals who produce more variable vocal productions [Scheerer & Jones, 2012]. ...
Article
Autism spectrum disorder (ASD) is a developmental disorder characterized by persistent deficits in social communication and interaction. Speech is an important form of social communication. Prosody (e.g. vocal pitch, rhythm, etc.), one aspect of the speech signal, is crucial for ensuring information about the emotionality, excitability, and intent of the speaker, is accurately expressed. The objective of this study was to gain a better understanding of how auditory information is used to regulate speech prosody in autistic and non‐autistic children, while exploring the relationship between the prosodic control of speech and social competence. Eighty autistic (M = 8.48 years, SD = 2.55) and non‐autistic (M = 7.36 years, SD = 2.51) participants produced vocalizations while exposed to unaltered and frequency altered auditory feedback. The parent‐report Multidimensional Social Competence Scale was used to assess social competence, while the Autism‐Spectrum Quotient and the Autism Spectrum Rating Scales were used to assess autism characteristics. Results indicate that vocal response magnitudes and vocal variability were similar across autistic and non‐autistic children. However, autistic children produced significantly faster responses to the auditory feedback manipulation. Hierarchical multiple regressions indicated that these faster responses were significantly associated with poorer parent‐rated social competence and higher autism characteristics. These findings suggest that prosodic speech production differences are present in at least a subgroup of autistic children. These results represent a key step in understanding how atypicalities in the mechanisms supporting speech production may manifest in social‐communication deficits, as well as broader social competence, and vice versa. Autism Res 2020, 13: 1880‐1892. © 2020 International Society for Autism Research and Wiley Periodicals LLC Lay Summary In this study, autistic and non‐autistic children produced vowel sounds while listening to themselves through headphones. When the children heard their vocal pitch shifted upward or downward, they compensated by shifting their vocal pitch in the opposite direction. Interestingly, autistic children were faster to correct for the perceived vowel sound changes than their typically developing peers. Faster responses in the children with ASD were linked to poorer ratings of their social abilities by their parent. These results suggest that autistic and non‐autistic children show differences in how quickly they control their speech, and these differences may be related to the social challenges experienced by autistic children.
... The importance of auditory feedback for monitoring and correcting ongoing vocalizations has been demonstrated by utilizing the frequency altered feedback (FAF) paradigm to synthetically alter speakers' auditory feedback (Burnett, Freedland, Larson, & Hain, 1998;Civier et al., 2010;Scheerer, Jacobson, & Jones, 2016;Scheerer & Jones, 2012, 2018a, 2018bScheerer, Liu, & Jones, 2013). When a speaker's auditory feedback is manipulated by changing properties such as the fundamental frequency (F0; Burnett et al., 1998;Civier et al., 2010;Scheerer et al., 2013Scheerer et al., , 2016Scheerer & Jones, 2012, 2018a, 2018b, or the formant frequencies (Cai et al., 2012;Houde & Jordan, 1998;Purcell & Munhall, 2006;Villacorta, Perkell, & Guenther, 2007), the speaker reflexively responds by opposing the manipulation. ...
... The importance of auditory feedback for monitoring and correcting ongoing vocalizations has been demonstrated by utilizing the frequency altered feedback (FAF) paradigm to synthetically alter speakers' auditory feedback (Burnett, Freedland, Larson, & Hain, 1998;Civier et al., 2010;Scheerer, Jacobson, & Jones, 2016;Scheerer & Jones, 2012, 2018a, 2018bScheerer, Liu, & Jones, 2013). When a speaker's auditory feedback is manipulated by changing properties such as the fundamental frequency (F0; Burnett et al., 1998;Civier et al., 2010;Scheerer et al., 2013Scheerer et al., , 2016Scheerer & Jones, 2012, 2018a, 2018b, or the formant frequencies (Cai et al., 2012;Houde & Jordan, 1998;Purcell & Munhall, 2006;Villacorta, Perkell, & Guenther, 2007), the speaker reflexively responds by opposing the manipulation. These compensatory responses demonstrate that when a speaker detects changes in their auditory feedback, they use information from the deviant auditory feedback to modify their ongoing vocalization. ...
Article
Children maintain fluent speech despite dramatic changes to their articulators during development. Auditory feedback aids in the acquisition and maintenance of the sensorimotor mechanisms that underlie vocal motor control. MacDonald, Johnson, Forsythe, Plante, and Munhall (2012) reported that toddlers' speech motor control systems may "suppress" the influence of auditory feedback, since exposure to altered auditory feedback regarding their formant frequencies did not lead to modifications of their speech. This finding is not parsimonious with most theories of motor control. Here, we exposed toddlers to perturbations to the pitch of their auditory feedback as they vocalized. Toddlers compensated for the manipulations, producing significantly different responses to upward and downward perturbations. These data represent the first empirical demonstration that toddlers use auditory feedback for vocal motor control. Furthermore, our findings suggest toddlers are more sensitive to changes to the postural properties of their auditory feedback, such as fundamental frequency, relative to the phonemic properties, such as formant frequencies. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
... The interaction between speech feed-forward and feedback mechanisms is often measured by an online modification of the auditory feedback someone receives while speaking (Scheerer, Jacobson, & Jones, 2016). In these experiments, participants are usually asked to repeatedly produce a syllable, while being recorded. ...
... Factors thought to influence these individual differences in the response to altered auditory feedback include: the strength of the manipulation (Niziolek & Guenther, 2013), the developmental phase of the participants (e.g. very young children do not adapt as strongly as adults; MacDonald, Johnson, Forsythe, Plante, & Munhall, 2012;Scheerer et al., 2016), and the shape of the participants' vowel space (Niziolek, Nagarajan, & Houde, 2013). ...
Article
Full-text available
Although dyslexia is characterized by a deficit in phonological representations, the nature of this deficit is debated. Previously, it was shown that adults with dyslexia respond differently to online manipulations of auditory feedback. In the present study, we found that individual differences in reading and reading-related skills within a group of 30 children (10–13 years old) with dyslexia were associated with the response to altered feedback. The fractional anisotropy of the arcuate fasciculus/superior longitudinal fasciculus was not directly related to the response to altered feedback. This study corroborates that speech perception-production communication is important for phonological representations and reading.
... While the majority of pitch-perturbation studies to date have focused on neurotypical adult speakers, a growing number of studies have examined responses in children and individuals with communication disorders. Reflexive perturbation responses in children are evident as young as age 3 years (Russo et al., 2008;Scheerer et al., 2013Scheerer et al., , 2016Heller Murray and Stepp, 2020) but are associated with longer response latencies and greater variability compared to adult responses. Studies have also investigated responses in individuals with Parkinson's disease (Kiran and Larson, 2001;Liu et al., 2012;Abur et al., 2021a), Alzheimer's disease (Ranasinghe et al., 2017), cerebellar degeneration (Houde et al., 2019;Li et al., 2019), apraxia of speech (Ballard et al., 2018), aphasia (Behroozmand et al., 2018(Behroozmand et al., , 2022, hyperfunctional voice disorders (Abur et al., 2021b), 16p11.2 ...
Article
Full-text available
Background Reflexive pitch perturbation experiments are commonly used to investigate the neural mechanisms underlying vocal motor control. In these experiments, the fundamental frequency–the acoustic correlate of pitch–of a speech signal is shifted unexpectedly and played back to the speaker via headphones in near real-time. In response to the shift, speakers increase or decrease their fundamental frequency in the direction opposing the shift so that their perceived pitch is closer to what they intended. The goal of the current work is to develop a quantitative model of responses to reflexive perturbations that can be interpreted in terms of the physiological mechanisms underlying the response and that captures both group-mean data and individual subject responses. Methods A model framework was established that allowed the specification of several models based on Proportional-Integral-Derivative and State-Space/Directions Into Velocities of Articulators (DIVA) model classes. The performance of 19 models was compared in fitting experimental data from two published studies. The models were evaluated in terms of their ability to capture both population-level responses and individual differences in sensorimotor control processes. Results A three-parameter DIVA model performed best when fitting group-mean data from both studies; this model is equivalent to a single-rate state-space model and a first-order low pass filter model. The same model also provided stable estimates of parameters across samples from individual subject data and performed among the best models to differentiate between subjects. The three parameters correspond to gains in the auditory feedback controller’s response to a perceived error, the delay of this response, and the gain of the somatosensory feedback controller’s “resistance” to this correction. Excellent fits were also obtained from a four-parameter model with an additional auditory velocity error term; this model was better able to capture multi-component reflexive responses seen in some individual subjects. Conclusion Our results demonstrate the stereotyped nature of an individual’s responses to pitch perturbations. Further, we identified a model that captures population responses to pitch perturbations and characterizes individual differences in a stable manner with parameters that relate to underlying motor control capabilities. Future work will evaluate the model in characterizing responses from individuals with communication disorders.
... In 1981, Elman introduced a new technique to study the influence of auditory feedback on vocal f o control: Vocal responses to sudden, unpredictable auditory feedback perturbations were examined (hereafter referred to as reflexive paradigms), providing information about feedback error correction (Bauer & Larson, 2003;Bauer et al., 2006;Burnett et al., 1997Burnett et al., , 1998Elman, 1981). Later on, sustained and predictable perturbations to auditory feedback (hereafter referred to as adaptive paradigms) were used to examine how auditory feedback corrections are incorporated to feedforward commands when errors persist over multiple vocal productions (Behroozmand & Sangtian, 2018;Hawco & Jones, 2010;Keough & Jones, 2009;Scheerer et al., 2016). In response to both auditory reflexive and adaptive paradigms, adults with typical voices tend to produce compensatory vocal responses (i.e., responses that are in opposition to the perturbation; e.g., vocal f o reflexive responses- Burnett et al., 1997, vocal SPL reflexive responses-Bauer et al., 2006Larson et al., 2007, vocal f o adaptive responses- Hawco & Jones, 2010;Keough & Jones, 2009). ...
Article
Purpose The goal of this review article is to provide a summary of the progression of altered auditory feedback (AAF) as a method to understand the pathophysiology of voice disorders. This review article focuses on populations with voice disorders that have thus far been studied using AAF, including individuals with Parkinson's disease, cerebellar degeneration, hyperfunctional voice disorders, vocal fold paralysis, and laryngeal dystonia. Studies using AAF have found that individuals with Parkinson's disease, cerebellar degeneration, and laryngeal dystonia have hyperactive auditory feedback responses due to differing underlying causes. In persons with PD, the hyperactivity may be a compensatory mechanism for atypically weak feedforward motor control. In individuals with cerebellar degeneration and laryngeal dystonia, the reasons for hyperactivity remain unknown. Individuals with hyperfunctional voice disorders may have auditory–motor integration deficits, suggesting atypical updating of feedforward motor control. Conclusions These findings have the potential to provide critical insights to clinicians in selecting the most effective therapy techniques for individuals with voice disorders. Future collaboration between clinicians and researchers with the shared objective of improving AAF as an ecologically feasible and valid tool for clinical assessment may provide more personalized therapy targets for individuals with voice disorders.
... Perturbation and study paradigm Delay (ms) Jones & Munhall (2000 f o perturbations, adaptive paradigm 3-4 Keough et al. (2013) f o perturbations, adaptive paradigm 10 Daliri et al. (2018), Mollaei et al. (2016Mollaei et al. ( , 2013 f o perturbations, adaptive paradigm 14 Abur et al. (2018), Stepp et al. (2017) f o perturbations, adaptive paradigm 45 Burnett et al. (1998) f o and vocal intensity perturbations, reflexive paradigm 8-20 Burnett & Larson (2002), Burnett et al. (2008) f o perturbations, reflexive paradigm 8-20 Hain et al. (2000), Scheerer et al. (2013Scheerer et al. ( , 2016 f o perturbations, reflexive paradigm 10 Demopoulos et al. (2018), Houde et al. (2019) f o perturbations, reflexive paradigm 12 Larson et al. (2007) f o perturbations, reflexive paradigm 14 Max et al. (2003) f o perturbations, reflexive paradigm 20 Sares et al. (2018) f o perturbations, reflexive paradigm 25 Bauer et al. (2006), Bauer & Larson (2003), Behroozmand et al. (2012Behroozmand et al. ( , 2018, Behroozmand & Larson (2011), Burnett et al. (1998Burnett et al. ( , 1997 ...
Article
Full-text available
Purpose Gradual and sudden perturbations of vocal fundamental frequency (fo), also known as adaptive and reflexive fo perturbations, are techniques to study the influence of auditory feedback on voice fo control mechanisms. Previous vocal fo perturbations have incorporated varied setup-specific feedback delays and amplifications. Here, we investigated the effects of feedback delays (10-100 ms) and amplifications on both adaptive and reflexive fo perturbation paradigms, encapsulating the variability in equipment-specific delays (3-45 ms) and amplifications utilized in previous experiments. Method Responses to adaptive and reflexive fo perturbations were recorded in 24 typical speakers for four delay conditions (10, 40, 70, and 100 ms) or three amplification conditions (-10, +5, and +10 dB relative to microphone) in a counterbalanced order. Repeated-measures analyses of variance were carried out on the magnitude of fo responses to determine the effect of feedback condition. Results There was a statistically significant effect of the level of auditory feedback amplification on the response magnitude during adaptive fo perturbations, driven by the difference between +10- and -10-dB amplification conditions (hold phase difference: M = 38.3 cents, SD = 51.2 cents; after-effect phase: M = 66.1 cents, SD = 84.6 cents). No other statistically significant effects of condition were found for either paradigm. Conclusions Experimental equipment delays below 100 ms in behavioral paradigms do not affect the results of fo perturbation paradigms. As there is no statistically significant difference between the response magnitudes elicited by +5- and +10-dB auditory amplification conditions, this study is a confirmation that an auditory feedback amplification of +5 dB relative to microphone is sufficient to elicit robust compensatory responses for fo perturbation paradigms.
... For example, researchers can examine immediate reflexive responses to adjusted auditory feedback (e.g., pitch and loudness stimulus shifts), providing information about immediate feedback error correction (Bauer & Larson, 2003;Bauer et al., 2006;Burnett et al., 1998). Furthermore, there are paradigms that examine sensorimotor adaptation or how small shifts to auditory feedback over time result in a learned adjustment that persists for a period of time even when the shifted stimulus is removed (Behroozmand & Sangtian, 2018;Hawco & Jones, 2010;Keough & Jones, 2009;Scheerer et al., 2016). In sensorimotor adaptation paradigms, adults with typical voices tend to produce a compensatory vocal response in opposition to the perturbation. ...
Article
Full-text available
Purpose This study examined vocal hyperfunction (VH) using voice onset time (VOT). We hypothesized that speakers with VH would produce shorter VOTs, indicating increased laryngeal tension, and more variable VOTs, indicating disordered vocal motor control. Method We enrolled 32 adult women with VH (aged 20–74 years) and 32 age- and sex-matched controls. All were speakers of American English. Participants produced vowel–consonant–vowel combinations that varied by vowel (ɑ/u) and plosive (p/b, t/d, k/g). VOT—measured at the release of the plosive to the initiation of voicing—was averaged over three repetitions of each vowel–consonant–vowel combination. The coefficient of variation (CoV), a measure of VOT variability, was also computed for each combination. Results The mean VOTs were not significantly different between the two groups; however, the CoVs were significantly greater in speakers with VH compared to controls. Voiceless CoV values were moderately correlated with clinical ratings of dysphonia ( r = .58) in speakers with VH. Conclusion Speakers with VH exhibited greater variability in phonemic voicing targets compared to vocally healthy speakers, supporting the hypothesis for disordered vocal motor control in VH. We suggest future work incorporate VOT measures when assessing auditory discrimination and auditory–motor integration deficits in VH.
... More recently, the sensorimotor adaptation paradigm has been used to investigate sensorimotor adaptation in children and individuals diagnosed with communication disorders. Evidence of adaptation has been shown in children as young as three; however, the magnitude of the adaptive response is not as great as adults (Scheerer et al., 2016) and adaptation does not appear to have a reliable effect on their perceptual representations (Shiller et al., 2010). In the realm of communication disorders, the paradigm has FIGURE 1 | A schematic of a typical sensorimotor adaptation paradigm with four phases. ...
Article
Full-text available
Sensorimotor adaptation experiments are commonly used to examine motor learning behavior and to uncover information about the underlying control mechanisms of many motor behaviors, including speech production. In the speech and voice domains, aspects of the acoustic signal are shifted/perturbed over time via auditory feedback manipulations. In response, speakers alter their production in the opposite direction of the shift so that their perceived production is closer to what they intended. This process relies on a combination of feedback and feedforward control mechanisms that are difficult to disentangle. The current study describes and tests a simple 3-parameter mathematical model that quantifies the relative contribution of feedback and feedforward control mechanisms to sensorimotor adaptation. The model is a simplified version of the DIVA model, an adaptive neural network model of speech motor control. The three fitting parameters of SimpleDIVA are associated with the three key subsystems involved in speech motor control, namely auditory feedback control, somatosensory feedback control, and feedforward control. The model is tested through computer simulations that identify optimal model fits to six existing sensorimotor adaptation datasets. We show its utility in (1) interpreting the results of adaptation experiments involving the first and second formant frequencies as well as fundamental frequency; (2) assessing the effects of masking noise in adaptation paradigms; (3) fitting more than one perturbation dimension simultaneously; (4) examining sensorimotor adaptation at different timepoints in the production signal; and (5) quantitatively predicting responses in one experiment using parameters derived from another experiment. The model simulations produce excellent fits to real data across different types of perturbations and experimental paradigms (mean correlation between data and model fits across all six studies = 0.95 ± 0.02). The model parameters provide a mechanistic explanation for the behavioral responses to the adaptation paradigm that are not readily available from the behavioral responses alone. Overall, SimpleDIVA offers new insights into speech and voice motor control and has the potential to inform future directions of speech rehabilitation research in disordered populations. Simulation software, including an easy-to-use graphical user interface, is publicly available to facilitate the use of the model in future studies.
... Finally, our results showed that the adaptation magnitudes ob- served for children and adults who do not stutter were similar to each other; however, children who stutter showed greater adaptation magnitude than that observed in adults who stutter. Our results for children and adults who do not stutter are consistent with several pre- vious studies of sensorimotor adaptation in the speech motor system of normally fluent speakers (MacDonald, Johnson, Forsythe, Plante, & Munhal, 2012;Scheerer, Jacobson, & Jones, 2016;Shiller, Gracco, & Rvachew, 2010;Shiller & Rochon, 2014). These studies have inves- tigated auditory-motor adaptation (a) in response to perturbations in different parameters of speech, and (b) in normally fluent children in different age groups (compared to adults). ...
Article
Previous studies have shown that adults who stutter produce smaller corrective motor responses to compensate for unexpected auditory perturbations in comparison to adults who do not stutter, suggesting that stuttering may be associated with deficits in integration of auditory feedback for online speech monitoring. In this study, we examined whether stuttering is also associated with deficiencies in integrating and using discrepancies between expected and received auditory feedback to adaptively update motor programs for accurate speech production. Using a sensorimotor adaptation paradigm, we measured adaptive speech responses to auditory formant frequency perturbations in adults and children who stutter and their matched nonstuttering controls. We found that the magnitude of the speech adaptive response for children who stutter did not differ from that of fluent children. However, the adaptation magnitude of adults who stutter in response to auditory perturbation was significantly smaller than the adaptation magnitude of adults who do not stutter. Together these results indicate that stuttering is associated with deficits in integrating discrepancies between predicted and received auditory feedback to calibrate the speech production system in adults but not children. This auditory-motor integration deficit thus appears to be a compensatory effect that develops over years of stuttering.
Article
Auditory feedback is an important component of speech motor control, but its precise role in developing speech is less understood. The role of auditory feedback in development was probed by perturbing the speech of children 4-9 years old. The vowel sound /ɛ/ was shifted to /æ/ in real time and presented to participants as their own auditory feedback. Analyses of the resultant formant magnitude changes in the participants' speech indicated that children compensated and adapted by adjusting their formants to oppose the perturbation. Older and younger children responded to perturbation differently in F1 and F2. The compensatory change in F1 was greater for younger children, whereas the increase in F2 was greater for older children. Adaptation aftereffects were observed in both groups. Exploratory directional analyses in the two-dimensional formant space indicated that older children responded more directly and less variably to the perturbation than younger children, shifting their vowels back toward the vowel sound /ɛ/ to oppose the perturbation. Findings support the hypothesis that auditory feedback integration continues to develop between the ages of 4 and 9 years old such that the differences in the adaptive and compensatory responses arise between younger and older children despite receiving the same auditory feedback perturbation.
Article
Full-text available
Various aspects of motherese also known as infant-directed speech (IDS) have been studied for many years. As it is a widespread phenomenon, it is suspected to play some important roles in infant development. Therefore, our purpose was to provide an update of the evidence accumulated by reviewing all of the empirical or experimental studies that have been published since 1966 on IDS driving factors and impacts. Two databases were screened and 144 relevant studies were retained. General linguistic and prosodic characteristics of IDS were found in a variety of languages, and IDS was not restricted to mothers. IDS varied with factors associated with the caregiver (e.g., cultural, psychological and physiological) and the infant (e.g., reactivity and interactive feedback). IDS promoted infants' affect, attention and language learning. Cognitive aspects of IDS have been widely studied whereas affective ones still need to be developed. However, during interactions, the following two observations were notable: (1) IDS prosody reflects emotional charges and meets infants' preferences, and (2) mother-infant contingency and synchrony are crucial for IDS production and prolongation. Thus, IDS is part of an interactive loop that may play an important role in infants' cognitive and social development.
Article
Full-text available
Speech motor control develops gradually as the acoustics of speech are mapped onto the positions and movements of the articulators. In this event-related potential (ERP) study, children and adults aged 4-30 years produced vocalizations while exposed to frequency-altered feedback. Vocal pitch variability and the latency of vocal responses were found to differ as a function of age. ERP responses indexed by the P1-N1-P2 complex were also modulated as a function of age. P1 amplitudes decreased with age, whereas N1 and P2 amplitudes increased with age. In addition, a correlation between vocal variability and N1 amplitudes was found, suggesting a complex interaction between behavioural and neurological responses to frequency-altered feedback. These results suggest that the neural systems that integrate auditory feedback during vocal motor control undergo robust changes with age and physiological development.
Article
Full-text available
This article presents a theoretical perspective on stuttering based on numerous findings regarding speech and nonspeech neuromotor control in individuals who stutter in combination with recent empirical data and theoretical models from the literature on the neuroscience of motor control. Specifically, this perspective on stuttering relies heavily on recent work regarding feedforward and feedback control schemes; the formation, consolidation, and updating of inverse and forward internal models of the motor systems; and cortical, subcortical, and cerebellar activation patterns during speech and nonspeech motor tasks. Against this background, we propose that stuttering may result when producing speech (a) with unstable or insufficiently activated internal models or (b) with a motor
Article
Full-text available
Background Recent research has addressed the suppression of cortical sensory responses to altered auditory feedback that occurs at utterance onset regarding speech. However, there is reason to assume that the mechanisms underlying sensorimotor processing at mid-utterance are different than those involved in sensorimotor control at utterance onset. The present study attempted to examine the dynamics of event-related potentials (ERPs) to different acoustic versions of auditory feedback at mid-utterance. Methodology/Principal findings Subjects produced a vowel sound while hearing their pitch-shifted voice (100 cents), a sum of their vocalization and pure tones, or a sum of their vocalization and white noise at mid-utterance via headphones. Subjects also passively listened to playback of what they heard during active vocalization. Cortical ERPs were recorded in response to different acoustic versions of feedback changes during both active vocalization and passive listening. The results showed that, relative to passive listening, active vocalization yielded enhanced P2 responses to the 100 cents pitch shifts, whereas suppression effects of P2 responses were observed when voice auditory feedback was distorted by pure tones or white noise. Conclusion/Significance The present findings, for the first time, demonstrate a dynamic modulation of cortical activity as a function of the quality of acoustic feedback at mid-utterance, suggesting that auditory cortical responses can be enhanced or suppressed to distinguish self-produced speech from externally-produced sounds.
Article
Full-text available
Background Auditory feedback is important for accurate control of voice fundamental frequency (F0). The purpose of this study was to address whether task instructions could influence the compensatory responding and sensorimotor adaptation that has been previously found when participants are presented with a series of frequency-altered feedback (FAF) trials. Trained singers and musically untrained participants (nonsingers) were informed that their auditory feedback would be manipulated in pitch while they sang the target vowel [/ɑ /]. Participants were instructed to either ‘compensate’ for, or ‘ignore’ the changes in auditory feedback. Whole utterance auditory feedback manipulations were either gradually presented (‘ramp’) in -2 cent increments down to -100 cents (1 semitone) or were suddenly (’constant‘) shifted down by 1 semitone. Results Results indicated that singers and nonsingers could not suppress their compensatory responses to FAF, nor could they reduce the sensorimotor adaptation observed during both the ramp and constant FAF trials. Conclusions Compared to previous research, these data suggest that musical training is effective in suppressing compensatory responses only when FAF occurs after vocal onset (500-2500 ms). Moreover, our data suggest that compensation and adaptation are automatic and are influenced little by conscious control.
Article
Full-text available
Objective: The present event-related potential (ERP) study examined the developmental mechanisms of auditory-vocal integration in normally developing children. Neurophysiological responses to altered auditory feedback were recorded to determine whether they are affected by age and sex. Method: Forty-two children were pairwise matched for sex and were divided into a group of younger (10-12years) and a group of older (13-15years) children. Twenty healthy young adults (20-25years) also participated in the experiment. ERPs were recorded from the participants who heard their voice pitch feedback unexpectedly shifted -50, -100, or -200 cents during sustained vocalization. Results: P1 amplitudes became smaller as subjects increased in age from childhood to adulthood, and males produced larger N1 amplitudes than females. An age-related decrease in the P1-N1 latencies was also found: latencies were shorter in young adults than in school children. A complex age-by-sex interaction was found for the P2 component, where an age-related increase in P2 amplitudes existed only in girls, and boys produced longer P2 latencies than girls but only in the older children. Conclusions: These findings demonstrate that neurophysiological responses to pitch errors in voice auditory feedback depend on age and sex in normally developing children. Significance: The present study provides evidence that there is a sex-specific development of the neural mechanisms involved in auditory-vocal integration.
Article
Full-text available
In a series of 5 auditory preference experiments, 120 5-month-old infants were presented with Approval and Prohibition vocalizations in infant-directed (ID) and adult-directed (AD) English, and in ID speech in nonsense English and 3 unfamiliar languages, German, Italian, and Japanese. Dependent measures were looking-time to the side of stimulus presentation, and positive and negative facial affect. No consistent differences in looking-time were found. However, infants showed small but significant differences in facial affect in response to ID vocalizations in every language except Japanese. Infants smiled more to Approvals, and when they showed negative affect, it was more likely to occur in response to Prohibitions. Infants did not show differential affect in response to Approvals and Prohibitions in AD speech. The results indicate that young infants can discriminate affective vocal expressions in ID speech in several languages and that ID speech is more effective than AD speech in eliciting infant affect.
Article
Full-text available
Research on the control of visually guided limb movements indicates that the brain learns and continuously updates an internal model that maps the relationship between motor commands and sensory feedback. A growing body of work suggests that an internal model that relates motor commands to sensory feedback also supports vocal control. There is evidence from arm-reaching studies that shows that when provided with a contextual cue, the motor system can acquire multiple internal models, which allows an animal to adapt to different perturbations in diverse contexts. In this study we show that trained singers can rapidly acquire multiple internal models regarding voice fundamental frequency (F(0)). These models accommodate different perturbations to ongoing auditory feedback. Participants heard three musical notes and reproduced each one in succession. The musical targets could serve as a contextual cue to indicate which direction (up or down) feedback would be altered on each trial; however, participants were not explicitly instructed to use this strategy. When participants were gradually exposed to altered feedback adaptation was observed immediately following vocal onset. Aftereffects were target specific and did not influence vocal productions on subsequent trials. When target notes were no longer a contextual cue, adaptation occurred during altered feedback trials and evidence for trial-by-trial adaptation was found. These findings indicate that the brain is exceptionally sensitive to the deviations between auditory feedback and the predicted consequence of a motor command during vocalization. Moreover, these results indicate that, with contextual cues, the vocal control system may maintain multiple internal models that are capable of independent modification during different tasks or environments.
Article
Full-text available
The present study was intended to address how the online control of voice fundamental frequency (F(0)) during vocalization develops from school children to young adults. Nineteen school children (7-12 years old) and twenty-one young adults (19-27 years old) participated in this experiment. They were asked to sustain a vowel sound /u/ while their voice pitch feedback was randomly shifted (+/-50, +/-100, +/-200, and +/-500 cents) and fed back to them instantaneously over headphones. Results showed that school children produced significantly larger but slower compensatory responses to voice pitch feedback perturbations than young adults. Response latencies became longer with the increase in pitch perturbation magnitude, but no systematic changes were found as a function of stimulus direction. In addition, the number of responses "following" the stimulus direction across different stimulus magnitudes for school children was greater than for young adults. These findings demonstrate developmental changes of vocal responses to pitch feedback perturbations during vocalization from school children to young adults, and suggest that vocal responses can serve as an objective index of the maturation of the audio-vocal system.
Article
Full-text available
Vocal sensory-motor adaptation is typically studied by introducing a prolonged change in auditory feedback. While it may be preferable to perform multiple blocks of adaptation within a single experiment, it is possible that a carry-over effect from previous blocks of adaptation may affect the results of subsequent blocks. Speakers were asked to vocalize an /a/ sound and match a target note during ten adaptation blocks. Each block represented a unique combination of target note and shift direction. The adaptation response was found to be similar for all blocks, indicating that there were no carry-over effects from previous blocks of adaptation.
Article
Full-text available
Singing requires accurate control of the fundamental frequency (F0) of the voice. This study examined trained singers' and untrained singers' (nonsingers') sensitivity to subtle manipulations in auditory feedback and the subsequent effect on the mapping between F0 feedback and vocal control. Participants produced the consonant-vowel /ta/ while receiving auditory feedback that was shifted up and down in frequency. Results showed that singers and nonsingers compensated to a similar degree when presented with frequency-altered feedback (FAF); however, singers' F0 values were consistently closer to the intended pitch target. Moreover, singers initiated their compensatory responses when auditory feedback was shifted up or down 6 cents or more, compared to nonsingers who began compensating when feedback was shifted up 26 cents and down 22 cents. Additionally, examination of the first 50 ms of vocalization indicated that participants commenced subsequent vocal utterances, during FAF, near the F0 value on previous shift trials. Interestingly, nonsingers commenced F0 productions below the pitch target and increased their F0 until they matched the note. Thus, singers and nonsingers rely on an internal model to regulate voice F0, but singers' models appear to be more sensitive in response to subtle discrepancies in auditory feedback.
Article
Full-text available
A developmental study of delayed auditory feedback (DAF) indicated that: (1) DAF disrupts the speech of children more than adults, for all delays in feedback; (2) The delay for maximal interference varies with age. The older a subject, the shorter the delay producing maximal interference with his speech; (3) The peak interference delay remains at 0.2 sec, when adults reduce their rate of speech by drawing out speechsounds. This finding suggested that the critical DAF interval is independent of the duration of speechsounds in the returning auditory signal; (4) Slowing down the rate of speech as described above, reduced the amount of stuttering under DAF; (5) However, a subject's maximum rate of speech was significantly correlated with the duration of the delay producing maximal interference with his speech. The slower the subject's maximum rate of speech, the longer the peak interference delay. A correlation of maximum speech rate and frequency of DAF stuttering was also significant. The slower a subject's maximum speech rate, the more he tended to stutter under DAF. Since voluntary prolongation of speechsounds had the opposite effect, decreasing rather than increasing stuttering, it was suggested that: (l) Mechanisms determining the maximum speech rate are to some extent different from those governing the prolongation of speechsounds; (2) both the amount of stuttering under DAF and the peak interference delay are related to some as yet unknown factor or set of factors determining the maximum rate of speech, and, (3) this factor is age‐linked since the maximum rate of speech varies inversely with age.
Article
Full-text available
Recent studies have shown that when phonating subjects hear their voice pitch feedback shift upward or downward, they respond with a change in voice fundamental frequency (F0) output. Three experiments were performed to improve our understanding of this response and to explore the effects of different stimulus variables on voice F0 responses to pitch-shift stimuli. In experiment 1, it was found that neither the absolute level of feedback intensity nor the presence of pink masking noise significantly affect magnitude or latency of the voice F0 response. In experiment 2, changes in stimulus magnitude led to no systematic differences in response magnitudes or latencies. However, as stimulus magnitude was increased from 25 to 300 cents, the proportion of responses that changed in the direction opposite that of the stimulus ("opposing" response) decreased. A corresponding increase was observed in the proportion of same direction responses ("following" response). In experiment 3, increases in pitch-shift stimulus durations from 20 to 100 ms led to no differences in the F0 response. Durations between 100 and 500 ms led to longer duration voice F0 responses with greater response magnitude, and suggested the existence of a second F0 response with a longer latency than the first.
Article
Full-text available
Previous findings have shown that subjects respond to an alteration, or shift, of auditory feedback pitch with a change in voice fundamental frequency (F0). When pitch shifts exceeding 500 ms in duration were presented, subjects' averaged responses appeared to consist of both an early and a late component. The latency of the second response was long enough to be produced voluntarily. To test the hypothesis that there are two responses to pitch-shift stimuli and to clarify the role of intention, subjects were instructed to change their voice F0 in the opposite direction of the pitch-shift stimulus, in the same direction, or not to respond at all. In a second group, subjects were tested under the above conditions as well as under instructions to raise voice F0 or to lower F0 as rapidly as possible upon hearing a pitch shift. Results showed that, when given instructions to produce a voluntary response, subjects made both an early vocal response (VR1) and a later vocal response (VR2). The second response, VR2, was almost always made in the instructed direction, whereas VR1 was often made incorrectly. The latency of VR1 was reduced under instructions to respond to feedback pitch shifts by changing voice F0 in the opposite direction, compared with that when told to ignore the pitch shifts. Latency and amplitude measures of VR2 differed under the various experimental conditions. These results demonstrate that there are two responses to pitch-shift stimuli. The first is relatively automatic but may be modulated by instructions to the participant. The second response is probably a voluntary one.
Article
Full-text available
The purpose of this article is to demonstrate that self-produced auditory feedback is sufficient to train a mapping between auditory target space and articulator space under conditions in which the structures of speech production are undergoing considerable developmental restructuring. One challenge for competing theories that propose invariant constriction targets is that it is unclear what teaching signal could specify constriction location and degree so that a mapping between constriction target space and articulator space can be learned. It is predicted that a model trained by auditory feedback will accomplish speech goals, in auditory target space, by continuously learning to use different articulator configurations to adapt to the changing acoustic properties of the vocal tract during development. The Maeda articulatory synthesis part of the DIVA neural network model (Guenther et al., 1998) was modified to reflect the development of the vocal tract by using measurements taken from MR images of children. After training, the model was able to maintain the 11 English vowel targets in auditory planning space, utilizing varying articulator configurations, despite morphological changes that occur during development. The vocal-tract constriction pattern (derived from the vocal-tract area function) as well as the formant values varied during the course of development in correspondence with morphological changes in the structures involved with speech production. Despite changes in the acoustical properties of the vocal tract that occur during the course of development, the model was able to demonstrate motor-equivalent speech production under lip-restriction conditions. The model accomplished this in a self-organizing manner even though there was no prior experience with lip restriction during training.
Article
Full-text available
Several behavioral and brain imaging studies have demonstrated a significant interaction between speech perception and speech production. In this study, auditory cortical responses to speech were examined during self-production and feedback alteration. Magnetic field recordings were obtained from both hemispheres in subjects who spoke while hearing controlled acoustic versions of their speech feedback via earphones. These responses were compared to recordings made while subjects listened to a tape playback of their production. The amplitude of tape playback was adjusted to match the amplitude of self-produced speech. Recordings of evoked responses to both self-produced and tape-recorded speech were obtained free of movement-related artifacts. Responses to self-produced speech were weaker than were responses to tape-recorded speech. Responses to tones were also weaker during speech production, when compared with responses to tones recorded in the presence of speech from tape playback. However, responses evoked by gated noise stimuli did not differ for recordings made during self-produced speech versus recordings made during tape-recorded speech playback. These data suggest that during speech production, the auditory cortex (1) attenuates its sensitivity and (2) modulates its activity as a function of the expected acoustic feedback.
Article
Full-text available
The pitch-shift reflex is a sophisticated system that produces a "compensatory" response in voice F0 that is opposite in direction to a change in voice pitch feedback (pitch-shift stimulus), thus correcting for the discrepancy between the intended voice F0 and the feedback pitch. In order to more fully exploit the pitch-shift reflex as a tool for studying the influence of sensory feedback mechanisms underlying voice control, the optimal characteristics of the pitch-shift stimulus must be understood. The present study was undertaken to assess the effects of altering the duration of the interstimulus interval (ISI) and the number of trials comprising an average on measures of the pitch-shift reflex. Pitch-shift stimuli were presented to vocalizing subjects with ISI of 5.0, 2.5, 1.0, and 0.5 s to determine if an increase in ISI altered response properties. With each ISI, measures of event-related averages of the first 10, 15, 20, or 30 pitch-shift reflex responses were compared to see if increases in the number of responses comprising an event-related average altered response properties. Measures of response latency, peak time, magnitude, and prevalence were obtained for all ISI and average conditions. While quantitative measures were similar across ISI and averaging conditions, we observed more instances of "non-responses" with averages of ten trials as well as at an ISI of 0.5 s. These findings suggest an ISI of 1.0 s and an average consisting of at least 15 trials produce optimal results. Future studies using these stimulus parameters may produce more reliable data due to the fivefold decrease in subject participation time and a concomitant decrease in fatigue, boredom, and inattention.
Article
Full-text available
Like any other surgery requiring anesthesia, cochlear implantation in the first few years of life carries potential risks, which makes it important to assess the potential benefits. This study introduces a new method to assess the effect of age at implantation on cochlear implant outcomes: developmental trajectory analysis (DTA). DTA compares curves representing change in an outcome measure over time (i.e. developmental trajectories) for two groups of children that differ along a potentially important independent variable (e.g. age at intervention). This method was used to compare language development and speech perception outcomes in children who received cochlear implants in the second, third or fourth year of life. Within this range of age at implantation, it was found that implantation before the age of 2 resulted in speech perception and language advantages that were significant both from a statistical and a practical point of view. Additionally, the present results are consistent with the existence of a 'sensitive period' for language development, a gradual decline in language acquisition skills as a function of age.
Article
Full-text available
Little is known about the basic processes underlying the behavior of singing. This experiment was designed to examine differences in the representation of the mapping between fundamental frequency (F0) feedback and the vocal production system in singers and nonsingers. Auditory feedback regarding F0 was shifted down in frequency while participants sang the consonant-vowel /ta/. During the initial frequency-altered trials, singers compensated to a lesser degree than nonsingers, but this difference was reduced with continued exposure to frequency-altered feedback. After brief exposure to frequency altered auditory feedback, both singers and nonsingers suddenly heard their F0 unaltered. When participants received this unaltered feedback, only singers' F0 values were found to be significantly higher than their F0 values produced during baseline and control trials. These aftereffects in singers were replicated when participants sang a different note than the note they produced while hearing altered feedback. Together, these results suggest that singers rely more on internal models than nonsingers to regulate vocal productions rather than real time auditory feedback.
Article
Hearing one's own voice is important for regulating ongoing speech, and for mapping speech sounds onto articulator movements. However, it is currently unknown whether attention mediates changes in the relationship between motor commands and their acoustic output, which are necessary as growth and aging inevitably cause changes to the vocal tract. In this study, participants produced vocalizations while they heard their vocal pitch persistently shifted downward one semi-tone in both single- and dual-task conditions. During the single-task condition, participants vocalized while passively viewing a visual stream. During the dual-task condition, participants vocalized while also monitoring a visual stream for target letters, forcing participants to divide their attention. Participants' vocal pitch was measured across each vocalization, to index the extent to which their ongoing vocalization was modified as a result of the deviant auditory feedback. Smaller compensatory responses were recorded during the dual-task condition, suggesting that divided attention interfered with the use of auditory feedback for the regulation of ongoing vocalizations. Participants' vocal pitch was also measured at the beginning of each vocalization, before auditory feedback was available, to assess the extent to which the deviant auditory feedback was used to modify subsequent speech motor commands. Smaller changes in vocal pitch at vocalization onset were recorded during the dual-task condition, suggesting that divided attention diminished sensory-motor learning. Together the results of this study suggest that attention is required for the speech motor control system to make optimal use of auditory feedback for the regulation and planning of speech motor commands.
Article
Functional neuroimaging has dramatically accelerated our understanding of the neural computations underlying speech. Speaking involves interactions between motor planning and execution areas in the frontal lobe, somatosensory areas in the parietal lobe, and auditory areas in the temporal lobe, along with associated subcortical structures. Cortical activities are modulated by two reentrant loops: a basal ganglia loop involved in action selection and initiation and a cerebellar loop involved in generating the finely tuned motor commands that underlie fluent speech. Motor commands descending from the motor cortex to the cranial nerve nuclei are shaped by both feedforward-based and sensory feedback-based control mechanisms.
Article
Speech production requires the combined effort of a feedback control system driven by sensory feedback, and a feedforward control system driven by internal models. However, the factors dictating the relative weighting of these feedback and feedforward control systems are unclear. In this event-related potential (ERP) study, participants produced vocalizations while exposed to blocks of frequency-altered feedback (FAF) perturbations that were either predictable in magnitude (consistently either 50 or 100 cents), or unpredictable in magnitude (50 and 100-cent perturbations varying randomly within each vocalization). Vocal and P1-N1-P2 ERP responses revealed decreases in the magnitude and trial-to-trial variability of vocal responses, smaller N1-amplitudes, and shorter vocal, P1 and N1 response latencies following predictable FAF perturbation magnitudes. In addition, vocal response magnitudes correlated with N1 amplitudes, as well as vocal response latencies and P2 latencies. This pattern of results suggests that after repeated exposure to predictable FAF perturbations, the contribution of the feedforward control system increases. Examination of the presentation order of the FAF perturbations revealed smaller compensatory responses, P1 and P2 amplitudes, as well as shorter N1 latencies when the block of predictable 100-cent perturbations occurred prior to the block of predictable 50-cent perturbations. These results suggest that exposure to large perturbations modulates responses to subsequent perturbations of equal or smaller size. Similarly, exposure to a 100-cent perturbation, prior to a 50-cent perturbation within a vocalization, decreased the magnitude of vocal and N1 responses, but increased P1 and P2 latencies. Thus, exposure to a single perturbation can affect responses to subsequent perturbations.
Article
A powerful pitch estimation algorithm called SWIPE has been developed for processing speech and music. SWIPE is shown to outperform existing algorithms on several publicly available speech and musical instrument databases, and a disordered speech database, reducing the gross error rate by 40%, relative to the best competing algorithm. In short, SWIPE estimates the pitch as the fundamental frequency of a sawtooth waveform, whose spectrum best matches the spectrum of the input signal. The short‐time Fourier transform of the sawtooth waveform provides an extension to older frequency‐based, sieve‐type estimation algorithms by providing smooth peaks with decaying amplitudes to correlate with the fundamental frequency (if present) and its harmonics. An improvement on the algorithm is achieved by using only the first and prime harmonics, which significantly reduces subharmonic errors commonly found in other pitch estimation algorithms.
Article
Investigated auditory feedback and speech in 22 3-yr-old and 28 4-yr-old nursery school children and in 11 college students. The children told stories about familiar picture books while wearing earphones. In the Lombard procedure, several different levels of masking noise were presented through the earphones, and the S's vocal intensity at each noise level was measured. A significant Lombard effect was obtained for each group; vocal intensity increased as masking level increased. There was no developmental pattern, however. In the sidetone amplification procedure, the S's voice was fed back at several levels of amplification. All groups significantly lowered their vocal intensity as the amplification increased. There was a significant developmental pattern for sidetone amplification, with the older Ss showing greater effects than the 3-yr-olds. Findings indicate that auditory feedback is involved in the regulation of vocal intensity, even in 3-yr-olds. The developmental pattern for intensity control in speech does not support the supposition that auditory feedback initially is very important and then wanes in significance. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Auditory feedback plays an important role in monitoring vocal output and determining when adjustments are necessary. In this study a group of untrained singers participated in a frequency altered feedback experiment to examine if accuracy at matching a note could predict the degree of compensation to auditory feedback that was shifted in frequency. Participants were presented with a target note and instructed to match the note in pitch and duration. Following the onset of the participants' vocalizations their vocal pitch was shifted down one semi-tone at a random time during their utterance. This altered auditory feedback was instantaneously presented back to them through headphones. Results indicated that note matching accuracy did not correlate with compensation magnitude, however, a significant correlation was found between baseline variability and compensation magnitude. These results suggest that individuals with a more stable baseline fundamental frequency rely more on feedforward control mechanisms than individuals with more variable vocal production. This increased weighting of feedforward control means they are less sensitive to mismatches between their intended vocal production and auditory feedback.
Article
Across several independent studies, infants from a few days to 9 months of age have shown preferences for infant-directed (ID) over adult-directed (AD) speech. Moreover, 4-month-olds have been shown to prefer sine-wave analogs of the fundamental frequency of ID speech, suggesting that exaggerated pitch contours are prepotent stimuli for infants. The possibility of similar preferences by 1-month-olds was examined in a series of experiments, using a fixation-based preference procedure. Results from the first 2 experiments showed that 1-month-olds did not prefer the lower-frequency pitch characteristics of ID speech, even though 1-month-olds were able to discriminate low-pass filtered ID and AD speech. Since low-pass filtering may have distorted the fundamental frequency characteristics of ID speech, 1-month-olds were also tested with sine-wave analogs of the fundamental frequencies of the ID utterances. Infants in this third experiment also showed no preference for ID pitch contours. In the fourth experiment, 1-month-olds preferred a natural recording of ID speech over a version which preserved only its lower frequency prosodic features. From these results, it is argued that, although young infants are similar to older infants in their attraction to ID speech, their preferences depend on a wider range of acoustic features (e.g., spectral structure). It is suggested that exaggerated pitch contours which characterize ID speech may become salient communicative signals for infants through language-rich, interactive experiences with caretakers and increased perceptual acuity over the first months after birth.
Article
A theoretical overview and supporting data are presented about the control of the segmental component of speech production. Findings of "motor-equivalent" trading relations between the contributions of two constrictions to the same acoustic transfer function provide preliminary support for the idea that segmental control is based on acoustic or auditory-perceptual goals. The goals are determined partly by non-linear, quantal relations (called "saturation effects") between motor commands and articulatory movements and between articulation and sound. Since processing times would be too long to allow the use of auditory feedback for closed-loop error correction in achieving acoustic goals, the control mechanism must use a robust "internal model" of the relation between articulation and the sound output that is learned during speech acquisition. Studies of the speech of cochlear implant and bilateral acoustic neuroma patients provide evidence supporting two roles for auditory feedback in adults: maintenance of the internal model, and monitoring the acoustic environment to help assure intelligibility by guiding relatively rapid adjustments in "postural" parameters underlying average sound level, speaking rate and the amount of prosodically-based inflection of F0 and SPL.
Article
Species-specific vocalizations fall into two broad categories: those that emerge during maturation, independent of experience, and those that depend on early life interactions with conspecifics. Human language and the communication systems of a small number of other species, including songbirds, fall into this latter class of vocal learning. Self-monitoring has been assumed to play an important role in the vocal learning of speech and studies demonstrate that perception of your own voice is crucial for both the development and lifelong maintenance of vocalizations in humans and songbirds. Experimental modifications of auditory feedback can also change vocalizations in both humans and songbirds. However, with the exception of large manipulations of timing, no study to date has ever directly examined the use of auditory feedback in speech production under the age of 4. Here we use a real-time formant perturbation task to compare the response of toddlers, children, and adults to altered feedback. Children and adults reacted to this manipulation by changing their vowels in a direction opposite to the perturbation. Surprisingly, toddlers' speech didn't change in response to altered feedback, suggesting that long-held assumptions regarding the role of self-perception in articulatory development need to be reconsidered.
Article
Auditory sensory processing is an important element of the neural mechanisms controlling human vocalization. We evaluated which components of Event Related Potentials (ERP) elicited by the unexpected shift of fundamental frequency in a subject's own voice might correlate with his/her ability to process auditory information. A significant negative correlation between the latency of the N1 component of the ERP and the Montreal Battery of Evaluation of Amusia scores for Melodic organization was found. A possible functional role of neuronal activity underling the N1 component in voice control mechanisms is discussed.
Article
Unlabelled: This paper investigates the hypothesis that stuttering may result in part from impaired readout of feedforward control of speech, which forces persons who stutter (PWS) to produce speech with a motor strategy that is weighted too much toward auditory feedback control. Over-reliance on feedback control leads to production errors which if they grow large enough, can cause the motor system to "reset" and repeat the current syllable. This hypothesis is investigated using computer simulations of a "neurally impaired" version of the DIVA model, a neural network model of speech acquisition and production. The model's outputs are compared to published acoustic data from PWS' fluent speech, and to combined acoustic and articulatory movement data collected from the dysfluent speech of one PWS. The simulations mimic the errors observed in the PWS subject's speech, as well as the repairs of these errors. Additional simulations were able to account for enhancements of fluency gained by slowed/prolonged speech and masking noise. Together these results support the hypothesis that many dysfluencies in stuttering are due to a bias away from feedforward control and toward feedback control. Educational objectives: The reader will be able to (a) describe the contribution of auditory feedback control and feedforward control to normal and stuttered speech production, (b) summarize the neural modeling approach to speech production and its application to stuttering, and (c) explain how the DIVA model accounts for enhancements of fluency gained by slowed/prolonged speech and masking noise.
Article
A large body of evidence suggests that the motor system maintains a forward model that predicts the sensory outcome of movements. When sensory feedback does not match the predicted consequences, a compensatory response corrects for the motor error and the forward model is updated to prevent future errors. Like other motor behaviours, vocalization relies on sensory feedback for the maintenance of forward models. In this study, we used a frequency altered feedback (FAF) paradigm to study the role of auditory feedback in the control of vocal pitch (F0). We adapted subjects to a one semitone shift and induced a perturbation by briefly removing the altered feedback. This was compared to a control block in which a 1 semitone perturbation was introduced into an unshifted trial, or trials were randomly shifted up 1 semitone, and a perturbation was introduced by removing the feedback alteration. The compensation response to mid-utterance perturbations was identical in all conditions, and was always smaller than the compensation to a shift at utterance onset. These results are explained by a change in the control strategy at utterance onset and mid-utterance. At utterance onset, auditory feedback is compared to feedback predicted by a forward model to ensure the pitch goal is achieved. However, after utterance onset, the control strategy switches and stabilization is maintained by comparing feedback to previous F0 production.
Article
This paper is concerned with methods for analyzing quantitative, non-categorical profile data, e.g., a battery of tests given to individuals in one or more groups. It is assumed that the variables have a multinormal distribution with an arbitrary variance-covariance matrix. Approximate procedures based on classical analysis of variance are presented, including an adjustment to the degrees of freedom resulting in conservativeF tests. These can be applied to the case where the variance-covariance matrices differ from group to group. In addition, exact generalized multivariate analysis methods are discussed. Examples are given illustrating both techniques.
Article
The traditional belief that audition plays only a minor role in infant vocal development depends upon evidence that deaf infants produce the same kinds of babbling sounds as hearing infants. Evidence in support of this position has been very limited. A more extensive comparison of vocal development in deaf and hearing infants indicates that the traditional belief is in error. Well-formed syllable production is established in the first 10 months of life by hearing infants but not by deaf infants, indicating that audition plays an important role in vocal development. The difference between babbling in the deaf and hearing is apparent if infant vocal sounds are observed from a metaphonological perspective, a view that takes account of the articulatory/acoustic patterns of speech sounds in all mature spoken languages.
Article
Across several independent studies, infants from a few days to 9 months of age have shown preferences for infant-directed (ID) over adult-directed (AD) speech. Moreover, 4-month-olds have been shown to prefer sine-wave analogs of the fundamental frequency of ID speech, suggesting that exaggerated pitch contours are prepotent stimuli for infants. The possibility of similar preferences by 1-month-olds was examined in a series of experiments, using a fixation-based preference procedure. Results from the first 2 experiments showed that 1-month-olds did not prefer the lower-frequency pitch characteristics of ID speech, even though 1-month-olds were able to discriminate low-pass filtered ID and AD speech. Since low-pass filtering may have distorted the fundamental frequency characteristics of ID speech, 1-month-olds were also tested with sine-wave analogs of the fundamental frequencies of the ID utterances. Infants in this third experiment also showed no preference for ID pitch contours. In the fourth experiment, 1-month-olds preferred a natural recording of ID speech over a version which preserved only its lower frequency prosodic features. From these results, it is argued that, although young infants are similar to older infants in their attraction to ID speech, their preferences depend on a wider range of acoustic features (e.g., spectral structure). It is suggested that exaggerated pitch contours which characterize ID speech may become salient communicative signals for infants through language-rich, interactive experiences with caretakers and increased perceptual acuity over the first months after birth.
Article
This article describes a neural network model that addresses the acquisition of speaking skills by infants and subsequent motor equivalent production of speech sounds. The model learns two mappings during a babbling phase. A phonetic-to-orosensory mapping specifies a vocal tract target for each speech sound; these targets take the form of convex regions in orosensory coordinates defining the shape of the vocal tract. The babbling process wherein these convex region targets are formed explains how an infant can learn phoneme-specific and language-specific limits on acceptable variability of articulator movements. The model also learns an orosensory-to-articulatory mapping wherein cells coding desired movement directions in orosensory space learn articulator movements that achieve these orosensory movement directions. The resulting mapping provides a natural explanation for the formation of coordinative structures. This mapping also makes efficient use of redundancy in the articulator system, thereby providing the model with motor equivalent capabilities. Simulations verify the model's ability to compensate for constraints or perturbations applied to the articulators automatically and without new learning and to explain contextual variability seen in human speech production.
Article
Infants learn language with remarkable speed. By the end of their second year they speak in sentences with an 'accent' typical of a native speaker. How does an individual acquire a specific language? While acknowledging the biological preparation for language, this review focuses on the effects of early language experience on infants' perceptual and perceptual-motor systems. The data show that by the time infants begin to master the higher levels of language--sound-meaning correspondences, contrastive phonology, and grammatical rules--their perceptual and perceptual-motor systems are already tuned to a specific language. The consequences of this are described in a developmental theory at the phonetic level that holds promise for higher levels of language.
Article
In a series of 5 auditory preference experiments, 120 5-month-old infants were presented with Approval and Prohibition vocalizations in infant-directed (ID) and adult-directed (AD) English, and in ID speech in nonsense English and 3 unfamiliar languages, German, Italian, and Japanese. Dependent measures were looking-time to the side of stimulus presentation, and positive and negative facial affect. No consistent differences in looking-time were found. However, infants showed small but significant differences in facial affect in response to ID vocalizations in every language except Japanese. Infants smiled more to Approvals, and when they showed negative affect, it was more likely to occur in response to Prohibitions. Infants did not show differential affect in response to Approvals and Prohibitions in AD speech. The results indicate that young infants can discriminate affective vocal expressions in ID speech in several languages and that ID speech is more effective than AD speech in eliciting infant affect.
Article
Auditory feedback has been suggested to be important for voice fundamental frequency (F0) control. The present study featured a new technique for testing this hypothesis by which the pitch of a subject's voice was modulated, fed back over earphones, and the resultant change in the emitted voice F0 was measured. The responses of 67 normal, healthy young adults were recorded as they attempted to ignore intermittent upward or downward shifts in pitch feedback while they sustained steady vowel sounds (/a/) or sang musical scales. Ninety-six percent of subjects increased their F0 when the feedback pitch was decreased, and 78% of subjects decreased their F0 when the pitch feedback was increased. Latencies of responses ranged from 104 to 223 ms. Results indicate people normally rely on pitch feedback to control voice F0.
Article
Infants learn language with remarkable speed, but how they do it remains a mystery. New data show that infants use computational strategies to detect the statistical and prosodic patterns in language input, and that this leads to the discovery of phonemes and words. Social interaction with another human being affects speech learning in a way that resembles communicative learning in songbirds. The brain's commitment to the statistical and prosodic patterns that are experienced early in life might help to explain the long-standing puzzle of why infants are better language learners than adults. Successful learning by infants, as well as constraints on that learning, are changing theories of language acquisition.
Article
This paper describes a neural model of speech acquisition and production that accounts for a wide range of acoustic, kinematic, and neuroimaging data concerning the control of speech movements. The model is a neural network whose components correspond to regions of the cerebral cortex and cerebellum, including premotor, motor, auditory, and somatosensory cortical areas. Computer simulations of the model verify its ability to account for compensation to lip and jaw perturbations during speech. Specific anatomical locations of the model's components are estimated, and these estimates are used to simulate fMRI experiments of simple syllable production.
Article
Unlabelled: Speech production involves the integration of auditory, somatosensory, and motor information in the brain. This article describes a model of speech motor control in which a feedforward control system, involving premotor and primary motor cortex and the cerebellum, works in concert with auditory and somatosensory feedback control systems that involve both sensory and motor cortical areas. New speech sounds are learned by first storing an auditory target for the sound, then using the auditory feedback control system to control production of the sound in early repetitions. Repeated production of the sound leads to tuning of feedforward commands which eventually supplant the feedback-based control signals. Although parts of the model remain speculative, it accounts for a wide range of kinematic, acoustic, and neuroimaging data collected during speech production and provides a framework for investigating communication disorders that involve malfunction of the cerebral cortex and interconnected subcortical structures. Learning outcomes: Readers will be able to: (1) describe several types of learning that occur in the sensory-motor system during babbling and early speech, (2) identify three neural control subsystems involved in speech production, (3) identify regions of the brain involved in monitoring auditory and somatosensory feedback during speech production, and (4) identify regions of the brain involved in feedforward control of speech.
Article
The neural substrates underlying auditory feedback control of speech were investigated using a combination of functional magnetic resonance imaging (fMRI) and computational modeling. Neural responses were measured while subjects spoke monosyllabic words under two conditions: (i) normal auditory feedback of their speech and (ii) auditory feedback in which the first formant frequency of their speech was unexpectedly shifted in real time. Acoustic measurements showed compensation to the shift within approximately 136 ms of onset. Neuroimaging revealed increased activity in bilateral superior temporal cortex during shifted feedback, indicative of neurons coding mismatches between expected and actual auditory signals, as well as right prefrontal and Rolandic cortical activity. Structural equation modeling revealed increased influence of bilateral auditory cortical areas on right frontal areas during shifted speech, indicating that projections from auditory error cells in posterior superior temporal cortex to motor correction cells in right frontal cortex mediate auditory feedback control of speech.
  • N E Scheerer
N. E. Scheerer et al. / Neuroscience xxx (2015) xxx–xxx 9
Computing pitch of speech and music 808
  • A Camacho
  • J G Harris
Camacho A, Harris JG (2007) Computing pitch of speech and music 808
A developmental study of 811
  • R A Chase
  • S Sutton
  • D First
  • J Zubin
Chase RA, Sutton S, First D, Zubin J (1961) A developmental study of 811
Control of vocalization at utterance 846
  • C S Hawco
  • J A Jones
Hawco CS, Jones JA (2009) Control of vocalization at utterance 846
Multiple instances of vocal sensorimotor 850
  • C S Hawco
  • J A Jones
Hawco CS, Jones JA (2010) Multiple instances of vocal sensorimotor 850