Article

Neural Coding of Phonemic Fricative Contrast With and Without Hearing Aid

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

To determine whether auditory event-related potentials (ERPs) to a phonemic fricative contrast ("s" and "sh") show significant differences in listening conditions with or without a hearing aid and whether the aided condition significantly alters a listener's ERP responses to the fricative speech sounds. The raw EEG data were collected using a 64-channel system from 10 healthy adult subjects with normal hearing. The fricative stimuli were digitally edited versions of naturally produced syllables, /sa/ and /∫a/. The evoked responses were derived in unaided and aided conditions by using an alternating block design with a passive listening task. Peak latencies and amplitudes of the P1-N1-P2 components and the N1' and P2'' peaks of the acoustic change complex (ACC) were analyzed. The evoked N1 and N1' responses to the fricative sounds significantly differed in the unaided condition. The fricative contrast also elicited distinct N1-P2 responses in the aided condition. While the aided condition increased and delayed the N1 and ACC responses, significant differences in the P1-N1-P2 and ACC components were still observed, which would support fricative contrast perception at the cortical level. Despite significant alterations in the ERP responses by the aided condition, normal-hearing adult listeners showed distinct neural coding patterns for the voiceless fricative contrast, "s" and "sh," with or without a hearing aid.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Previous work in adult listeners with NH indicates ERPs are reliably elicited by the voiceless fricatives /s/ and /ʃ/ in the consonant-initial context with and without hearing aid amplification Miller and Zhang 2014b). Miller and Zhang (2014b) recorded aided and unaided ERP responses to a /sɑ/-/ʃɑ/ contrast that tightly controlled frication duration, amplitude, and dynamic formant transition cues in the speech sounds. ...
... Previous work in adult listeners with NH indicates ERPs are reliably elicited by the voiceless fricatives /s/ and /ʃ/ in the consonant-initial context with and without hearing aid amplification Miller and Zhang 2014b). Miller and Zhang (2014b) recorded aided and unaided ERP responses to a /sɑ/-/ʃɑ/ contrast that tightly controlled frication duration, amplitude, and dynamic formant transition cues in the speech sounds. The unaided results showed differences in spectral frication alone elicited significantly different ERP responses to the initial consonant (Miller and Zhang 2014b). ...
... Miller and Zhang (2014b) recorded aided and unaided ERP responses to a /sɑ/-/ʃɑ/ contrast that tightly controlled frication duration, amplitude, and dynamic formant transition cues in the speech sounds. The unaided results showed differences in spectral frication alone elicited significantly different ERP responses to the initial consonant (Miller and Zhang 2014b). Hearing aid amplification acoustically modified the contrast, but because the /s/-/ʃ/ stimuli were still acoustically distinct, the aided ERP waveforms to the fricative also significantly differed (Miller and Zhang 2014b). ...
Article
Full-text available
Background: Cortical auditory event-related potentials are a potentially useful clinical tool to objectively assess speech outcomes with rehabilitative devices. Whether hearing aids reliably encode the spectrotemporal characteristics of fricative stimuli in different phonological contexts and whether these differences result in distinct neural responses with and without hearing aid amplification remain unclear. Purpose: To determine whether the neural coding of the voiceless fricatives /s/ and /ʃ/ in the syllable-final context reliably differed without hearing aid amplification and whether hearing aid amplification altered neural coding of the fricative contrast. Research Design: A repeated-measures, within subject design was used to compare the neural coding of a fricative contrast with and without hearing aid amplification. Study Sample: Ten adult listeners with normal hearing participated in the study. Data Collection and Analysis: Cortical auditory event-related potentials were elicited to an /ɑs/–/ɑʃ/ vowel-fricative contrast in unaided and aided listening conditions. Neural responses to the speech contrast were recorded at 64-electrode sites. Peak latencies and amplitudes of the cortical response waveforms to the fricatives were analyzed using repeated-measures analysis of variance. Results: The P2' component of the acoustic change complex significantly differed from the syllable-final fricative contrast with and without hearing aid amplification. Hearing aid amplification differentially altered the neural coding of the contrast across frontal, temporal, and parietal electrode regions. Conclusions: Hearing aid amplification altered the neural coding of syllable-final fricatives. However, the contrast remained acoustically distinct in the aided and unaided conditions, and cortical responses to the fricative significantly differed with and without the hearing aid.
... This paradigm contains short blocks of repeated stimulus presentations with equal stimulus ratios and is useful for comparing the neural processing of different acoustic stimuli, including phonemic speech contrasts. It has been shown that N1-P2 responses recorded with this alternating short block paradigm are sensitive to variations in the acoustic features of different speech stimuli that are presented in alternating blocks as well as the effects of different listening conditions on the neural coding of these stimuli (Miller & Zhang, 2014;Zhang et al., 2011). However, while the single block and alternating short block designs allow for an examination of the cortical encoding of acoustic features of different stimuli, they are not appropriate for measuring sensory discrimination abilities. ...
... The average impedance of electrodes was below 5 kOhms. The same recording was used in previous ERP studies (Rao, Zhang, & Miller, 2010;& Miller & Zhang, 2014). ERP waveform analysis was completed offline in BESA (Version 6.0, MEGIS Software GmbH, Germany) and MATLAB (Version 8.0). ...
... Because electrophysiological measures are non-invasive, have fine temporal precision, and are relatively low cost, they represent potential tools to assess and predict the ability to perceive speech in noise in difficult-to-test clinical populations. For instance, neural measures may be used to assess how hearing aid amplification potentially alters the neural coding of speech sounds at different stages of cortical processing (Miller & Zhang, 2014) and whether these hearing aid-related changes in AERPs are linked to improved speech perception, which may inform future development of effective hearing aid algorithms. In addition, AERPs may be useful in assessing whether individuals who cannot behaviorally respond to auditory stimuli, such as infants or developmentally delayed adults, are processing certain speech cues posthearing aid fitting. ...
... At M1, we predicted longer latencies in P50, N1, and P2 in hearing aid users compared to 7 those with age-appropriate hearing as has been shown in within-subject designs in 8 younger adults (Korczak et al., 2005;Marynewich et al., 2012;Miller and Zhang, 2014) 9 and in studies comparing CI users to those with normal hearing (Finke et al., 2016). ...
... For fitting the clusters back to the individual data, we calculated the 19 spatial correlation of the clusters with the individual subject data (Brunet et al., 2011;20 Murray et al., 2008). As dependent parameters, we obtained the mean GFP and theBertoli et al., 2011;Billings et al., 2007Billings et al., , 2011Easwar et al., 2015;Korczak et al., 18 2005;Marynewich et al., 2012;Miller and Zhang, 2014;Tremblay et al., 2006aTremblay et al., , 2006b). ...
... In this paper, we examined longitudinal modulations of early sensory-driven and later 9 cognitive-related auditory processing and their modulations by hearing loss and hearing 10 aid treatment in healthy, older adults. Traditionally, studies have investigated the effects 11 of hearing aid amplification at only the initial stages of hearing, namely for the P50, N1, 12 and the P2 in young adults (Billings et al., 2007Billings et al., , 2011Marynewich et al., 2012;Miller 13 and Zhang, 2014;Tremblay et al., 2006a) or with older adults using cochlear implantsBillings et al., 2011Billings et al., , 2007Marynewich et al., 2012;Miller and Zhang, 2014Korczak et al., 2005;Marynewich et al., 2012;Miller and Zhang, 2014) 14 and in accordance with our hypothesis, there were no group differences in the P50, N1, 15 and P2 peak amplitudes between hearing aid users and normal-hearing listeners. ...
Article
Full-text available
The present study investigates behavioral and electrophysiological auditory and cognitive-related plasticity in three groups of healthy older adults (60–77 years). Group 1 was moderately hearing-impaired, experienced hearing aid users, and fitted with new hearing aids using non-linear frequency compression (NLFC on); Group 2, also moderately hearing-impaired, used the same type of hearing aids but NLFC was switched off during the entire period of study duration (NLFC off); Group 3 represented individuals with age-appropriate hearing (NHO) as controls, who were not different in IQ, gender, or age from Group 1 and 2. At five measurement time points (M1-M5) across three months, a series of active oddball tasks were administered while EEG was recorded. The stimuli comprised syllables consisting of naturally high-pitched fricatives (/sh/, /s/, and /f/), which are hard to distinguish for individuals with presbycusis. By applying a data-driven microstate approach to obtain global field power (GFP) as a measure of processing effort, the modulations of perceptual (P50, N1, P2) and cognitive-related (N2b, P3b) auditory evoked potentials were calculated and subsequently related to behavioral changes (accuracy and reaction time) across time.
... Note that subtracting the waveform of standard stimuli from that of the deviant one may not completely eliminate the potential influence of N1 on MMN, as the amplitude of N1 elicited by different stimuli varies. Previous studies also found that distinct acoustic properties of segments in a syllable or consonant-vowel transition can lead to potential P1-N1-P2, which may have an effect on the asymmetric activation of MMN and LDN (Martin and Boothroyd, 1999;Miller and Zhang, 2014). Indeed, N1 has been noted as a component which extracts phonological features (cf. ...
... Obleser et al., 2004). Future studies could use alternative measurements to separate the effects of MMN and N1, and investigate the influence of the transition within stimuli or vocalic cue on the ERP components (Schröger and Wolff, 1996;Miller and Zhang, 2014). ...
Article
Full-text available
In the present study, we examine the interactive effect of vowels on Mandarin fricative sibilants using a passive oddball paradigm to determine whether the HEIGHT features of vowels can spread on the surface and influence preceding consonants with unspecified features. The stimuli are two pairs of Mandarin words ([sa] ∼ [ʂa] and [su] ∼ [ʂu]) contrasting in vowel HEIGHT ([LOW] vs. [HIGH]). Each word in the same pair was presented both as standard and deviant, resulting in four conditions (/standard/[deviant]: /sa/[ʂa] ∼ /ʂa/[sa] and /su/[ʂu] ∼ /ʂu/[su]). In line with the Featurally Underspecified Lexicon (FUL) model, asymmetric patterns of processing were found in the [su] ∼ [ʂu] word pair where both the MMN (mismatch negativity) and LDN (late discriminative negativity) components were more negative in /su/[ʂu] (mismatch) than in /ʂu/[su] (no mismatch), suggesting the spreading of the feature [HIGH] from the vowel [u] to [ʂ] on the surface. In the [sa] ∼ [ʂa] pair, however, symmetric negativities (for both MMN and LDN) were observed as there is no conflict between the surface feature [LOW] from [a] to [ʂ] and the underlying specified feature [LOW] of [s]. These results confirm that not all features are fully specified in the mental lexicon: features of vowels can spread on the surface and influence surrounding unspecified segments.
... In this study, we examine electrophysiological activity of the brain to understand the perception of English dental fricative sounds in two groups of proficient L2 listeners, which has been rarely done. Some previous research (mainly using magnetoencephalography) on fricative perception examined L1 English phonemic contrasts such as /s/ and / / (Miller and Zhang, 2014;Lago et al., 2015) as well as responses to Polish fricatives by native and inexperienced non-native listeners (Lipski and Mathiak, 2007). It remains unclear whether experience with typical mispronunciations of L2 speech sounds lead to cross-linguistically distinct memory traces for non-native phonemes. ...
... Moreover, we restricted our analysis to the first N1 elicited by the speech sound, time-locked to its onset. Another approach would be to also analyze the second N1, elicited by the acoustic change complex defined as -and time-locked to -the transition of the consonant to the following vowel (see Lipski and Mathiak, 2007;Miller and Zhang, 2014). The ERP data in both subject groups (see Figure 2) indicate the consistent elicitation of the negative deflection responses to fricatives. ...
Article
Full-text available
The Mismatch Negativity (MMN) response has often been used to measure memory traces for phonological representations and to show effects of long-term native language (L1) experience on neural organization. We know little about whether phonological representations of non-native (L2) phonemes are modulated by experience with distinct non-native accents. We used MMN to examine effects of experience with L2-accented speech on auditory brain responses. Specifically, we tested whether it is long-term experience with language-specific L2 pronunciations or instead acoustic similarity between L2 speech sounds that modulates non-native phoneme perception. We registered MMN responses of Dutch and German proficient L2 speakers of English to the English interdental fricative /θ/ and compared it to its non-native pronunciations /s/ (typical pronunciation of /θ/ for German speakers) and /t/ (typical pronunciation of /θ/ for Dutch speakers). Dutch and German listeners heard the English pseudoword thond and its pronunciation deviants sond and tond. We computed the identity Mismatch Negativity (iMMN) by analyzing the difference in ERPs when the deviants were the frequent versus the infrequent stimulus for the respective group of L2 listeners. For both groups, tond and sond elicited mismatch effects of comparable size. Overall, the results suggest that experience with deviant pronunciations of L2 speech sounds in foreign-accented speech does not alter auditory memory traces. Instead, non-native phoneme perception seems to be modulated by acoustic similarity between speech sounds rather than by experience with typical L2 pronunciation patterns.
... In auditory development, researchers have used ACC to assess infants' speech recognition abilities and their speech perception differences under different stimuli (Small and Werker, 2012;Chen and Small, 2015;McCarthy et al., 2019). ACC can provide some insights into assessing the speech perception ability of hearing aid users and cochlear implant users (Friesen and Tremblay, 2006;Tremblay et al., 2006;Martin, 2007;Miller and Zhang, 2014;Liang et al., 2018;Shetty and Puttabasappa, 2020), and in evaluating the efficacy of assistive listening devices such as hearing aids Kirby and Brown, 2015). It has also been used in investigations such as cochlear dead region measurement (Kang et al., 2018) and the objective assessment of tinnitus (Han et al., 2017). ...
Article
Full-text available
Acoustic change complex (ACC) is a cortical auditory-evoked potential induced by a change of continuous sound stimulation. This study aimed to explore: (1) whether the change of horizontal sound location can elicit ACC; (2) the relationship between the change of sound location and the amplitude or latency of ACC; (3) the relationship between the behavioral measure of localization, minimum audible angle (MAA), and ACC. A total of 36 normal-hearing adults participated in this study. A 180° horizontal arc-shaped bracket with a 1.2 m radius was set in a sound field where participants sat at the center. MAA was measured in a two-alternative forced-choice setting. The objective electroencephalography recording of ACC was conducted with the location changed at four sets of positions, ±45°, ±15°, ±5°, and ±2°. The test stimulus was a 125–6,000 Hz broadband noise of 1 s at 60 ± 2 dB SPL with a 2 s interval. The N1′–P2′ amplitudes, N1′ latencies, and P2′ latencies of ACC under four positions were evaluated. The influence of electrode sites and the direction of sound position change on ACC waveform was analyzed with analysis of variance. Results suggested that (1) ACC can be elicited successfully by changing the horizontal sound location position. The elicitation rate of ACC increased with the increase of location change. (2) N1′–P2′ amplitude increased and N1′ and P2′ latencies decreased as the change of sound location increased. The effects of test angles on N1′–P2′ amplitude [F(1.91,238.1) = 97.172, p < 0.001], N1′ latency [F(1.78,221.90) = 96.96, p < 0.001], and P2′ latency [F(1.87,233.11) = 79.97, p < 0.001] showed a statistical significance. (3) The direction of sound location change had no significant effect on any of the ACC peak amplitudes or latencies. (4) Sound location discrimination threshold by the ACC test (97.0% elicitation rate at ±5°) was higher than MAA threshold (2.08 ± 0.5°). The current study results show that though the ACC thresholds are higher than the behavioral thresholds on MAA task, ACC can be used as an objective method to evaluate sound localization ability. This article discusses the implications of this research for clinical practice and evaluation of localization skills, especially for children.
... The electrode sites for analysis were chosen based on scalp maps that showed intense activation in these regions. Similar methods for electrode grouping were used in previous ERP studies [57][58][59][60]. The frontal electrodes included F3, F5, F7, FC3, FC5, FT7 and the corresponding electrodes on the right hemisphere. ...
Article
Full-text available
This electrophysiological study investigated the role of the medial olivocochlear (MOC) efferents in listening in noise. Both ears of eleven normal-hearing adult participants were tested. The physiological tests consisted of transient-evoked otoacoustic emission (TEOAE) inhibition and the measurement of cortical event-related potentials (ERPs). The mismatch negativity (MMN) and P300 responses were obtained in passive and active listening tasks, respectively. Behavioral responses for the word recognition in noise test were also analyzed. Consistent with previous findings, the TEOAE data showed significant inhibition in the presence of contralateral acoustic stimulation. However, performance in the word recognition in noise test was comparable for the two conditions (i.e., without contralateral stimulation and with contralateral stimulation). Peak latencies and peak amplitudes of MMN and P300 did not show changes with contralateral stimulation. Behavioral performance was also maintained in the P300 task. Together, the results show that the peripheral auditory efferent effects captured via otoacoustic emission (OAE) inhibition might not necessarily be reflected in measures of central cortical processing and behavioral performance. As the MOC effects may not play a role in all listening situations in adults, the functional significance of the cochlear effects of the medial olivocochlear efferents and the optimal conditions conducive to corresponding effects in behavioral and cortical responses remain to be elucidated.
... This channel grouping procedure has been applied successfully in our previous hearing RAO ET AL. / EAR & HEARING, VOL. 38, NO. 1, 28-41 33 research studies (Rao et al. 2010;Zhang et al. 2011;Miller & Zhang 2014;Nie et al. 2014). A correlation analysis for change in behavioral measures and change in ERP measures across sessions was also conduced. ...
Article
The objectives of this study were to investigate the effects of hearing aid use and the effectiveness of ReadMyQuips (RMQ), an auditory training program, on speech perception performance and auditory selective attention using electrophysiological measures. RMQ is an audiovisual training program designed to improve speech perception in everyday noisy listening environments. Participants were adults with mild to moderate hearing loss who were first-time hearing aid users. After 4 weeks of hearing aid use, the experimental group completed RMQ training in 4 weeks, and the control group received listening practice on audiobooks during the same period. Cortical late event-related potentials (ERPs) and the Hearing in Noise Test (HINT) were administered at prefitting, pretraining, and post-training to assess effects of hearing aid use and RMQ training. An oddball paradigm allowed tracking of changes in P3a and P3b ERPs to distractors and targets, respectively. Behavioral measures were also obtained while ERPs were recorded from participants. After 4 weeks of hearing aid use but before auditory training, HINT results did not show a statistically significant change, but there was a significant P3a reduction. This reduction in P3a was correlated with improvement in d prime (d') in the selective attention task. Increased P3b amplitudes were also correlated with improvement in d' in the selective attention task. After training, this correlation between P3b and d' remained in the experimental group, but not in the control group. Similarly, HINT testing showed improved speech perception post training only in the experimental group. The criterion calculated in the auditory selective attention task showed a reduction only in the experimental group after training. ERP measures in the auditory selective attention task did not show any changes related to training. Hearing aid use was associated with a decrement in involuntary attention switch to distractors in the auditory selective attention task. RMQ training led to gains in speech perception in noise and improved listener confidence in the auditory selective attention task.
... Centro-parietal electrodes included CP3, CP1, CPz, CP2, CP4, P5, P3, P1, Pz, P2, P4, and P6, and parieto-occipital electrodes included PO3, POz, and PO4. This channel grouping procedure has been applied successfully in our previous hearing research studies (Rao et al. 2010; Zhang et al. 2011; Miller & Zhang 2014; Nie et al. 2014). A correlation analysis for change in behavioral measures and change in ERP measures across sessions was also conduced. ...
Article
Objectives: The objectives of this study were to investigate the effects of hearing aid use and the effectiveness of ReadMyQuips (RMQ), an auditory training program, on speech perception performance and auditory selective attention using electrophysiological measures. RMQ is an audiovisual training program designed to improve speech perception in everyday noisy listening environments. Design: Participants were adults with mild to moderate hearing loss who were first-time hearing aid users. After 4 weeks of hearing aid use, the experimental group completed RMQ training in 4 weeks, and the control group received listening practice on audiobooks during the same period. Cortical late event-related potentials (ERPs) and the Hearing in Noise Test (HINT) were administered at prefitting, pretraining, and post-training to assess effects of hearing aid use and RMQ training. An oddball paradigm allowed tracking of changes in P3a and P3b ERPs to distrac-tors and targets, respectively. Behavioral measures were also obtained while ERPs were recorded from participants. Results: After 4 weeks of hearing aid use but before auditory training , HINT results did not show a statistically significant change, but there was a significant P3a reduction. This reduction in P3a was correlated with improvement in d prime (d′) in the selective attention task. Increased P3b amplitudes were also correlated with improvement in d′ in the selective attention task. After training, this correlation between P3b and d′ remained in the experimental group, but not in the control group. Similarly, HINT testing showed improved speech perception post training only in the experimental group. The criterion calculated in the auditory selective attention task showed a reduction only in the experimental group after training. ERP measures in the auditory selective attention task did not show any changes related to training. Conclusions: Hearing aid use was associated with a decrement in involuntary attention switch to distractors in the auditory selective attention task. RMQ training led to gains in speech perception in noise and improved listener confidence in the auditory selective attention task.
... Some studies have reported CAEP differences in unaided and aided conditions, but they did not equate SNRs (Gravel et al., 1989;Korczak, Kurtzberg, & Stapells, 2005;Miller & Zhang 2014;Rapin & Graziani, 1967). Therefore, reported amplification effects might have resulted from the change in SNR as well as signal modifications from the hearing aid. ...
Article
Purpose: This study investigated (a) the effect of amplification on cortical auditory evoked potentials (CAEPs) at different signal levels when signal-to-noise ratios (SNRs) were equated between unaided and aided conditions, and (b) the effect of absolute signal level on aided CAEPs when SNR was held constant. Method: CAEPs were recorded from 13 young adults with normal hearing. A 1000-Hz pure tone was presented in unaided and aided conditions with a linear analog hearing aid. Direct audio input was used, allowing recorded hearing aid noise floor to be added to unaided conditions to equate SNRs between conditions. An additional stimulus was created through scaling the noise floor to study the effect of signal level. Results: Amplification resulted in delayed N1 and P2 peak latencies relative to the unaided condition. An effect of absolute signal level (when SNR was constant) was present for aided CAEP area measures, such that larger area measures were found at higher levels. Conclusion: Results of this study further demonstrate that factors in addition to SNR must also be considered before CAEPs can be used to clinically to measure aided thresholds.
... The average impedance of electrodes was below 5 kOhms. The same recording setup was used in previous ERP studies (Rao et al., 2010;Miller and Zhang, 2014). ERP waveform analysis was completed offline in BESA (Version 6.0, MEGIS Software GmbH, Germany) and MATLAB (Version 8.0). ...
... CAEPs can be used for the tracking of maturation of the auditory system 2,12,13 and the effects of plasticity 12,14,15 . Cortical waveforms in response to changes in stimulus intensity 16 , frequency 17 or phase 18 are called acoustic change complexes (ACCs) and can potentially be utilized to investigate discrimination between different speech sounds 19 and localization ability 18 . In addition, CAEPs are useful in the investigation of auditory or temporal processing 20,21 , auditory training 22,23 , loudness growth and comfortable levels 24 , and the effect of aging [25][26][27] . ...
Article
Full-text available
Introduction Cortical auditory evoked potentials (CAEPs) are influenced by the characteristics of the stimulus, including level and hearing aid gain. Previous studies have measured CAEPs aided and unaided in individuals with normal hearing. There is a significant difference between providing amplification to a normal-hearing or a hearing-impaired person. This study investigates this difference, and the effects of stimulus signal-to-noise ratio (SNR) and audibility on the CAEP amplitude in a population with hearing loss. Methods Twelve normal-hearing participants and twelve participants with a hearing loss participated in this study. Three speech sounds /m/, /g/, and /t/ were presented in the free field. Unaided stimuli were presented at 55, 65, and 75 dB SPL, and aided stimuli at 55 dB SPL with three different gains in steps of 10 dB. CAEPs were recorded and their amplitudes analyzed. Stimulus SNR and audibility were determined. Results No significant effect of stimulus level or hearing aid gain was found in normal-hearers. Conversely, a significant effect was found in hearing-impaired individuals. Audibility of the signal, which in some cases is determined by the signal level relative to threshold and in other cases by the signal-to-noise ratio, is the dominant factor explaining changes in CAEP amplitude. Conclusions CAEPs can potentially be used to assess the effects of hearing aid gain in hearing-impaired users.
... Some studies have reported CAEP differences in unaided and aided conditions, but they did not equate SNRs (Gravel et al., 1989;Korczak, Kurtzberg, & Stapells, 2005;Miller & Zhang 2014;Rapin & Graziani, 1967). Therefore, reported amplification effects might have resulted from the change in SNR as well as signal modifications from the hearing aid. ...
Article
Full-text available
Hearing aids are used to improve sound audibility for people with hearing loss, but the ability to make use of the amplified signal, especially in the presence of competing noise, can vary across people. Here we review how neuroscientists, clinicians, and engineers are using various types of physiological information to improve the design and use of hearing aids.
Article
Objectives: The objectives of this study were to measure the effects of level and vowel contrast on the latencies and amplitudes of acoustic change complex (ACC) in the mature auditory system. This was done to establish how the ACC in healthy young adults is affected by these stimulus parameters that could then be used to inform translation of the ACC into a clinical measure for the pediatric population. Another aim was to demonstrate that a normalized amplitude metric, calculated by dividing the ACC amplitude in the vowel contrast condition by the ACC amplitude obtained in a control condition (no vowel change) would demonstrate good sensitivity with respect to perceptual measures of vowel-contrast detection. The premises underlying this research were that: (1) ACC latencies and amplitudes would vary with level, in keeping with principles of an increase in neural synchrony and activity that takes place as a function of increasing stimulus level; (2) ACC latencies and amplitudes would vary with vowel contrast, because cortical auditory evoked potentials are known to be sensitive to the spectro-temporal characteristics of speech. Design: Nineteen adults, 14 of them female, with a mean age of 24.2 years (range 20 to 38 years) participated in this study. All had normal-hearing thresholds. Cortical auditory evoked potentials were obtained from all participants in response to synthesized vowel tokens (/a/, /i/, /o/, /u/), presented in a quasi-steady state fashion at a rate of 2/sec in an oddball stimulus paradigm, with a 25% probability of the deviant stimulus. The ACC was obtained in response to the deviant stimulus. All combinations of vowel tokens were tested at 2 stimulus levels: 40 and 70 dBA. In addition, listeners were tested for their ability to detect the vowel contrasts using behavioral methods. Results: ACC amplitude varied systematically with level, and test condition (control versus contrast) and vowel token, but ACC latency did not. ACC amplitudes were significantly larger when tested at 70 dBA compared with 40 dBA and for contrast trials compared with control trials at both levels. Amplitude ratios (normalized amplitudes) were largest for contrast pairs in which /a/ was the standard token. The amplitude ratio metric at the individual level demonstrated up to 97% sensitivity with respect to perceptual measures of discrimination. Conclusions: The present study establishes the effects of stimulus level and vowel type on the latency and amplitude of the ACC in the young adult auditory system and supports the amplitude ratio as a sensitive metric for cortical acoustic salience of vowel spectral features. Next steps are to evaluate these methods in infants and children with hearing loss with the long-term goal of its translation into a clinical method for estimating speech feature discrimination.
Article
Full-text available
The current study measured neural responses to investigate auditory stream segregation of noise stimuli with or without clear spectral contrast. Sequences of alternating A and B noise bursts were presented to elicit stream segregation in normal-hearing listeners. The successive B bursts in each sequence maintained an equal amount of temporal separation with manipulations introduced on the last stimulus. The last B burst was either delayed for 50% of the sequences or not delayed for the other 50%. The A bursts were jittered in between every two adjacent B bursts. To study the effects of spectral separation on streaming, the A and B bursts were further manipulated by using either bandpass-filtered noises widely spaced in center frequency or broadband noises. Event-related potentials (ERPs) to the last B bursts were analyzed to compare the neural responses to the delay vs. no-delay trials in both passive and attentive listening conditions. In the passive listening condition, a trend for a possible late mismatch negativity (MMN) or late discriminative negativity (LDN) response was observed only when the A and B bursts were spectrally separate, suggesting that spectral separation in the A and B burst sequences could be conducive to stream segregation at the pre-attentive level. In the attentive condition, a P300 response was consistently elicited regardless of whether there was spectral separation between the A and B bursts, indicating the facilitative role of voluntary attention in stream segregation. The results suggest that reliable ERP measures can be used as indirect indicators for auditory stream segregation in conditions of weak spectral contrast. These findings have important implications for cochlear implant (CI) studies – as spectral information available through a CI device or simulation is substantially degraded, it may require more attention to achieve stream segregation.
Article
Full-text available
There is interest in using cortical auditory evoked potentials (CAEPs) to evaluate hearing aid fittings and experience-related plasticity associated with amplification; however, little is known about hearing aid signal processing effects on these responses. The purpose of this study was to determine the effect of clinically relevant hearing aid gain settings, and the resulting in-the-canal signal-to-noise ratios (SNRs), on the latency and amplitude of P1, N1, and P2 waves. DESIGN & SAMPLE: Evoked potentials and in-the-canal acoustic measures were recorded in nine normal-hearing adults in unaided and aided conditions. In the aided condition, a 40-dB signal was delivered to a hearing aid programmed to provide four levels of gain (0, 10, 20, and 30 dB). As a control, unaided stimulus levels were matched to aided condition outputs (i.e. 40, 50, 60, and 70 dB) for comparison purposes. When signal levels are defined in terms of output level, aided CAEPs were surprisingly smaller and delayed relative to unaided CAEPs, probably resulting from increases to noise levels caused by the hearing aid. These results reinforce the notion that hearing aids modify stimulus characteristics such as SNR, which in turn affects the CAEP in a way that does not reliably reflect hearing aid gain.
Article
Full-text available
This paper reviews the literature on the Nl wave of the human auditory evoked potential. It concludes that at least six different cerebral processes can contribute to (he negative wave recorded from the scalp with a peak latency between 50 and 150 ms: a component generated in the auditory-cortex on the supratemporal plane, a component generated in the association cortex on the lateral aspect of the temporal and parietal cortex, a component generated in the motor and premotor cortices, the mismatch negativity, a temporal component of the processing negativity, and a frontal component of the processing negativity, The first three, which can be considered ‘true’ N1 components, are controlled by the physical and temporal aspects of the stimulus and by the general state of the subject. The other three components are not necessarily elicited by a stimulus but depend on the conditions in which the stimulus occurs. They often last much longer than the true N1 components that they overlap.
Article
Full-text available
Hearing aid amplification can be used as a model for studying the effects of auditory stimulation on the central auditory system (CAS). We examined the effects of stimulus presentation level on the physiological detection of sound in unaided and aided conditions. P1, N1, P2, and N2 cortical evoked potentials were recorded in sound field from 13 normal-hearing young adults in response to a 1000-Hz tone presented at seven stimulus intensity levels. As expected, peak amplitudes increased and peak latencies decreased with increasing intensity for unaided and aided conditions. However, there was no significant effect of amplification on latencies or amplitudes. Taken together, these results demonstrate that 20 dB of hearing aid gain affects neural responses differently than 20 dB of stimulus intensity change. Hearing aid signal processing is discussed as a possible contributor to these results. This study demonstrates (1) the importance of controlling for stimulus intensity when evoking responses in aided conditions, and (2) the need to better understand the interaction between the hearing aid and the CAS.
Article
Full-text available
There is interest in using cortical auditory evoked potentials (CAEPs) to evaluate hearing aid fittings and experience-related plasticity associated with amplification; however, little is known about hearing aid signal processing effects on these responses. The purpose of this study was to determine the effect of clinically relevant hearing aid gain settings, and the resulting in-the-canal signal-to-noise ratios (SNRs), on the latency and amplitude of P1, N1, and P2 waves. DESIGN & SAMPLE: Evoked potentials and in-the-canal acoustic measures were recorded in nine normal-hearing adults in unaided and aided conditions. In the aided condition, a 40-dB signal was delivered to a hearing aid programmed to provide four levels of gain (0, 10, 20, and 30 dB). As a control, unaided stimulus levels were matched to aided condition outputs (i.e. 40, 50, 60, and 70 dB) for comparison purposes. When signal levels are defined in terms of output level, aided CAEPs were surprisingly smaller and delayed relative to unaided CAEPs, probably resulting from increases to noise levels caused by the hearing aid. These results reinforce the notion that hearing aids modify stimulus characteristics such as SNR, which in turn affects the CAEP in a way that does not reliably reflect hearing aid gain.
Article
Full-text available
Speech scientists have long proposed that formant exaggeration in infant-directed speech plays an important role in language acquisition. This event-related potential (ERP) study investigated neural coding of formant-exaggerated speech in 6-12-month-old infants. Two synthetic /i/ vowels were presented in alternating blocks to test the effects of formant exaggeration. ERP waveform analysis showed significantly enhanced N250 for formant exaggeration, which was more prominent in the right hemisphere than the left. Time-frequency analysis indicated increased neural synchronization for processing formant-exaggerated speech in the delta band at frontal-central-parietal electrode sites as well as in the theta band at frontal-central sites. Minimum norm estimates further revealed a bilateral temporal-parietal-frontal neural network in the infant brain sensitive to formant exaggeration. Collectively, these results provide the first evidence that formant expansion in infant-directed speech enhances neural activities for phonetic encoding and language learning.
Article
Full-text available
The current review constitutes the first comprehensive look at the possibility that the mismatch negativity (MMN, the deflection of the auditory ERP/ERF elicited by stimulus change) might be generated by so-called fresh-afferent neuronal activity. This possibility has been repeatedly ruled out for the past 30 years, with the prevailing theoretical accounts relying on a memory-based explanation instead. We propose that the MMN is, in essence, a latency- and amplitude-modulated expression of the auditory N1 response, generated by fresh-afferent activity of cortical neurons that are under nonuniform levels of adaptation.
Article
Full-text available
The influence of the intensity of passively perceived 1-kHz tones on the latencies and amplitudes of AEP components in the 80- to 200-ms latency range recorded from frontal, central and temporal electrode locations was investigated in healthy adult subjects. At stimulus intensities from 30 to 70 dB SL the latencies of the frontally and centrally recorded N100 and P175 waves decreased, their amplitudes increased. At stimulus intensities from 70 to 90 dB the latencies of N100 and P175 increased, N100 amplitude declined and P175 amplitude increased at a slower rate than between 30 and 70 dB. The latency of the N140 wave recorded at temporal electrode locations decreased markedly with increasing stimulus intensity.
Article
Full-text available
The remarkable linguistic abilities of human neonates are well documented. Young infants can discriminate phonemes even if they are not used in their native language, an ability which regresses during the first year of life. This ability to discriminate is often studied by repeating a stimulus for several minutes until some behavioural response of the infant habituates, and later examining whether the response recovers when the stimulus is changed. This method, however, does not reveal how fast infants can detect phonetic changes, nor what brain mechanisms are involved. We describe here high-density recordings of event-related potentials in three-month-old infants listening to syllables whose first consonants differed in place of articulation. Two processing stages, corresponding to an increasingly refined analysis of the auditory input, were identified and localised to the temporal lobes. A late frontal response to novelty was also observed. The infant brain recognizes a phonetic change in less than 400 ms.
Article
Full-text available
To investigate whether the evoked potential to a complex naturally produced speech syllable could be decomposed to reflect the contributions of the acoustic events contained in the constituent phonemes. Auditory cortical evoked potentials N1 and P2 were obtained in eight adults with normal hearing. Three naturally produced speech stimuli were used: 1) the syllable [sei]; 2) the sibilant [s], extracted from the syllable; 3) the vowel [ei] extracted from the syllable. The isolated sibilant and vowel preserved the same time relationships to the sampling window as they did in the complete syllable. Evoked potentials were collected at Fz, Cz, Pz, A1, and A2, referenced to the nose. In the group mean waveforms, clear responses were observed to both the sibilant and the isolated vowel. Although the response to the [s] was weaker than that to [ei], both had N1 and P2 components with latencies, in relation to sound onset, appropriate to cortical onset potentials. The vowel onset response was preserved in the response to the complete syllable, though with reduced amplitude. This pattern was observable in six of the eight waveforms from individual subjects. It seems likely that the response to [ei] within the complete syllable reflects changes of cortical activation caused by amplitude or spectral change at the transition from consonant to vowel. The change from aperiodic to periodic stimulation may also produce changes in cortical activation that contribute to the observed response. Whatever the mechanism, the important conclusion is that the auditory cortical evoked potential to complex, time-varying speech waveforms can reflect features of the underlying acoustic patterns. Such potentials may have value in the evaluation of speech perception capacity in young hearing-impaired children.
Article
Full-text available
The sequence of neurophysiological processes elicited in the auditory system by a sound is analyzed in search of the stage at which the processes carrying sensory information cross the borderline beyond which they directly underlie sound perception. Neurophysiological data suggest that this transition occurs when the sensory input is mapped onto the physiological basis of sensory memory in the auditory cortex. At this point, the sensory information carried by the stimulus-elicited process corresponds, for the first time, to that contained by the actual sound percept. Before this stage, the sensory stimulus code is fragmentary, lacks the time dimension, cannot enter conscious perception, and is not accessible to top-down processes (voluntary mental operations). On these grounds, 2 distinct stages of auditory sensory processing, prerepresentational and representational, can be distinguished.
Article
Full-text available
Event-related potentials (ERPs) recorded from the human scalp can provide important information about how the human brain normally processes information and about how this processing may go awry in neurological or psychiatric disorders. Scientists using or studying ERPs must strive to overcome the many technical problems that can occur in the recording and analysis of these potentials. The methods and the results of these ERP studies must be published in a way that allows other scientists to understand exactly what was done so that they can, if necessary, replicate the experiments. The data must then be analyzed and presented in a way that allows different studies to be compared readily. This paper presents guidelines for recording ERPs and criteria for publishing the results.
Article
Full-text available
The acoustic change complex (ACC) is a scalp-recorded negative-positive voltage swing elicited by a change during an otherwise steady-state sound. The ACC was obtained from eight adults in response to changes of amplitude and/or spectral envelope at the temporal center of a three-formant synthetic vowel lasting 800 ms. In the absence of spectral change, the group mean waveforms showed a clear ACC to amplitude increments of 2 dB or more and decrements of 3 dB or more. In the presence of a change of second formant frequency (from perceived /u/ to perceived /i/), amplitude increments increased the magnitude of the ACC but amplitude decrements had little or no effect. The fact that the just detectable amplitude change is close to the psychoacoustic limits of the auditory system augurs well for the clinical application of the ACC. The failure to find a condition under which the spectrally elicited ACC is diminished by a small change of amplitude supports the conclusion that the observed ACC to a change of spectral envelope reflects some aspect of cortical frequency coding. Taken together, these findings support the potential value of the ACC as an objective index of auditory discrimination capacity.
Article
Full-text available
Auditory evoked potential (AEP) correlates of the neural representation of stimuli along a /ga/-/ka/ and a /ba/-/pa/ continuum were examined to determine whether the voice-onset time (VOT)-related change in the N1 onset response from a single to double-peaked component is a reliable indicator of the perception of voiced and voiceless sounds. Behavioral identification results from ten subjects revealed a mean category boundary at a VOT of 46 ms for the /ga/-/ka/ continuum and at a VOT of 27.5 ms for the /ba/-/pa/ continuum. In the same subjects, electrophysiologic recordings revealed that a single N1 component was seen for stimuli with VOTs of 30 ms and less, and two components (N1' and N1) were seen for stimuli with VOTs of 40 ms and more for both continua. That is, the change in N1 morphology (from single to double-peaked) coincided with the change in perception from voiced to voiceless for stimuli from the /ba/-/pa/ continuum, but not for stimuli from the /ga/-/ka/ continuum. The results of this study show that N1 morphology does not reliably predict phonetic identification of stimuli varying in VOT. These findings also suggest that the previously reported appearance of a "double-peak" onset response in aggregate recordings from the auditory cortex does not indicate a cortical correlate of the perception of voicelessness.
Article
Full-text available
This study examined the perceptual-weighting strategies and performance-audibility functions of 11 moderately hearing-impaired (HI) children, 11 age-matched normal-hearing (NH) children, 11 moderately HI adults, and 11 NH adults. The purpose was to (a) determine the perceptual-weighting strategies of HI children relative to the other groups and (b) determine the audibility required by each group to achieve a criterion level of performance. Stimuli were 4 nonsense syllables (see text). The vowel, transition, and fricative segments of each nonsense syllable were identified along the temporal domain, and each segment was amplified randomly within each syllable during presentation. Point-biserial correlation coefficients were calculated using the amplitude variation of each segment and the correct and incorrect responses for the corresponding syllable. Results showed that for /see text/ and /see text/, all four groups heavily weighted the fricative segments during perception, whereas the vowel and transition segments received little or no weight. For /see text/, relatively low weights were given to each segment by all four groups. For /see text/, the NH children and adults weighted the transition segment more so than the vowel and fricative segments, whereas the HI children and adults weighted all three segments equally low. Performance-audibility functions of the fricative segments of /see text/ and /see text/ were constructed for each group. In general, maximum performance for each group was reached at lower audibility levels for /see text/ than for /see text/ and steeper functions were observed for the HI groups relative to the NH groups. A decision theory approach was used to confirm the audibility required by each group to achieve a > or =90% level of performance. Results showed both hearing sensitivity and age effects. The HI listeners required lower levels of audibility than the NH listeners to achieve similar levels of performance. Likewise, the adult listeners required lower levels of audibility than the children, although this difference was more substantial for the NH listeners than for the HI listeners.
Article
Full-text available
To determine if naturally produced speech stimuli evoke distinct neural response patterns that can be reliably recorded in individuals. Auditory cortical evoked potentials were obtained from seven normal-hearing young adults in response to four naturally produced speech tokens (/bi/, /pi/, /[U0283]i/, and /si/). Stimuli were tokens from the standardized UCLA version of the Nonsense Syllable Test (NST) ( Dubno & Schaefer, 1992). Using a repeated measures design, subjects were tested and then retested within an 8-day period. Auditory cortical evoked potentials elicited by naturally produced speech sounds were reliably recorded in individuals. Also, naturally produced speech tokens, representing different acoustic cues, evoked distinct neural response patterns. 1) Cortical evoked potentials elicited by naturally produced speech sounds can be reliably recorded in individuals. 2) Naturally produced speech tokens, representing different acoustic cues, evoke distinct neural response patterns. 3) Given the reliability of the response, this work has potential application to the study of neural processing of speech in individuals with communication disorders as well as changes over time after various types of auditory rehabilitation.
Article
Full-text available
Recent studies have shown that the mismatch negativity (MMN), a change-specific component of the event-related potential (ERP), for particular auditory features is degraded in different clinical populations. This suggests that the MMN could, in principle, reflect the whole profile and extent of the central auditory deficit. In the present article, we tested a new MMN paradigm allowing one to obtain MMNs for several auditory attributes in a short time. MMN responses to changes in frequency, intensity, duration, location, and to a silent gap occasionally inserted in the middle of a tone were compared between the traditional 'oddball' paradigm (a single type of auditory change in each sequence) and the new paradigm (two versions) in which all the 5 types of changes appeared within the same sequence. The MMNs obtained in the new paradigm were equal in amplitude to those in the traditional MMN paradigm. We propose a new paradigm that can provide 5 different MMNs in the same time in which usually only one MMN is obtained. The new paradigm enables one to objectively determine the profile of different auditory discrimination abilities within a very short recording time.
Article
Full-text available
This article reviews literature on the characteristics and possible interpretations of the event-related potential (ERP) peaks commonly identified in research. The description of each peak includes typical latencies, cortical distributions, and possible brain sources of observed activity as well as the evoking paradigms and underlying psychological processes. The review is intended to serve as a tutorial for general readers interested in neuropsychological research and as a reference source for researchers using ERP techniques.
Article
Full-text available
Linguistic experience alters an individual's perception of speech. We here provide evidence of the effects of language experience at the neural level from two magnetoencephalography (MEG) studies that compare adult American and Japanese listeners' phonetic processing. The experimental stimuli were American English /ra/ and /la/ syllables, phonemic in English but not in Japanese. In Experiment 1, the control stimuli were /ba/ and /wa/ syllables, phonemic in both languages; in Experiment 2, they were non-speech replicas of /ra/ and /la/. The behavioral and neuromagnetic results showed that Japanese listeners were less sensitive to the phonemic /r-l/ difference than American listeners. Furthermore, processing non-native speech sounds recruited significantly greater brain resources in both hemispheres and required a significantly longer period of brain activation in two regions, the superior temporal area and the inferior parietal area. The control stimuli showed no significant differences except that the duration effect in the superior temporal cortex also applied to the non-speech replicas. We argue that early exposure to a particular language produces a "neural commitment" to the acoustic properties of that language and that this neural commitment interferes with foreign language processing, making it less efficient.
Article
Full-text available
We investigated whether N1 and P2 auditory-evoked responses are modulated by the spectral complexity of musical sounds in pianists and non-musicians. Study participants were presented with three variants of a C4 piano tone equated for temporal envelope but differing in the number of harmonics contained in the stimulus. A fourth tone was a pure tone matched to the fundamental frequency of the piano tones. A simultaneous electroencephalographic/magnetoencephalographic recording was made. P2 amplitude was larger in musicians and increased with spectral complexity preferentially in this group, but N1 did not. The results suggest that P2 reflects the specific features of acoustic stimuli experienced during musical practice and point to functional differences in P2 and N1 that relate to their underlying mechanisms.
Article
The amplitude of frication relative to vowel onset amplitude in the F3 and F5 formant frequency regions was manipulated for the synthetic fricative contrasts /s/-/integral/ and /s/-/theta/, respectively. The influence of this relative amplitude manipulation on listeners' perception of place of articulation was tested by (1) varying the duration of frication from 30 to 140 ms, (2) pairing the frication noise with different vowels /i a u/, (3) placing formant transitions in conflict with relative amplitude, and (4) holding relative amplitude constant within a continuum while varying formant transitions and the amplitudes of spectral regions where relative amplitude was not manipulated. To determine if listeners were using absolute spectral cues or relative amplitude comparisons between frication and vowel for fricative identification, the frication and vowel were separated by (1) presenting the frication in isolation, and (2) inserting a gap of silence between the frication and vowel. The results showed that relative amplitude was perceived across vowel context and frication duration, and overrode context-dependent formant transition cues. The findings for temporal separations between the frication and vowel suggest that short-term memory processes may dominate the mediation of the relative-amplitude comparison. However, the overall results indicate that relative amplitude is only a component of spectral prominence, which is comprised of a primary frication spectral peak and a secondary frication/vowel peak comparison.
Article
The two aims of this study were (a) to determine the perceptual weight given formant transition and relative amplitude information for labeling fricative place of articulation perception and (b) to determine the extent of integration of relative amplitude and formant transition cues. Seven listeners with normal hearing and 7 listeners with sensorineural hearing loss participated. The listeners were asked to label the fricatives of synthetic consonant-vowel stimuli as either /s/ or /∫/. Across the stimuli, 3 cues were varied: (a) The amplitude of the spectral peak in the 2500- Hz range of the frication relative to the adjacent vowel peak amplitude in the same frequency region, (b) the frication duration, which was either 50 or 140 ms, and (c) the second formant transition onset frequency, which was varied from 1200 to 1800 Hz. An analysis of variance model was used to determine weightings for the relative amplitude and transition cues for the different frication duration conditions. A 30-ms gap of silence was inserted between the frication and vocalic portions of the stimuli, with the intent that a temporal separation of frication and transition information might affect how the cues were integrated. The weighting given transition or relative amplitude differed between the listening groups and depended on frication duration. Use of the transition cue was most affected by insertion of the silent gap. Listeners with hearing loss had smaller interaction terms for the cues than listeners with normal hearing, suggesting less integration of cues.
Article
It has sometimes been assumed that the identification of the fricatives of American English in CV syllables depends primarily on the characteristics of the noise (i.e., nonvocalic) portion of the speech sound. A second possibility is that characteristics of the vocalic portion—previously shown to be cues for the perception of other consonants—are important for the fricatives. These alternatives were tested by combining the noise from one spoken fricative-vowel syllable with the voiced portion of another. Results indicate that the important cues for the fricatives /s/ and /∫/ are given by the noise but that the differentiation of /f/ and /θ/ is accomplished primarily on the basis of cues contained in the vocalic part of the syllable. Similar results were obtained for the voiced counterparts of these sounds.
Article
Equipment was assembled and a procedure was developed for the measurement of power spectra of consonants. Detailed power spectra as well as measurements of grosser spectral properties were made on a fairly large sample of English and Russian stops and fricatives. Special criteria were developed for the evaluation of the data obtained. Possibilities of utilizing the data automatic recognition were considered.
Article
Energy density spectra of gated segments of fricative consonants were measured. The spectral data were used as a basis for developing objective identification criteria which yielded fair results when tested. As a further check gated segments of fricatives were presented for identification to a group of listeners and their responses evaluated in terms of the objective identification criteria.
Article
The locations of the poles and zeros in the spectra of fricatives were determined by a matching process where comparison spectra synthesized by electric circuits were matched against the spectra under analysis. Based on these findings, an electrical model was developed consisting of a noise-excited electric circuit characterized by a pole and a zero whose frequency locations could be varied. Stimuli generated by this model, both in isolation and in syllables were presented to 8 Ss for identification. The results of the listening tests are consistent with the data from the acoustic analyses and with the findings of other investigators. From Psyc Abstracts 36:02:2GH89H. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
We examined the short- and long-term habituation of auditory event-related potentials (ERPs) elicited by tones, complex tones and digitized speech sounds (vowels and consonant-vowel-consonant syllables). Twelve different stimuli equated in loudness and duration (300 msec) were studied. To examine short-term habituation stimuli were presented in trains of 6 with interstimulus intervals of 0.5 or 1.0 sec. The first 4 stimuli in a train were identical standards. On 50% of the trains the standard in the 5th position was replaced by a deviant probe stimulus, and on 20% of the trains the standard in the 6th position was replaced by a target, a truncated standard that required a speeded button press response. Short-term habituation (STH) was complete by the third stimulus in the train and resulted in amplitude decrements of 50-75% for the N1 component. STH was partially stimulus specific in that amplitudes were larger following deviant stimuli in the 5th position than following standards. STH of the N1 was more marked for speech sounds than for loudness-matched tones or complex tones at short ISI. In addition, standard and deviant stimuli that differed in phonetic structure showed more cross-habituation than did tones or complex tones that differed in frequency. This pattern of results suggests that STH is a function of the acoustic resemblance of successive stimuli. The long-term habituation (LTH) of the ERP was studied by comparing amplitudes across balanced 5.25 m stimulus blocks over the course of the experiment. Two types of LTH were observed. The N1 showed stimulus-specific LTH in that N1 amplitudes declined during the presentation of a stimulus, but returned to control levels when a different stimulus was presented in the subsequent condition. In contrast, the P3 elicited by the deviant stimuli showed non-specific LTH, being reduced across successive blocks containing different stimuli. P3s elicited by target stimuli remained stable in amplitude.
Article
We review in a common framework several algorithms that have been proposed recently, in order to improve the voice quality of a text-to-speech synthesis based on acoustical units concatenation (Charpentier and Moulines, 1988; Moulines and Charpentier, 1988; Hamon et al., 1989). These algorithms rely on a pitch-synchronous overlap-add (PSOLA) approach for modifying the speech prosody and concatenating speech waveforms. The modifications of the speech signal are performed either in the frequency domain (FD-PSOLA), using the Fast Fourier Transform, or directly in the time domain (TD-PSOLA), depending on the length of the window used in the synthesis process. The frequency domain approach is capable of a great flexibility in modifying the spectral characteristics of the speech signal, while the time domain approach provides very efficient solutions for the real time implementation of synthesis systems. We also discuss the different kinds of distortions involved in these different algorithms.
Article
This study employed behavioral and electrophysiological measures to examine selective listening of concurrent auditory stimuli. Stimuli consisted of four compound sounds, each created by mixing a pure tone with filtered noise bands at a signal-to-noise ratio of +15 dB. The pure tones and filtered noise bands each contained two levels of pitch. Two separate conditions were created; the background stimuli varied randomly or were held constant. In separate blocks, participants were asked to judge the pitch of tones or the pitch of filtered noise in the compound stimuli. Behavioral data consistently showed lower sensitivity and longer response times for classification of filtered noise when compared with classification of tones. However, differential effects were observed in the peak components of auditory event-related potentials (ERPs). Relative to tone classification, the P1 and N1 amplitudes were enhanced during the more difficult noise classification task in both test conditions, but the peak latencies were shorter for P1 and longer for N1 during noise classification. Moreover, a significant interaction between condition and task was seen for the P2. The results suggest that the essential ERP components for the same compound auditory stimuli are modulated by listeners' focus on specific aspects of information in the stimuli.
Article
The purpose of this study was to compare four strategies for stimulus presentation in terms of their efficiency when generating a speech-evoked cortical acoustic change complex (ACC) in adults and children. Ten normally hearing adults (aged 22 to 31 yrs) and nine normally hearing children (aged 6 to 9 yrs) served as participants. The ACC was elicited using a 75-dB SPL synthetic vowel containing 1000 Hz changes of second formant frequency, creating a change of perceived vowel between /u/ and /i/. The ACC was recorded from Cz using four stimulus formats:ACC magnitude was expressed as the standard deviation of the voltage waveform within a window believed to span the ACC. Noise magnitude was estimated from the variances at each sampling point in the same window. Efficiency was expressed in terms of the ACC to noise magnitude ratio divided by testing time. ACC magnitude was not significantly different for the two directions of second formant change. Reducing interonset interval from 2 to 1 sec increased efficiency by a factor close to two. Combining data from the two directions of change increased efficiency further, by a factor approximating the square root of 2. Continuous alternating stimulus presentation is more efficient than interrupted stimulus presentation in eliciting the ACC. The benefits of eliminating silent periods and doubling the number of acoustic changes presented in a given time period are not seriously offset by a reduction in root mean square response amplitude, at least in young adults and in children as young as 6 yrs.
Article
Speech scientists have long proposed that formant-exaggerated speech plays an important role in phonetic learning and language acquisition. However, there have been very little neurophysiological data on how the infant brain and adult brain respond to formant exaggeration in speech. We employed event-related potentials (ERPs) to investigate neural coding of formant-exaggerated speech sounds. Two synthetic i vowels were modeled after infant-directed speech data and presented in alternating blocks to test the effects of formant exaggeration. The fundamental frequencies of the two sounds were kept identical to avoid interference from exaggerated pitch level and range. For adult subjects, non-speech homologs were also created by using the center frequencies of the formants to additionally test whether the effects were speech-specific. In the infants (6 to 12-month olds), ERP waveforms showed significantly enhanced N250 and sustaining negativity responses for processing formant-exaggerated speech. In adults, enhancement was observed in the N100 component for the speech stimuli but not the homologous non-speech sounds. Collectively, these results provide the first evidence that formant expansion in infant-directed speech enhances neural activities for phonetic encoding, which may facilitate phonetic learning and language acquisition regardless of the age factor [Zhang et al. (2009). Neuroimage 46, 226-240].
Article
To determine the influence of spectrotemporal properties of naturally produced consonant-vowel syllables to speech-evoked auditory event-related potentials (ERPs) for stimuli with very similar or even identical wide-band envelopes. Speech-evoked ERPs may be useful for validating the neural representation of speech. Speech-evoked ERPs were obtained from 10 normal-hearing young adults in response to the syllables /da/ and /ta/. Both monosyllables were obtained from ongoing speech. They have quite similar wide-band envelopes, and they mainly differ in the spectrotemporal content of the consonant parts. Additionally, each stimulus derivatives were investigated with (1) isolated consonant part ("consonant stimulus"), (2) isolated vowel part ("vowel stimulus"), and (3) removed spectral information but identical wide-band envelope. Latencies and amplitudes of the N1 and P2 components were determined and analyzed. ERPs in response to the naturally produced /ta/ syllable had significant shorter N1 and P2 latencies and larger amplitudes than ERPs in response to /da/. Similar differences were observed for the ERPs evoked by the consonant stimuli alone. For the vowel stimuli and stimuli with removed spectral information, no significant differences were observed. In summary, differences between the ERPs of /da/ and /ta/ corresponded to the distinct spectrotemporal content in the consonant parts of the original consonant-vowel (CV) syllables. The study shows that even small differences in spectrotemporal features of speech may evoke different ERPs, despite very similar or even identical wide-band envelopes. The results are consistent with a model that ERPs evoked by short CVs are an onset response to the consonant merged with an acoustic change complex evoked by the vowel part. However, all components appear as one P1-N1-P2 complex. The results may be explained by differences in the narrow-band envelopes of the stimuli. Therefore, this study underlines the limitations of the wide-band envelope in explaining speech-evoked ERPs. Additionally, the results of this study are of special interest for clinical application since some of the ERP parameter differences, as the N1 latency, are present not only in the ERPs of each single subject but also in the group mean value of all N1 latencies. Thus, presented ERP measurements in response to CVs might be used for identification of potential problems in phoneme differentiation caused by spectrotemporal analysis problems.
Article
Magnetoencephalographic (MEG) applications in auditory evoked field (AEF) recordings have demonstrated that both tonotopicity and amplitopicity exist in the auditory cortex. The present study was conducted to determine whether previously reported characteristics of the AEF could be identified in multichannel cortical auditory evoked potential N1e (e.g., the electrical correlate of the magnetically recorded N1m) component recordings. Multichannel auditory evoked potentials from 11 young normal adults were collected after monaural tone burst stimuli of 250, 1000, and 4000 Hz. Results demonstrated that N1e amplitudes after stimulation at 250 Hz were significantly larger than those obtained after stimulation at 1000 or 4000 Hz. These frequency-specific differences existed for latency as well. Responses obtained after stimulation at 250 Hz were, on the average, 13 msec longer than those obtained after stimulation at 1000 or 4000 Hz. Also, contralateral latencies were significantly shorter than ipsilateral latencies. Although the significant frequency-specific amplitude results support the findings of previous investigators, the frequency-related latency differences have not been described. An explanation of these differences may exist in the spatial differences in the reception areas for low- and high-frequency tones in the primary auditory cortex.
Article
Magnetic evoked responses were recorded to different speech sounds in healthy humans. (i) Short words consisting of fricative consonant/vowel combinations evoked strong responses at the auditory cortex about 100 ms after the vowel onset. The response is specific to acoustic rather than phonetic aspects of the sounds. (ii) In a categorization task, words elicited a transient response followed by a sustained field (SF). When the subject counted the number of target words, SF was clearly increased. There were no consistent differences between the hemispheres and a similar increase of SF was observed when the subject classified the duration of two tones. (iii) When tone 'probes' were presented randomly to either ear and speech sounds to one ear, the 100-ms response was dampened and delayed bilaterally. The dampening was not specific to speech masking but dependent on the amount of frequency and amplitude transitions in the masker. All these experiments suggest that the auditory system performs a very similar analysis of both speech signals and other sounds. (iv) In a recent study, more closely related to speech perception, visual input from articulatory movements of the speaker was found to affect the activity of the auditory cortex. It seems that MEG studies can be useful in the study of brain mechanisms underlying speech perception in intact humans.
Article
The purpose of this study was to investigate the sufficient perceptual cues used in the recognition of four voiceless fricative consonants [s, f, theta, integral of] followed by the same vowel [i:] in normal-hearing and hearing-impaired adult listeners. Subjects identified the four CV speech tokens in a closed-set response task across a range of presentation levels. Fricative syllables were either produced by a human speaker in the natural stimulus set, or generated by a computer program in the synthetic stimulus set. By comparing conditions in which the subjects were presented with equivalent degrees of audibility for individual fricatives, it was possible to isolate the factor of lack of audibility from that of loss of suprathreshold discriminability. Results indicate that (a) the friction burst portion may serve as a sufficient cue for correct recognition of voiceless fricatives by normal-hearing subjects, whereas the more intense CV transition portion, though it may not be necessary, can also assist these subjects to distinguish place information, particularly at low presentation levels; (b) hearing-impaired subjects achieved close-to-normal recognition performance when given equivalent degrees of audibility of the frication cue, but they obtained poorer-than-normal performance if only given equivalent degrees of audibility of the transition cue; (c) the difficulty that hearing-impaired subjects have in perceiving fricatives under normal circumstances may be due to two factors: the lack of audibility of the frication cue and the loss of discriminability of the transition cue.
We examined the short- and long-term habituation of auditory event-related potentials (ERPs) elicited by tones, complex tones and digitized speech sounds (vowels and consonant-vowel-consonant syllables). Twelve different stimuli equated in loudness and duration (300 msec) were studied. To examine short-term habituation stimuli were presented in trains of 6 with interstimulus intervals of 0.5 or 1.0 sec. The first 4 stimuli in a train were identical standards. On 50% of the trains the standard in the 5th position was replaced by a deviant probe stimulus, and on 20% of the trains the standard in the 6th position was replaced by a target, a truncated standard that required a speeded button press response. Short-term habituation (STH) was complete by the third stimulus in the train and resulted in amplitude decrements of 50-75% for the N1 component. STH was partially stimulus specific in that amplitudes were larger following deviant stimuli in the 5th position than following standards. STH of the N1 was more marked for speech sounds than for loudness-matched tones or complex tones at short ISI. In addition, standard and deviant stimuli that differed in phonetic structure showed more cross-habituation than did tones or complex tones that differed in frequency. This pattern of results suggests that STH is a function of the acoustic resemblance of successive stimuli. The long-term habituation (LTH) of the ERP was studied by comparing amplitudes across balanced 5.25 m stimulus blocks over the course of the experiment. Two types of LTH were observed. The N1 showed stimulus-specific LTH in that N1 amplitudes declined during the presentation of a stimulus, but returned to control levels when a different stimulus was presented in the subsequent condition. In contrast, the P3 elicited by the deviant stimuli showed non-specific LTH, being reduced across successive blocks containing different stimuli. P3s elicited by target stimuli remained stable in amplitude.
Article
Neuromagnetic responses to different auditory stimuli (noise bursts and short speech stimuli) were mapped over both hemispheres of seven healthy subjects. The results indicate that a particular acoustic feature of speech, vowel onset after voiceless fricative consonants, evokes a prominent response in the human supratemporal auditory cortex. Although the observed response seems to be specific to acoustic rather than phonetic characteristics of the stimuli, it might reflect feature detection essential for further speech processing.
Article
Steps in brain information processing are reflected on the scalp as changes of the electric potential which is evoked by the stimulus. However, for a given recording point on the scalp, there is no absolute amplitude or phase information of the electric brain potential. This means that the shape of an evoked potential waveform which is recorded from a given scalp location crucially depends on the location of the chosen reference. Only unbiased results of evoked potential data evaluation can be hoped to elucidate or map successfully into information processing models established by other methods, e.g. behavior measurements. Conventional recordings vs a common reference contain only one of many possible sets of waveshapes. In order to avoid ambiguities or bias of results, the entire evoked potential data set firstly must be analysed over space, and reference-independent parameters must be extracted. For each time point, the spatial distribution of the potentials is viewed as field map. The parameter extraction in a direct approach at each time point includes, e.g. locations of field peaks and troughs, voltage and gradient between them, and global electrical field power; or, parameters via the first or second spatial derivative of the electric field. In the second step, changes of these reference-independent field measurements are analysed over time. At component latency which is defined by maximal, global field power or by voltage range, mapped field distributions can be compared using maximal/minimal field value locations or complete maps. Significantly different field configurations establish the activity of non-identical neural generators. Classification of the field configurations (examination of orbits of field extrema over time) leads to the segmentation of series of field maps (multichannel EP data) into short epochs of stationary spatial configurations (i.e. spatially characterized components) with equal consideration of all recording points, and without the amplitude criterion. The application of these principles to the following problems is discussed: comparison of evoked potentials between different analysis times, in particular pre-stimulus and post-stimulus electric brain states; zero baseline for measurement; reference electrode; identification of evoked components in time and space. Illustrations of these problems include functional differences of input-analysing sub-systems, and the topography of cognition- and speech-related electric brain activity.
Article
The amplitude of frication relative to vowel onset amplitude in the F3 and F5 formant frequency regions was manipulated for the synthetic fricative contrasts /s/-/integral of/ and /s/-/theta/, respectively. The influence of this relative amplitude manipulation on listeners' perception of place of articulation was tested by (1) varying the duration of frication from 30 to 140 ms, (2) pairing the frication noise with different vowels /i a u/, (3) placing formant transitions in conflict with relative amplitude, and (4) holding relative amplitude constant within a continuum while varying formant transitions and the amplitudes of spectral regions where relative amplitude was not manipulated. To determine if listeners were using absolute spectral cues or relative amplitude comparisons between frication and vowel for fricative identification, the frication and vowel were separated by (1) presenting the frication in isolation, and (2) inserting a gap of silence between the frication and vowel. The results showed that relative amplitude was perceived across vowel context and frication duration, and overrode context-dependent formant transition cues. The findings for temporal separations between the frication and vowel suggest that short-term memory processes may dominate the mediation of the relative-amplitude comparison. However, the overall results indicate that relative amplitude is only a component of spectral prominence, which is comprised of a primary frication spectral peak and a secondary frication/vowel peak comparison.
Article
1) To determine whether the N1-P2 acoustic change complex is elicited by a change of periodicity in the middle of an ongoing stimulus, in the absence of changes of spectral envelope or rms intensity. 2) To compare the N1-P2 acoustic change complex with the mismatch negativity elicited by the same stimuli in terms of amplitude and signal to noise ratio. The signals used in this study were a tonal complex and a band of noise having the same spectral envelope and rms intensity. For elicitation of the acoustic change complex, the signals were concatenated to produce two stimuli that changed in the middle (noise-tone, tone-noise). Two control stimuli were created by concatenating two copies of the noise and two copies of the tone (noise-only, tone-only). The stimuli were presented using an onset-to-onset interstimulus interval of 3 sec. For elicitation of the mismatch negativity, the tonal complex and noise band stimuli were presented using an oddball paradigm (deviant probability = 0.14) with an onset-to-onset interstimulus interval of 600 msec. The stimuli were presented via headphones at 80 dB SPL to 10 adults with normal hearing. Subjects watched a silent video during testing. The responses to the noise-only and tone-only stimuli showed a clear N1-P2 complex to the onset of stimulation followed by a sustained potential that continued until the offset of stimulation. The noise-tone and tone-noise stimuli elicited an additional N1-P2 acoustic change complex in response to the change in periodicity occurring in the middle. The acoustic change complex was larger for the tone-noise stimulus than for the noise-tone stimulus. A clear mismatch negativity was elicited by both the noise band and tonal complex stimuli. In contrast to the acoustic change complex, there was no significant difference in amplitude across the two stimuli. The acoustic change complex was a more sensitive index of peripheral discrimination capacity than the mismatch negativity, primarily because its average amplitude was 2.5 times as large. These findings indicate that both the acoustic change complex and the mismatch negativity are sensitive indexes of the neural processing of changes in periodicity, though the acoustic change complex has an advantage in terms of amplitude. The results support the possible utility of the acoustic change complex as a clinical tool in the assessment of peripheral speech perception capacity.
Article
Children's auditory event-related potentials (ERPs) are dominated by the P1 and N2 peaks, while the N1 wave emerges between 3 and 4 years of age. The neural substrates and the behavioral correlates of the protracted N1 maturation, as well as of the 10-year long predominance of the N2 are unclear. The present study utilized high-resolution electroencephalography to study the maturation of auditory ERPs from age 4 to adulthood and to compare the sources of the N1 and the N2 peaks in 9-year-old children and adults. Three partial harmonic tones were delivered with short (700 ms) and long (mean of 5s) stimulus onset asynchrony (SOA), with only 700 ms SOA used with 4-year-olds. With a short SOA, 4- and 9-year-old children displayed P1 and N2 peaks, whereas adults showed P1, N1, P2, and N2 waves. With a long SOA, 9-year-olds also displayed an N1 peak, which was frontal in scalp distribution to that in adults who showed P1, N1, and P2 peaks. After filtering out the slow N2 activity, the N1 wave was also revealed in the short-SOA data in 9-year-old but not in 4-year-old children. In adults and in 9-year-olds, the neural sources of the N2 and N1 mapped onto the superior aspects of the temporal lobes, the sources of the N2 being anterior to those of the N1. The results indicated that children's N1 is composed of differently weighted components as that in adults, and that in both children and adults the N1 and N2 are generated by anatomically distinct generators. A protracted ontogeny of the N1 could be linked with that of auditory sensitivity and orienting, whereas the P1 and N2 peaks are suggested to reflect auditory sensory processes.
Article
The two aims of this study were (a) to determine the perceptual weight given formant transition and relative amplitude information for labeling fricative place of articulation perception and (b) to determine the extent of integration of relative amplitude and formant transition cues. Seven listeners with normal hearing and 7 listeners with sensorineural hearing loss participated. The listeners were asked to label the fricatives of synthetic consonant-vowel stimuli as either /s/ or [see text]. Across the stimuli, 3 cues were varied: (a) The amplitude of the spectral peak in the 2500-Hz range of the frication relative to the adjacent vowel peak amplitude in the same frequency region, (b) the frication duration, which was either 50 or 140 ms, and (c) the second formant transition onset frequency, which was varied from 1200 to 1800 Hz. An analysis of variance model was used to determine weightings for the relative amplitude and transition cues for the different frication duration conditions. A 30-ms gap of silence was inserted between the frication and vocalic portions of the stimuli, with the intent that a temporal separation of frication and transition information might affect how the cues were integrated. The weighting given transition or relative amplitude differed between the listening groups and depended on frication duration. Use of the transition cue was most affected by insertion of the silent gap. Listeners with hearing loss had smaller interaction terms for the cues than listeners with normal hearing, suggesting less integration of cues.
Article
To determine if (1) evoked potentials elicited by amplified speech sounds (/si/ and /[symbol: see text]/) can be recorded reliably in individuals, (2) amplification alters neural response patterns, and (3) different amplified speech sounds evoke different neural patterns. Cortical evoked potentials were recorded in sound field from seven normal-hearing young adults in response to naturally produced speech tokens /si/ and /[symbol: see text]/ from the Nonsense Syllable Test. With the use of a repeated-measures design, subjects were tested and then retested within an 8-day period in both aided and unaided conditions. (1) Speech-evoked cortical potentials can be recorded reliably in individuals in both aided and unaided conditions. (2) Hearing aids that provide a mild high-frequency gain only subtly enhance peak amplitudes relative to unaided cortical recordings. (3) If the consonant-vowel boundary is preserved by the hearing aid, it can also be detected neurally, resulting in different neural response patterns for /si/ and /[symbol: see text]/. Speech-evoked cortical potentials can be recorded reliably in individuals during hearing aid use. A better understanding of how amplification (and device settings) affects neural response patterns is still needed.
Article
There has been considerable recent interest in the use of cortical auditory evoked potentials (CAEPs) as an electrophysiological measure of human speech encoding in individuals with normal as well as impaired auditory systems. The development of such electrophysiological measures such as CAEPs is important because they can be used to evaluate the benefits of hearing aids and cochlear implants in infants, young children, and adults that cannot cooperate for behavioral speech discrimination testing. The current study determined whether CAEPs produced by seven different speech sounds, which together cover a broad range of frequencies across the speech spectrum, could be differentiated from each other based on response latency and amplitude measures. CAEPs were recorded from ten adults with normal hearing in response to speech stimuli presented at a conversational level (65 dB SPL) via a loudspeaker. Cortical responses were reliably elicited by each of the speech sounds in all participants. CAEPs produced by speech sounds dominated by high-frequency energy were significantly different in amplitude from CAEPs produced by sounds dominated by lower-frequency energy. Significant effects of stimulus duration were also observed, with shorter duration stimuli producing larger amplitudes and earlier latencies than longer duration stimuli. This research demonstrates that CAEPs can be reliably evoked by sounds that encompass the entire speech frequency range. Further, CAEP latencies and amplitudes may provide an objective indication that spectrally different speech sounds are encoded differently at the cortical level.