Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This study examined how speech babble noise differentially affected the auditory P3 responses and the associated neural oscillatory activities for consonant and vowel discrimination in relation to segmental- and sentence-level speech perception in noise. The data were collected from 16 normal-hearing participants in a double-oddball paradigm that contained a consonant (/ba/ to /da/) and vowel (/ba/ to /bu/) change in quiet and noise (speech-babble background at a -3 dB signal-to-noise ratio) conditions. Time-frequency analysis was applied to obtain inter-trial phase coherence (ITPC) and event-related spectral perturbation (ERSP) measures in delta, theta, and alpha frequency bands for the P3 response. Behavioral measures included percent correct phoneme detection and reaction time as well as percent correct IEEE sentence recognition in quiet and in noise. Linear mixed-effects models were applied to determine possible brain-behavior correlates. A significant noise-induced reduction in P3 amplitude was found, accompanied by significantly longer P3 latency and decreases in ITPC across all frequency bands of interest. There was a differential effect of noise on consonant discrimination and vowel discrimination in both ERP and behavioral measures, such that noise impacted the detection of the consonant change more than the vowel change. The P3 amplitude and some of the ITPC and ERSP measures were significant predictors of speech perception at segmental- and sentence-levels across listening conditions and stimuli. These data demonstrate that the P3 response with its associated cortical oscillations represents a potential neurophysiological marker for speech perception in noise.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Supplementary resource (1)

... A dominant observation in time-frequency analyses auditory change detection is an increase in theta ITC for the deviant relative to the standard (e.g., Fuentemilla et al., 2008;Hsiao et al., 2009), sometimes also associated with an increase in theta power. These theta modulations have been shown to be associated with perceptual discrimination abilities (Bishop et al., 2011;Jin et al., 2014) and speech intelligibility (Koerner et al., 2017). Additionally, the theta modulations may differ depending on language abilities (Cantiani et al., 2019;Halliday et al., 2014). ...
... The oscillatory correlate underlying the MMN is thought to be a phase reset resulting in increased inter-trial coherence (ITC) in the theta band(Fuentemilla et al., 2008;Hsiao et al., 2009), sometimes also associated with a concurrent increase in theta power. As with the MMN, these theta modulations have been shown to be associated with perceptual discrimination abilities(Bishop et al., 2011;Jin et al., 2014), speech intelligibility(Koerner et al., 2017), and differences in language abilities(Cantiani et al., 2019;Halliday et al., 2014). As these prior studies have been conducted on simpler stimuli such as tones, vowels or single syllables, we cannot currently form precise hypotheses about modulations of the time-frequency measures by the current stimulus features (phonotactic probability and syllable stress patterns). ...
... ERPs reflect specific sensory and/or cognitive processes [30]. Specific ERPs that may be used to study the perception of signals in noise are the N1, P2, mismatch negativity (MMN) and P300 responses [31][32][33][34][35][36][37][38][39][40]. Of particular interest to the current study are the MMN and P300 components. ...
... Changes have been reported in ERPs in the presence of ipsilateral noise compared with a quiet condition [33,34,38,39,[77][78][79][80]. Generally, amplitudes are reduced and latencies are prolonged for the MMN and P300 when stimuli are presented in ipsilateral noise, which may be activating the efferent system. ...
Article
Full-text available
This electrophysiological study investigated the role of the medial olivocochlear (MOC) efferents in listening in noise. Both ears of eleven normal-hearing adult participants were tested. The physiological tests consisted of transient-evoked otoacoustic emission (TEOAE) inhibition and the measurement of cortical event-related potentials (ERPs). The mismatch negativity (MMN) and P300 responses were obtained in passive and active listening tasks, respectively. Behavioral responses for the word recognition in noise test were also analyzed. Consistent with previous findings, the TEOAE data showed significant inhibition in the presence of contralateral acoustic stimulation. However, performance in the word recognition in noise test was comparable for the two conditions (i.e., without contralateral stimulation and with contralateral stimulation). Peak latencies and peak amplitudes of MMN and P300 did not show changes with contralateral stimulation. Behavioral performance was also maintained in the P300 task. Together, the results show that the peripheral auditory efferent effects captured via otoacoustic emission (OAE) inhibition might not necessarily be reflected in measures of central cortical processing and behavioral performance. As the MOC effects may not play a role in all listening situations in adults, the functional significance of the cochlear effects of the medial olivocochlear efferents and the optimal conditions conducive to corresponding effects in behavioral and cortical responses remain to be elucidated.
... EROs can be obtained at levels ranging from single cell to focal field potentials in animals, to large-scale synchronized activities measured at the human scalp (Moran and Hong, 2011). Power in the theta band is particularly prominent during the processing of tone and speech deviants, with studies showing that neural generation of the MMN is accompanied by theta power modulation and theta phase alignment (Fuentemilla et al., 2008;Hsiao et al., 2009;Bishop and Hardiman, 2010;Ko et al., 2012;Choi et al., 2013;Hermann et al., 2014;Koerrner et al., 2016;Corcoran et al., 2018). NMDA receptor antagonists modulate background spontaneous theta oscillations Hunt and Kasicki, 2013) and reduce both MMN and the theta response to auditory deviants in rodents . ...
... NMDA receptor antagonists have been found to reduce evoked theta power during MMN generation and other task events in rodent models Lazarewicz et al., 2009). Interestingly, speech-evoked MMN and theta power in healthy volunteers serve as predictors of behavioral speech FIGURE 9 | Scatterplot showing significant relationship between ketamine-induced changes in symptom rating (derived by subtracting placebo CADSS values from ketamine CADSS values) and baseline (placebo) and theta phase locking factor (PLF). perception at the syllable and sentence level and, along with perception accuracy are reduced during noise stress (Koerrner et al., 2016), which also disrupts NMDA receptor signaling (Cui et al., 2012). ...
Article
Full-text available
Background: Previous studies in schizophrenia have consistently shown that deficits in the generation of the auditory mismatch negativity (MMN) – a pre-attentive, event-related potential (ERP) typically elicited by changes to simple sound features – are linked to N-methyl-D-aspartate (NMDA) receptor hypofunction. Concomitant with extensive language dysfunction in schizophrenia, patients also exhibit MMN deficits to changes in speech but their relationship to NMDA-mediated neurotransmission is not clear. Accordingly, our study aimed to investigate speech MMNs in healthy humans and their underlying electrophysiological mechanisms in response to NMDA antagonist treatment. We also evaluated the relationship between baseline MMN/electrocortical activity and emergent schizophrenia-like symptoms associated with NMDA receptor blockade. Methods: In a sample of 18 healthy volunteers, a multi-feature Finnish language paradigm incorporating changes in syllables, vowels and consonant stimuli was used to assess the acute effects of the NMDA receptor antagonist ketamine and placebo on the MMN. Further, measures of underlying neural activity, including evoked theta power, theta phase locking and source-localized current density in cortical regions of interest were assessed. Subjective symptoms were assessed with the Clinician Administered Dissociative States Scale (CADSS). Results: Participants exhibited significant ketamine-induced increases in psychosis-like symptoms and depending on temporal or frontal recording region, co-occurred with reductions in MMN generation in response to syllable frequency/intensity, vowel duration, across vowel and consonant deviants. MMN attenuation was associated with decreases in evoked theta power, theta phase locking and diminished current density in auditory and inferior frontal (language-related cortical) regions. Baseline (placebo) MMN and underlying electrophysiological features associated with the processing of changes in syllable intensity correlated with the degree of psychotomimetic response to ketamine. Conclusion: Ketamine-induced impairments in healthy human speech MMNs and their underlying electrocortical mechanisms closely resemble those observed in schizophrenia and support a model of dysfunctional NMDA receptor-mediated neurotransmission of language processing deficits in schizophrenia. HIGHLIGHTS - Neural effects of NMDA receptor blockade on speech processing were assessed in a ketamine model. - Ketamine reduced MMN, theta power, theta phase locking factor and regional cortical current density. - Psychosis-like symptoms induced by ketamine were related to baseline (placebo) neural measures of speech processing.
... A larger P3b amplitude in the Easy condition suggests more attentional resources deployed in this condition (Kok, 2001;Polich, 2007), and is in line with previous findings of P3 amplitude decreases with lower stimulus discriminability (Kok, 2001;Parasuraman and Beatty, 1980), such as in the Difficult condition. This interpretation is in accordance with other studies that showed smaller P3b amplitudes in challenging conditions compared to easier ones during stimulus classification (Caclin et al., 2008), working-memory tasks (Pinal et al., 2014) and phonemic discrimination (Koerner et al., 2017). It is also consistent with behavioural results ...
Thesis
Our everyday listening environment is a complex acoustic mixture that needs to be processed and filtered in order to access relevant auditory information. Cognitive resources are then required for the selective processing of a particular sound stream, and simultaneous filtering of irrelevant information. The engagement of these cognitive resources to understand an auditory message, leads to listening effort, especially in noisy environments. Listening effort has been investigated in the last two decades, using a large panel of methods. The work of this thesis aims at bringing new insights on the investigation of listening effort, first with the use of pupillometry, then based on the complementarity of different measures (subjective, behavioral and objective). A methodological investigation was first conducted on pupillometry data recorded during a word-in-noise task, among older hearing-impaired patients, with and without hearing-aids. Several analysis methods were compared, including different normalization techniques, baseline periods, and baseline durations. While the different normalization methods and baseline durations showed similar results, the choice of the baseline period turned out to have a crucial influence on conclusions. Indeed, anticipatory, pre-stimulus cognitive processes, such as attention mobilization were observed on pupil dilation when the baseline period was the most anterior, relative to the stimulus onset. The differences in pupil dilation were observed even at perfect intelligibility, highlighting the relevance of pupillometry as an objective measure of listening effort. The second axis of this work focused on the results of empirical studies in which several measures, including pupillometry, were concurrently used to assess listening effort. Empirical studies were conducted (1) in older hearing-impaired patients using subjective measures of effort and pupillometry during a word-in-noise task, (2) in normal-hearing young adults using pupillometry and sclap electroencephalography during a discrimination in noise task. The lack of correlation between self-assessed difficulty of the task and pupil responses in hearing-impaired listeners, suggests that the two measures address different aspects of effortful listening. Pupil responses allowed for the observation of anticipation processes, even at perfect intelligibility, while subjective measures described the overall perceived effort during the task. In normal-hearing young adults, the modulations of the cortical responses observed thanks to electroencephalography, were linked to the processing of the stimulation and the inhibition of the irrelevant sound source during discrimination. Pupillary responses, recorded simultaneously, brought information on participants' arousal state during the task. Results of both studies then suggest that the different measures complement each other, and that their combination can help for the understanding of the different cognitive processes involved during effortful listening. Overall, this PhD work brings insights on the use and processing of the pupillometric signal to explore listening effort. It also underlines the relevance of the use of pupillometry and its contribution for the study of listening effort, among distinct populations. Finally, it shows the complementarity of subjective and objective measures during the assessment of listening effort, supporting the idea that it is a multidimensional construct.
... P3 is thought to be related to updating a mental representation of the incoming stimulus [28]. This component can be observed in various tasks such as oddball tasks [42,43], go/no-go or stop signal tasks [44][45][46], or identification tasks [47] during stimulus discrimination. P3 can be affected by stimulus probability and relevancy to the task. ...
Article
Full-text available
Speech discrimination is used by audiologists in diagnosing and determining treatment for hearing loss patients. Usually, assessing speech discrimination requires subjective responses. Using electroencephalography (EEG), a method that is based on event-related potentials (ERPs), could provide objective speech discrimination. In this work we proposed a visual-ERP-based method to assess speech discrimination using pictures that represent word meaning. The proposed method was implemented with three strategies, each with different number of pictures and test sequences. Machine learning was adopted to classify between the task conditions based on features that were extracted from EEG signals. The results from the proposed method were compared to that of a similar visual-ERP-based method using letters and a method that is based on the auditory mismatch negativity (MMN) component. The P3 component and the late positive potential (LPP) component were observed in the two visual-ERP-based methods while MMN was observed during the MMN-based method. A total of two out of three strategies of the proposed method, along with the MMN-based method, achieved approximately 80% average classification accuracy by a combination of support vector machine (SVM) and common spatial pattern (CSP). Potentially, these methods could serve as a pre-screening tool to make speech discrimination assessment more accessible, particularly in areas with a shortage of audiologists.
... The P3 occurs in response to rare and relevant events 250-500 ms after the presentation of a stimulus and appears to be associated with stimulus classification and updating in short-term memory. It has been suggested that P3 latency is a function of the time necessary to classify a stimulus on any task-relevant dimension (Kok, 2001;Kutas & Van Petten, 1988), and, like MMN, it has often been reported in studies using an oddball 1 3 paradigm (Koerner et al., 2017). In a review of P3, Kok (2001) suggests that it is the attention to stimulus processing (that is "task emphasis") that increases the P3 amplitude, while concurrent working-memory load (that is "dual-task performance") reduces it. ...
Article
Full-text available
Phonological duration differences in quantity languages can be problematic for second language learners whose native language does not use duration contrastively. Recent studies have found improvement in the processing of non-native vowel duration contrasts with the use of listen-and-repeat training, and the current study explores the efficacy of similar methodology on consonant duration contrasts. 18 adult participants underwent two days of listen-and-repeat training with pseudoword stimuli containing either a sibilant or a stop consonant contrast. The results were examined with psychophysiological event-related potentials (mismatch negativity and P3), behavioral discrimination tests and a production task. The results revealed no training-related effects in the event-related potentials or the production task, but behavioral discrimination performance improved. Furthermore, differences emerged between the processing of the two consonant types. The findings suggest that stop consonants are processed more slowly than the sibilants, and the findings are discussed with regard to possible segmentation difficulties.
... Earlier work assessing AEP responses in patients with known lesions to the central auditory pathway has revealed that N1 and P2 responses may be left intact while P3 responses reveal longer latencies and reduced amplitudes, thus suggesting that P3 measures may be more sensitive to neurological lesions in the auditory system compared to earlier waves (Knight et al. 1989;Musiek et al. 1992). Indeed, previous studies have shown P3 measures to be sensitive to a range of auditory processing abilities (Krishnamurti 2001) including significant correlations between P2 indices and central auditory processing tasks including Dichotic Digits (Rocha et al. 2010), Staggered Spondaic Words test, and some measures of SIN perception (Talarico et al. 2007;Bennett et al. 2012;McCullagh et al. 2012;Koerner et al. 2017 The present study represents an important step towards objective measurement of auditory dysfunction in Veterans exposed to high-intensity blasts. However, interpretation of the results should be tempered due to several study limitations. ...
Article
Objectives: Veterans who have been exposed to high-intensity blast waves frequently report persistent auditory difficulties such as problems with speech-in-noise (SIN) understanding, even when hearing sensitivity remains normal. However, these subjective reports have proven challenging to corroborate objectively. Here, we sought to determine whether use of complex stimuli and challenging signal contrasts in auditory evoked potential (AEP) paradigms rather than traditional use of simple stimuli and easy signal contrasts improved the ability of these measures to (1) distinguish between blast-exposed Veterans with auditory complaints and neurologically normal control participants, and (2) predict behavioral measures of SIN perception. Design: A total of 33 adults (aged 19-56 years) took part in this study, including 17 Veterans exposed to high-intensity blast waves within the past 10 years and 16 neurologically normal control participants matched for age and hearing status with the Veteran participants. All participants completed the following test measures: (1) a questionnaire probing perceived hearing abilities; (2) behavioral measures of SIN understanding including the BKB-SIN, the AzBio presented in 0 and +5 dB signal to noise ratios (SNRs), and a word-level consonant-vowel-consonant test presented at +5 dB SNR; and (3) electrophysiological tasks involving oddball paradigms in response to simple tones (500 Hz standard, 1000 Hz deviant) and complex speech syllables (/ba/ standard, /da/ deviant) presented in quiet and in four-talker speech babble at a SNR of +5 dB. Results: Blast-exposed Veterans reported significantly greater auditory difficulties compared to control participants. Behavioral performance on tests of SIN perception was generally, but not significantly, poorer among the groups. Latencies of P3 responses to tone signals were significantly longer among blast-exposed participants compared to control participants regardless of background condition, though responses to speech signals were similar across groups. For cortical AEPs, no significant interactions were found between group membership and either stimulus type or background. P3 amplitudes measured in response to signals in background babble accounted for 30.9% of the variance in subjective auditory reports. Behavioral SIN performance was best predicted by a combination of N1 and P2 responses to signals in quiet which accounted for 69.6% and 57.4% of the variance on the AzBio at 0 dB SNR and the BKB-SIN, respectively. Conclusions: Although blast-exposed participants reported far more auditory difficulties compared to controls, use of complex stimuli and challenging signal contrasts in cortical and cognitive AEP measures failed to reveal larger group differences than responses to simple stimuli and easy signal contrasts. Despite this, only P3 responses to signals presented in background babble were predictive of subjective auditory complaints. In contrast, cortical N1 and P2 responses were predictive of behavioral SIN performance but not subjective auditory complaints, and use of challenging background babble generally did not improve performance predictions. These results suggest that challenging stimulus protocols are more likely to tap into perceived auditory deficits, but may not be beneficial for predicting performance on clinical measures of SIN understanding. Finally, these results should be interpreted with caution since blast-exposed participants did not perform significantly poorer on tests of SIN perception.
... Electrophysiological measures, though, can assess how spectral and temporal features of speech are coded in the central auditory system and how neural responses to acoustic changes within and across speech sounds relate to behavioral perception (Easwar et al. 2012;Swink and Stuart 2012). Objective measurements of speech processing independent of cognitive or attentional skill are a potentially attractive tool for clinicians who monitor speech and language outcomes as the neural coding of speech segments has been shown to be predictive of sentence perception abilities in adult listeners with and without hearing impairment (Koerner et al. 2016;Koerner et al. 2017). It remains unclear whether syllable-final /s/-/ʃ/ fricatives produce different electrophysiological responses in listeners with and without hearing loss when stimuli are behaviorally discriminable. ...
Article
Full-text available
Background: Cortical auditory event-related potentials are a potentially useful clinical tool to objectively assess speech outcomes with rehabilitative devices. Whether hearing aids reliably encode the spectrotemporal characteristics of fricative stimuli in different phonological contexts and whether these differences result in distinct neural responses with and without hearing aid amplification remain unclear. Purpose: To determine whether the neural coding of the voiceless fricatives /s/ and /ʃ/ in the syllable-final context reliably differed without hearing aid amplification and whether hearing aid amplification altered neural coding of the fricative contrast. Research Design: A repeated-measures, within subject design was used to compare the neural coding of a fricative contrast with and without hearing aid amplification. Study Sample: Ten adult listeners with normal hearing participated in the study. Data Collection and Analysis: Cortical auditory event-related potentials were elicited to an /ɑs/–/ɑʃ/ vowel-fricative contrast in unaided and aided listening conditions. Neural responses to the speech contrast were recorded at 64-electrode sites. Peak latencies and amplitudes of the cortical response waveforms to the fricatives were analyzed using repeated-measures analysis of variance. Results: The P2' component of the acoustic change complex significantly differed from the syllable-final fricative contrast with and without hearing aid amplification. Hearing aid amplification differentially altered the neural coding of the contrast across frontal, temporal, and parietal electrode regions. Conclusions: Hearing aid amplification altered the neural coding of syllable-final fricatives. However, the contrast remained acoustically distinct in the aided and unaided conditions, and cortical responses to the fricative significantly differed with and without the hearing aid.
... When background noise, either competing noise or competing speech, is combined with a speech signal, N1 and P2 peaks recorded from speech sound onset are generally reduced in amplitude and delayed in latency for adults (Billings et al., 2011;Billings et al., 2013;Billings and Grush, 2016;Kaplan-Neeman et al., 2006;Parbery-Clark et al., 2011;Whiting et al., 1998;Zendel et al., 2015) and school-age children 2 (Almeqbel and McMahon, 2015;Anderson et al., 2010;Cunningham et al., 2001;Hassaan, 2015;Hayes et al., 2003;Sharma et al., 2014;Warrier et al., 2004). Consistent with effects of competing noise on N1 and P2, the P3b peak is generally delayed and reduced in amplitude for adult listeners when competing noise is present (Bennett et al., 2012;Koerner et al., 2017;Whiting et al., 1998). Far less is known about the effect of competing noise on P3b in child listeners. ...
Article
Child listeners have particular difficulty with speech perception when competing speech noise is present; this challenge is often attributed to their immature top-down processing abilities. The purpose of this study was to determine if the effects of competing speech noise on speech-sound processing vary with age. Cortical auditory evoked potentials (CAEPs) were measured during an active speech-syllable discrimination task in 58 normal-hearing participants (age 7–25 years). Speech syllables were presented in quiet and embedded in competing speech noise (4-talker babble, +15 dB signal-to-noise ratio; SNR). While noise was expected to similarly reduce amplitude and delay latencies of N1 and P2 peaks in all listeners, it was hypothesized that effects of noise on the P3b peak would be inversely related to age due to the maturation of top-down processing abilities throughout childhood. Consistent with previous work, results showed that a +15 dB SNR reduces amplitudes and delays latencies of CAEPs for listeners of all ages, affecting speech-sound processing, delaying stimulus evaluation, and causing a reduction in behavioral speech-sound discrimination. Contrary to expectations, findings suggest that competing speech noise at a +15 dB SNR may have similar effects on various stages of speech-sound processing for listeners of all ages. Future research directions should examine how more difficult listening conditions (poorer SNRs) might affect results across ages.
... It is possible that the absence of parietal P300 responses at pretest (defined target amplitude ≥ 0.5 μV above standard amplitude) for one third of the study sample was due to study procedures used to collect AEP data. First, work in adult listeners has shown that the presence of babble noise causes significant reductions in P300 amplitude when compared to AEPs elicited in quiet (Bennett, Billings, Molis, & Leek, 2012;Koerner, Zhang, Nelson, Wang, & Zou, 2017). The use of speech embedded in multitalker babble noise could have negatively influenced the presence of the P300 response. ...
Article
Purpose: The purpose of this study was to examine fatigue associated with sustained and effortful speech-processing in children with mild to moderately severe hearing loss. Method: We used auditory P300 responses, subjective reports, and behavioral indices (response time, lapses of attention) to measure fatigue resulting from sustained speech-processing demands in 34 children with mild to moderately severe hearing loss (M = 10.03 years, SD = 1.93). Results: Compared to baseline values, children with hearing loss showed increased lapses in attention, longer reaction times, reduced P300 amplitudes, and greater reports of fatigue following the completion of the demanding speech-processing tasks. Conclusions: Similar to children with normal hearing, children with hearing loss demonstrate reductions in attentional processing of speech in noise following sustained speech-processing tasks-a finding consistent with the development of fatigue.
... In contrast to behavioral evidence, ERP evidence may provide a finer examination of sensitivity to consonant and vowel mispronunciations. Previous studies with adults have found sensitivity to both consonant and vowel mispronunciations in auditory processing-both with and without noise [30,31]-as well as in visual word recognition, but at different timing and scalp distributions [32][33][34]. ERP studies with infants have not directly compared consonant and vowel processing, but a series of auditory word recognition studies have examined consonant or vowel processing in older infants. ...
Article
Full-text available
Segmentation skill and the preferential processing of consonants (C-bias) develop during the second half of the first year of life and it has been proposed that these facilitate language acquisition. We used Event-related brain potentials (ERPs) to investigate the neural bases of early word form segmentation, and of the early processing of onset consonants, medial vowels, and coda consonants, exploring how differences in these early skills might be related to later language outcomes. Our results with French-learning eight-month-old infants primarily support previous studies that found that the word familiarity effect in segmentation is developing from a positive to a negative polarity at this age. Although as a group infants exhibited an anterior-localized negative effect, inspection of individual results revealed that a majority of infants showed a negative-going response (Negative Responders), while a minority showed a positive-going response (Positive Responders). Furthermore, all infants demonstrated sensitivity to onset consonant mispronunciations, while Negative Responders demonstrated a lack of sensitivity to vowel mispronunciations, a developmental pattern similar to previous literature. Responses to coda consonant mispronunciations revealed neither sensitivity nor lack of sensitivity. We found that infants showing a more mature, negative response to newly segmented words compared to control words (evaluating segmentation skill) and mispronunciations (evaluating phonological processing) at test also had greater growth in word production over the second year of life than infants showing a more positive response. These results establish a relationship between early segmentation skills and phonological processing (not modulated by the type of mispronunciation) and later lexical skills.
... Similarly, Kozou et al. (2005) compared MMNs elicited in silence and in several types of weak background noise at +10 dB SNR, and found no significant effect of any type of noise on behavioral performance and no significant effect of broadband noise on MMN indices. In contrast, MMNs and P300s elicited in loud background noise (e.g., À3 dB SNR in Koerner et al., 2016Koerner et al., , 2017Bennett et al., 2012) are significantly deteriorated by the noise. All these observations are actually consistent with each other. ...
Article
Since sound perception takes place against a background with a certain amount of noise, both speech and non-speech processing involve extraction of target signals and suppression of background noise. Previous works on early processing of speech phonemes largely neglected how background noise is encoded and suppressed. This study aimed to fill in this gap. We adopted an oddball paradigm where speech (vowels) or non-speech stimuli (complex tones) were presented with or without a background of amplitude-modulated noise and analyzed cortical responses related to foreground stimulus processing, including mismatch negativity (MMN), N2b, and P300, as well as neural representations of the background noise, i.e. auditory steady-state response (ASSR). We found that speech deviants elicited later and weaker MMN, later N2b, and later P300 than non-speech ones, but N2b and P300 had similar strength, suggesting more complex processing of certain acoustic features in speech. Only for vowels, background noise enhanced N2b strength relative to silence, suggesting an attention-related speech-specific process to improve perception of foreground targets. In addition, noise suppression in speech contexts, quantified by ASSR amplitude reduction after stimulus onset, was lateralized towards the left hemisphere. The left-lateralized suppression following N2b was associated with the N2b enhancement in noise for speech, indicating that foreground processing may interact with background suppression, particularly during speech processing. Together, our findings indicate that the differences between perception of speech and non-speech sounds involve not only the processing of target information in the foreground but also the suppression of irrelevant aspects in the background.
Article
Purpose: Understanding speech in a background of other people talking is a difficult listening situation for hearing-impaired individuals, and even for those with normal hearing. Speech-on-speech masking is known to contribute to increased perceptual difficulty over nonspeech background noise because of informational masking provided over and above the effects of energetic masking. While informational masking research has identified factors of similarity and uncertainty between target and masker that contribute to reduced behavioral performance in speech background noise, critical gaps in knowledge including the underlying neural-perceptual processes remain. By systematically manipulating aspects of acoustic similarity and uncertainty in the same auditory paradigm, the current study examined the time course and objectively quantified these informational masking effects at both early and late stages of auditory processing using auditory evoked potentials (AEPs). Method: Thirty participants were included in a cross-sectional repeated measures design. Target-masker similarity was manipulated by varying the linguistic/phonetic similarity (i.e., language) of the talkers in the background. Specifically, four levels representing hypothesized increasing levels of informational masking were implemented: (1) no masker (quiet); (2) Mandarin; (3) Dutch; and (4) English. Stimulus uncertainty was manipulated by task complexity, specifically presentation of target-to-target interval (TTI) in the auditory evoked paradigm. Participants had to discriminate between English word stimuli (/bæt/ and /pæt/) presented in an oddball paradigm under each masker condition pressing buttons to either the target or standard stimulus. Responses were recorded simultaneously for P1-N1-P2 (standard waveform) and P3 (target waveform). This design allowed for simultaneous recording of multiple AEP peaks, as well as accuracy, reaction time, and d' behavioral discrimination to button press responses. Results: Several trends in AEP components were consistent with effects of increasing linguistic/phonetic similarity and stimulus uncertainty. All babble maskers significantly affected outcomes compared to quiet. In addition, the native language English masker had the largest effect on outcomes in the AEP paradigm, including reduced P3 amplitude and area, as well as decreased accuracy and d' behavioral discrimination to target word responses. AEP outcomes for the Mandarin and Dutch maskers, however, were not significantly different across any measured component. Latency outcomes for both N1 and P3 also supported an effect of stimulus uncertainty, consistent with increased processing time related to greater task complexity. An unanticipated result was the absence of the interaction of linguistic/phonetic similarity and stimulus uncertainty. Conclusions: Observable effects of both similarity and uncertainty were evidenced at a level of the P3 more than the earlier N1 level of auditory cortical processing suggesting that higher-level active auditory processing may be more sensitive to informational masking deficits. The lack of significant interaction between similarity and uncertainty at either level of processing suggests that these informational masking factors operated independently. Speech babble maskers across languages altered AEP component measures, behavioral detection, and reaction time. Specifically, this occurred when the babble was in the native/same language as the target, while the effects of foreign language maskers did not differ. The objective results from this study provide a foundation for further investigation of how the linguistic content of target and masker and task difficulty contribute to difficulty understanding speech-in-noise.
Article
Understanding speech in background noise is difficult for many listeners with and without hearing impairment (HI). This study investigated the effects of HI on speech discrimination and recognition measures as well as speech-evoked cortical N1-P2 and MMN auditory event-related potentials (AERPs) in background noise. We aimed to determine which AERP components can predict the effects of HI on speech perception in noise across adult listeners with and without HI. The data were collected from 18 participants with hearing thresholds ranging from within normal limits to bilateral moderate-to-severe sensorineural hearing loss. Linear mixed effects models were employed to examine how hearing impairment, age, stimulus type, and SNR listening condition affected neural and behavioral responses and what AERP components were correlated with effects of HI on speech-in-noise perception across participants. Significant effects of age were found on the N1-P2 but not on MMN, and significant effects of HI were observed on the MMN and behavioral measures. The results suggest that neural responses reflecting later cognitive processing of stimulus discrimination may be more susceptible to the effects of HI on the processing of speech in noise than earlier components that signal the sensory encoding of acoustic stimulus features. Objective AERP responses were also potential neural predictors of speech perception in noise across participants with and without HI, which has implications for the use of AERPs as a potential clinical tool for assessing speech perception in noise. Full text for personal sharing available before December 6. 2018 at this web link: https://authors.elsevier.com/c/1XynD1M5IZOSKX
Article
Full-text available
Neurophysiological studies are often designed to examine relationships between measures from different testing conditions, time points, or analysis techniques within the same group of participants. Appropriate statistical techniques that can take into account repeated measures and multivariate predictor variables are integral and essential to successful data analysis and interpretation. This work implements and compares conventional Pearson correlations and linear mixed-effects (LME) regression models using data from two recently published auditory electrophysiology studies. For the specific research questions in both studies, the Pearson correlation test is inappropriate for determining strengths between the behavioral responses for speech-in-noise recognition and the multiple neurophysiological measures as the neural responses across listening conditions were simply treated as independent measures. In contrast, the LME models allow a systematic approach to incorporate both fixed-effect and random-effect terms to deal with the categorical grouping factor of listening conditions, between-subject baseline differences in the multiple measures, and the correlational structure among the predictor variables. Together, the comparative data demonstrate the advantages as well as the necessity to apply mixed-effects models to properly account for the built-in relationships among the multiple predictor variables, which has important implications for proper statistical modeling and interpretation of human behavior in terms of neural correlates and biomarkers.
Article
Full-text available
Linear mixed-effects models (LMMs) are increasingly being used for data analysis in cognitive neuroscience and experimental psychology, where within-participant designs are common. The current article provides an introductory review of the use of LMMs for within-participant data analysis and describes a free, simple, graphical user interface (LMMgui). LMMgui uses the package lme4 (Bates et al., 2014a,b) in the statistical environment R (R Core Team). Linear mixed-effects models (LMMs) provide a versatile approach to data analysis and have been shown to be very useful in a several branches of neuroscience (Gueorguieva and Krystal, 2004; Kristensen and Hansen, 2004; Quené and van den Bergh, 2004; Baayen et al., 2008; Lazic, 2010; Judd et al., 2012; Aarts et al., 2014). The current article briefly reviews the use of LMMs for within-participant studies typical in in experimental psychology, before describing a free, graphical user interface (LMMgui; http://www.unifr.ch/neurology/en/lmmgui) to carry out LMM analyses.
Article
Full-text available
The current study measured neural responses to investigate auditory stream segregation of noise stimuli with or without clear spectral contrast. Sequences of alternating A and B noise bursts were presented to elicit stream segregation in normal-hearing listeners. The successive B bursts in each sequence maintained an equal amount of temporal separation with manipulations introduced on the last stimulus. The last B burst was either delayed for 50% of the sequences or not delayed for the other 50%. The A bursts were jittered in between every two adjacent B bursts. To study the effects of spectral separation on streaming, the A and B bursts were further manipulated by using either bandpass-filtered noises widely spaced in center frequency or broadband noises. Event-related potentials (ERPs) to the last B bursts were analyzed to compare the neural responses to the delay vs. no-delay trials in both passive and attentive listening conditions. In the passive listening condition, a trend for a possible late mismatch negativity (MMN) or late discriminative negativity (LDN) response was observed only when the A and B bursts were spectrally separate, suggesting that spectral separation in the A and B burst sequences could be conducive to stream segregation at the pre-attentive level. In the attentive condition, a P300 response was consistently elicited regardless of whether there was spectral separation between the A and B bursts, indicating the facilitative role of voluntary attention in stream segregation. The results suggest that reliable ERP measures can be used as indirect indicators for auditory stream segregation in conditions of weak spectral contrast. These findings have important implications for cochlear implant (CI) studies – as spectral information available through a CI device or simulation is substantially degraded, it may require more attention to achieve stream segregation.
Article
Full-text available
Enhanced alpha power compared with a baseline can reflect states of increased cognitive load, for example, when listening to speech in noise. Can knowledge about "when" to listen (temporal expectations) potentially counteract cognitive load and concomitantly reduce alpha? The current magnetoencephalography (MEG) experiment induced cognitive load using an auditory delayed-matching-to-sample task with 2 syllables S1 and S2 presented in speech-shaped noise. Temporal expectation about the occurrence of S1 was manipulated in 3 different cue conditions: "Neutral" (uninformative about foreperiod), "early-cued" (short foreperiod), and "late-cued" (long foreperiod). Alpha power throughout the trial was highest when the cue was uninformative about the onset time of S1 (neutral) and lowest for the late-cued condition. This alpha-reducing effect of late compared with neutral cues was most evident during memory retention in noise and originated primarily in the right insula. Moreover, individual alpha effects during retention accounted best for observed individual performance differences between late-cued and neutral conditions, indicating a tradeoff between allocation of neural resources and the benefits drawn from temporal cues. Overall, the results indicate that temporal expectations can facilitate the encoding of speech in noise, and concomitantly reduce neural markers of cognitive load. © 2014 The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: [email protected] /* */
Article
Full-text available
Alpha-band oscillations are the dominant oscillations in the human brain and recent evidence suggests that they have an inhibitory function. Nonetheless, there is little doubt that alpha-band oscillations also play an active role in information processing. In this article, I suggest that alpha-band oscillations have two roles (inhibition and timing) that are closely linked to two fundamental functions of attention (suppression and selection), which enable controlled knowledge access and semantic orientation (the ability to be consciously oriented in time, space, and context). As such, alpha-band oscillations reflect one of the most basic cognitive processes and can also be shown to play a key role in the coalescence of brain activity in different frequencies.
Article
Full-text available
To investigate the contributions of energetic and informational masking to neural encoding and perception in noise, using oddball discrimination and sentence recognition tasks. P3 auditory evoked potential, behavioral discrimination, and sentence recognition data were recorded in response to speech and tonal signals presented to nine normal-hearing adults. Stimuli were presented at a signal to noise ratio of -3 dB in four background conditions: quiet, continuous noise, intermittent noise, and four-talker babble. Responses to tonal signals were not significantly different for the three maskers. However, responses to speech signals in the four-talker babble resulted in longer P3 latencies, smaller P3 amplitudes, poorer discrimination accuracy, and longer reaction times than in any of the other conditions. Results also demonstrate significant correlations between physiological and behavioral data. As latency of the P3 increased, reaction times also increased and sentence recognition scores decreased. The data confirm a differential effect of masker type on the P3 and behavioral responses and present evidence of interference by an informational masker to speech understanding at the level of the cortex. Results also validate the use of the P3 as a useful measure to demonstrate physiological correlates of informational masking.
Article
Full-text available
The speech signal contains many acoustic properties that may contribute differently to spoken word recognition. Previous studies have demonstrated that the importance of properties present during consonants or vowels is dependent upon the linguistic context (i.e., words versus sentences). The current study investigated three potentially informative acoustic properties that are present during consonants and vowels for monosyllabic words and sentences. Natural variations in fundamental frequency were either flattened or removed. The speech envelope and temporal fine structure were also investigated by limiting the availability of these cues via noisy signal extraction. Thus, this study investigated the contribution of these acoustic properties, present during either consonants or vowels, to overall word and sentence intelligibility. Results demonstrated that all processing conditions displayed better performance for vowel-only sentences. Greater performance with vowel-only sentences remained, despite removing dynamic cues of the fundamental frequency. Word and sentence comparisons suggest that the speech envelope may be at least partially responsible for additional vowel contributions in sentences. Results suggest that speech information transmitted by the envelope is responsible, in part, for greater vowel contributions in sentences, but is not predictive for isolated words.
Article
Full-text available
Speech scientists have long proposed that formant exaggeration in infant-directed speech plays an important role in language acquisition. This event-related potential (ERP) study investigated neural coding of formant-exaggerated speech in 6-12-month-old infants. Two synthetic /i/ vowels were presented in alternating blocks to test the effects of formant exaggeration. ERP waveform analysis showed significantly enhanced N250 for formant exaggeration, which was more prominent in the right hemisphere than the left. Time-frequency analysis indicated increased neural synchronization for processing formant-exaggerated speech in the delta band at frontal-central-parietal electrode sites as well as in the theta band at frontal-central sites. Minimum norm estimates further revealed a bilateral temporal-parietal-frontal neural network in the infant brain sensitive to formant exaggeration. Collectively, these results provide the first evidence that formant expansion in infant-directed speech enhances neural activities for phonetic encoding and language learning.
Article
Full-text available
This study investigated the relative contributions of consonants and vowels to the perceptual intelligibility of monosyllabic consonant-vowel-consonant (CVC) words. A noise replacement paradigm presented CVCs with only consonants or only vowels preserved. Results demonstrated no difference between overall word accuracy in these conditions; however, different error patterns were observed. A significant effect of lexical difficulty was demonstrated for both types of replacement, whereas the noise level used during replacement did not influence results. The contribution of consonant and vowel transitional information present at the consonant-vowel boundary was also explored. The proportion of speech presented, regardless of the segmental condition, overwhelmingly predicted performance. Comparisons were made with previous segment replacement results using sentences [Fogerty, and Kewley-Port (2009). J. Acoust. Soc. Am. 126, 847-857]. Results demonstrated that consonants contribute to intelligibility equally in both isolated CVC words and sentences. However, vowel contributions were mediated by context, with greater contributions to intelligibility in sentence contexts. Therefore, it appears that vowels in sentences carry unique speech cues that greatly facilitate intelligibility which are not informative and/or present during isolated word contexts. Consonants appear to provide speech cues that are equally available and informative during sentence and isolated word presentations.
Article
Full-text available
Although research has focused on the perceptual contribution of consonants to spoken syllable or word intelligibility, in sentences vowels have a distinct perceptual advantage over consonants in determining intelligibility [Kewley-Port et al., J. Acoust. Soc. Am. 122, 2365-2375 (2007)]. The current study used a noise replacement paradigm to investigate how perceptual contributions of consonants and vowels are mediated by transitional information at segmental boundaries. The speech signal preserved between replacements is defined as a glimpse window. In the first experiment, glimpse windows contained proportional amounts of transitional boundary information that was either added to consonants or deleted from vowels. Results replicated a two-to-one vowel advantage for intelligibility at the traditional consonant-vowel boundary and suggest that vowel contributions remain robust against proportional deletions of the signal. The second experiment examined the combined effect of random glimpse windows not locked to segments and the distributions of durations measured from the consonant versus vowel glimpses observed in Experiment 1. Results demonstrated that, for random glimpses, the cumulative sentence duration glimpsed was an excellent predictor of performance. Comparisons across experiments confirmed that higher proportions of vowel information within glimpses yielded the highest sentence intelligibility.
Article
Full-text available
This study provides new evidence of deficient auditory cortical processing of speech in noise in autism spectrum disorders (ASD). Speech-evoked responses (approximately 100-300 ms) in quiet and background noise were evaluated in typically-developing (TD) children and children with ASD. ASD responses showed delayed timing (both conditions) and reduced amplitudes (quiet) compared to TD responses. As expected, TD responses in noise were delayed and reduced compared to quiet responses. However, minimal quiet-to-noise response differences were found in children with ASD, presumably because quiet responses were already severely degraded. Moreover, ASD quiet responses resembled TD noise responses, implying that children with ASD process speech in quiet only as well as TD children do in background noise.
Article
Full-text available
The P300 wave is a positive deflection in the human event-related potential. It is most commonly elicited in an "oddball" paradigm when a subject detects an occasional "target" stimulus in a regular train of standard stimuli. The P300 wave only occurs if the subject is actively engaged in the task of detecting the targets. Its amplitude varies with the improbability of the targets. Its latency varies with the difficulty of discriminating the target stimulus from the standard stimuli. A typical peak latency when a young adult subject makes a simple discrimination is 300 ms. In patients with decreased cognitive ability, the P300 is smaller and later than in age-matched normal subjects. The intracerebral origin of the P300 wave is not known and its role in cognition not clearly understood. The P300 may have multiple intracerebral generators, with the hippocampus and various association areas of the neocortex all contributing to the scalp-recorded potential. The P300 wave may represent the transfer of information to consciousness, a process that involves many different regions of the brain.
Article
Objectives: The objectives of this study were to investigate the effects of hearing aid use and the effectiveness of ReadMyQuips (RMQ), an auditory training program, on speech perception performance and auditory selective attention using electrophysiological measures. RMQ is an audiovisual training program designed to improve speech perception in everyday noisy listening environments. Design: Participants were adults with mild to moderate hearing loss who were first-time hearing aid users. After 4 weeks of hearing aid use, the experimental group completed RMQ training in 4 weeks, and the control group received listening practice on audiobooks during the same period. Cortical late event-related potentials (ERPs) and the Hearing in Noise Test (HINT) were administered at prefitting, pretraining, and post-training to assess effects of hearing aid use and RMQ training. An oddball paradigm allowed tracking of changes in P3a and P3b ERPs to distrac-tors and targets, respectively. Behavioral measures were also obtained while ERPs were recorded from participants. Results: After 4 weeks of hearing aid use but before auditory training , HINT results did not show a statistically significant change, but there was a significant P3a reduction. This reduction in P3a was correlated with improvement in d prime (d′) in the selective attention task. Increased P3b amplitudes were also correlated with improvement in d′ in the selective attention task. After training, this correlation between P3b and d′ remained in the experimental group, but not in the control group. Similarly, HINT testing showed improved speech perception post training only in the experimental group. The criterion calculated in the auditory selective attention task showed a reduction only in the experimental group after training. ERP measures in the auditory selective attention task did not show any changes related to training. Conclusions: Hearing aid use was associated with a decrement in involuntary attention switch to distractors in the auditory selective attention task. RMQ training led to gains in speech perception in noise and improved listener confidence in the auditory selective attention task.
Article
Objectives: Speech perception in background noise is difficult for many individuals, and there is considerable performance variability across listeners. The combination of physiological and behavioral measures may help to understand sources of this variability for individuals and groups and prove useful clinically with hard-to-test populations. The purpose of this study was threefold: (1) determine the effect of signal-to-noise ratio (SNR) and signal level on cortical auditory evoked potentials (CAEPs) and sentence-level perception in older normal-hearing (ONH) and older hearing-impaired (OHI) individuals, (2) determine the effects of hearing impairment and age on CAEPs and perception, and (3) explore how well CAEPs correlate with and predict speech perception in noise. Design: Two groups of older participants (15 ONH and 15 OHI) were tested using speech-in-noise stimuli to measure CAEPs and sentence-level perception of speech. The syllable /ba/, used to evoke CAEPs, and sentences were presented in speech-spectrum background noise at four signal levels (50, 60, 70, and 80 dB SPL) and up to seven SNRs (-10, -5, 0, 5, 15, 25, and 35 dB). These data were compared between groups to reveal the hearing impairment effect and then combined with previously published data for 15 young normal-hearing individuals to determine the aging effect. Results: Robust effects of SNR were found for perception and CAEPs. Small but significant effects of signal level were found for perception, primarily at poor SNRs and high signal levels, and in some limited instances for CAEPs. Significant effects of age were seen for both CAEPs and perception, while hearing impairment effects were only found with perception measures. CAEPs correlate well with perception and can predict SNR50s to within 2 dB for ONH. However, prediction error is much larger for OHI and varies widely (from 6 to 12 dB) depending on the model that was used for prediction. Conclusions: When background noise is present, SNR dominates both perception-in-noise testing and cortical electrophysiological testing, with smaller and sometimes significant contributions from signal level. A mismatch between behavioral and electrophysiological results was found (hearing impairment effects were primarily only seen for behavioral data), illustrating the possible contributions of higher order cognitive processes on behavior. It is interesting that the hearing impairment effect size was more than five times larger than the aging effect size for CAEPs and perception. Sentence-level perception can be predicted well in normal-hearing individuals; however, additional research is needed to explore improved prediction methods for older individuals with hearing impairment.
Article
Noise, as an unwanted sound, has become one of modern society's environmental conundrums, and many children are exposed to higher noise levels than previously assumed. However, the effects of background noise on central auditory processing of toddlers, who are still acquiring language skills, have so far not been determined. The authors evaluated the effects of background noise on toddlers' speech-sound processing by recording event-related brain potentials. The hypothesis was that background noise modulates neural speech-sound encoding and degrades speech-sound discrimination. Obligatory P1 and N2 responses for standard syllables and the mismatch negativity (MMN) response for five different syllable deviants presented in a linguistic multifeature paradigm were recorded in silent and background noise conditions. The participants were 18 typically developing 22- to 26-month-old monolingual children with healthy ears. The results showed that the P1 amplitude was smaller and the N2 amplitude larger in the noisy conditions compared with the silent conditions. In the noisy condition, the MMN was absent for the intensity and vowel changes and diminished for the consonant, frequency, and vowel duration changes embedded in speech syllables. Furthermore, the frontal MMN component was attenuated in the noisy condition. However, noise had no effect on P1, N2, or MMN latencies. The results from this study suggest multiple effects of background noise on the central auditory processing of toddlers. It modulates the early stages of sound encoding and dampens neural discrimination vital for accurate speech perception. These results imply that speech processing of toddlers, who may spend long periods of daytime in noisy conditions, is vulnerable to background noise. In noisy conditions, toddlers' neural representations of some speech sounds might be weakened. Thus, special attention should be paid to acoustic conditions and background noise levels in children's daily environments, like day-care centers, to ensure a propitious setting for linguistic development. In addition, the evaluation and improvement of daily listening conditions should be an ordinary part of clinical intervention of children with linguistic problems.
Article
This study investigated the effects of decreased audibility produced by high-pass noise masking on cortical event-related potentials (ERPs) N1, N2, and P3 to the speechsounds /ba/ and /da/ presented at 65 and 80 dB SPL. Normal-hearing subjects pressed a button in response to the deviant sound in an oddball paradigm. Broadband masking noise was presented at an intensity sufficient to completely mask the response to the 65-dB SPLspeechsounds, and subsequently high-pass filtered at 4000, 2000, 1000, 500, and 250 Hz. With high-pass masking noise, pure-tone behavioral thresholds increased by an average of 38 dB at the high-pass cutoff and by 50 dB one octave above the cutoff frequency. Results show that as the cutoff frequency of the high-pass masker was lowered, ERP latencies to speechsounds increased and amplitudes decreased. The cutoff frequency where these changes first occurred and the rate of the change differed for N1 compared to N2, P3, and the behavioral measures. N1 showed gradual changes as the masker cutoff frequency was lowered. N2, P3, and behavioral measures showed marked changes below a masker cutoff of 2000 Hz. These results indicate that the decreased audibility resulting from the noise masking affects the various ERP components in a differential manner. N1 is related to the presence of audible stimulus energy, being present whether audible stimuli are discriminable or not. In contrast, N2 and P3 were absent when the stimuli were audible but not discriminable (i.e., when the second formant transitions were masked), reflecting stimulus discrimination. These data have implications regarding the effects of decreased audibility on cortical processing of speechsounds and for the study of cortical ERPs in populations with hearing impairment.
Article
Objective: http://www.communication.northwestern.edu/csd/research/brainvoltsTo establish reliable procedures and normative values to quantify brainstem encoding of speech sounds.Methods: Auditory brainstem responses to speech syllables presented in quiet and in background noise were obtained from 38 normal children. Brainstem responses consist of transient and sustained, periodic components—much like the speech signal itself. Transient peak responses were analyzed with measures of latency, amplitude, area, and slope. Magnitude of sustained, periodic frequency-following responses was assessed with root mean square, fundamental frequency, and first formant amplitudes; timing was assessed by stimulus-to-response and quiet-to-noise inter-response correlations.Results: Measures of transient and sustained components of the brainstem response to speech syllables were reliably obtained with high test–retest stability and low variability across subjects. All components of the brainstem response were robust in quiet. Background noise disrupted the transient responses whereas the sustained response was more resistant to the deleterious effects of noise.Conclusions: The speech-evoked brainstem response faithfully reflects many acoustic properties of the speech signal. Procedures to quantitatively describe it have been developed.Significance: Accurate and precise manifestation of stimulus timing at the auditory brainstem is a hallmark of the normal perceptual system. The brainstem response to speech sounds provides a mechanism for understanding the neural bases of normal and deficient attention-independent auditory function.
Article
Although brainstem dys-synchrony is a hallmark of children with auditory neuropathy spectrum disorder (ANSD), little is known about how the lack of neural synchrony manifests at more central levels. We used time-frequency single-trial EEG analyses (i.e., inter-trial coherence; ITC), to examine cortical phase synchrony in children with normal hearing (NH), sensorineural hearing loss (SNHL) and ANSD. Single trial time-frequency analyses were performed on cortical auditory evoked responses from 41 NH children, 91 children with ANSD and 50 children with SNHL. The latter two groups included children who received intervention via hearing aids and cochlear implants. ITC measures were compared between groups as a function of hearing loss, intervention type, and cortical maturational status. In children with SNHL, ITC decreased as severity of hearing loss increased. Children with ANSD revealed lower levels of ITC relative to children with NH or SNHL, regardless of intervention. Children with ANSD who received cochlear implants showed significant improvements in ITC with increasing experience with their implants. Cortical phase coherence is significantly reduced as a result of both severe-to-profound SNHL and ANSD. ITC provides a window into the brain oscillations underlying the averaged cortical auditory evoked response. Our results provide a first description of deficits in cortical phase synchrony in children with SNHL and ANSD.
Article
Speech perception in background noise is a common challenge across individuals and health conditions (e.g., hearing impairment, aging, etc.). Both behavioral and physiological measures have been used to understand the important factors that contribute to perception-in-noise abilities. The addition of a physiological measure provides additional information about signal-in-noise encoding in the auditory system and may be useful in clarifying some of the variability in perception-in-noise abilities across individuals. Fifteen young normal-hearing individuals were tested using both electrophysiology and behavioral methods as a means to determine (1) the effects of signal-to-noise ratio (SNR) and signal level and (2) how well cortical auditory evoked potentials (CAEPs) can predict perception in noise. Three correlation/regression approaches were used to determine how well CAEPs predicted behavior. Main effects of SNR were found for both electrophysiology and speech perception measures, while signal level effects were found generally only for speech testing. These results demonstrate that when signals are presented in noise, sensitivity to SNR cues obscures any encoding of signal level cues. Electrophysiology and behavioral measures were strongly correlated. The best physiological predictors (e.g., latency, amplitude, and area of CAEP waves) of behavior (SNR at which 50 % of the sentence is understood) were N1 latency and N1 amplitude measures. In addition, behavior was best predicted by the 70-dB signal/5-dB SNR CAEP condition. It will be important in future studies to determine the relationship of electrophysiology and behavior in populations who experience difficulty understanding speech in noise such as those with hearing impairment or age-related deficits.
Article
The perception of vowels heard in noises of various spectra is analyzed by means of stimulus‐response matrices. The stimulus vowels were spoken in PB‐word lists and in syllable lists in which the vowels were equally probable. The matrices show shifts in vowel confusions depending on how different noise spectra mask the vowelformants.Vowel duration and intensity are measured and related to vowel perception. Vowel guessing is related to past training.
Article
Research has shown that the amplitude and latency of neural responses to passive mismatch negativity (MMN) tasks are affected by noise (Billings et al., 2010). Further studies have revealed that informational masking noise results in decreased P3 amplitude and increased P3 latency, which correlates with decreased discrimination abilities and reaction time (Bennett et al., 2012). This study aims to further investigate neural processing of speech in differing types of noise by attempting to correlate MMN neural responses to consonant and vowel stimuli with results from behavioral sentence recognition tasks. Preliminary behavioral data indicate that noise conditions significantly compromise the perception of consonant change in an oddball discrimination task. Noise appears to have less of an effect on the perception of vowel change. The MMN data are being collected for the detection of consonant change and vowel change in different noise conditions. The results will be examined to address how well the pre-attentive MMN measures at the phonemic level can predict speech intelligibility at the sentence level using the same noise conditions.
Article
The current methodological policy in Psychophysiology stipulates that repeated-measures designs be analyzed using either multivariate analysis of variance (ANOVA) or repeated-measures ANOVA with the Greenhouse–Geisser or Huynh–Feldt correction. Both techniques lead to appropriate type I error probabilities under general assumptions about the variance-covariance matrix of the data. This report introduces mixed-effects models as an alternative procedure for the analysis of repeated-measures data in Psychophysiology. Mixed-effects models have many advantages over the traditional methods: They handle missing data more effectively and are more efficient, parsimonious, and flexible. We described mixed-effects modeling and illustrated its applicability with a simple example.
Article
This paper provides an introduction to mixed-effects models for the analysis of repeated measurement data with subjects and items as crossed random effects. A worked-out example of how to use recent software for mixed-effects modeling is provided. Simulation studies illustrate the advantages offered by mixed-effects analyses compared to traditional analyses based on quasi-F tests, by-subjects analyses, combined by-subjects and by-items analyses, and random regression. Applications and possibilities across a range of domains of inquiry are discussed.
Article
Sixteen English consonants were spoken over voice communication systems with frequency distortion and with random masking noise. The listeners were forced to guess at every sound and a count was made of all the different errors that resulted when one sound was confused with another. With noise or low‐pass filtering the confusions fall into consistent patterns, but with high‐pass filtering the errors are scattered quite randomly. An articulatory analysis of these 16 consonants provides a system of five articulatory features or “dimensions” that serve to characterize and distinguish the different phonemes: voicing, nasality, affrication, duration, and place of articulation. The data indicate that voicing and nasality are little affected and that place is severely affected by low‐pass and noisy systems. The indications are that the perception of any one of these five features is relatively independent of the perception of the others, so that it is as if five separate, simple channels were involved rather than a single complex channel.
Article
We investigated a neural basis of speech-in-noise perception in older adults. Hearing loss, the third most common chronic condition in older adults, is most often manifested by difficulty understanding speech in background noise. This trouble with understanding speech in noise, which occurs even in individuals who have normal-hearing thresholds, may arise, in part, from age-related declines in central auditory processing of the temporal and spectral components of speech. We hypothesized that older adults with poorer speech-in-noise (SIN) perception demonstrate impairments in the subcortical representation of speech. In all participants (28 adults, age 60-73 yr), average hearing thresholds calculated from 500 to 4000 Hz were ≤ 25 dB HL. The participants were evaluated behaviorally with the Hearing in Noise Test (HINT) and neurophysiologically using speech-evoked auditory brainstem responses recorded in quiet and in background noise. The participants were divided based on their HINT scores into top and bottom performing groups that were matched for audiometric thresholds and intelligent quotient. We compared brainstem responses in the two groups, specifically, the average spectral magnitudes of the neural response and the degree to which background noise affected response morphology. In the quiet condition, the bottom SIN group had reduced neural representation of the fundamental frequency of the speech stimulus and an overall reduction in response magnitude. In the noise condition, the bottom SIN group demonstrated greater disruption in noise, reflecting reduction in neural synchrony. The role of brainstem timing is particularly evident in the strong relationship between SIN perception and quiet-to-noise response correlations. All physiologic measures correlated with SIN perception. Adults in the bottom SIN group differed from the audiometrically matched top SIN group in how speech was neurally encoded. The strength of subcortical encoding of the fundamental frequency appears to be a factor in successful speech-in-noise perception in older adults. Given the limitations of amplification, our results suggest the need for inclusion of auditory training to strengthen central auditory processing in older adults with SIN perception difficulties.
Article
To advance our understanding of the biological basis of speech-in-noise perception, we investigated the effects of background noise on both subcortical- and cortical-evoked responses, and the relationships between them, in normal hearing young adults. The addition of background noise modulated subcortical and cortical response morphology. In noise, subcortical responses were later, smaller in amplitude and demonstrated decreased neural precision in encoding the speech sound. Cortical responses were also delayed by noise, yet the amplitudes of the major peaks (N1, P2) were affected differently, with N1 increasing and P2 decreasing. Relationships between neural measures and speech-in-noise ability were identified, with earlier subcortical responses, higher subcortical response fidelity and greater cortical N1 response magnitude all relating to better speech-in-noise perception. Furthermore, it was only with the addition of background noise that relationships between subcortical and cortical encoding of speech and the behavioral measures of speech in noise emerged. Results illustrate that human brainstem responses and N1 cortical response amplitude reflect coordinated processes with regards to the perception of speech in noise, thereby acting as a functional index of speech-in-noise perception.
Article
Perception-in-noise deficits have been demonstrated across many populations and listening conditions. Many factors contribute to successful perception of auditory stimuli in noise, including neural encoding in the central auditory system. Physiological measures such as cortical auditory-evoked potentials (CAEPs) can provide a view of neural encoding at the level of the cortex that may inform our understanding of listeners' abilities to perceive signals in the presence of background noise. To understand signal-in-noise neural encoding better, we set out to determine the effect of signal type, noise type, and evoking paradigm on the P1-N1-P2 complex. Tones and speech stimuli were presented to nine individuals in quiet and in three background noise types: continuous speech spectrum noise, interrupted speech spectrum noise, and four-talker babble at a signal-to-noise ratio of -3 dB. In separate sessions, CAEPs were evoked by a passive homogenous paradigm (single repeating stimulus) and an active oddball paradigm. The results for the N1 component indicated significant effects of signal type, noise type, and evoking paradigm. Although components P1 and P2 also had significant main effects of these variables, only P2 demonstrated significant interactions among these variables. Signal type, noise type, and evoking paradigm all must be carefully considered when interpreting signal-in-noise evoked potentials. Furthermore, these data confirm the possible usefulness of CAEPs as an aid to understand perception-in-noise deficits.
Article
Children often have difficulty understanding speech in challenging listening environments. In the absence of peripheral hearing loss, these speech perception difficulties may arise from dysfunction at more central levels in the auditory system, including subcortical structures. We examined brainstem encoding of pitch in a speech syllable in 38 school-age children. In children with poor speech-in-noise perception, we find impaired encoding of the fundamental frequency and the second harmonic, two important cues for pitch perception. Pitch, an essential factor in speaker identification, aids the listener in tracking a specific voice from a background of voices. These results suggest that the robustness of subcortical neural encoding of pitch features in time-varying signals is a key factor in determining success with perceiving speech in noise.
Article
The presence of irrelevant auditory information (other talkers, environmental noises) presents a major challenge to listening to speech. The fundamental frequency (F(0)) of the target speaker is thought to provide an important cue for the extraction of the speaker's voice from background noise, but little is known about the relationship between speech-in-noise (SIN) perceptual ability and neural encoding of the F(0). Motivated by recent findings that music and language experience enhance brainstem representation of sound, we examined the hypothesis that brainstem encoding of the F(0) is diminished to a greater degree by background noise in people with poorer perceptual abilities in noise. To this end, we measured speech-evoked auditory brainstem responses to /da/ in quiet and two multitalker babble conditions (two-talker and six-talker) in native English-speaking young adults who ranged in their ability to perceive and recall SIN. Listeners who were poorer performers on a standardized SIN measure demonstrated greater susceptibility to the degradative effects of noise on the neural encoding of the F(0). Particularly diminished was their phase-locked activity to the fundamental frequency in the portion of the syllable known to be most vulnerable to perceptual disruption (i.e., the formant transition period). Our findings suggest that the subcortical representation of the F(0) in noise contributes to the perception of speech in noisy conditions.
Article
To investigate the effects of three articulatory features of speech (i.e., vowel-space contrast, place of articulation of stop consonants, and voiced/voiceless distinctions) on cortical event-related potentials (ERPs) (waves N1, mismatch negativity, N2b, and P3b) and their related behavioral measures of discrimination (d-prime sensitivity and reaction time [RT]) in normal-hearing adults to increase our knowledge regarding how the brain responds to acoustical differences that occur within an articulatory speech feature and across articulatory features of speech. Cortical ERPs were recorded to three sets of consonant-vowel speech stimuli (/bi versus /bu/, /ba/ versus /da/, /da/ versus /ta/) presented at 65 and 80 dB peak-to-peak equivalent SPL from 20 normal-hearing adults. All speech stimuli were presented in an oddball paradigm. Cortical ERPs were recorded from 10 individuals in the active-listening condition and another 10 individuals in the passive-listening condition. All listeners were tested at both stimulus intensities. Mean amplitudes for all ERP components were considerably larger for the responses to the vowel contrast in comparison with the responses to the two consonant contrasts. Similarly, the mean mismatch negativity, P3b, and RT latencies were significantly shorter for the responses to the vowel versus consonant contrasts. For the majority of ERP components, only small nonsignificant differences occurred in either the ERP amplitude or the latency response measurements for stimuli within a particular articulatory feature of speech. The larger response amplitudes and earlier latencies for the cortical ERPs to the vowel versus consonant stimuli are likely related, in part, to the large spectral differences present in these speech contrasts. The measurements of response strength (amplitudes and d-prime scores) and response timing (ERP and RT latencies) for the various cortical ERPs suggest that the brain may have an easier task processing the steady state information present in the vowel stimuli in comparison with the rapidly changing formant transitions in the consonant stimuli.
Article
Understanding speech in background noise is challenging for every listener, including those with normal peripheral hearing. This difficulty is attributable in part to the disruptive effects of noise on neural synchrony, resulting in degraded representation of speech at cortical and subcortical levels as reflected by electrophysiological responses. These problems are especially pronounced in clinical populations such as children with learning impairments. Given the established effects of noise on evoked responses, we hypothesized that listening-in-noise problems are associated with degraded processing of timing information at the brainstem level. Participants (66 children; ages, 8-14 years; 22 females) were divided into groups based on their performance on clinical measures of speech-in-noise (SIN) perception and reading. We compared brainstem responses to speech syllables between top and bottom SIN and reading groups in the presence and absence of competing multitalker babble. In the quiet condition, neural response timing was equivalent between groups. In noise, however, the bottom groups exhibited greater neural delays relative to the top groups. Group-specific timing delays occurred exclusively in response to the noise-vulnerable formant transition, not to the more perceptually robust, steady-state portion of the stimulus. These results demonstrate that neural timing is disrupted by background noise and that greater disruptions are associated with the inability to perceive speech in challenging listening conditions.
Article
Recent findings indicate that the electroencephalographic alpha (7-14 Hz) activity is functionally involved in cognitive brain functioning, but the issue of whether and how event-related alpha oscillations may relate to the processes indexed by the P300 component of the event-related brain potentials (ERPs) has not been addressed. The present study assessed the effect of auditory oddball task processing on slow (7-10 Hz) and fast (10-14 Hz) alpha activity from the P300 latency range. ERPs from mentally counted targets (20%) and not counted nontargets (80 %) were recorded at Fz, Cz, and Pz in nine subjects. Single-sweep phase-locking, power of phase-locked, and power of non-phase-locked alpha responses during P300 activity were quantified. The results demonstrated that larger and more synchronized phase-locked fast alpha components at anterior (frontal-central) locations, with reduced non-phase-locked slow alpha responses at the parietal site were produced by targets relative to nontargets. Because the simultaneously recorded P300 and alpha activity manifested a similar sensitivity to the oddball task, event-related alpha appears to be functionally associated with the cognitive processing demands eliciting P300. Also, evidence is provided for the functional involvement of frontally synchronized and enhanced alpha oscillations in task processing. NeuroReport 9: 3159-3164 © 1998 Lippincott Williams & Wilkins
Article
One key issue for any computational model of visual-word recognition is the choice of an input coding scheme for assigning letter position. Recent research has shown that pseudowords created by transposing two letters are very effective at activating the lexical representation of their base words (e.g., relovution activates REVOLUTION). We report a masked priming lexical decision experiment in which the pseudoword primes were created by transposing/replacing two consonants or two vowels while event-related potentials were recorded. The results showed a modulation of the amplitude at an early window (150-250 ms) and at the N400 component for vowels but not for consonant transpositions. In addition, the peak latencies were faster for transposed than replaced consonants. These results suggest that consonants and vowels play a different role during the process of visual word recognition. We examine the implications for the choice of an input coding scheme in models of visual-word recognition.
Article
The present paper combines a review of event-related potentials (ERPs) with empirical data concerning the question: what are the differences between auditory evoked potentials (EPs) and two types of ERPs with respect to their frequency components? In this study auditory EPs were elicited by 1500 Hz tones. The first type of ERPs was responses to 3rd attended tones in an omitted stimulus paradigm where every 4th stimulus was omitted. The second type of ERPs was responses to rare 1600 Hz tones in an oddball paradigm. The amplitudes of delta and theta components of EPs and ERPs showed significant differences: in responses to 3rd attended tones there was a significant increase in the theta frequency band (frontal and parietal locations; 0-250 ms). In the delta frequency band there was no significant change. In contrast a diffuse delta increase occurred in oddball responses and an additional prolongation of theta oscillations was observed (late theta response: 250-500 ms). These results are discussed in the context of ERPs as induced rhythmicities. The intracranial sources of ERPs, their psychological correlates and the role of theta rhythms in the cortico-hippocampal interaction are reviewed. From these results and from the literature a working hypothesis is derived assuming that delta responses are mainly involved in signal matching, decision making and surprise, whereas theta responses are more related to focused attention and signal detection.
Article
We confirm that the latency of the P300 component of the human event-related potential is determined by processes involved in stimulus evaluation and categorization and is relatively independent of response selection and execution. Stimulus discriminability and stimulus-response compatibility were manipulated independently in an "additive-factors" design. Choice reaction time and P300 latency were obtained simultaneously for each trial. Although reaction time was affected by both discriminability and stimulus-response compatibility, P300 latency was affected only by stimulus discriminability.
A new measure of event-related brain dynamics, the event-related spectral perturbation (ERSP), is introduced to study event-related dynamics of the EEG spectrum induced by, but not phase-locked to, the onset of the auditory stimuli. The ERSP reveals aspects of event-related brain dynamics not contained in the ERP average of the same response epochs. Twenty-eight subjects participated in daily auditory evoked response experiments during a 4 day study of the effects of 24 h free-field exposure to intermittent trains of 89 dB low frequency tones. During evoked response testing, the same tones were presented through headphones in random order at 5 sec intervals. No significant changes in behavioral thresholds occurred during or after free-field exposure. ERSPs induced by target pips presented in some inter-tone intervals were larger than, but shared common features with, ERSPs induced by the tones, most prominently a ridge of augmented EEG amplitude from 11 to 18 Hz, peaking 1-1.5 sec after stimulus onset. Following 3-11 h of free-field exposure, this feature was significantly smaller in tone-induced ERSPs; target-induced ERSPs were not similarly affected. These results, therefore, document systematic effects of exposure to intermittent tones on EEG brain dynamics even in the absence of changes in auditory thresholds.
Article
Background EEG was recorded from 24 subjects under eyes open/closed conditions for 5 min. The P3(00) event-related brain potential (ERP) was elicited with auditory stimuli when eyes were open/closed in the same subjects. Target stimulus probability was manipulated (0.20, 0.40, 0.60, 0.80) in different blocks under each eyes open/closed condition. Spectral analysis indicated that EEG power between 8 and 12 Hz demonstrated a similar scalp distribution as the P3 component of the ERP for the electrode sites employed. Spectral power and mean frequency were modestly correlated with P3 amplitude and peak latency primarily in the slower EEG bands, with associations observed across probability conditions and often strongest when target stimulus probability was .20. The results suggest that differences between individuals for EEG variation may contribute to P3 component variability, especially at the parietal recording site and under low target stimulus probability conditions when the P3 is largest and most stable.
Article
Four experiments addressing the role of attention in phonetic perception are reported. The first experiment shows that the relative importance of two cues to the voicing distinction changes when subjects must perform an arithmetic distractor task at the same time as identifying a speech stimulus. The contribution of voice onset time to phonetic labeling decreases when subjects are distracted, while that of FO onset frequency increases. The second experiment shows a similar pattern for two cues to the distinction between the vowels /i/ (as in "beat") and /I/ (as in "bit"). Under low attention conditions, formant pattern has a smaller effect on phonetic labeling while vowel duration has a larger effect. Together these experiments indicate that careful attention to speech perception is necessary for strong acoustic cues (voice-onset time and formant patterns) to achieve their full impact on phonetic labeling, while weaker acoustic cues (FO onset frequency and vowel duration) achieve their full impact on phonetic labeling without close attention. Experiment 3 shows that this pattern is obtained when the distractor task places little demand on verbal short-term memory. Experiment 4 provides a data set for testing formal models of the role of attention in speech perception. Attention is shown to influence the signal-to-noise ratio in the phonetic encoding of acoustic cues; the sustained phonetic contribution of weak cues without close attention stems from reduced competition from strong cues. This principle is instantiated in a network model in which the role of attention is to reduce noise in the phonetic encoding of acoustic cues. Implications of this work for understanding speech perception and general theories of the role of attention in perception are discussed.
Article
This study investigated the effects of decreased audibility produced by high-pass noise masking on cortical event-related potentials (ERPs) N1, N2, and P3 to the speech sounds /ba/and/da/presented at 65 and 80 dB SPL. Normal-hearing subjects pressed a button in response to the deviant sound in an oddball paradigm. Broadband masking noise was presented at an intensity sufficient to completely mask the response to the 65-dB SPL speech sounds, and subsequently high-pass filtered at 4000, 2000, 1000, 500, and 250 Hz. With high-pass masking noise, pure-tone behavioral thresholds increased by an average of 38 dB at the high-pass cutoff and by 50 dB one octave above the cutoff frequency. Results show that as the cutoff frequency of the high-pass masker was lowered, ERP latencies to speech sounds increased and amplitudes decreased. The cutoff frequency where these changes first occurred and the rate of the change differed for N1 compared to N2, P3, and the behavioral measures. N1 showed gradual changes as the masker cutoff frequency was lowered. N2, P3, and behavioral measures showed marked changes below a masker cutoff of 2000 Hz. These results indicate that the decreased audibility resulting from the noise masking affects the various ERP components in a differential manner. N1 is related to the presence of audible stimulus energy, being present whether audible stimuli are discriminable or not. In contrast, N2 and P3 were absent when the stimuli were audible but not discriminable (i.e., when the second formant transitions were masked), reflecting stimulus discrimination. These data have implications regarding the effects of decreased audibility on cortical processing of speech sounds and for the study of cortical ERPs in populations with hearing impairment.
Article
The stimulus evaluation view on P3 latency holds that P3 latency mainly reflects stimulus-processing time, in contrast to response-processing time. A review of the experimental evidence, however, leads to the conclusion that P3 is not a sensitive tool for separating between stimulus- and response-related processes. Rather, it appears that P3 latency is a sensitive index of any response-time changes when response times in the fast condition are brief, with its sensitivity decreasing when response times in the fast condition get longer. This regularity was confirmed by a detailed analysis of the published evidence from Sternberg's task and was not attributable to speed-accuracy trade-off or to different methods of parametrization. The structures generating the scalp P3b are involved both in stimulus processing and in response selection. Response selection may exert its effect on P3 in one of two ways; either directly, fully delaying P3 latency, or affecting a second P3 component (P-CR) only, thus having an attenuated effect on P3 latency.
EEG was recorded from 120 normal adult subjects who ranged in age from 20 to 80+ years in separate eyes open/closed conditions. The P3(00) event-related brain potential (ERP) was elicited with auditory and visual stimuli in separate conditions in the same subjects. Spectral analysis indicated that overall EEG power decreased as subject age increased. P3 amplitude decreased and peak latency increased for both the auditory and visual stimulus conditions as subject age increased. Few age-related differences were observed for the N1, P2, or N2 components. Spectral power from the delta, theta, and alpha bands correlated positively with P3 amplitude across subject age, but mean band frequency demonstrated only weak associations with P3 latency. No strong relationships were found between EEG and the other ERP component variables. The results suggest that age contributes to EEG power shifts, and that such changes significantly affect age-related variability of the P3 ERP component.