Article

Effects of background noise on inter-trial phase coherence and auditory N1–P2 responses to speech stimuli

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Supplementary resource (1)

... Though conventional ERP waveform analysis can shed light on the event-locked regularities of brain dynamics based on time-domain information averaged across trials, it may underestimate trial-by-trial response variability in the time-frequency domain [22][23][24]. A line of studies have applied time-frequency analyses to explore the time-locked neural substrates of auditory processing [23,[25][26][27][28], though these investigations were often conducted with non-emotional stimuli. In these studies, evoked neural synchrony can be evaluated through inter-trial phase coherence (ITPC) in five frequency bands, including delta (1-4 Hz), theta (4-8 Hz), alpha (8-12 Hz), beta (12-30 Hz) and gamma (over 30 Hz). ...
... Higher ITPC values suggest better phase alignment of cortical oscillations, while smaller values indicate poorer consistency or larger neural "jittering" across trials [29]. Results suggested that stimulus-evoked phase alignment of EEG oscillations, especially delta, theta and alpha ITPC, forms a crucial basis for the neural generation of auditory ERP [23,28,30]. By contrast, time-frequency analyses of vocal emotion processing are sparse with even less attention on the relationship between ERP waveforms and neural oscillations [31,32]. ...
... We measured N100, P200, LPC and their associated cortical oscillatory activities to characterize sensory processing of acoustic signals, initial decoding of emotional significance, and early stages of cognitive evaluation. Delta, theta and alpha ITPC were selected for evaluation as these frequency band oscillations could reflect salience detection, emotional significance and attentional modulation [53], and could better predict auditory ERP responses [23,28,30]. We also recorded accuracy and reaction time data from stimulus offset to show emotional speech processing in the decision-making stage. ...
Preprint
Full-text available
How language mediates emotional perception and experience is poorly understood. The present event-related potential (ERP) study examined the explicit and implicit processing of emotional speech to differentiate the relative influences of communication channel, emotion category and task type in the prosodic salience effect. Thirty participants (15 women) were presented with spoken words denoting happiness, sadness and neutrality in either the prosodic or semantic channel. They were asked to judge the emotional content (explicit task) and speakers’ gender (implicit task) of the stimuli. Results indicated that emotional prosody (relative to semantics) triggered larger N100 and P200 amplitudes with greater delta, theta and alpha inter-trial phase coherence (ITPC) values in the corresponding early time windows, and continued to produce larger LPC amplitudes and faster responses during late stages of higher-order cognitive processing. The relative salience of prosodic and semantics was modulated by emotion and task, though such modulatory effects varied across different processing stages. The prosodic salience effect was reduced for sadness processing and in the implicit task during early auditory processing and decision-making but reduced for happiness processing in the explicit task during conscious emotion processing. Additionally, across-trial synchronization of delta, theta and alpha bands predicted the ERP components with higher ITPC values significantly associated with stronger N100, P200 and LPC enhancement. These findings reveal the neurocognitive dynamics of emotional speech processing with prosodic salience tied to stage-dependent emotion- and task-specific effects, which can reveal insights to research reconciling language and emotion processing from cross-linguistic/cultural and clinical perspectives.
... N1 is additionally enhanced by increased attention, where larger amplitudes [73][74][75] and shorter latencies [75] are observed with increasing attentional engagement. In the presence of background noise, N1 is attenuated, with decreased amplitude and increased latency with falling signal-to-noise ratios [76][77][78]. Thus, N1 is associated with encoding of physical properties of sound and marks the arrival of potentially important sounds to the auditory cortex. ...
... Change in N1 could be indicative of more synchronized discharge patterns in N1 generator neuron populations of Heschl's gyrus or regions of the superior temporal gyrus. This is supported by evidence that N1 responses to speech in noise are predicted by neural phase locking, as measured by inter-trial phase coherence [77]. Specifically, neural synchrony is positively correlated with the earlier latencies and larger amplitudes of N1 that are observed when background noise is decreased [77]. ...
... This is supported by evidence that N1 responses to speech in noise are predicted by neural phase locking, as measured by inter-trial phase coherence [77]. Specifically, neural synchrony is positively correlated with the earlier latencies and larger amplitudes of N1 that are observed when background noise is decreased [77]. The shorter latency observed in the active condition may additionally indicate faster conduction time in these neurons [106]. ...
Article
Full-text available
Perceiving speech in noise (SIN) is important for health and well-being and decreases with age. Musicians show improved speech-in-noise abilities and reduced age-related auditory decline, yet it is unclear whether short term music engagement has similar effects. In this randomized control trial we used a pre-post design to investigate whether a 12-week music intervention in adults aged 50-65 without prior music training and with subjective hearing loss improves well-being, speech-in-noise abilities, and auditory encoding and voluntary attention as indexed by auditory evoked potentials (AEPs) in a syllable-in-noise task, and later AEPs in an oddball task. Age and gender-matched adults were randomized to a choir or control group. Choir participants sang in a 2-hr ensemble with 1-hr home vocal training weekly; controls listened to a 3-hr playlist weekly, attended concerts, and socialized online with fellow participants. From pre- to post-intervention, no differences between groups were observed on quantitative measures of well-being or behavioral speech-in-noise abilities. In the choir group, but not the control group, changes in the N1 component were observed for the syllable-in-noise task, with increased N1 amplitude in the passive condition and decreased N1 latency in the active condition. During the oddball task, larger N1 amplitudes to the frequent standard stimuli were also observed in the choir but not control group from pre to post intervention. Findings have implications for the potential role of music training to improve sound encoding in individuals who are in the vulnerable age range and at risk of auditory decline.
... While N1 and P2 precisely depict the temporal aspect of neural activity in AV perception, coherence of EEG oscillations is determined by inter-trial phase coherence (ITPC) in response to a stimulus. These EEG oscillations measured by ITPCs particularly in low-frequency (<30 Hz) bands can also shape the generation of evoked potentials, such as N1 and P2 (Gruber et al., 2004;Eggermont, 2007;Edwards et al., 2009;Koerner and Zhang, 2015;van Diepen and Mazaheri, 2018). ITPC in low-frequency bands has previously been used together with ERP analyses to study N1 and P2. ...
... ITPC in low-frequency bands has previously been used together with ERP analyses to study N1 and P2. For example, Koerner and Zhang (2015) suggested that early evoked potentials such as N1 and P2 might be dependent on ITPC for delta, theta, and alpha-band activities. Moreover, Kühnis et al. (2014) showed that musicians' beta activity increase is accompanied by reduced N1 amplitude in response to a passive vowel listening task. ...
... Notably, low-frequency activity is essential in the processing of speech (Howard and Poeppel, 2012;Gisladottir et al., 2018) and music (Doelling and Poeppel, 2015;Doelling et al., 2019). Lowfrequency activity also correlates with early ERP components (Gruber et al., 2004;Fuentemilla et al., 2006;Arnal and Giraud, 2012;Kühnis et al., 2014;Koerner and Zhang, 2015). Moreover, previous research on AV perception in speech, not taking musical experience into account, suggested that visual predictory information signaling an upcoming speech sound might reset ongoing frequency activity (Lakatos et al., 2007;Busch and VanRullen, 2010). ...
Article
Full-text available
In audiovisual music perception, visual information from a musical instrument being played is available prior to the onset of the corresponding musical sound and consequently allows a perceiver to form a prediction about the upcoming audio music. This prediction in audiovisual music perception, compared to auditory music perception, leads to lower N1 and P2 amplitudes and latencies. Although previous research suggests that audiovisual experience, such as previous musical experience may enhance this prediction, a remaining question is to what extent musical experience modifies N1 and P2 amplitudes and latencies. Furthermore, corresponding event-related phase modulations quantified as inter-trial phase coherence (ITPC) have not previously been reported for audiovisual music perception. In the current study, audio video recordings of a keyboard key being played were presented to musicians and non-musicians in audio only (AO), video only (VO), and audiovisual (AV) conditions. With predictive movements from playing the keyboard isolated from AV music perception (AV-VO), the current findings demonstrated that, compared to the AO condition, both groups had a similar decrease in N1 amplitude and latency, and P2 amplitude, along with correspondingly lower ITPC values in the delta, theta, and alpha frequency bands. However, while musicians showed lower ITPC values in the beta-band in AV-VO compared to the AO, non-musicians did not show this pattern. Findings indicate that AV perception may be broadly correlated with auditory perception, and differences between musicians and non-musicians further indicate musical experience to be a specific factor influencing AV perception. Predicting an upcoming sound in AV music perception may involve visual predictory processes, as well as beta-band oscillations, which may be influenced by years of musical training. This study highlights possible interconnectivity in AV perception as well as potential modulation with experience.
... While N1 and P2 amplitudes and latencies can provide insights into the neural basis of musical experience and AV modulation based on the time-domain, the generation of evoked potentials such as N1 and P2 are also dependent on superposition of the trial-by-trial phase alignment of low-frequency (<30 Hz) EEG oscillations in response to a stimulus (Gruber et al., 2004;Eggermont, 2007;Edwards et al., 2009;Koerner and Zhang, 2015;van Diepen and Mazaheri, 2018). A combination of ITPC and ERP have previously been used to study early auditory ERP components both for adults (Koerner and Zhang, 2015) and children (Yu et al., 2018) and shown that ITPC data in delta, theta and alpha might be a predictor for early auditory ERP components such as N1 and P2. ...
... While N1 and P2 amplitudes and latencies can provide insights into the neural basis of musical experience and AV modulation based on the time-domain, the generation of evoked potentials such as N1 and P2 are also dependent on superposition of the trial-by-trial phase alignment of low-frequency (<30 Hz) EEG oscillations in response to a stimulus (Gruber et al., 2004;Eggermont, 2007;Edwards et al., 2009;Koerner and Zhang, 2015;van Diepen and Mazaheri, 2018). A combination of ITPC and ERP have previously been used to study early auditory ERP components both for adults (Koerner and Zhang, 2015) and children (Yu et al., 2018) and shown that ITPC data in delta, theta and alpha might be a predictor for early auditory ERP components such as N1 and P2. With this basis, in the current study phase-locking neural synchrony will be computed as inter-trial phase coherence (ITPC) to examine the role of each frequency band, which coincides with early auditory ERP components, including delta (1-4 Hz), theta (4-8 Hz), alpha (8-12 Hz), and beta (12-30 Hz) (Edwards et al., 2009). ...
... For example, low-frequency power, such as theta activity, which has been related to syllable encoding of speech (Giraud and Poeppel, 2012;Doelling et al., 2014) is suppressed in response to AV speech . Theta ITPC also significantly correlates with early ERP components (Koerner and Zhang, 2015). Correlated with later ERP components, theta oscillatory activity together with delta activity signals further processing of correctly predicted stimuli (Arnal et al., 2011). ...
Article
Full-text available
In audiovisual speech perception, visual information from a talker's face during mouth articulation is available before the onset of the corresponding audio speech, and thereby allows the perceiver to use visual information to predict the upcoming audio. This prediction from phonetically congruent visual information modulates audiovisual speech perception and leads to a decrease in N1 and P2 amplitudes and latencies compared to the perception of audio speech alone. Whether audiovisual experience, such as with musical training, influences this prediction is unclear, but if so, may explain some of the variations observed in previous research. The current study addresses whether audiovisual speech perception is affected by musical training, first assessing N1 and P2 event-related potentials (ERPs) and in addition, inter-trial phase coherence (ITPC). Musicians and non-musicians are presented the syllable, /ba/ in audio only (AO), video only (VO), and audiovisual (AV) conditions. With the predictory effect of mouth movement isolated from the AV speech (AV−VO), results showed that, compared to audio speech, both groups have a lower N1 latency and P2 amplitude and latency. Moreover, they also showed lower ITPCs in the delta, theta, and beta bands in audiovisual speech perception. However, musicians showed significant suppression of N1 amplitude and desynchronization in the alpha band in audiovisual speech, not present for non-musicians. Collectively, the current findings indicate that early sensory processing can be modified by musical experience, which in turn can explain some of the variations in previous AV speech perception research.
... One approach to understanding the impact of age-related hearing loss is to obtain non-invasive electrophysiological measures to determine how the timing and magnitude of the objective neural responses to speech along the auditory pathway may account for some of the behavioral variability across individuals in noise. Previous studies have well established that the presence of background noise can impact auditory event-related potentials (AERPs) to speech as well as nonspeech stimuli (Bidelman et al., 2014;Billings et al., 2009;Koerner and Zhang, 2015;Kozou et al., 2005;Maamor and Billings, 2017;Muller-Gass et al., 2001;Whiting et al., 1998). Furthermore, noiseinduced changes in AERPs have been shown to be correlated with changes in the ability to perceive speech in background noise (Anderson et al., 2013b;Anderson et al., 2011;Bennett et al., 2012;Billings et al., 2013;Koerner et al., 2016;Song et al., 2011). ...
... Trial-by-trial phase locking associated with the N1-P2 responses was calculated in delta (0.5e4 Hz), theta (4e8 Hz), and alpha (8e12 Hz) frequency bands using the inter-trial phase coherence (ITPC) measure from the EEGLAB software . Previous studies have shown that the trial-by-trial synchronization of neural activity in delta, theta, and alpha frequency bands reflects auditory processing and the generation of the N1-P2 response (Edwards et al., 2009;Koerner and Zhang, 2015). Intertrial phase coherence estimates EEG trial-by-trial mean normalized phase as a function of time and frequency. ...
... It has been shown that modulation of theta power is linked with cognitive memory processes and likely contributes to the generation of the MMN response during auditory processing (Fuentemilla et al., 2008;Hsiao et al., 2009;Ko et al., 2012;Koerner et al., 2016). Each spectral calculation used a modified short-term Fourier Transform (STFT) with Hanning window tapering that was implemented in EEGLAB (Koerner and Zhang, 2015), which is recommended for the analysis of low frequency activities. The modified STFT method used overlapping sliding windows that are adapted to the target frequency bins to overcome limitations due to the use of fixed windows in conventional analysis. ...
Article
Understanding speech in background noise is difficult for many listeners with and without hearing impairment (HI). This study investigated the effects of HI on speech discrimination and recognition measures as well as speech-evoked cortical N1-P2 and MMN auditory event-related potentials (AERPs) in background noise. We aimed to determine which AERP components can predict the effects of HI on speech perception in noise across adult listeners with and without HI. The data were collected from 18 participants with hearing thresholds ranging from within normal limits to bilateral moderate-to-severe sensorineural hearing loss. Linear mixed effects models were employed to examine how hearing impairment, age, stimulus type, and SNR listening condition affected neural and behavioral responses and what AERP components were correlated with effects of HI on speech-in-noise perception across participants. Significant effects of age were found on the N1-P2 but not on MMN, and significant effects of HI were observed on the MMN and behavioral measures. The results suggest that neural responses reflecting later cognitive processing of stimulus discrimination may be more susceptible to the effects of HI on the processing of speech in noise than earlier components that signal the sensory encoding of acoustic stimulus features. Objective AERP responses were also potential neural predictors of speech perception in noise across participants with and without HI, which has implications for the use of AERPs as a potential clinical tool for assessing speech perception in noise. Full text for personal sharing available before December 6. 2018 at this web link: https://authors.elsevier.com/c/1XynD1M5IZOSKX
... In particular, infants' theta activity has been shown to be modulated by linguistic experience (Radicevic et al. , 2008) and phonetic salience (Zhang et al. , 2011). In adults, theta ITPC significantly predicts auditory N1-P2 amplitude to CV (consonant-vowel) syllables (Koerner et al. , 2015). Furthermore, theta activities are believed to be responsible for syllable-level speech encoding, which are crucial for successful speech comprehension (Morillon et al. , 2010, Giraud et al. , 2012, Peelle et al. , 2013, Doelling et al. , 2014. ...
... Consistent with our second hypothesis, theta ITPC value was a significant predictor of AEP amplitude for both P1 and N2 in both subject groups. These patterns are consistent with findings from normal adults (Koerner & Zhang, 2015). One additional piece of information is that the P1 hyper-sensitivity in the children with autism was associated with their heightened theta synchrony while their hypo-sensitivity in the N2 was associated with the reduced theta synchrony. ...
... Spectral power in the theta band was also computed for both the pre-stimulus baseline and the response portion of the epochs using the spectopo function in EEGLAB based on Welch's power spectral density estimate (oversampling ×8). Similar time-frequency analysis procedures were used in published studies(Koerner et al. , 2015, Koerner et al. , 2016. The number of trials for analysis in the autism group were 332 (range 165-466) for the pure tone condition and 300 (172-383) for the word condition. ...
Article
Objective: This autism study investigated how inter-trial phase coherence (ITPC) drives abnormalities in auditory evoked potential (AEP) responses for speech and nonspeech stimuli. Methods: Auditory P1-N2 responses and ITPCs in the theta band (4~7 Hz) for pure tones and words were assessed with EEG data from 15 school-age children with autism and 16 age-matched typically developing (TD) controls. Results: The autism group showed enhanced P1 and reduced N2 for both speech and nonspeech stimuli in comparison with the TD group. Group differences were also found with enhanced theta ITPC for P1 followed by ITPC reduction for N2 in the autism group. The ITPC values were significant predictors of P1 and N2 amplitudes in both groups. Conclusions: Abnormal trial-to-trial phase synchrony plays an important role in AEP atypicalities in children with autism. ITPC-driven enhancement as well as attenuation in different AEP components may coexist, depending on the stage of information processing. Significance: It is necessary to examine the time course of auditory evoked potentials and the corresponding inter-trial coherence of neural oscillatory activities to better understand hyper- and hypo- sensitive responses in autism, which has important implications for sensory-based treatment. Web link: https://www.sciencedirect.com/science/article/pii/S1388245718309003
... Speech communication often takes place in the presence of background noise, which can be difficult for hard of hearing listeners as well as many listeners with normal hearing. In recent years, there has been a surge of interest investigating noise-induced modulatory effects on cortical/subcortical responses to examine the neural networks and brain mechanisms supporting higher-level cognitive and linguistic skills (Billings, Mcmillan, Penman, & Gille, 2013;Du, Buchsbaum, Grady, & Alain, 2014;Koerner & Zhang, 2015;Mesgarani, David, Fritz, & Shamma, 2014;Wong, Uppunda, Ajith, Parrish, & Dhar, 2008). Cortical auditory event-related potentials (AERPs) are one representative method of measuring the neural coding of speech sounds in various listening conditions. ...
... In addition to the conventional ERP latency and amplitude measures, a recent trend in neurophysiological studies is the development of sophisticated time-frequency analyses to examine the role of various neural oscillation frequency bands of the EEG signal in the generation of AERP waveforms. These cortical oscillations are thought to modulate neural excitability and timing, which enables information exchange between cortical processes that are responsible for sensory and cognitive events Klimesch et al., 2007;Koerner & Zhang, 2015;Luck, 2014;Makeig, Debener, Onton, & Delorme, 2004;Sauseng et al., 2007;Zhang et al., 2011). In particular, several studies have revealed the contribution of the theta frequency band (4-8 Hz) in driving the neuronal generation of the MMN in frontal and temporal areas (Choi et al., 2013;Fuentemilla et al., 2007;Hsiao, Wu, Ho, & Lin, 2009;Ko et al., 2012). ...
... The background noise used in this study was a four-talker speech babble noise that was adopted from the Quick Speech In Noise Test (Quick-SIN) (Niquette, Gundmundsen, & Killion, 2001). All of the CV syllables and the noise stimuli were resampled at 44.1 kHz and were normalized to create a -3 dB SNR using Sony SoundForge 9.0 (Sony Creative Software, USA) (Koerner & Zhang, 2015). ...
... Measures of brain electrical activity have been important in investigating mechanisms that allow listeners to extract target signals from interfering background noise for successful speech communication. Previous auditory event-related potential (AERP) studies have demonstrated the detrimental effects of background noise on the timing and strength of neural responses to speech and non-speech stimuli (Billings et al., 2011;Koerner and Zhang, 2015;Parbery-Clark et al., 2011;Russo et al., 2009). Furthermore, the noise-induced changes in different AERP components have been shown to predict behavioral measures of perceptual and cognitive abilities (Anderson et al., 2011;Anderson et al., 2010b;Billings et al., 2013;Billings et al., 2015;Koerner et al., 2016;Song et al., 2011). ...
... In addition to conventional analysis on the latency and amplitude of AERP components, researchers have also begun to use time-frequency analysis techniques to determine how experimental stimulus and task factors impact induced and evoked cortical oscillations within the ongoing EEG signal. The oscillations are thought to play a key role in enabling sensory and cognitive processing across and within cortical networks (Baş ar et al., 1999;Klimesch et al., 2007;Koerner and Zhang, 2015;Makeig et al., 2004;Sauseng et al., 2007;Zhang et al., 2011). Specifically, oscillations in the delta (1e4 Hz), theta (4e8 Hz), and alpha (8e12 Hz) frequency bands have been found to be associated with the cortical P3 response, which may represent underlying cognitive demands related to different processes of signal processing and attentional engagement (Demiralp et al., 2001;Polich, 1994, 1995;Kolev et al., 1997;Polich, 1997;Spencer and Polich, 1999;Yordanova and Kolev, 1998). ...
... Three consonant-vowel (CV) syllables, /ba/, /da/, and /bu/, were synthesized using a 10 kHz sampling rate in the HLsyn software program (Sensimetrics Corporation, USA) (Koerner and Zhang, 2015). Each syllable was 170 ms in duration with a steady fundamental frequency of 100 Hz and a steady F4 of 3300 Hz. ...
Article
This study examined how speech babble noise differentially affected the auditory P3 responses and the associated neural oscillatory activities for consonant and vowel discrimination in relation to segmental- and sentence-level speech perception in noise. The data were collected from 16 normal-hearing participants in a double-oddball paradigm that contained a consonant (/ba/ to /da/) and vowel (/ba/ to /bu/) change in quiet and noise (speech-babble background at a -3 dB signal-to-noise ratio) conditions. Time-frequency analysis was applied to obtain inter-trial phase coherence (ITPC) and event-related spectral perturbation (ERSP) measures in delta, theta, and alpha frequency bands for the P3 response. Behavioral measures included percent correct phoneme detection and reaction time as well as percent correct IEEE sentence recognition in quiet and in noise. Linear mixed-effects models were applied to determine possible brain-behavior correlates. A significant noise-induced reduction in P3 amplitude was found, accompanied by significantly longer P3 latency and decreases in ITPC across all frequency bands of interest. There was a differential effect of noise on consonant discrimination and vowel discrimination in both ERP and behavioral measures, such that noise impacted the detection of the consonant change more than the vowel change. The P3 amplitude and some of the ITPC and ERSP measures were significant predictors of speech perception at segmental- and sentence-levels across listening conditions and stimuli. These data demonstrate that the P3 response with its associated cortical oscillations represents a potential neurophysiological marker for speech perception in noise.
... The current report of side-by-side comparison was propelled by the successive publication of two recent studies from our lab that respectively used conventional Pearson correlations and the more sophisticated linear mixed-effects regression models. In particular, our first study investigated whether noised-induced trial-by-trial changes in cortical oscillatory rhythms in the ongoing auditory electroencephalography (EEG) signal could account for the basic evoked response components in the averaged event-related potential (ERP) waveforms for speech stimuli in quiet and noisy listening conditions [54]. When the first study was submitted, we were not aware of the importance and relevance of the LME approach to the analysis of our data set. ...
... Even though the paper went through two rounds of revisions, the two anonymous peer reviewers did not raise any concerns for the use of Pearson correlation in our analysis. Our second study further examined whether the noise-induced changes in trial-by-trial neural phase locking, as measured by inter-trial phase coherence (ITPC) and spectral EEG power, could predict averaged mismatch negativity (MMN) responses for detecting a consonant change and a vowel change and whether the cortical MMN response itself could predict speech perception in noise at both the syllable and sentence levels [54]. In the publication process of the second study, reviewers questioned the validity of the Pearson correlation analysis for the multiple measures for the same speech stimuli from the same group of subjects, which led to a major revision adopting the LME regression analysis. ...
... Koerner and Zhang [54] aimed to determine whether noise-induced changes in trial-by-trial neural synchrony in delta (0.5-4 Hz), theta (4-8 Hz), and alpha (8-12 Hz) frequency bands in response to the syllable /bu/ in quiet and in speech babble background noise at a −3 dB SNR (signal-to-noise ratio) were predictive of variation in the N1-P2 ERPs across participants. ...
Article
Full-text available
Neurophysiological studies are often designed to examine relationships between measures from different testing conditions, time points, or analysis techniques within the same group of participants. Appropriate statistical techniques that can take into account repeated measures and multivariate predictor variables are integral and essential to successful data analysis and interpretation. This work implements and compares conventional Pearson correlations and linear mixed-effects (LME) regression models using data from two recently published auditory electrophysiology studies. For the specific research questions in both studies, the Pearson correlation test is inappropriate for determining strengths between the behavioral responses for speech-in-noise recognition and the multiple neurophysiological measures as the neural responses across listening conditions were simply treated as independent measures. In contrast, the LME models allow a systematic approach to incorporate both fixed-effect and random-effect terms to deal with the categorical grouping factor of listening conditions, between-subject baseline differences in the multiple measures, and the correlational structure among the predictor variables. Together, the comparative data demonstrate the advantages as well as the necessity to apply mixed-effects models to properly account for the built-in relationships among the multiple predictor variables, which has important implications for proper statistical modeling and interpretation of human behavior in terms of neural correlates and biomarkers.
... Theta (4)(5)(6)(7)(8) activity has been associated with processing the temporal and spectral attributes of spoken sentences, respectively [48,49]. Theta and alpha (8)(9)(10)(11)(12)(13)(14) bands are also sensitive to the rise in time of the acoustic stimulus onset [50,51]. We hypothesized that Brain Sci. ...
... The PLF data indicate the involvement of distinct neural oscillations for tracking the rising vs. falling intensity modulation direction within the acoustic stimuli. In the present context with a passive listening condition, the dominant PLF for the ON response to the damped sounds was mediated by stronger alpha and theta activity, reflecting new information coding for the abrupt onset [50,51]. As the onset of damped sounds is prone to capture attention/arousal/alertful reaction, there could be differences in involuntary attention to the arrival of the damped vs. ramped sounds mediated by alpha activity, which is known to be influenced by attention [84]. ...
... While it is appealing to interpret the phase-locking factor as a measure of oscillatory activity on a phase-resetting account, caution is necessary here as we cannot rule out the traditional additive model for evoked responses [51]. Our PLF data as reported cannot provide conclusive evidence to cleanly separate what might be due to phase-resetting of ongoing oscillations and what might be due to additive evoked response (possibly non-oscillatory) for stimulus coding. ...
Article
Full-text available
This magnetoencephalography (MEG) study investigated evoked ON and OFF responses to ramped and damped sounds in normal-hearing human adults. Two pairs of stimuli that differed in spectral complexity were used in a passive listening task; each pair contained identical acoustical properties except for the intensity envelope. Behavioral duration judgment was conducted in separate sessions, which replicated the perceptual bias in favour of the ramped sounds and the effect of spectral complexity on perceived duration asymmetry. MEG results showed similar cortical sites for the ON and OFF responses. There was a dominant ON response with stronger phase-locking factor (PLF) in the alpha (8–14 Hz) and theta (4–8 Hz) bands for the damped sounds. In contrast, the OFF response for sounds with rising intensity was associated with stronger PLF in the gamma band (30–70 Hz). Exploratory correlation analysis showed that the OFF response in the left auditory cortex was a good predictor of the perceived temporal asymmetry for the spectrally simpler pair. The results indicate distinct asymmetry in ON and OFF responses and neural oscillation patterns associated with the dynamic intensity changes, which provides important preliminary data for future studies to examine how the auditory system develops such an asymmetry as a function of age and learning experience and whether the absence of asymmetry or abnormal ON and OFF responses can be taken as a biomarker for certain neurological conditions associated with auditory processing deficits.
... Speech communication often takes place in the presence of background noise, which can be difficult for hard of hearing listeners as well as many listeners with normal hearing. In recent years, there has been a surge of interest investigating noiseinduced modulatory effects on cortical/subcortical responses to examine the neural networks and brain mechanisms supporting higher-level cognitive and linguistic skills (Anderson et al., 2010a;Billings et al., 2013;Du et al., 2014;Koerner and Zhang, 2015;Mesgarani et al., 2014;Vaden et al., 2015;Wong et al., 2008). Cortical auditory event-related potentials (AERPs) are one representative method of measuring the neural coding of speech sounds in various listening conditions. ...
... In addition to the conventional ERP latency and amplitude measures, a recent trend in neurophysiological studies is the development of sophisticated time-frequency analyses to examine the role of various neural oscillation frequency bands of the EEG signal in the generation of AERP waveforms. These cortical oscillations are thought to modulate neural excitability and timing, which enables information exchange between cortical processes that are responsible for sensory and cognitive events (Baş ar et al., 1999;Klimesch et al., 2007;Koerner and Zhang, 2015;Makeig et al., 2004;Sauseng et al., 2007;Zhang et al., 2011). In particular, several studies have revealed the contribution of the theta frequency band (4e8 Hz) in driving the neuronal generation of the MMN in frontal and temporal areas (Bishop and Hardiman, 2010;Choi et al., 2013;Fuentemilla et al., 2008;Hsiao et al., 2009;Ko et al., 2012). ...
... The consonant-vowel (CV) syllables, /ba/, /da/, and /bu/, were synthesized with the HLsyn software program (Sensimetrics Corporation, USA) using a 10 kHz sampling rate (Koerner and Zhang, 2015). All the syllables were 170 ms in duration with a steady fundamental frequency of 100 Hz and a steady F4 at 3300 Hz. ...
... The ERP responses to the first syllable in a sequence (Fig. 4A) consisted of negative and positive deflections peaking at 142 ms and 242 ms after syllable onset and with central and frontocentral topography, respectively, followed by a parietal positive deflection peaking at 388 ms after syllable onset (for the low presentation rate at electrode Cz). These responses are consistent in phase, latency and topography with the obligatory auditory N1-P2 complex and the attention orienting N1 response (Alcaini et al., 1994;Budd et al., 1998;Giard et al., 1994;Koerner and Zhang, 2015), followed by the P3a attention orienting response (Escera et al., 2000(Escera et al., , 1998Polich, 2007;Polich et al., 1997). The ERPs evoked by the first syllable in a sequence were of similar amplitude at all syllable presentation rates. ...
... stimulus rates (Büchel et al., 1998;Linden et al., 1999;Binder et al., 2009). The grand average ERPs to a single syllable revealed a sequence of negative-positive deflections at latencies and topographies consistent with the obligatory auditory N1-P2 complex and orienting N1 (Alcaini et al., 1994;Budd et al., 1998;Giard et al., 1994;Alcaini et al., 1994;Koerner and Zhang, 2015), followed by the P3a ( Polich et al., 1997;Polich, 2007;Escera et al., 1998Escera et al., , 2000. The decline seen here in the amplitude of the N1-P2 complex for the last versus first syllable in a train at high presentation rates, and in the syllable normalized global field and secondary (B) fMRI subcomponents, and primary (C) and secondary (D) EEG subcomponents, resulting from jICA. ...
Article
Background Meaningful integration of functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) requires knowing whether these measurements reflect the activity of the same neural sources, i.e., estimating the degree of coupling and decoupling between the neuroimaging modalities. New method This paper proposes a method to quantify the coupling and decoupling of fMRI and EEG signals based on the mixing matrix produced by joint independent component analysis (jICA). The method is termed fMRI/EEG-jICA. Results fMRI and EEG acquired during a syllable detection task with variable syllable presentation rates (0.25-3 Hz) were separated with jICA into two spatiotemporally distinct components, a primary component that increased nonlinearly in amplitude with syllable presentation rate, putatively reflecting an obligatory auditory response, and a secondary component that declined nonlinearly with syllable presentation rate, putatively reflecting an auditory attention orienting response. The two EEG subcomponents were of similar amplitude, but the secondary fMRI subcomponent was ten folds smaller than the primary one. Comparison to existing method FMRI multiple regression analysis yielded a map more consistent with the primary than secondary fMRI subcomponent of jICA, as determined by a greater area under the curve (0.5 versus 0.38) in a sensitivity and specificity analysis of spatial overlap. Conclusion fMRI/EEG-jICA revealed spatiotemporally distinct brain networks with greater sensitivity than fMRI multiple regression analysis, demonstrating how this method can be used for leveraging EEG signals to inform the detection and functional characterization of fMRI signals. fMRI/EEG-jICA may be useful for studying neurovascular coupling at a macro-level, e.g., in neurovascular disorders.
... When the experimental group was extended to include older individuals, it was determined that the electrophysiological predictors of behavioral performance differed between individuals with typical hearing and those with hearing impairment (Billings, Penman, McMillan, & Ellis, 2015). Likewise, using a simple speech stimulus (/bu/) in quiet and in babble, Koerner and Zhang (2015) found that noise increased the latency and decreased the amplitude of N1. They additionally determined that these changes may, in part, be attributable to changes in cortical neural synchrony that are induced by noise (Koerner & Zhang, 2015). ...
... Likewise, using a simple speech stimulus (/bu/) in quiet and in babble, Koerner and Zhang (2015) found that noise increased the latency and decreased the amplitude of N1. They additionally determined that these changes may, in part, be attributable to changes in cortical neural synchrony that are induced by noise (Koerner & Zhang, 2015). Responses such as N1 and MMN, however, do not offer any information regarding whether the speech signal has been understood. ...
Article
Purpose: Speech-in-noise testing relies on a number of factors beyond the auditory system, such as cognitive function, compliance, and motor function. It may be possible to avoid these limitations by using electroencephalography. The present study explored this possibility using the N400. Method: Eleven adults with typical hearing heard high-constraint sentences with congruent and incongruent terminal words in the presence of speech-shaped noise. Participants ignored all auditory stimulation and watched a video. The signal-to-noise ratio (SNR) was varied around each participant's behavioral threshold during electroencephalography recording. Speech was also heard in quiet. Results: The amplitude of the N400 effect exhibited a nonlinear relationship with SNR. In the presence of background noise, amplitude decreased from high (+4 dB) to low (+1 dB) SNR but increased dramatically at threshold before decreasing again at subthreshold SNR (-2 dB). Conclusions: The SNR of speech in noise modulates the amplitude of the N400 effect to semantic anomalies in a nonlinear fashion. These results are the first to demonstrate modulation of the passively evoked N400 by SNR in speech-shaped noise and represent a first step toward the end goal of developing an N400-based physiological metric for speech-in-noise testing.
... A relatively new area of analysis in ERP research is the application of time-frequency analysis to examine degree of trial-by-trial coherence in cortical rhythms that may give rise to the salient components in the averaged ERP waveforms (Koerner & Zhang, 2015;Luck, 2014). ...
... The inter-trial coherence measure is an estimate of mean normalized phase across trials, which can range from 0 (indicating random phase coherence or complete lack of synchronization) to 1 (indicating perfect phase synchrony across trials). The inter-trial coherence data (also referred to as phase locking values) were averaged across the frequencies within the range of each frequency (Koerner & Zhang, 2015). The peak phase locking values corresponding to the N400 and late positive response components in their respective windows were identified for each frequency band for each listening condition on an individual basis. ...
... Phase coherence (or intertrial phase coherence) shows the phase consistency of ASSRs across epochs (17,62). It also explains the phase-locking capability of a neural generator to the acoustic stimulus and varies between 0 and 1 (45,63). To calculate the phase coherence, the time series of each ROI with . ...
Article
Full-text available
People with age-related hearing loss suffer from speech understanding difficulties, even after correcting for differences in hearing audibility. These problems are not only attributed to deficits in audibility but are also associated with changes in central temporal processing. The goal of this study is to obtain an understanding of potential alterations in temporal envelope processing for middle-aged and older persons with and without hearing impairment. The time series of activity of subcortical and cortical neural generators was reconstructed using a minimum-norm imaging technique. This novel technique allows for reconstructing a wide range of neural generators with minimal prior assumptions regarding the number and location of the generators. The results indicated that the response strength and phase coherence of middle-aged participants with hearing impairment (HI) were larger than for normal-hearing (NH) ones. In contrast, for the older participants, a significantly smaller response strength and phase coherence were observed in the participants with HI than the NH ones for most modulation frequencies. Hemispheric asymmetry in the response strength was also altered in middle-aged and older participants with hearing impairment and showed asymmetry toward the right hemisphere. Our brain source analyses show that age-related hearing loss is accompanied by changes in the temporal envelope processing, although the nature of these changes varies with age.
... For example, attention to speech sounds has been associated with a left-lateralization of the M100 (neuromagnetic equivalent of the N1; Parviainen et al., 2005;Poeppel et al., 1996). Furthermore, the N1-P2 amplitude can reflect phonetic cues that are relevant to speech perception such as amplitude rise time and formant transitions (Carpenter & Shahin, 2013), and is later and reduced for syllables presented in background noise compared to quiet (Koerner & Zhang, 2015). The amplitude of the P2 component specifically has been implicated as a marker of perceptual discriminability of speech sounds (Sheehan et al., 2005) and categorical perception of phonemes (Bidelman et al., 2013(Bidelman et al., , 2020 and may serve an additional function in cohort reduction during visual gating of written words (Bles et al., 2007). ...
... Phase coherence shows how stable the difference between the phase of two signals is. And it can describe the phase-locking capability of a neural source (Koerner and Zhang 2015). We will measure the phase coherence between the first two DSS components. ...
Article
Different studies have suggested that language and developmental disorders such as dyslexia are associated with a disturbance of auditory entrainment and of the functional hemispheric asymmetries during speech processing. These disorders typically result from an issue in the phonological component of language that causes problems to represent and manipulate the phonological structure of words at the syllable and/or phoneme level. We used Auditory Steady‐State Responses (ASSRs) in EEG recordings to investigate the brain activation and hemisphere asymmetry of theta, alpha, beta and low‐gamma range oscillations in typical readers and readers with dyslexia. The aim was to analyse whether the group differences found in previous electrode level studies were caused by a different source activation pattern or conversely was an effect that could be found on the active brain sources. We could not find differences in the brain locations of the main active brain sources. However, we observed differences in the extracted waveforms. The group average of the first DSS component of all signal‐to‐noise ratios of ASSR at source level were higher than the group averages at the electrode level. These analyses included a lower alpha synchronisation in adolescents with dyslexia and the possibility of compensatory mechanisms in theta, beta and low‐gamma frequency bands. The main brain auditory sources were located in cortical regions around the auditory cortex. Thus, the differences observed in auditory EEG experiments would, according to our findings, have their origin in the intrinsic oscillatory mechanisms of the brain cortical sources related to speech perception.
... N1 is elicited by an acoustic change in the auditory environment and may be functionally related to the readout of sensory registration and the update of an auditory memory before perception [8]. The N1 amplitude depends on the physical stimulus parameters, and background noise typically diminishes its amplitude [9]. However, a lowlevel background noise paradoxically yields an increase in the N1 amplitude when both the stimuli and noise were presented binaurally in MEG [10][11][12] and EEG studies [13][14][15]. ...
Article
Full-text available
The presence of binaural low-level background noise has been shown to enhance the transient evoked N1 response at about 100 ms after sound onset. This increase in N1 amplitude is thought to reflect noise-mediated efferent feedback facilitation from the auditory cortex to lower auditory centers. To test this hypothesis, we recorded auditory-evoked fields using magnetoencephalography while participants were presented with binaural harmonic complex tones embedded in binaural or monaural background noise at signal-to-noise ratios of 25 dB (low noise) or 5 dB (higher noise). Half of the stimuli contained a gap in the middle of the sound. The source activities were measured in bilateral auditory cortices. The onset and gap N1 response increased with low binaural noise, but high binaural and low monaural noise did not affect the N1 amplitudes. P1 and P2 onset and gap responses were consistently attenuated by background noise, and noise level and binaural/monaural presentation showed distinct effects. Moreover, the evoked gamma synchronization was also reduced by background noise, and it showed a lateralized reduction for monaural noise. The effects of noise on the N1 amplitude follow a bell-shaped characteristic that could reflect an optimal representation of acoustic information for transient events embedded in noise.
... In general, larger ITPC values mean higher phase consistency across trials, and smaller values index lower consistency or larger neural 'jittering'. The reduction of theta band power and ITPC of schizophrenia indicates increased neural 'jittering' and unstable neural firing in the auditory cortex (Koerner & Zhang, 2015), all of this may lead to an imprecise representation of sounds, inaccurate calculation of interaural correlation, even diminishing the listener's ability to selectively attend to a target object in a complex auditory scene (Anderson et al., 2012;Füllgrabe et al., 2020;Luo et al., 2017Luo et al., , 2020Shinn-Cunningham & Best, 2008). Moreover, the abnormal trial-to-trial phase synchrony plays an important role in auditory eventrelated potentials in people with schizophrenia. ...
Article
Full-text available
Detection of transient changes in interaural correlation is based on the temporal precision of the central representations of acoustic signals. Whether schizophrenia impairs the temporal precision in the interaural correlation process is not clear. In both participants with schizophrenia and matched healthy-control participants, this study examined the detection of a break in interaural correlation (BIC, a change in interaural correlation from 1 to 0 and back to 1), including the longest interaural delay at which a BIC was just audible, representing the temporal extent of the primitive auditory memory (PAM). Moreover, BIC-induced EEGs and the relationships between the early binaural psychoacoustic processing and higher cognitive functions, which were assessed by the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS), were examined. The results showed that compared to healthy controls, participants with schizophrenia exhibited poorer BIC detection, PAM, and RBANS score. Both the BIC-detection accuracy and the PAM extent were correlated with the RBANS score. Moreover, participants with schizophrenia showed weaker BIC-induced N1-P2 amplitude which was correlated with both theta-band power and inter-trial phase coherence. These results suggested that schizophrenia impairs the temporal precision of the central representations of acoustic signals, affecting both interaural correlation processing and higher-order cognitions.
... The maximum delta/theta/gamma phase locking value within the designated time windows of different AEP components was identified individually under each experimental condition for statistical analysis. Similar TF analysis approaches were adopted in previous studies (Koerner et al., 2016;Koerner & Zhang, 2015Yu et al., 2018). ...
Article
Full-text available
The presence of vowel exaggeration in infant‐directed speech (IDS) may adapt to the age‐appropriate demands in speech and language acquisition. Previous studies have provided behavioral evidence of atypical auditory processing towards IDS in children with autism spectrum disorders (ASD), while the underlying neurophysiological mechanisms remain unknown. This event‐related potential (ERP) study investigated the neural coding of formant‐exaggerated speech and nonspeech in 24 4‐ to 11‐year‐old children with ASD and 24 typically‐developing (TD) peers. The EEG data were recorded using an alternating block design, in which each stimulus type (exaggerated/non‐exaggerated sound) was presented with equal probability. ERP waveform analysis revealed an enhanced P1 for vowel formant exaggeration in the TD group but not in the ASD group. This speech‐specific atypical processing in ASD was not found for the nonspeech stimuli which showed similar P1 enhancement in both ASD and TD groups. Moreover, the time‐frequency analysis indicated that children with ASD showed differences in neural synchronization in the delta‐theta bands for processing acoustic formant changes embedded in nonspeech. Collectively, the results add substantiating neurophysiological evidence (i.e., a lack of neural enhancement effect of vowel exaggeration) for atypical auditory processing of IDS in children with ASD, which may exert a negative effect on phonetic encoding and language learning. Lay summary Atypical responses to motherese might act as a potential early marker of risk for children with ASD. This study investigated the neural responses to such socially relevant stimuli in the ASD brain, and the results suggested a lack of neural enhancement responding to the motherese even in individuals without intellectual disability.
... Phase coherence (or inter-trial phase coherence) shows the similarity in the phase of ASSRs across epochs (Picton et al., 2001;Luo and Poeppel, 2007). It also reflects the phase-locking capability of a neural source to the acoustic stimulus and varies between 0 and 1 (Koerner and Zhang, 2015;Farahani et al., 2019). For each ROI, the phase coherence was calculated based on the time-series of the representative dipoles of that ROI. ...
Article
Full-text available
Speech understanding problems are highly prevalent in the aging population, even when hearing sensitivity is clinically normal. These difficulties are attributed to changes in central temporal processing with age and can potentially be captured by age-related changes in neural generators. The aim of this study is to investigate age-related changes in a wide range of neural generators during temporal processing in middle-aged and older persons with normal audiometric thresholds. A minimum-norm imaging technique is employed to reconstruct cortical and subcortical neural generators of temporal processing for different acoustic modulations. The results indicate that for relatively slow modulations (<50 Hz), the response strength of neural sources is higher in older adults than in younger ones, while the phase-locking does not change. For faster modulations (80 Hz), both the response strength and the phase-locking of neural sources are reduced in older adults compared to younger ones. These age-related changes in temporal envelope processing of slow and fast acoustic modulations are possibly due to loss of functional inhibition, which is accompanied by aging. Both cortical (primary and non-primary) and subcortical neural generators demonstrate similar age-related changes in response strength and phase-locking. Hemispheric asymmetry is also altered in older adults compared to younger ones. Alterations depend on the modulation frequency and side of stimulation. The current findings at source level could have important implications for the understanding of age-related changes in auditory temporal processing and for developing advanced rehabilitation strategies to address speech understanding difficulties in the aging population.
... We applied source analysis to scalp-recorded electrical brain activity recorded in older adults while they were presented with clear or noisedegraded speech. ERPs were expected to differ for noise-degraded compared to clear speech due to a reduction of neural synchrony (Koerner and Zhang, 2015) and more widespread engagement of neural resources in challenging acoustics (Brette, 2012;Kim et al., 2012), including right hemisphere (Bidelman and Howell, 2016). We also anticipated more dramatic group differences in noise since older adults with mild hearing loss are most challenged in degraded listening conditions (Tremblay et al., 2003). ...
Article
Full-text available
Speech perception in noisy environments depends on complex interactions between sensory and cognitive systems. In older adults, such interactions may be affected, especially in those individuals who have more severe age-related hearing loss. Using a data-driven approach, we assessed the temporal (when in time) and spatial (where in the brain) characteristics of cortical speech-evoked responses that distinguish older adults with or without mild hearing loss. We performed source analyses to estimate cortical surface signals from the EEG recordings during a phoneme discrimination task conducted under clear and noise-degraded conditions. We computed source-level ERPs (i.e., mean activation within each ROI) from each of the 68 ROIs of the Desikan-Killiany (DK) atlas, averaged over a randomly chosen 100 trials without replacement to form feature vectors. We adopted a multivariate feature selection method called stability selection and control to choose features that are consistent over a range of model parameters. We use parameter optimized support vector machine (SVM) as a classifiers to investigate the time course and brain regions that segregate groups and speech clarity. For clear speech perception, whole-brain data revealed a classification accuracy of 81.50% [area under the curve (AUC) 80.73%; F1-score 82.00%], distinguishing groups within ∼60 ms after speech onset (i.e., as early as the P1 wave). We observed lower accuracy of 78.12% [AUC 77.64%; F1-score 78.00%] and delayed classification performance when speech was embedded in noise, with group segregation at 80 ms. Separate analysis using left (LH) and right hemisphere (RH) regions showed that LH speech activity was better at distinguishing hearing groups than activity measured in the RH. Moreover, stability selection analysis identified 12 brain regions (among 1428 total spatiotemporal features from 68 regions) where source activity segregated groups with >80% accuracy (clear speech); whereas 16 regions were critical for noise-degraded speech to achieve a comparable level of group segregation (78.7% accuracy). Our results identify critical time-courses and brain regions that distinguish mild hearing loss from normal hearing in older adults and confirm a larger number of active areas, particularly in RH, when processing noise-degraded speech information.
... ERPs reflect specific sensory and/or cognitive processes [30]. Specific ERPs that may be used to study the perception of signals in noise are the N1, P2, mismatch negativity (MMN) and P300 responses [31][32][33][34][35][36][37][38][39][40]. Of particular interest to the current study are the MMN and P300 components. ...
Article
Full-text available
This electrophysiological study investigated the role of the medial olivocochlear (MOC) efferents in listening in noise. Both ears of eleven normal-hearing adult participants were tested. The physiological tests consisted of transient-evoked otoacoustic emission (TEOAE) inhibition and the measurement of cortical event-related potentials (ERPs). The mismatch negativity (MMN) and P300 responses were obtained in passive and active listening tasks, respectively. Behavioral responses for the word recognition in noise test were also analyzed. Consistent with previous findings, the TEOAE data showed significant inhibition in the presence of contralateral acoustic stimulation. However, performance in the word recognition in noise test was comparable for the two conditions (i.e., without contralateral stimulation and with contralateral stimulation). Peak latencies and peak amplitudes of MMN and P300 did not show changes with contralateral stimulation. Behavioral performance was also maintained in the P300 task. Together, the results show that the peripheral auditory efferent effects captured via otoacoustic emission (OAE) inhibition might not necessarily be reflected in measures of central cortical processing and behavioral performance. As the MOC effects may not play a role in all listening situations in adults, the functional significance of the cochlear effects of the medial olivocochlear efferents and the optimal conditions conducive to corresponding effects in behavioral and cortical responses remain to be elucidated.
... For example, the talker normalization experiments by Zhang et al., (2013) employed a jittered silent interval of 300 -500 ms before the last word. Further, the presence of background noise reduces the amplitude of the N1 and P2 in response to speech (Koerner & Zhang, 2015), suggesting that acoustic continuity (as was the case in our stimuli) may ablate these ERPs. Finally, the relative similarity of the words said by the two different talkers may have reduced the N1 and P2 to be undetectable when compared to the environmental sounds experiment, in which the sentence endings are conceptually quite distinct from the preceding speech. ...
Article
Adjusting to the vocal characteristics of a new talker is important for speech recognition. Previous research has indicated that adjusting to talker differences is an active cognitive process that depends on attention and working memory (WM). These studies have not examined how talker variability affects perception and neural responses in fluent speech. Here we use source analysis from high-density EEG to show that perceiving fluent speech in which the talker changes recruits early involvement of parietal and temporal cortical areas, suggesting functional involvement of WM and attention in talker normalization. We extend these findings to acoustic source change in general by examining understanding environmental sounds in spoken sentence context. Though there may be differences in cortical recruitment to processing demands for non-speech sounds versus a changing talker, the underlying mechanisms are similar, supporting the view that shared cognitive-general mechanisms assist both talker normalization and speech-to-nonspeech transitions.
... Components of cAEPs (such as the N1-P2 complex) have been reported to be good indicators for assessing HA effects both in quiet and in background noise ( Tremblay and Miller, 2014;Kuruvilla- Mathew et al., 2015). While low-frequency (delta, theta, and alpha) PLVs can predict both amplitudes and latencies of N1 and P2 in cAEPs in noisy backgrounds ( Koerner and Zhang, 2015), it is plausible that PLV can also serve as a marker that helps to assess the effects of HA fitting on cortical encoding during SPIN perception. Bellier et al. (2015) demonstrated that magnitudes of speech-evoked FFR ENV and FFR TFS can be modulated by amending HA settings which generate different auditory stimulations. ...
Preprint
Auditory phase-locked responses are affected by aging and it has been proposed that this increases the challenges experienced during speech perception in noise (SPiN). However, the proposal lacks direct support. This issue was addressed by measuring speech-evoked phase-locked responses at subcortical (frequency-following responses, FFRs) and cortical (theta-band phase-locking, θ-PLV) levels, and studying the relationship between the phase-locked responses and SPiN (word report accuracies in spoken sentences in noise) in adults across a wide age-range (19-75 years old). It was found that: (1) FFR magnitudes declined with age after hearing loss was controlled for; (2) θ-PLV increased with age, suggesting cortical hyperexcitability in audition; (3) SPiN correlated positively with FFR magnitudes obtained in quiet and with θ-PLV obtained in noise, suggesting that impacts of age effects (smaller FFR magnitudes and greater θ-PLV) on SPiN perception differ at subcortical and cortical levels. The current study thus provides evidence for different mechanisms at subcortical and cortical levels by which age affects speech-evoked phase-locked activities and SPiN.
... Although we used dipoles in the auditory cortex for both ASSR and Change-N1m, it is known that subcortical or frontal regions contribute to ASSR and N1 as well, which might affect the results. Furthermore, in order to validate the present results, we need to adopt other methods, such as time-frequency analysis that addresses intertrial power and neural-phase-locking underlying evoked responses [34], as well as ASSR [17]. ...
Article
Full-text available
The auditory steady-state response (ASSR) elicited by a periodic sound stimulus is a neural oscillation recorded by magnetoencephalography (MEG), which is phase-locked to the repeated sound stimuli. This ASSR phase alternates after an abrupt change in the feature of a periodic sound stimulus and returns to its steady-state value. An abrupt change also elicits a MEG component peaking at approximately 100–180 ms (called “Change-N1m”). We investigated whether both the ASSR phase deviation and Change-N1m were affected by the magnitude of change in sound pressure. The ASSR and Change-N1m to 40 Hz click-trains (1000 ms duration, 70 dB), with and without an abrupt change (± 5, ± 10, or ± 15 dB) were recorded in ten healthy subjects. We used the source strength waveforms obtained by a two-dipole model for measurement of the ASSR phase deviation and Change-N1m values (peak amplitude and latency). As the magnitude of change increased, Change-N1m increased in amplitude and decreased in latency. Similarly, ASSR phase deviation depended on the magnitude of sound-pressure change. Thus, we suspect that both Change-N1m and the ASSR phase deviation reflect the sensitivity of the brain’s neural change-detection system.
... Components of cAEPs (such as the N1-P2 complex) have been reported to be good indicators for assessing HA effects both in quiet and in background noise (Tremblay and Miller, 2014;Kuruvilla-Mathew, Purdy and Welch, 2015). While low-frequency (delta, theta and alpha) PLVs can predict both amplitudes and latencies of N1 and P2 in cAEPs in noisy backgrounds (Koerner and Zhang, 2015), it is plausible that PLV can also serve as a marker that helps to assess the effects of HA fitting on cortical encoding during SPIN perception. Bellier et al. (2015) demonstrated that magnitudes of speech-evoked FFRENV and FFRTFS can be modulated by amending HA settings which generate different auditory stimulations. ...
Article
Full-text available
Speech-in-noise (SPIN) perception involves neural encoding of temporal acoustic cues. Cues include temporal fine structure (TFS) and envelopes that modulate at syllable (Slow-rate ENV) and fundamental frequency (F0-rate ENV) rates. Here the relationship between speech-evoked neural responses to these cues and SPIN perception was investigated in older adults. Theta-band phase-locking values (PLV) that reflect cortical sensitivity to Slow-rate ENV and peripheral/brainstem frequency-following responses phase-locked to F0-rate ENV (FFRENV_F0) and TFS (FFRTFS) were measured from scalp-EEG responses to a repeated speech syllable in steady-state speech-shaped (SpN) and 16-speaker babble (BbN) noises. The results showed that: 1) SPIN performance and PLV were significantly higher under SpN than BbN, implying differential cortical encoding may serve as the neural mechanism of SPIN performance that varies as a function of noise types; 2) PLV and FFRTFS at resolved harmonics were significantly related to good SPIN performance, supporting the importance of phase-locked neural encoding of Slow-rate ENV and TFS of resolved harmonics during SPIN perception; 3) FFRENV_F0 was not associated to SPIN performance until audiometric threshold was controlled for, indicating that hearing loss should be carefully controlled when studying the role of neural encoding of F0-rate ENV. Implications are drawn with respect to fitting auditory prostheses.
... The results of intertrial phase coherence analyses in other experiments have been interpreted as reflecting underlying neural synchrony (Nash-Kille and Sharma, 2014), though the mechanisms that generate this synchrony are not yet fully understood. Interestingly, a recent study by Koerner and Zhang (2015) showed that the inter-trial phase coherence is specifically related to the generation of the N1/P1 complex (the N1 and the P1 components of the ERP taken together), correlating with amplitude and latency measures, suggesting that our characterizations of averaged activity and phase angle in fronto-central regions may be interrelated. ...
Article
Cortical alpha oscillations (8–13 Hz) appear to play a role in suppressing distractions when just one sensory modality is being attended, but do they also contribute when attention is distributed over multiple sensory modalities? For an answer, we examined cortical oscillations in human subjects who were dividing attention between auditory and visual sequences. In Experiment 1, subjects performed an oddball task with auditory, visual, or simultaneous audiovisual sequences in separate blocks, while the electroencephalogram was recorded using high-density scalp electrodes. Alpha oscillations were present continuously over posterior regions while subjects were attending to auditory sequences. This supports the idea that the brain suppresses processing of visual input in order to advantage auditory processing. During a divided-attention audiovisual condition, an oddball (a rare, unusual stimulus) occurred in either the auditory or the visual domain, requiring that attention be divided between the two modalities. Fronto-central theta band (4–7 Hz) activity was strongest in this audiovisual condition, when subjects monitored auditory and visual sequences simultaneously. Theta oscillations have been associated with both attention and with short-term memory. Experiment 2 sought to distinguish these possible roles of fronto-central theta activity during multisensory divided attention. Using a modified version of the oddball task from Experiment 1, Experiment 2 showed that differences in theta power among conditions were independent of short-term memory load. Ruling out theta's association with short-term memory, we conclude that fronto-central theta activity is likely a marker of multisensory divided attention.
... In addition to the conventional ERP latency and amplitude measures, we applied time-frequency analysis to examine trial-by-trial consistency of neural oscillations in selected frequency bands of interest that could drive the MMR activity for speech and nonspeech discrimination. The cortical oscillations are thought to reflect the net excitatory and inhibitory neuronal activities that mediate sensory and cognitive events [39][40][41][42][43][44][45] . Time-frequency analysis on a trial-by-trial basis allows a more detailed examination of what oscillatory activities contribute to or do not contribute to the observed ERP responses that are averaged across trials. ...
Article
Full-text available
Recent studies reveal that tonal language speakers with autism have enhanced neural sensitivity to pitch changes in nonspeech stimuli but not to lexical tone contrasts in their native language. The present ERP study investigated whether the distinct pitch processing pattern for speech and nonspeech stimuli in autism was due to a speech-specific deficit in categorical perception of lexical tones. A passive oddball paradigm was adopted to examine two groups (16 in the autism group and 15 in the control group) of Chinese children's Mismatch Responses (MMRs) to equivalent pitch deviations representing within-category and between-category differences in speech and nonspeech contexts. To further examine group-level differences in the MMRs to categorical perception of speech/nonspeech stimuli or lack thereof, neural oscillatory activities at the single trial level were further calculated with the inter-trial phase coherence (ITPC) measure for the theta and beta frequency bands. The MMR and ITPC data from the children with autism showed evidence for lack of categorical perception in the lexical tone condition. In view of the important role of lexical tones in acquiring a tonal language, the results point to the necessity of early intervention for the individuals with autism who show such a speech-specific categorical perception deficit.
Article
During speech production auditory regions operate in concert with the anterior dorsal stream to facilitate online error detection. As the dorsal stream also is known to activate in speech perception, the purpose of the current study was to probe the role of auditory regions in error detection during auditory discrimination tasks as stimuli are encoded and maintained in working memory. A priori assumptions are that sensory mismatch (i.e., error) occurs during the discrimination of Different (mismatched) but not Same (matched) syllable pairs. Independent component analysis was applied to raw EEG data recorded from 42 participants to identify bilateral auditory alpha rhythms, which were decomposed across time and frequency to reveal robust patterns of event related synchronization (ERS; inhibition) and desynchronization (ERD; processing) over the time course of discrimination events. Results were characterized by bilateral peri-stimulus alpha ERD transitioning to alpha ERS in the late trial epoch, with ERD interpreted as evidence of working memory encoding via Analysis by Synthesis and ERS considered evidence of speech-induced-suppression arising during covert articulatory rehearsal to facilitate working memory maintenance. The transition from ERD to ERS occurred later in the left hemisphere in Different trials than in Same trials, with ERD and ERS temporally overlapping during the early post-stimulus window. Results were interpreted to suggest that the sensory mismatch (i.e., error) arising from the comparison of the first and second syllable elicits further processing in the left hemisphere to support working memory encoding and maintenance. Results are consistent with auditory contributions to error detection during both encoding and maintenance stages of working memory, with encoding stage error detection associated with stimulus concordance and maintenance stage error detection associated with task-specific retention demands.
Article
Objective The purpose of this study was to determine better- ear listening effect on spatial separation with the N1–P2 complex. Methods Twenty individuals with normal hearing participated in this study. The speech stimulus /ba/ was presented in front of the participant (0°). Continuous Speech Noise (5 dB signal-to-noise ratio) was presented either in front of the participant (0°), left-side (-90°), or right-side (+90°). N1- P2 complex has been recorded in quiet and three noisy conditions. Results There was a remarkable effect of noise direction on N1, P2 latencies. When the noise was separated from the stimulus, N1 and P2 latency increased in terms of when noise was co-located with the stimulus. There was no statistically significant difference in N1-P2 amplitudes between the stimulus-only and co-located condition. N1-P2 amplitude was increased when the noise came from the sides, according to the stimulus-only and co-located conditions. Conclusion These findings demonstrate that the latency shifts on N1-P2 complex explain cortical mechanisms of spatial separation in better-ear listening.
Article
Long-term language and music experience enhances neural representation of temporal attributes of pitch in the brainstem and auditory cortex in favorable listening conditions. Herein we examine whether brainstem and cortical pitch mechanisms—shaped by long-term language experience—maintain this advantage in the presence of reverberation-induced degradation in pitch representation. Brainstem frequency following responses (FFR) and cortical pitch responses (CPR) were recorded concurrently from Chinese and English-speaking natives, using a Mandarin word exhibiting a high rising pitch (/yi ² /). Stimuli were presented diotically in quiet (Dry), and in the presence of Slight, Mild, and Moderate reverberation conditions. Regardless of language group, the amplitude of both brainstem FFR (F0) and cortical CPR (Na[sbnd]Pb) responses decreased with increases in reverberation. Response amplitude for Chinese, however, was larger than English in all reverberant conditions. The Chinese group also exhibited a robust rightward asymmetry at temporal electrode sites (T8 > T7) across stimulus conditions. Regardless of language group, direct comparison of brainstem and cortical responses revealed similar magnitude of change in response amplitude with increasing reverberation. These findings suggest that experience-dependent brainstem and cortical pitch mechanisms provide an enhanced and stable neural representation of pitch-relevant information that is maintained even in the presence of reverberation. Relatively greater degradative effects of reverberation on brainstem (FFR) compared to cortical (Na[sbnd]Pb) responses suggest relatively stronger top-down influences on CPRs.
Article
Everyday speech perception is challenged by external acoustic interferences that hinder verbal communication. Here, we directly compared how different levels of the auditory system (brainstem vs. cortex) code speech and how their neural representations are affected by two acoustic stressors: noise and reverberation. We recorded multichannel (64 ch) brainstem frequency-following responses (FFRs) and cortical event-related potentials (ERPs) simultaneously in normal hearing individuals to speech sounds presented in mild and moderate levels of noise and reverb. We matched signal-to-noise and direct-to-reverberant ratios to equate the severity between classes of interference. Electrode recordings were parsed into source waveforms to assess the relative contribution of region-specific brain areas [i.e., brainstem (BS), primary auditory cortex (A1), inferior frontal gyrus (IFG)]. Results showed that reverberation was less detrimental to (and in some cases facilitated) the neural encoding of speech compared to additive noise. Inter-regional correlations revealed associations between BS and A1 responses, suggesting subcortical speech representations influence higher auditory-cortical areas. Functional connectivity analyses further showed that directed signaling toward A1 in both feedforward cortico-collicular (BS→A1) and feedback cortico-cortical (IFG→A1) pathways were strong predictors of degraded speech perception and differentiated "good" vs. "poor" perceivers. Our findings demonstrate a functional interplay within the brain's speech network that depends on the form and severity of acoustic interference. We infer that in addition to the quality of neural representations within individual brain regions, listeners' success at the "cocktail party" is modulated based on how information is transferred among subcortical and cortical hubs of the auditory-linguistic network.
Chapter
Full-text available
This study examined the costs of simultaneously monitoring two frequency regions. Listeners detected low-and high-frequency tones in a 2I4AFC procedure. On every trial, each signal was presented in either the first or second interval independently. Comparison of thresholds in single-and dual-signal conditions provided an estimate of the costs. Thresholds were obtained in quiet, in notched-filtered noise, and in randomized multitone maskers. No cost was found in quiet, whereas large costs were found for the masked conditions, especially for the multitone masker. These results suggest that costs of dividing attention in frequency depend on both signal and non-signal channels.
Article
Full-text available
The event-related potential (ERP) approach has provided a wealth of fine-grained information about the time course and the neural basis of cognitive processing events. However, in the 1980s and 1990s, an increasing number of researchers began to realize that an ERP only represents a certain part of the event-related electroencephalographic (EEG) signal. This chapter focuses on another aspect of event-related EEG activity: oscillatory EEG activity. There exists a meaningful relationship between oscillatory neuronal dynamics, on the one hand, and a wide range of cognitive processes, on the other hand. Given that the analysis of oscillatory dynamics extracts information from the EEG/magnetoencephalographic (EEG/MEG) signal that is largely lost with the traditional time-locked averaging of single trials used in the ERP approach, studying the dynamic oscillatory patterns in the EEG/MEG is at least a useful addition to the traditional ERP approach.
Article
Full-text available
To investigate the contributions of energetic and informational masking to neural encoding and perception in noise, using oddball discrimination and sentence recognition tasks. P3 auditory evoked potential, behavioral discrimination, and sentence recognition data were recorded in response to speech and tonal signals presented to nine normal-hearing adults. Stimuli were presented at a signal to noise ratio of -3 dB in four background conditions: quiet, continuous noise, intermittent noise, and four-talker babble. Responses to tonal signals were not significantly different for the three maskers. However, responses to speech signals in the four-talker babble resulted in longer P3 latencies, smaller P3 amplitudes, poorer discrimination accuracy, and longer reaction times than in any of the other conditions. Results also demonstrate significant correlations between physiological and behavioral data. As latency of the P3 increased, reaction times also increased and sentence recognition scores decreased. The data confirm a differential effect of masker type on the P3 and behavioral responses and present evidence of interference by an informational masker to speech understanding at the level of the cortex. Results also validate the use of the P3 as a useful measure to demonstrate physiological correlates of informational masking.
Article
Full-text available
Neuronal oscillations are ubiquitous in the brain and may contribute to cognition in several ways: for example, by segregating information and organizing spike timing. Recent data show that delta, theta and gamma oscillations are specifically engaged by the multi-timescale, quasi-rhythmic properties of speech and can track its dynamics. We argue that they are foundational in speech and language processing, 'packaging' incoming information into units of the appropriate temporal granularity. Such stimulus-brain alignment arguably results from auditory and motor tuning throughout the evolution of speech and language and constitutes a natural model system allowing auditory research to make a unique contribution to the issue of how neural oscillatory activity affects human cognition.
Article
Full-text available
There is considerable uncertainty about the time-course of central auditory maturation. On some indices, children appear to have adult-like competence by school age, whereas for other measures development follows a protracted course. We studied auditory development using auditory event-related potentials (ERPs) elicited by tones in 105 children on two occasions two years apart. Just over half of the children were 7 years initially and 9 years at follow-up, whereas the remainder were 9 years initially and 11 years at follow-up. We used conventional analysis of peaks in the auditory ERP, independent component analysis, and time-frequency analysis. We demonstrated maturational changes in the auditory ERP between 7 and 11 years, both using conventional peak measurements, and time-frequency analysis. The developmental trajectory was different for temporal vs. fronto-central electrode sites. Temporal electrode sites showed strong lateralisation of responses and no increase of low-frequency phase-resetting with age, whereas responses recorded from fronto-central electrode sites were not lateralised and showed progressive change with age. Fronto-central vs. temporal electrode sites also mapped onto independent components with differently oriented dipole sources in auditory cortex. A global measure of waveform shape proved to be the most effective method for distinguishing age bands. The results supported the idea that different cortical regions mature at different rates. The ICC measure is proposed as the best measure of 'auditory ERP age'.
Article
Full-text available
Speech scientists have long proposed that formant exaggeration in infant-directed speech plays an important role in language acquisition. This event-related potential (ERP) study investigated neural coding of formant-exaggerated speech in 6-12-month-old infants. Two synthetic /i/ vowels were presented in alternating blocks to test the effects of formant exaggeration. ERP waveform analysis showed significantly enhanced N250 for formant exaggeration, which was more prominent in the right hemisphere than the left. Time-frequency analysis indicated increased neural synchronization for processing formant-exaggerated speech in the delta band at frontal-central-parietal electrode sites as well as in the theta band at frontal-central sites. Minimum norm estimates further revealed a bilateral temporal-parietal-frontal neural network in the infant brain sensitive to formant exaggeration. Collectively, these results provide the first evidence that formant expansion in infant-directed speech enhances neural activities for phonetic encoding and language learning.
Article
Full-text available
We recorded the electrocorticogram directly from the exposed cortical surface of awake neurosurgical patients during the presentation of auditory syllable stimuli. All patients were unanesthetized as part of a language-mapping procedure for subsequent left-hemisphere tumor resection. Time-frequency analyses showed significant high-gamma (gammahigh: 70-160 Hz) responses from the left superior temporal gyrus, but no reliable response from the left inferior frontal gyrus. Alpha suppression (alpha: 7-14 Hz) and event-related potential responses exhibited a more widespread topography. Across electrodes, the alpha suppression from 200 to 450 ms correlated with the preceding (50-200 ms) gammahigh increase. The results are discussed in terms of the different physiological origins of these electrocortical signals.
Article
Full-text available
To systematically investigate in normal-hearing listeners the effects of decreased audibility produced by broadband noise masking on the cortical event-related potentials (ERPs) N1, N2, and P3 to the speech sounds /ba/ and /da/. Ten normal-hearing adult listeners actively (button-press response) discriminated the speech sounds /ba/ and /da/ presented in quiet (no masking) or with broadband masking noise (BBN), using an ERP oddball paradigm. The BBN was presented at 50, 60, and 70 dB SPL when speech sounds were presented at 65 dB ppe SPL and at 60, 70 and, 80 dB SPL when speech sounds were presented at 80 dB ppe SPL. On average, the 50, 60, 70, and 80 dB SPL BBN maskers produced behavioral threshold elevations of 18, 25, 35, and 48 dB (average for 250 to 4000 Hz), respectively. The BBN maskers produced significant decreases (relative to quiet condition) in ERP amplitudes and behavioral discriminability. These decreases did not occur, however, until the noise masker intensity (in dB SPL) was equal to or greater than the speech stimulus intensity (in dB ppe SPL), that is, until speech to noise ratios (SNRs) were < or = 0 dB. N1 remained present even after N2, P3, and behavioral discriminability were absent. In contrast to amplitudes, ERP and behavioral latencies showed significant decreases at higher (better) SNRs. Significant latency increases occurred when the noise maskers were within 10 to 20 dB of the stimuli (i.e., SNR < or = 20 dB). The effects of masking were greater for responses to /da/ compared with /ba/. Latency increases occurred with less masking for N1 than for P3 or behavioral reaction time, with N2 falling in between. These results indicate that decreased audibility as a result of masking affects the various ERP peaks in a differential manner and that latencies are more sensitive indicators of these masking effects than are amplitudes.
Article
Full-text available
We have developed a toolbox and graphic user interface, EEGLAB, running under the crossplatform MATLAB environment (The Mathworks, Inc.) for processing collections of single-trial and/or averaged EEG data of any number of channels. Available functions include EEG data, channel and event information importing, data visualization (scrolling, scalp map and dipole model plotting, plus multi-trial ERP-image plots), preprocessing (including artifact rejection, filtering, epoch selection, and averaging), independent component analysis (ICA) and time/frequency decompositions including channel and component cross-coherence supported by bootstrap statistical methods based on data resampling. EEGLAB functions are organized into three layers. Top-layer functions allow users to interact with the data through the graphic interface without needing to use MATLAB syntax. Menu options allow users to tune the behavior of EEGLAB to available memory. Middle-layer functions allow users to customize data processing using command history and interactive 'pop' functions. Experienced MATLAB users can use EEGLAB data structures and stand-alone signal processing functions to write custom and/or batch analysis scripts. Extensive function help and tutorial information are included. A 'plug-in' facility allows easy incorporation of new EEG modules into the main menu. EEGLAB is freely available (http://www.sccn.ucsd.edu/eeglab/) under the GNU public license for noncommercial use and open source development, together with sample data, user tutorial and extensive documentation.
Article
Full-text available
This article provides a new, more comprehensive view of event-related brain dynamics founded on an information-based approach to modeling electroencephalographic (EEG) dynamics. Most EEG research focuses either on peaks 'evoked' in average event-related potentials (ERPs) or on changes 'induced' in the EEG power spectrum by experimental events. Although these measures are nearly complementary, they do not fully model the event-related dynamics in the data, and cannot isolate the signals of the contributing cortical areas. We propose that many ERPs and other EEG features are better viewed as time/frequency perturbations of underlying field potential processes. The new approach combines independent component analysis (ICA), time/frequency analysis, and trial-by-trial visualization that measures EEG source dynamics without requiring an explicit head model.
Article
Full-text available
Clocks tick, bridges and skyscrapers vibrate, neuronal networks oscillate. Are neuronal oscillations an inevitable by-product, similar to bridge vibrations, or an essential part of the brain's design? Mammalian cortical neurons form behavior-dependent oscillating networks of various sizes, which span five orders of magnitude in frequency. These oscillations are phylogenetically preserved, suggesting that they are functionally relevant. Recent findings indicate that network oscillations bias input selection, temporally link neurons into assemblies, and facilitate synaptic plasticity, mechanisms that cooperatively support temporal representation and long-term consolidation of information.
Article
Full-text available
This study was designed to characterize the effect of background noise on the identification of syllables using behavioral and electrophysiological measures. Twenty normal-hearing adults (18-30 years) performed an identification task in a two-alternative forced-choice paradigm. Stimuli consisted of naturally produced syllables [da] and [ga] embedded in white noise. The noise was initiated 1000 ms before the onset of the speech stimuli in order to separate the auditory event related potentials (AERP) response to noise onset from that to the speech. Syllables were presented in quiet and in five SNRs: +15, +3, 0, -3, and -6 dB. Results show that (1) performance accuracy, d', and reaction time were affected by the noise, more so for reaction time; (2) both N1 and P3 latency were prolonged as noise levels increased, more so for P3; (3) [ga] was better identified than [da], in all noise conditions; and (4) P3 latency was longer for [da] than for [ga] for SNR 0 through -6 dB, while N1 latency was longer for [ga] than for [da] in most listening conditions. In conclusion, the unique stimuli structure utilized in this study demonstrated the effects of noise on speech recognition at both the physical and the perceptual processing levels.
Book
How does the brain code and process incoming information, how does it recog­ nize a certain object, how does a certain Gestalt come into our awareness? One of the key issues to conscious realization of an object, of a Gestalt is the attention de­ voted to the corresponding sensory input which evokes the neural pattern underly­ ing the Gestalt. This requires that the attention be devoted to one set of objects at a time. However, the attention may be switched quickly between different objects or ongoing input processes. It is to be expected that such mechanisms are reflected in the neural dynamics: Neurons or neuronal assemblies which pertain to one object may fire, possibly in rapid bursts at a time. Such firing bursts may enhance the synaptic strength in the corresponding cell assembly and thereby form the substrate of short-term memory. However, we may well become aware of two different objects at a time. How can we avoid that the firing patterns which may relate to say a certain type of move­ ment (columns in V5) or to a color (V 4) of one object do not become mixed with those of another object? Such a blend may only happen if the presentation times be­ come very short (below 20-30 ms). One possibility is that neurons pertaining to one cell assembly fire syn­ chronously. Then different cell assemblies firing at different rates may code different information.
Article
This study investigated the effects of decreased audibility produced by high-pass noise masking on cortical event-related potentials (ERPs) N1, N2, and P3 to the speechsounds /ba/ and /da/ presented at 65 and 80 dB SPL. Normal-hearing subjects pressed a button in response to the deviant sound in an oddball paradigm. Broadband masking noise was presented at an intensity sufficient to completely mask the response to the 65-dB SPLspeechsounds, and subsequently high-pass filtered at 4000, 2000, 1000, 500, and 250 Hz. With high-pass masking noise, pure-tone behavioral thresholds increased by an average of 38 dB at the high-pass cutoff and by 50 dB one octave above the cutoff frequency. Results show that as the cutoff frequency of the high-pass masker was lowered, ERP latencies to speechsounds increased and amplitudes decreased. The cutoff frequency where these changes first occurred and the rate of the change differed for N1 compared to N2, P3, and the behavioral measures. N1 showed gradual changes as the masker cutoff frequency was lowered. N2, P3, and behavioral measures showed marked changes below a masker cutoff of 2000 Hz. These results indicate that the decreased audibility resulting from the noise masking affects the various ERP components in a differential manner. N1 is related to the presence of audible stimulus energy, being present whether audible stimuli are discriminable or not. In contrast, N2 and P3 were absent when the stimuli were audible but not discriminable (i.e., when the second formant transitions were masked), reflecting stimulus discrimination. These data have implications regarding the effects of decreased audibility on cortical processing of speechsounds and for the study of cortical ERPs in populations with hearing impairment.
Article
To determine whether auditory event-related potentials (ERPs) to a phonemic fricative contrast ("s" and "sh") show significant differences in listening conditions with or without a hearing aid and whether the aided condition significantly alters a listener's ERP responses to the fricative speech sounds. The raw EEG data were collected using a 64-channel system from 10 healthy adult subjects with normal hearing. The fricative stimuli were digitally edited versions of naturally produced syllables, /sa/ and /∫a/. The evoked responses were derived in unaided and aided conditions by using an alternating block design with a passive listening task. Peak latencies and amplitudes of the P1-N1-P2 components and the N1' and P2'' peaks of the acoustic change complex (ACC) were analyzed. The evoked N1 and N1' responses to the fricative sounds significantly differed in the unaided condition. The fricative contrast also elicited distinct N1-P2 responses in the aided condition. While the aided condition increased and delayed the N1 and ACC responses, significant differences in the P1-N1-P2 and ACC components were still observed, which would support fricative contrast perception at the cortical level. Despite significant alterations in the ERP responses by the aided condition, normal-hearing adult listeners showed distinct neural coding patterns for the voiceless fricative contrast, "s" and "sh," with or without a hearing aid.
Article
Although brainstem dys-synchrony is a hallmark of children with auditory neuropathy spectrum disorder (ANSD), little is known about how the lack of neural synchrony manifests at more central levels. We used time-frequency single-trial EEG analyses (i.e., inter-trial coherence; ITC), to examine cortical phase synchrony in children with normal hearing (NH), sensorineural hearing loss (SNHL) and ANSD. Single trial time-frequency analyses were performed on cortical auditory evoked responses from 41 NH children, 91 children with ANSD and 50 children with SNHL. The latter two groups included children who received intervention via hearing aids and cochlear implants. ITC measures were compared between groups as a function of hearing loss, intervention type, and cortical maturational status. In children with SNHL, ITC decreased as severity of hearing loss increased. Children with ANSD revealed lower levels of ITC relative to children with NH or SNHL, regardless of intervention. Children with ANSD who received cochlear implants showed significant improvements in ITC with increasing experience with their implants. Cortical phase coherence is significantly reduced as a result of both severe-to-profound SNHL and ANSD. ITC provides a window into the brain oscillations underlying the averaged cortical auditory evoked response. Our results provide a first description of deficits in cortical phase synchrony in children with SNHL and ANSD.
Article
Research has shown that the amplitude and latency of neural responses to passive mismatch negativity (MMN) tasks are affected by noise (Billings et al., 2010). Further studies have revealed that informational masking noise results in decreased P3 amplitude and increased P3 latency, which correlates with decreased discrimination abilities and reaction time (Bennett et al., 2012). This study aims to further investigate neural processing of speech in differing types of noise by attempting to correlate MMN neural responses to consonant and vowel stimuli with results from behavioral sentence recognition tasks. Preliminary behavioral data indicate that noise conditions significantly compromise the perception of consonant change in an oddball discrimination task. Noise appears to have less of an effect on the perception of vowel change. The MMN data are being collected for the detection of consonant change and vowel change in different noise conditions. The results will be examined to address how well the pre-attentive MMN measures at the phonemic level can predict speech intelligibility at the sentence level using the same noise conditions.
Article
Perception-in-noise deficits have been demonstrated across many populations and listening conditions. Many factors contribute to successful perception of auditory stimuli in noise, including neural encoding in the central auditory system. Physiological measures such as cortical auditory-evoked potentials (CAEPs) can provide a view of neural encoding at the level of the cortex that may inform our understanding of listeners' abilities to perceive signals in the presence of background noise. To understand signal-in-noise neural encoding better, we set out to determine the effect of signal type, noise type, and evoking paradigm on the P1-N1-P2 complex. Tones and speech stimuli were presented to nine individuals in quiet and in three background noise types: continuous speech spectrum noise, interrupted speech spectrum noise, and four-talker babble at a signal-to-noise ratio of -3 dB. In separate sessions, CAEPs were evoked by a passive homogenous paradigm (single repeating stimulus) and an active oddball paradigm. The results for the N1 component indicated significant effects of signal type, noise type, and evoking paradigm. Although components P1 and P2 also had significant main effects of these variables, only P2 demonstrated significant interactions among these variables. Signal type, noise type, and evoking paradigm all must be carefully considered when interpreting signal-in-noise evoked potentials. Furthermore, these data confirm the possible usefulness of CAEPs as an aid to understand perception-in-noise deficits.
Article
This study employed behavioral and electrophysiological measures to examine selective listening of concurrent auditory stimuli. Stimuli consisted of four compound sounds, each created by mixing a pure tone with filtered noise bands at a signal-to-noise ratio of +15 dB. The pure tones and filtered noise bands each contained two levels of pitch. Two separate conditions were created; the background stimuli varied randomly or were held constant. In separate blocks, participants were asked to judge the pitch of tones or the pitch of filtered noise in the compound stimuli. Behavioral data consistently showed lower sensitivity and longer response times for classification of filtered noise when compared with classification of tones. However, differential effects were observed in the peak components of auditory event-related potentials (ERPs). Relative to tone classification, the P1 and N1 amplitudes were enhanced during the more difficult noise classification task in both test conditions, but the peak latencies were shorter for P1 and longer for N1 during noise classification. Moreover, a significant interaction between condition and task was seen for the P2. The results suggest that the essential ERP components for the same compound auditory stimuli are modulated by listeners' focus on specific aspects of information in the stimuli.
Article
Neuroelectric oscillations reflect rhythmic shifting of neuronal ensembles between high and low excitability states. In natural settings, important stimuli often occur in rhythmic streams, and when oscillations entrain to an input rhythm their high excitability phases coincide with events in the stream, effectively amplifying neuronal input responses. When operating in a 'rhythmic mode', attention can use these differential excitability states as a mechanism of selection by simply enforcing oscillatory entrainment to a task-relevant input stream. When there is no low-frequency rhythm that oscillations can entrain to, attention operates in a 'continuous mode', characterized by extended increase in gamma synchrony. We review the evidence for early sensory selection by oscillatory phase-amplitude modulations, its mechanisms and its perceptual and behavioral consequences.
Article
This study investigated the effects of decreased audibility produced by high-pass noise masking on cortical event-related potentials (ERPs) N1, N2, and P3 to the speech sounds /ba/and/da/presented at 65 and 80 dB SPL. Normal-hearing subjects pressed a button in response to the deviant sound in an oddball paradigm. Broadband masking noise was presented at an intensity sufficient to completely mask the response to the 65-dB SPL speech sounds, and subsequently high-pass filtered at 4000, 2000, 1000, 500, and 250 Hz. With high-pass masking noise, pure-tone behavioral thresholds increased by an average of 38 dB at the high-pass cutoff and by 50 dB one octave above the cutoff frequency. Results show that as the cutoff frequency of the high-pass masker was lowered, ERP latencies to speech sounds increased and amplitudes decreased. The cutoff frequency where these changes first occurred and the rate of the change differed for N1 compared to N2, P3, and the behavioral measures. N1 showed gradual changes as the masker cutoff frequency was lowered. N2, P3, and behavioral measures showed marked changes below a masker cutoff of 2000 Hz. These results indicate that the decreased audibility resulting from the noise masking affects the various ERP components in a differential manner. N1 is related to the presence of audible stimulus energy, being present whether audible stimuli are discriminable or not. In contrast, N2 and P3 were absent when the stimuli were audible but not discriminable (i.e., when the second formant transitions were masked), reflecting stimulus discrimination. These data have implications regarding the effects of decreased audibility on cortical processing of speech sounds and for the study of cortical ERPs in populations with hearing impairment.
Article
On the basis of a systems theoretical approach it was hypothesized that event-related potentials (ERPs) are superpositions of stimulus-evoked and time-locked EEG rhythms reflecting resonance properties of the brain (Başar, 1980). This approach led to frequency analysis of ERPs as a way of analyzing evoked rhythms. The present article outlines the basic features of ERP frequency analysis in comparison to ERP wavelet analysis, a recently introduced method of time-frequency analysis. Both methods were used in an investigation of the functional correlates of evoked rhythms where auditory and visual ERPs were recorded from the cat brain. Intracranial electrodes were located in the primary auditory cortex and in the primary visual cortex thus permitting "cross-modality" experiments. Responses to adequate stimulation (e.g., visual ERP recorded from the visual cortex) were characterized by high amplitude alpha (8-16 Hz) responses which were not observed for inadequate stimulation. This result is interpreted as a hint at a special role of alpha responses in primary sensory processing. The results of frequency analysis and of wavelet analysis were quite similar, with possible advantages of wavelet methods for single-trial analysis. The results of frequency analysis as performed earlier were thus confirmed by wavelet analysis. This supports the view that ERP frequency components correspond to evoked rhythms with a distinct biological significance.
Article
This study investigated the effects of decreased audibility produced by high-pass noise masking on the cortical event-related potentials (ERPs) N1 and mismatch negativity (MMN) to the speech sounds /ba/ and /da/, presented at 65 dB SPL. ERPs were recorded while normal listeners (N = 11) ignored the stimuli and read a book. Broadband masking noise was simultaneously presented at an intensity sufficient to mask the response to the speech sounds, and subsequently high-pass filtered. The conditions were QUIET (no noise); high-pass cutoff frequencies of 4000, 2000, 1000, 500, and 250 Hz; and broadband noise. Behavioral measures of discrimination of the speech sounds (d' and reaction time) were obtained separately from the ERPs for each listener and condition. As the cutoff frequency of the high-pass masker was lowered, ERP latencies increased and amplitudes decreased. The cutoff frequency where changes first occurred differed for N1 and MMN. N1 showed small systematic changes across frequency beginning with the 4000-Hz high-pass noise. MMN and behavioral measures showed large changes that occurred at approximately 1000 Hz. These results indicate that decreased audibility, resulting from the masking, affects N1 and the MMN in a differential manner. N1 reflects the presence of audible stimulus energy, being present in all conditions where stimuli were audible, whether or not they were discriminable. The MMN is present only for those conditions where stimuli were behaviorally discriminable. These studies of cortical ERPs in high-pass noise studies provide insight into the changes in brain processes and behavioral performance that occur when audibility is reduced, as in hearing loss.
Article
Electrophysiological and hemodynamical responses of the brain allow investigation of the neural origins of human attention. We review attention-related brain responses from auditory and visual tasks employing oddball and novelty paradigms. Dipole localization and intracranial recordings as well as functional magnetic resonance imaging reveal multiple areas involved in generating and modulating attentional brain responses. In addition, the influence of brain lesions of circumscribed areas of the human cortex onto attentional mechanisms are reviewed. While it is obvious that damaged brain tissue no longer functions properly, it has also been shown that functions of non-lesioned brain areas are impaired due to loss of modulatory influence of the lesioned area. Both early (P1 and N1) and late (P3) event-related potentials are modulated by excitatatory and inhibitory mechanisms. Oscillatory EEG-correlates of attention in the alpha and gamma frequency range also show attentional modulation.
Article
This study investigated the effects of decreased audibility in low-frequency spectral regions, produced by low-pass noise masking, on cortical event-related potentials (ERPs) to the speech sounds /ba/ and /da/. The speech sounds were presented to normal-hearing adults (N = 10) at 65- and 80-dB peak-to-peak equivalent SPL while they were engaged in an active condition (pressing a button to deviant sounds) and a passive condition (ignoring the stimuli and reading a book). Broadband masking noise was simultaneously presented at an intensity sufficient to mask the response to the 65-dB speech sounds and subsequently low-pass filtered. The conditions were quiet (no masking), low-pass noise cutoff frequencies of 250, 500, 1000, 2000, and 4000 Hz, and broadband noise. As the cutoff frequency of the low-pass noise masker was raised, ERP latencies increased and amplitudes decreased. The low-pass noise affected N1 differently than the other ERP or behavioral measures, particularly for responses to 80-dB speech stimuli. N1 showed a smaller decrease in amplitude and a smaller increase in latency compared with the other measures. Further, the cutoff frequency where changes first occurred was different for N1. For 80-dB stimuli, N1 amplitudes showed significant changes when the low-pass noise masker cutoff was raised to 4000 Hz. In contrast, d', MMN, N2, and P3 amplitudes did not change significantly until the low-pass noise masker was raised to 2000 Hz. N1 latencies showed significant changes when the low-pass noise masker was raised to 1000 Hz, whereas RT, MMN, N2, and P3 latencies did not change significantly until the low-pass noise masker was raised to 2000 Hz. No significant differences in response amplitudes were seen across the hemispheres (electrode sites C3M versus C4M) in quiet, or in masking noise. These results indicate that decreased audibility, resulting from the masking, affects N1 in a differential manner compared with MMN, N2, P3, and behavioral measures. N1 indexes the presence of audible stimulus energy, being present when speech sounds are audible, whether or not they are discriminable. MMN indexes stimulus discrimination at a pre-attentive level. It was present only when behavioral measures indicated the ability to differentiate the speech sounds. N2 and P3 also were present only when the speech sounds were behaviorally discriminated. N2 and P3 index stimulus discrimination at a conscious level. These cortical ERP in low-pass noise studies provide insight into the changes in brain processes and behavioral performance that occur when audibility is reduced, such as with low frequency hearing loss.
Article
Nowadays, the mechanisms involved in the genesis of event-related potentials (ERPs) are a matter of debate among neuroscientists. Specifically, the debate lies in whether ERPs arise due to the contribution of a fixed-polarity and fixed-latency superimposed neuronal activity to background electroencephalographic oscillations (evoked model) and/or due to a partial phase synchronization of the ongoing EEG (oscillatory model). The participation of the two mechanisms can be explored by the spectral power modulation and phase coherence of scalp EEG rhythms, respectively. However, an important limitation underlies their measurement: the fact that an added neural activity will be relatively phase-locked to stimulus, thus enhancing both spectral power and phase synchrony measures and making the contribution of each mechanism less clear-cut. This would not be relevant in the case that an increase in phase concentration was not accompanied by any concurrent spectral power modulation, thus opening the way to an oscillatory-based explanation. We computed event-related spectral power modulations and phase coherence to an auditory repeated-stimulus presentation paradigm with tone intensity far from threshold (90 dB SPL), in which N1 decreases its amplitude (N1 gating) as an attenuation brain process. Our data indicate that evoked and oscillatory activity could contribute together to the non-attenuated N1, while N1 to repeated stimuli could be explained by partial phase concentration of scalp EEG activity without concurrent power increase. Therefore, our results show that both increased spectral power and partial phase resetting contribute differentially to different ERPs. Moreover, they show that certain ERPs could arise through reorganization of the phase of ongoing scalp EEG activity only.
Article
Cortical responses, recorded by electroencephalography and magnetoencephalography, can be characterized in the time domain, to study event-related potentials/fields, or in the time-frequency domain, to study oscillatory activity. In the literature, there is a common conception that evoked, induced, and on-going oscillations reflect different neuronal processes and mechanisms. In this work, we consider the relationship between the mechanisms generating neuronal transients and how they are expressed in terms of evoked and induced power. This relationship is addressed using a neuronally realistic model of interacting neuronal subpopulations. Neuronal transients were generated by changing neuronal input (a dynamic mechanism) or by perturbing the systems coupling parameters (a structural mechanism) to produce induced responses. By applying conventional time-frequency analyses, we show that, in contradistinction to common conceptions, induced and evoked oscillations are perhaps more related than previously reported. Specifically, structural mechanisms normally associated with induced responses can be expressed in evoked power. Conversely, dynamic mechanisms posited for evoked responses can induce responses, if there is variation in neuronal input. We conclude, it may be better to consider evoked responses as the results of mixed dynamic and structural effects. We introduce adjusted power to complement induced power. Adjusted power is unaffected by trial-to-trial variations in input and can be attributed to structural perturbations without ambiguity.
Article
The traditional view holds that event-related potentials (ERPs) reflect fixed latency, fixed polarity evoked responses that appear superimposed on the 'background EEG'. The validity of the evoked model has been questioned by studies arguing that ERPs are generated at least in part by a reset of ongoing oscillations. But a proof of phase reset that is distinct from the 'artificial' influence of evoked components on EEG phase-has been proven difficult for a variety of methodological reasons. We argue that a theoretical analysis of the assumptions and empirical evaluation of predictions of the evoked and oscillatory ERP model offer a promising way to shed new light on mechanisms generating ERPs that goes well beyond attempts to prove phase reset. Research on EEG oscillations documents that oscillations are task relevant and show a common operating principle, which is the control of the timing of neural activity. Both findings suggest that phase reorganization of task relevant oscillations is a theoretical necessity. We further argue and show evidence that (i) task relevant oscillations exhibit a typical interactive and task relevant relationship between pre- and poststimulus power in the theta and alpha frequency range in a way that small prestimulus power is related to large poststimulus power and vice versa, (ii) ERP (interpeak) latencies and (iii) ERP amplitudes reflect frequency characteristics of alpha and theta oscillations. We emphasize that central assumptions of the evoked model cannot be substantiated and conclude that the ERPR model offers a new way for an integrative interpretation of ongoing and event-related EEG phenomena.
QuickSIN Speech-in-Noise Test Version 1.3. Etymotic Research
  • P Niquette
  • G Gudmundsen
  • M Killion
Niquette, P., Gudmundsen, G., Killion, M., 2001. QuickSIN Speech-in-Noise Test Version 1.3. Etymotic Research, Elk Grove Village, IL.
The Oxford Handbook of Event-Related Potential Components
  • M Bastiaansen
  • A Mazaheri
  • O Jensen
Bastiaansen, M., Mazaheri, A., Jensen, O., 2012. Beyond ERP's: oscillatory neuronal dynamics. In: Luck, S.J., Kappenman, S.J. (Eds.), The Oxford Handbook of Event-Related Potential Components. Oxford University Press, New York, pp. 31e51.
Informational masking
  • Kid Jr
  • G Mason
  • C R Richards
  • V M Gallun
  • F J Durlach
Kid Jr., G., Mason, C.R., Richards, V.M., Gallun, F.J., Durlach, N.I., 2007. Informational masking. In: Yost, W.A., Popper, A.N., Fay, R.R. (Eds.), Springer Handbook of Auditory Research: Auditory Perception of Sound Sources, vol. 29. Springer, New York, NY, pp. 143e189.