Article

Neural Correlates of Phonetic Learning in Postlingually Deafened Cochlear Implant Listeners

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Objective: The present training study aimed to examine the fine-scale behavioral and neural correlates of phonetic learning in adult postlingually deafened cochlear implant (CI) listeners. The study investigated whether high variability identification training improved phonetic categorization of the /ba/-/da/ and /wa/-/ja/ speech contrasts and whether any training-related improvements in phonetic perception were correlated with neural markers associated with phonetic learning. It was hypothesized that training would sharpen phonetic boundaries for the speech contrasts and that changes in behavioral sensitivity would be associated with enhanced mismatch negativity (MMN) responses to stimuli that cross a phonetic boundary relative to MMN responses evoked using stimuli from the same phonetic category. Design: A computer-based training program was developed that featured multitalker variability and adaptive listening. The program was designed to help CI listeners attend to the important second formant transition cue that categorizes the /ba/-/da/ and /wa/-/ja/ contrasts. Nine adult CI listeners completed the training and 4 additional CI listeners that did not undergo training were included to assess effects of procedural learning. Behavioral pre-post tests consisted of identification and discrimination of the synthetic /ba/-/da/ and /wa/-/ja/ speech continua. The electrophysiologic MMN response elicited by an across phoneme category pair and a within phoneme category pair that differed by an acoustically equivalent amount was derived at pre-post test intervals for each speech contrast as well. Results: Training significantly enhanced behavioral sensitivity across the phonetic boundary and significantly altered labeling of the stimuli along the /ba/-/da/ continuum. While training only slightly altered identification and discrimination of the /wa/-/ja/ continuum, trained CI listeners categorized the /wa/-/ja/ contrast more efficiently than the /ba/-/da/ contrast across pre-post test sessions. Consistent with behavioral results, pre-post EEG measures showed the MMN amplitude to the across phoneme category pair significantly increased with training for both the /ba/-/da/ and /wa/-/ja/ contrasts, but the MMN was unchanged with training for the corresponding within phoneme category pairs. Significant brain-behavior correlations were observed between changes in the MMN amplitude evoked by across category phoneme stimuli and changes in the slope of identification functions for the trained listeners for both speech contrasts. Conclusions: The brain and behavior data of the present study provide evidence that substantial neural plasticity for phonetic learning in adult postlingually deafened CI listeners can be induced by high variability identification training. These findings have potential clinical implications related to the aural rehabilitation process following receipt of a CI device.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... To match the duration characteristics across stimulus types, the stimulus was digitally edited to 170 ms (Sony Sound Forge 9.0, Sony Creative Software) using temporal stretching and shrinking via the Pitch Synchronous Overlap-Add technique (Moulines & Charpentier, 1990). The edited consonant duration was 60 ms, and the edited vowel duration was 110 ms (Miller & Zhang, 2014;Miller et al., 2016b;Sharma et al., 2000). ...
... The results of this study suggest synthetic speech and nonspeech were gated similarly. We have used the synthetic /bɑ/ token in previous studies, and it is perceptually perceived as speech (Miller et al., 2016a(Miller et al., , 2016b. In contrast, sinewave speech is typically not reported as sounding like speech unless listeners are instructed to engage in a speech mode of perception (Dehaene-Lambertz et al., 2005). ...
Article
Purpose Auditory sensory gating is a neural measure of inhibition and is typically measured with a click or tonal stimulus. This electrophysiological study examined if stimulus characteristics and the use of speech stimuli affected auditory sensory gating indices. Method Auditory event-related potentials were elicited using natural speech, synthetic speech, and nonspeech stimuli in a traditional auditory gating paradigm in 15 adult listeners with normal hearing. Cortical responses were recorded at 64 electrode sites, and peak amplitudes and latencies to the different stimuli were extracted. Individual data were analyzed using repeated-measures analysis of variance. Results Significant gating of P1–N1–P2 peaks was observed for all stimulus types. N1–P2 cortical responses were affected by stimulus type, with significantly less neural inhibition of the P2 response observed for natural speech compared to nonspeech and synthetic speech. Conclusions Auditory sensory gating responses can be measured using speech and nonspeech stimuli in listeners with normal hearing. The results of the study indicate the amount of gating and neural inhibition observed is affected by the spectrotemporal characteristics of the stimuli used to evoke the neural responses.
... 10th International Conference on Speech Prosody 2020 25-28 May 2020, Tokyo, Japan For CI individuals, the efficacy of high variability identification training has been evaluated in postlingually deafened adults [16]. In that study, a pretest-interventionposttest design was implemented with four two-hour sessions of phonetic identification training over a period of two weeks. ...
... The results demonstrated that the multi-talker phonetic identification training promoted the CP of speech contrasts of /ba/-/da/ and /wa/-/ja/. Moreover, that study also investigated the neural correlates underpinning the improvement of phonetic categorization in terms of enhanced mismatch negativity (MMN) responses to sound stimuli across different phonetic categories relative to stimuli within the same category [16]. ...
... Children with CIs from the training group were introduced to complete five sessions of 229 phonetic identification training over the course of three weeks in the rehabilitation center. A 230 computer-based training program was developed (c.f., Miller et al., 2016aMiller et al., , 2016b. The training 231 program included a 4 AFC paradigm for sound-picture matching, the procedure of which was similar 232 as in the lexical tone identification task. ...
Article
Full-text available
Purpose Lexical tone perception is known to be persistently difficult for individuals with cochlear implants (CIs). The purpose of this study was to evaluate the efficacy of high-variability phonetic training (HVPT) in improving Mandarin tone perception for native-speaking children with CIs. Method A total of 28 Mandarin-speaking pediatric CI recipients participated in the study. Half of the children with CIs received a five-session HVPT within a period of 3 weeks. Identification and discrimination of lexical tones produced by familiar talkers (used during training) and novel talkers (not used during training) were measured before, immediately after, and 10 weeks after training termination. The other half untrained children served as control for the identical pre- and posttests. Results Lexical tone perception significantly improved in both trained identification task and untrained discrimination task for the trainees. There was also a significant effect in transfer of learning to perceiving tones produced by novel talkers. Moreover, training-induced gains were retained for up to 10 weeks after training. By comparison, no significant pre–post changes were observed in the control group. Conclusion The results provide the first systematical assessment for the efficacy of the HVPT protocol for Mandarin-speaking pediatric CI users with congenital hearing loss, which supports the clinical utility of intensive short-term HVPT in these children's rehabilitative regimens.
... The remarkable success of the CI is largely attributable to the proper use of signal transduction technology to harness central auditory plasticity of the implantees [2]. While CI users can significantly benefit from the neuroplasticity driven by the CI-empowered learning experience in developing their auditory, linguistic, and cognitive skills [3][4][5][6][7][8][9], pitch perception poses a unique challenge for these individuals. A vast body of literature demonstrated that CI recipients show deficits in pitch-related perceptual tasks, including voice emotion perception [10][11][12], speech prosody recognition [13][14][15], music appreciation [16][17][18], and lexical tone perception [19][20][21]. ...
Article
Full-text available
Pitch perception is known to be difficult for individuals with cochlear implant (CI), and adding a hearing aid (HA) in the non-implanted ear is potentially beneficial. The current study aimed to investigate the bimodal benefit for lexical tone recognition in Mandarin-speaking preschoolers using a CI and an HA in opposite ears. The child participants were required to complete tone identification in quiet and in noise with CI + HA in comparison with CI alone. While the bimodal listeners showed confusion between Tone 2 and Tone 3 in recognition, the additional acoustic information from the contralateral HA alleviated confusion between these two tones in quiet. Moreover, significant improvement was demonstrated in the CI + HA condition over the CI alone condition in noise. The bimodal benefit for individual subjects could be predicted by the low-frequency hearing threshold of the non-implanted ear and the duration of bimodal use. The findings support the clinical practice to fit a contralateral HA in the non-implanted ear for the potential benefit in Mandarin tone recognition in CI children. The limitations call for further studies on auditory plasticity on an individual basis to gain insights on the contributing factors to the bimodal benefit or its absence.
... Interestingly, training behavioral capacities for categorization of syllables seems to optimize the amplitude of MMN. 80 Our results suggest that increasing amplitude of F1-evoked MMN could lead to enhanced speech recognition. ...
Article
Cochlear implants restore hearing in deaf individuals, but speech perception remains challenging. Poor discrimination of spectral components is thought to account for limitations of speech recognition in cochlear implant users. We investigated how combined variations of spectral components along two orthogonal dimensions can maximize neural discrimination between two vowels, as measured by mismatch negativity. Adult cochlear implant users and matched normal-hearing listeners underwent electroencephalographic event-related potentials recordings in an optimum-1 oddball paradigm. A standard /a/ vowel was delivered in an acoustic free field along with stimuli having a deviant fundamental frequency (+3 and +6 semitones), a deviant first formant making it a /i/ vowel or combined deviant fundamental frequency and first formant (+3 and +6 semitones /i/ vowels). Speech recognition was assessed with a word repetition task. An analysis of variance between both amplitude and latency of mismatch negativity elicited by each deviant vowel was performed. The strength of correlations between these parameters of mismatch negativity and speech recognition as well as participants’ age was assessed. Amplitude of mismatch negativity was weaker in cochlear implant users but was maximized by variations of vowels’ first formant. Latency of mismatch negativity was later in cochlear implant users and was particularly extended by variations of the fundamental frequency. Speech recognition correlated with parameters of mismatch negativity elicited by the specific variation of the first formant. This nonlinear effect of acoustic parameters on neural discrimination of vowels has implications for implant processor programming and aural rehabilitation.
... Auditory training programs are designed to exploit brain plasticity in order to improve speech perception in complex listening situations. Brain imaging tools can be useful in tracking these neurophysiological changes induced by perceptual learning, including measures of neural activation, oscillations, and functional connectivity patterns in the neural substrate dedicated to speech processing (Miller, Zhang, & Nelson, 2016;Rao et al., 2017;Song, Skoe, Banai, & Kraus, 2012;Yu et al., 2017;Zhang & Wang, 2010). Additionally, electrophysiological measures may provide useful information for the development of effective auditory training strategies, as they could track improvements in sensory or cognitive processes underlying speech perception in background noise. ...
Poster
Full-text available
Here we aimed to utilize spectral rippled noise stimuli in combination with a mismatch negativity (MMN) paradigm to develop a neural metric of spectral resolution in CI users.
Article
Full-text available
Purpose: This study implemented a pretest-intervention-posttest design to examine whether multiple-talker identification training enhanced phonetic perception of the /ba/-/da/ and /wa/-/ja/ contrasts in adult postlingually deafened cochlear implant (CI) listeners. Method: Nine CI recipients completed eight hours of identification training using a custom-designed training package. Perception of speech produced by familiar talkers (talkers used during training) and unfamiliar talkers (talkers not used during training) was measured before and after training. Five additional untrained CI recipients completed identical pre-and posttests over the same time course as the trainees to control for procedural learning effects. Results: Perception of the speech contrasts produced by the familiar talkers significantly improved for the trained CI listeners, and effects of perceptual learning transferred to unfamiliar talkers. Such training-induced significant changes were not observed in the control group. Conclusion: The data provide initial evidence for the efficacy of the multiple-talker identification training paradigm for postlingually deafened CI users. This pattern of results is consistent with enhanced phonemic categorization of the trained speech sounds.
Article
Full-text available
Background: Auditory training involves active listening to auditory stimuli and aims to improve performance in auditory tasks. As such, auditory training is a potential intervention for the management of people with hearing loss. Objective: This systematic review (PROSPERO 2011: CRD42011001406) evaluated the published evidence-base for the efficacy of individual computer-based auditory training to improve speech intelligibility, cognition and communication abilities in adults with hearing loss, with or without hearing aids or cochlear implants. Methods: A systematic search of eight databases and key journals identified 229 articles published since 1996, 13 of which met the inclusion criteria. Data were independently extracted and reviewed by the two authors. Study quality was assessed using ten pre-defined scientific and intervention-specific measures. Results: Auditory training resulted in improved performance for trained tasks in 9/10 articles that reported on-task outcomes. Although significant generalisation of learning was shown to untrained measures of speech intelligibility (11/13 articles), cognition (1/1 articles) and self-reported hearing abilities (1/2 articles), improvements were small and not robust. Where reported, compliance with computer-based auditory training was high, and retention of learning was shown at posttraining follow-ups. Published evidence was of very-low to moderate study quality. Conclusions: Our findings demonstrate that published evidence for the efficacy of individual computer-based auditory training for adults with hearing loss is not robust and therefore cannot be reliably used to guide intervention at this time. We identify a need for high-quality evidence to further examine the efficacy of computer-based auditory training for people with hearing loss.
Article
Full-text available
This article provides a brief overview of the advantages of two-ear hearing in children and discusses the limitations, from a psychophysical and a technical perspective, which may constrain the ability of cochlear implant users to gain these benefits. The latest outcomes for children using bilateral cochlear implants are discussed, which suggest that results are more favorable for children who receive both devices before the age of 3.5 to 4 years. The available studies that have investigated electrophysiological responses for children receiving bilateral implants are discussed. These also support the notion that optimum development of binaural auditory skills may be more difficult after the age of 3.5 to 4 years. Studies that investigate the alternative for some children of using a hearing aid on the opposite ear to the cochlear implant are briefly discussed. These indicate that advantages for speech perception in noise and localization can be obtained consistently for children with significant residual hearing in the nonimplanted ear. The article concludes with an attempt to bring the available scientific evidence into the practical clinical context with suggestions that may assist clinicians in making recommendations for families considering bilateral cochlear implantation. Although the evidence remains limited at this time, it is reasonable to suggest that bilateral cochlear implantation can provide improved auditory skills over a single implant for children with severe and profound bilateral hearing loss. The available data suggest that the benefit may be maximized by introducing both implants as early as possible, at least before 3.5 to 4 years of age.
Article
Full-text available
Purpose Several acoustic cues specify any single phonemic contrast. Nonetheless, adult, native speakers of a language share weighting strategies, showing preferential attention to some properties over others. Cochlear implant (CI) signal processing disrupts the salience of some cues: In general, amplitude structure remains readily available, but spectral structure less so. This study asked how well speech recognition is supported if CI users shift attention to salient cues not weighted strongly by native speakers. Method Twenty adults with CIs participated. The /bɑ/-/wɑ/ contrast was used because spectral and amplitude structure varies in correlated fashion for this contrast. Adults with normal hearing weight the spectral cue strongly but the amplitude cue negligibly. Three measurements were made: labeling decisions, spectral and amplitude discrimination, and word recognition. Results Outcomes varied across listeners: Some weighted the spectral cue strongly, some weighted the amplitude cue, and some weighted neither. Spectral discrimination predicted spectral weighting. Spectral weighting explained the most variance in word recognition. Age of onset of hearing loss predicted spectral weighting but not unique variance in word recognition. Conclusion The weighting strategies of listeners with normal hearing likely support speech recognition best, so efforts in implant design, fitting, and training should focus on developing those strategies.
Article
Full-text available
Learning electrically stimulated speech patterns can be a new and difficult experience for many cochlear implant users. In the present study, ten cochlear implant patients participated in an auditory training program using speech stimuli. Training was conducted at home using a personal computer for 1 hour per day, 5 days per week, for a period of 1 month or longer. Results showed a significant improvement in all patients' speech perception performance. These results suggest that moderate auditory training using a computer-based auditory rehabilitation tool can be an effective approach for improving the speech perception performance of cochlear implant patients.
Article
Full-text available
Deals with certain misunderstandings on which H. L. Lane (see 39:5) based his criticism of data that bear on a motor theory of speech perception. Lane criticized experiments that had demonstrated contrasting tendencies toward "categorical" perception of stop consonants and "continuous" perception of vowels and nonspeech sounds. He also undertook to demonstrate that categorical perception of nonspeech sounds can be produced by the ordinary procedures of discrimination training, and so to refute the claim that such perception is an interesting characteristic of the speech mode. It is shown that contrary to Lane's claim, discrimination training is not sufficient to produce categorical perception. (34 ref.)
Article
Full-text available
Auditory training involves active listening to auditory stimuli and aims to improve performance in auditory tasks. As such, auditory training is a potential intervention for the management of people with hearing loss. This systematic review (PROSPERO 2011: CRD42011001406) evaluated the published evidence-base for the efficacy of individual computer-based auditory training to improve speech intelligibility, cognition and communication abilities in adults with hearing loss, with or without hearing aids or cochlear implants. A systematic search of eight databases and key journals identified 229 articles published since 1996, 13 of which met the inclusion criteria. Data were independently extracted and reviewed by the two authors. Study quality was assessed using ten pre-defined scientific and intervention-specific measures. Auditory training resulted in improved performance for trained tasks in 9/10 articles that reported on-task outcomes. Although significant generalisation of learning was shown to untrained measures of speech intelligibility (11/13 articles), cognition (1/1 articles) and self-reported hearing abilities (1/2 articles), improvements were small and not robust. Where reported, compliance with computer-based auditory training was high, and retention of learning was shown at post-training follow-ups. Published evidence was of very-low to moderate study quality. Our findings demonstrate that published evidence for the efficacy of individual computer-based auditory training for adults with hearing loss is not robust and therefore cannot be reliably used to guide intervention at this time. We identify a need for high-quality evidence to further examine the efficacy of computer-based auditory training for people with hearing loss.
Article
Full-text available
Neural plasticity in speech acquisition and learning is concerned with the timeline trajectory and the mechanisms of experience-driven changes in the neural circuits that support or disrupt linguistic function. In this selective review, we discuss the role of phonetic learning in language acquisition, the "critical period" of learning, the agents of neural plasticity, and the distinctiveness of linguistic systems in the brain. In particular, we argue for the necessity to look at brain - behavior connections using modern brain imaging techniques, seek explanations based on measures of neural sensitivity, neural efficiency, neural specificity and neural connectivity at the cortical level, and point out some key factors that may facilitate or limit seco nd language learning. We conclude by highlighting the theoretical and practical issues for future studies and suggest ways to optimize language learning and treatment.
Article
Full-text available
In this article, we present a summary of recent research linking speech perception in infancy to later language development, as well as a new empirical study examin-ing that linkage. Infant phonetic discrimination is initially language universal, but a decline in phonetic discrimination occurs for nonnative phonemes by the end of the 1st year. Exploiting this transition in phonetic perception between 6 and 12 months of age, we tested the hypothesis that the decline in nonnative phonetic dis-crimination is associated with native-language phonetic learning. Using a standard behavioral measure of speech discrimination in infants at 7 months and measures of their language abilities at 14, 18, 24, and 30 months, we show (a) a negative cor-relation between infants' early native and nonnative phonetic discrimination skills and (b) that native-and nonnative-phonetic discrimination skills at 7 months differ-entially predict future language ability. Better native-language discrimination at 7 months predicts accelerated later language abilities, whereas better nonnative-lan-guage discrimination at 7 months predicts reduced later language abilities. The dis-cussion focuses on (a) the theoretical connection between speech perception and language development and (b) the implications of these findings for the putative "critical period" for phonetic learning. Work in my laboratory has recently been focused on two fundamental questions and their theoretical intersect. The first is the role that infant speech perception plays in the acquisition of language. The second is whether early speech percep-tion can reveal the mechanism underlying the putative "critical period" in language acquisition.
Article
Full-text available
This study investigates whether L2 learners can be trained to make better use of phonetic information from visual cues in their perception of a novel phonemic contrast. It also evaluates the impact of audiovisual perceptual training on the learners’ pronunciation of a novel contrast. The use of visual cues for speech perception was evaluated for two English phonemic contrasts: the /v/–/b/–/p/ labial/labiodental contrast and /l/–/r/ contrast. In the first study, 39 Japanese learners of English were tested on their perception of the /v/–/b/–/p/ distinction in audio, visual and audiovisual modalities, and then undertook ten sessions of either auditory (‘A training’) or audiovisual (‘AV training’) perceptual training before being tested again. AV training was more effective than A training in improving the perception of the labial/labiodental contrast. In a second study, 62 Japanese learners of English were tested on their perception of the /l/–/r/ contrast in audio, visual and audiovisual modalities, and then undertook ten sessions of perceptual training with either auditory stimuli (‘A training’), natural audiovisual stimuli (‘AV Natural training’) or audiovisual stimuli with a synthetic face synchronized to natural speech (‘AV Synthetic training’). Perception of the /l/–/r/ contrast improved in all groups but learners trained audiovisually did not improve more than those trained auditorily. Auditory perception improved most for ‘A training’ learners and performance in the lipreading alone condition improved most for ‘natural AV training’ learners. The learners’ pronunciation of /l/–/r/ improved significantly following perceptual training, and a greater improvement was obtained for the ‘AV Natural training’ group. This study shows that sensitivity to visual cues for non-native phonemic contrasts can be enhanced via audiovisual perceptual training. AV training is more effective than A training when the visual cues to the phonemic contrast are sufficiently salient. Seeing the facial gestures of the talker also leads to a greater improvement in pronunciation, even for contrasts with relatively low visual salience.
Article
Full-text available
The mismatch negativity (MMN) event-related potential is a non-task related neurophysiologic index of auditory discrimination. The MMN was elicited in eight cochlear implant recipients by the synthesized speech stimulus pair /da/ and /ta/. The response was remarkably similar to the MMN measured in normal-hearing individuals to the same stimuli. The results suggest that the central auditory system can process certain aspects of speech consistently, independent of whether the stimuli are processed through a normal cochlea or mediated by a cochlear prosthesis. The MMN shows promise as a measure for the objective evaluation of cochlear-implant function, and for the study of central neurophysiological processes underlying speech perception.
Article
Full-text available
Although some cochlear implant (CI) listeners can show good word recognition accuracy, it is not clear how they perceive and use the various acoustic cues that contribute to phonetic perceptions. In this study, the use of acoustic cues was assessed for normal-hearing (NH) listeners in optimal and spectrally degraded conditions, and also for CI listeners. Two experiments tested the tense/lax vowel contrast (varying in formant structure, vowel-inherent spectral change, and vowel duration) and the word-final fricative voicing contrast (varying in F1 transition, vowel duration, consonant duration, and consonant voicing). Identification results were modeled using mixed-effects logistic regression. These experiments suggested that under spectrally-degraded conditions, NH listeners decrease their use of formant cues and increase their use of durational cues. Compared to NH listeners, CI listeners showed decreased use of spectral cues like formant structure and formant change and consonant voicing, and showed greater use of durational cues (especially for the fricative contrast). The results suggest that although NH and CI listeners may show similar accuracy on basic tests of word, phoneme or feature recognition, they may be using different perceptual strategies in the process.
Article
Full-text available
While auditory training in quiet has been shown to improve cochlear implant (CI) users' speech understanding in quiet, it is unclear whether training in noise will benefit speech understanding in noise. The present study investigated whether auditory training could improve CI users' speech recognition in noise and whether training with familiar stimuli in an easy listening task (closed-set digit recognition) would improve recognition of unfamiliar stimuli in a more difficult task (open-set sentence recognition). CI users' speech understanding in noise was assessed before, during, and after auditory training with a closed-set recognition task (digits identification) in speech babble. Before training was begun, recognition of digits, Hearing in Noise Test (HINT) sentences, and IEEE sentences presented in steady speech-shaped noise or multitalker speech babble was repeatedly measured to establish a stable estimate of baseline performance. After completing baseline measures, participants trained at home on their personal computers using custom software for approximately 30 mins/day, 5 days/wk, for 4 wks, for a total of 10 hrs of training. Participants were trained only to identify random sequences of three digits presented in speech babble, using a closed-set task. During training, the signal-to-noise ratio was adjusted according to subject performance; auditory and visual feedback was provided. Recognition of digits, HINT sentences, and IEEE sentences in steady noise and speech babble was remeasured after the second and fourth week of training. Training was stopped after the fourth week, and subjects returned to the laboratory 1 mo later for follow-up testing to see whether any training benefits had been retained. Mean results showed that the digit training in babble significantly improved digit recognition in babble (which was trained) and in steady noise (which was not trained). The training benefit generalized to improved HINT and IEEE sentence recognition in both types of noise. Training benefits were largely retained in follow-up measures made 1 mo after training was stopped. The results demonstrated that auditory training in noise significantly improved CI users' speech performance in noise, and that training with simple stimuli using an easy closed-set listening task improved performance with difficult stimuli and a difficult open-set listening task.
Article
Full-text available
The present study used magnetoencephalography (MEG) to examine perceptual learning of American English /r/ and /l/ categories by Japanese adults who had limited English exposure. A training software program was developed based on the principles of infant phonetic learning, featuring systematic acoustic exaggeration, multi-talker variability, visible articulation, and adaptive listening. The program was designed to help Japanese listeners utilize an acoustic dimension relevant for phonemic categorization of /r-l/ in English. Although training did not produce native-like phonetic boundary along the /r-l/ synthetic continuum in the second language learners, success was seen in highly significant identification improvement over twelve training sessions and transfer of learning to novel stimuli. Consistent with behavioral results, pre-post MEG measures showed not only enhanced neural sensitivity to the /r-l/ distinction in the left-hemisphere mismatch field (MMF) response but also bilateral decreases in equivalent current dipole (ECD) cluster and duration measures for stimulus coding in the inferior parietal region. The learning-induced increases in neural sensitivity and efficiency were also found in distributed source analysis using Minimum Current Estimates (MCE). Furthermore, the pre-post changes exhibited significant brain-behavior correlations between speech discrimination scores and MMF amplitudes as well as between the behavioral scores and ECD measures of neural efficiency. Together, the data provide corroborating evidence that substantial neural plasticity for second-language learning in adulthood can be induced with adaptive and enriched linguistic exposure. Like the MMF, the ECD cluster and duration measures are sensitive neural markers of phonetic learning.
Article
Full-text available
In this report we review the vowel and consonant recognition ability of patients who use a multichannel cochlear implant and who achieve relatively good word identification scores. The results suggest that vowel recognition is accomplished by good resolution of the frequency of the first formant (F1) combined with poor resolution of the frequency of the second formant (F2). The results also suggest that consonant recognition is accomplished (1) by using information from the amplitude envelope, including periodicity/aperiodicity, as cues to manner and voicing, (2) by using F1 as an aid to the identification of manner and voicing, and (3) by using information from cochlear place of stimulation to provide a very crude indication of the shape of the frequency spectrum above 1 kHz.
Article
Full-text available
We derive a new self-organizing learning algorithm that maximizes the information transferred in a network of nonlinear units. The algorithm does not assume any knowledge of the input distributions, and is defined here for the zero-noise limit. Under these conditions, information maximization has extra properties not found in the linear case (Linsker 1989). The nonlinearities in the transfer function are able to pick up higher-order moments of the input distributions and perform something akin to true redundancy reduction between units in the output representation. This enables the network to separate statistically independent components in the inputs: a higher-order generalization of principal components analysis. We apply the network to the source separation (or cocktail party) problem, successfully separating unknown mixtures of up to 10 speakers. We also show that a variant on the network architecture is able to perform blind deconvolution (cancellation of unknown echoes and reverberation in a speech signal). Finally, we derive dependencies of information transfer on time delays. We suggest that information maximization provides a unifying framework for problems in "blind" signal processing.
Article
Full-text available
There is considerable debate about whether the early processing of sounds depends on whether they form part of speech. Proponents of such speech specificity postulate the existence of language-dependent memory traces, which are activated in the processing of speech but not when equally complex, acoustic non-speech stimuli are processed. Here we report the existence of these traces in the human brain. We presented to Finnish subjects the Finnish phoneme prototype /e/ as the frequent stimulus, and other Finnish phoneme prototypes or a non-prototype (the Estonian prototype /õ/) as the infrequent stimulus. We found that the brain's automatic change-detection response, reflected electrically as the mismatch negativity (MMN), was enhanced when the infrequent, deviant stimulus was a prototype (the Finnish /ö/) relative to when it was a non-prototype (the Estonian /õ/). These phonemic traces, revealed by MMN, are language-specific, as /õ/ caused enhancement of MMN in Estonians. Whole-head magnetic recordings located the source of this native-language, phoneme-related response enhancement, and thus the language-specific memory traces, in the auditory cortex of the left hemisphere.
Article
Full-text available
Factors leading to variability in auditory-visual (AV) speech recognition include the subject's ability to extract auditory (A) and visual (V) signal-related cues, the integration of A and V cues, and the use of phonological, syntactic, and semantic context. In this study, measures of A, V, and AV recognition of medial consonants in isolated nonsense syllables and of words in sentences were obtained in a group of 29 hearing-impaired subjects. The test materials were presented in a background of speech-shaped noise at 0-dB signal-to-noise ratio. Most subjects achieved substantial AV benefit for both sets of materials relative to A-alone recognition performance. However, there was considerable variability in AV speech recognition both in terms of the overall recognition score achieved and in the amount of audiovisual gain. To account for this variability, consonant confusions were analyzed in terms of phonetic features to determine the degree of redundancy between A and V sources of information. In addition, a measure of integration ability was derived for each subject using recently developed models of AV integration. The results indicated that (1) AV feature reception was determined primarily by visual place cues and auditory voicing + manner cues, (2) the ability to integrate A and V consonant cues varied significantly across subjects, with better integrators achieving more AV benefit, and (3) significant intra-modality correlations were found between consonant measures and sentence measures, with AV consonant scores accounting for approximately 54% of the variability observed for AV sentence recognition. Integration modeling results suggested that speechreading and AV integration training could be useful for some individuals, potentially providing as much as 26% improvement in AV consonant recognition.
Article
Full-text available
Here we report that training-associated changes in neural activity can precede behavioral learning. This finding suggests that speech-sound learning occurs at a pre-attentive level which can be measured neurophysiologically (in the absence of a behavioral response) to assess the efficacy of training. Children with biologically based perceptual learning deficits as well as people who wear cochlear implants or hearing aids undergo various forms of auditory training. The effectiveness of auditory training can be difficult to assess using behavioral methods because these populations are communicatively impaired and may have attention and/or cognitive deficits. Based on our findings, if neurophysiological changes are seen during auditory training, then the training method is effectively altering the neural representation of the speech/sounds and changes in behavior are likely to follow.
Article
Full-text available
At the forefront of debates on language are new data demonstrating infants' early acquisition of information about their native language. The data show that infants perceptually "map" critical aspects of ambient language in the first year of life before they can speak. Statistical properties of speech are picked up through exposure to ambient language. Moreover, linguistic experience alters infants' perception of speech, warping perception in the service of language. Infants' strategies are unexpected and unpredicted by historical views. A new theoretical position has emerged, and six postulates of this position are described.
Article
Monolingual speakers of Japanese were trained to identify English /r/ and /l/ using Logan et al.’s [J. Acoust. Soc. Am. 89, 874–886 (1991)] high‐variability training procedure. Subjects’ performance improved from the pretest to the post‐test and during the 3 weeks of training. Performance during training varied as a function of talker and phonetic environment. Generalization accuracy to new words depended on the voice of the talker producing the /r/–/l/ contrast: Subjects were significantly more accurate when new words were produced by a familiar talker than when new words were produced by an unfamiliar talker. This difference could not be attributed to differences in intelligibility of the stimuli. Three and six months after the conclusion of training, subjects returned to the laboratory and were given the post‐test and tests of generalization again. Performance was surprisingly good on each test after 3 months without any further training: Accuracy decreased only 2% from the post‐test given at the end of training to the post‐test given 3 months later. Similarly, no significant decrease in accuracy was observed for the tests of generalization. After 6 months without training, subjects’ accuracy was still 4.5% above pretest levels. Performance on the tests of generalization did not decrease and significant differences were still observed between talkers. The present results suggest that the high‐variability training paradigm encourages a long‐term modification of listeners’ phonetic perception. Changes in perception are brought about by shifts in selective attention to the acoustic cues that signal phoneticcontrasts. These modifications in attention appear to be retrained over time, despite the fact that listeners are not exposed to the /r/–/l/ contrast in their native language environment.
Article
In this study, spectral properties of speech sounds were used to test functional spectral resolution in people who use cochlear implants (CIs). Specifically, perception of the /ba/-/da/ contrast was tested using two spectral cues: Formant transitions (a fine-resolution cue) and spectral tilt (a coarse-resolution cue). Higher weighting of the formant cues was used as an index of better spectral cue perception. Participants included 19 CI listeners and 10 listeners with normal hearing (NH), for whom spectral resolution was explicitly controlled using a noise vocoder with variable carrier filter widths to simulate electrical current spread. Perceptual weighting of the two cues was modeled with mixed-effects logistic regression, and was found to systematically vary with spectral resolution. The use of formant cues was greatest for NH listeners for unprocessed speech, and declined in the two vocoded conditions. Compared to NH listeners, CI listeners relied less on formant transitions, and more on spectral tilt. Cue-weighting results showed moderately good correspondence with word recognition scores. The current approach to testing functional spectral resolution uses auditory cues that are known to be important for speech categorization, and can thus potentially serve as the basis upon which CI processing strategies and innovations are tested.
Book
Detection Theory is an introduction to one of the most important tools for analysis of data where choices must be made and performance is not perfect. Originally developed for evaluation of electronic detection, detection theory was adopted by psychologists as a way to understand sensory decision making, then embraced by students of human memory. It has since been utilized in areas as diverse as animal behavior and X-ray diagnosis. This book covers the basic principles of detection theory, with separate initial chapters on measuring detection and evaluating decision criteria. Some other features include: complete tools for application, including flowcharts, tables, pointers, and software;. student-friendly language;. complete coverage of content area, including both one-dimensional and multidimensional models;. separate, systematic coverage of sensitivity and response bias measurement;. integrated treatment of threshold and nonparametric approaches;. an organized, tutorial level introduction to multidimensional detection theory;. popular discrimination paradigms presented as applications of multidimensional detection theory; and. a new chapter on ideal observers and an updated chapter on adaptive threshold measurement. This up-to-date summary of signal detection theory is both a self-contained reference work for users and a readable text for graduate students and other researchers learning the material either in courses or on their own. © 2005 by Lawrence Erlbaum Associates, Inc. All rights reserved.
Article
Purpose: To determine if short-term computerized speech-in-noise training can produce significant improvements in speech-in-noise perception by cochlear implant (CI) recipients on standardized audiologic testing measures. Method: Five adult postlingually deafened CI recipients participated in 4 speech-in-noise training sessions using the Seeing and Hearing Speech program (Sensimetrics; Malden, MA). Each participant completed lessons concentrating on consonant and vowel recognition at word, phrase, and sentence levels. Speech-in-noise abilities were assessed using the QuickSIN (Killion, Niquette, Gudmundsen, Revit, & Banerjee, 2004) and the Hearing in Noise Test (HINT ( Nilsson, Soli & Sullivan, 1994)). Results: All listeners significantly improved key word identification on the HINT after training, albeit only at the most favorable signal-to-noise ratio (SNR). Listeners also showed a significant reduction in the degree of SNR loss on the QuickSIN after training. Conclusion: Short-term speech-in-noise training may improve speech-in-noise perception in postlingually deafened adult CI recipients.
Article
Eye movements, eye blinks, cardiac signals, muscle noise, and line noise present serious problems for electroencephalographic (EEG) interpretation and analysis when rejecting contaminated EEG segments results in an unacceptable data loss. Many methods have been proposed to remove artifacts from EEG recordings, especially those arising from eye movements and blinks. Often regression in the time or frequency domain is performed on parallel EEG and electrooculographic (EOG) recordings to derive parameters characterizing the appearance and spread of EOG artifacts in the EEG channels. Because EEG and ocular activity mix bidirectionally, regressing out eye artifacts inevitably involves subtracting relevant EEG signals from each record as well. Regression methods become even more problematic when a good regressing channel is not available for each artifact source, as in the case of muscle artifacts. Use of principal component analysis (PCA) has been proposed to remove eye artifacts from multichannel EEG. However, PCA cannot completely separate eye artifacts from brain signals, especially when they have comparable amplitudes. Here, we propose a new and generally applicable method for removing a wide variety of artifacts from EEG records based on blind source separation by independent component analysis (ICA). Our results on EEG data collected from normal and autistic subjects show that ICA can effectively detect, separate, and remove contamination from a wide variety of artifactual sources in EEG records with results comparing favorably with those obtained using regression and PCA methods. ICA can also be used to analyze blink-related brain activity.
Article
To test the effect of linguistic experience on the perception of a cue that is known to be effective in distinguishing between [r] and [l] in English, 21 Japanese and 39 American adults were tested on discrimination of a set of synthetic speech-like stimuli. The 13 “speech” stimuli in this set varied in the initial stationary frequency of the third formant (F3) and its subsequent transition into the vowel over a range sufficient to produce the perception of [r a] and [l a] for American subjects and to produce [r a] (which is not in phonemic contrast to [l a ]) for Japanese subjects. Discrimination tests of a comparable set of stimuli consisting of the isolated F3 components provided a “nonspeech” control. For Americans, the discrimination of the speech stimuli was nearly categorical, i.e., comparison pairs which were identified as different phonemes were discriminated with high accuracy, while pairs which were identified as the same phoneme were discriminated relatively poorly. In comparison, discrimination of speech stimuli by Japanese subjects was only slightly better than chance for all comparison pairs. Performance on nonspeech stimuli, however, was virtually identical for Japanese and American subjects; both groups showed highly accurate discrimination of all comparison pairs. These results suggest that the effect of linguistic experience is specific to perception in the “speech mode.”
Article
Articles contained in this monograph describe the communication performance of 112 teenagers who received multichannel cochlear implants between the ages of 2 and 5 years. Children were first tested during the elementary school years when they were 8 or 9 years of age. They also were tested as adolescents when they were between 15 and 18 years old. Characteristics of the population are described including their modes of communication and educational environments. Child, family and educational variables that will be explored in the following articles as possible predictors of successful outcomes are introduced.
Article
The present study investigated the neurophysiological correlates of categorical perception of Chinese lexical tones in Mandarin Chinese. Relative to standard stimuli, both within- and across-category deviants elicited mismatch negativity (MMN) in bilateral frontal-central recording sites. The MMN elicited in the right sites was marginally larger than in the left sites, which reflects the role of the right hemisphere in acoustic processing. At the same time, relative to within-category deviants, the across-category deviants elicited larger MMN in the left recording sites, reflecting the long-term phonemic traces of lexical tones. These results provide strong neurophysiological evidence in support of categorical perception of lexical tones in Chinese. More important, they demonstrate that acoustic and phonological information is processed in parallel within the MMN time window for the perception of lexical tones. Finally, homologous nonspeech stimuli elicited similar MMN patterns, indicating that lexical tone knowledge influences the perception of nonspeech signals.
Article
This study employed behavioral and electrophysiological measures to examine selective listening of concurrent auditory stimuli. Stimuli consisted of four compound sounds, each created by mixing a pure tone with filtered noise bands at a signal-to-noise ratio of +15 dB. The pure tones and filtered noise bands each contained two levels of pitch. Two separate conditions were created; the background stimuli varied randomly or were held constant. In separate blocks, participants were asked to judge the pitch of tones or the pitch of filtered noise in the compound stimuli. Behavioral data consistently showed lower sensitivity and longer response times for classification of filtered noise when compared with classification of tones. However, differential effects were observed in the peak components of auditory event-related potentials (ERPs). Relative to tone classification, the P1 and N1 amplitudes were enhanced during the more difficult noise classification task in both test conditions, but the peak latencies were shorter for P1 and longer for N1 during noise classification. Moreover, a significant interaction between condition and task was seen for the P2. The results suggest that the essential ERP components for the same compound auditory stimuli are modulated by listeners' focus on specific aspects of information in the stimuli.
Article
Cochlear implantation is effective at restoring partial hearing to profoundly deaf adults, but not all patients receive equal benefit. The present study evaluated the effectiveness of a computer-based self-administered training package that was designed to improve speech perception among adults who had used cochlear implants for more than three years. Eleven adults were asked to complete an hour of auditory training each day, five days a week, for a period of three weeks. Two training tasks were included, one based around discriminating isolated words, and the other around discriminating words in sentences. Compliance with the protocol was good, with eight out of eleven participants completing approximately 15 hours of training, as instructed. A significant improvement of eight percentage points was found on a test of consonant discrimination, but there were no significant improvements on sentence tests or on a test of vowel discrimination. Self-reported benefits were variable and generally small. Further research is needed to establish whether auditory training is particularly effective for identifiable sub-groups of cochlear-implant users.
Article
A variety of studies have demonstrated that organizing stimuli into categories can affect the way the stimuli are perceived. We explore the influence of categories on perception through one such phenomenon, the perceptual magnet effect, in which discriminability between vowels is reduced near prototypical vowel sounds. We present a Bayesian model to explain why this reduced discriminability might occur: It arises as a consequence of optimally solving the statistical problem of perception in noise. In the optimal solution to this problem, listeners' perception is biased toward phonetic category means because they use knowledge of these categories to guide their inferences about speakers' target productions. Simulations show that model predictions closely correspond to previously published human data, and novel experimental results provide evidence for the predicted link between perceptual warping and noise. The model unifies several previous accounts of the perceptual magnet effect and provides a framework for exploring categorical effects in other domains.
Article
The influence of a talker's face (e.g., articulatory gestures) and voice, vocalic context, and word position were investigated in the training of Japanese and Korean English as a second language learners to identify American English /I/ and /I/. In the pretest-posttest design, an identification paradigm assessed the effects of 3 weeks of training using multiple natural exemplars on videotape. Word position, adjacent vowel, and training type (auditory-visual [AV] vs. auditory only; multiple vs. single talker for Koreans) were independent variables. Findings revealed significant effects of training type (greater improvement with AV), talker, word position, and vowel. Identification accuracy generalized successfully to novel stimuli and a new talker. Transfer to significant production improvement was also noted. These findings are compatible with episodic models for the encoding of speech in memory.
Article
The discriminability of bilabial stop consonants differing in VOT (the Abramson-Lisker bilabial series) was measured in a same-different task, an oddity task, and a dual response, discrimination--identification task. Subjects showed excellent within-category discrimination in all three tasks after a moderate amount of training in a same-different task with a fixed standard and with feedback. In addition, discrimination performance continuously improved with increasing stimulus difference for both intra- and intercategory comparisons. Also, subjects were able to alter their identification responses so that well-defined category boundaries fell at arbitrary values determined by the experiments. These results are not compatible with a strict interpretation of the categorical perception of stop consonants.
Article
The mismatch negativity (MMN) is an automatic cortical evoked potential that signifies the brain's detection of acoustic change. In other words, the MMN reflects the neurophysiologic processes that underlie auditory discrimination. As such, the MMN provides an objective tool for evaluating central auditory mechanisms involved in speech perception. We are using the MMN to study the central auditory processes that encode acoustic changes important for speech perception in 1) normal-hearing adults and children, 2) individuals with impaired auditory systems (including persons with learning disabilities, attention deficit disorders, cochlear implants), and 3) an animal model. Specifically, we have demonstrated that the MMN provides information about the central processing of fine acoustic differences, the neuroanatomic pathways that encode acoustic change, central auditory processing in the presence of peripheral hearing deficits, and central auditory system plasticity. In addition, we have considered methodological challenges associated with measuring the MMN in individual subjects. Several methodological issues--including appropriate stimuli, stimulus presentation variables, the recording protocol and environment, and validation of the MMN in individuals--are discussed.
Article
Monolingual speakers of Japanese were trained to identify English /r/ and /l/ using Logan et al.'s [J. Acoust. Soc. Am. 89, 874-886 (1991)] high-variability training procedure. Subjects' performance improved from the pretest to the post-test and during the 3 weeks of training. Performance during training varied as a function of talker and phonetic environment. Generalization accuracy to new words depended on the voice of the talker producing the /r/-/l/ contrast: Subjects were significantly more accurate when new words were produced by a familiar talker than when new words were produced by an unfamiliar talker. This difference could not be attributed to differences in intelligibility of the stimuli. Three and six months after the conclusion of training, subjects returned to the laboratory and were given the post-test and tests of generalization again. Performance was surprisingly good on each test after 3 months without any further training: Accuracy decreased only 2% from the post-test given at the end of training to the post-test given 3 months later. Similarly, no significant decrease in accuracy was observed for the tests of generalization. After 6 months without training, subjects' accuracy was still 4.5% above pretest levels. Performance on the tests of generalization did not decrease and significant differences were still observed between talkers. The present results suggest that the high-variability training paradigm encourages a long-term modification of listeners' phonetic perception. Changes in perception are brought about by shifts in selective attention to the acoustic cues that signal phonetic contrasts. These modifications in attention appear to be retrained over time, despite the fact that listeners are not exposed to the /r/-/l/ contrast in their native language environment.
Article
An auditory event-related brain potential called mismatch negativity (MMN) was measured to study the perception of vowel pitch and formant frequency. In the MMN paradigm, deviant vowels differed from the standards either in F0 or F2 with equal relative steps. Pure tones of corresponding frequencies were used as control stimuli. The results indicate that the changes in F0 or F2 of vowels significantly affected the MMN amplitudes. The only variable significantly affecting the MMN latencies was sex which, however, did not have any effect on the amplitudes of the MMN. As expected, the MMN amplitudes increased with an increase in the acoustical difference between the standards and the deviants in all cases. On the average, the amplitudes were lower for the vowels than for the pure tones of equal loudness. However, in vowels, minor frequency changes in F0 produced higher MMN amplitudes than similar relative changes in F2. It was also noted that even the smallest and phonetically irrelevant change in F2 was detected by the MMN process. In overall, the results demonstrate that the MMN can be measured separately for F0 and F2 of vowels, although the MMN responses show large interindividual differences.
Article
Two experiments were carried out to extend Logan et al.'s recent study [J. S. Logan, S. E. Lively, and D. B. Pisoni, J. Acoust. Soc. Am. 89, 874-886 (1991)] on training Japanese listeners to identify English /r/ and /l/. Subjects in experiment 1 were trained in an identification task with multiple talkers who produced English words containing the /r/-/l/ contrast in initial singleton, initial consonant clusters, and intervocalic positions. Moderate, but significant, increases in accuracy and decreases in response latency were observed between pretest and posttest and during training sessions. Subjects also generalized to new words produced by a familiar talker and novel words produced by an unfamiliar talker. In experiment 2, a new group of subjects was trained with tokens from a single talker who produced words containing the /r/-/l/ contrast in five phonetic environments. Although subjects improved during training and showed increases in pretest-posttest performance, they failed to generalize to tokens produced by a new talker. The results of the present experiments suggest that variability plays an important role in perceptual learning and robust category formation. During training, listeners develop talker-specific, context-dependent representations for new phonetic categories by selectively shifting attention toward the contrastive dimensions of the non-native phonetic categories. Phonotactic constraints in the native language, similarity of the new contrast to distinctions in the native language, and the distinctiveness of contrastive cues all appear to mediate category acquisition.
Article
Behavioral perceptual abilities and neurophysiologic changes observed after listening training can generalize to other stimuli not used in the training paradigm, thereby demonstrating behavioral "transfer of learning" and plasticity in underlying physiologic processes. Nine normal-hearing monolingual English-speaking adults were trained to identify a prevoiced labial stop sound (one that is not used phonemically in the English language). After training, the subjects were asked to discriminate and identify a prevoiced alveolar stop. Mismatch negativity cortical evoked responses (MMN) were recorded to both labial and alveolar stimuli before and after training. Behavioral performance and MMNs also were evaluated in an age-matched control group that did not receive training. Listening training improved the experimental group's ability to discriminate and identify an unfamiliar VOT contrast. That enhanced ability transferred from one place of articulation (labial) to another (alveolar). The behavioral training effects were reflected in the MMN, which showed an increase in duration and area when elicited by the training stimuli as well as a decrease in onset latency when elicited by the transfer stimuli. Interestingly, changes in the MMN were largest over the left hemisphere. The results demonstrate that training can generalize to listening situations beyond those used in training sessions, and that the preattentive central neurophysiology underlying perceptual learning are altered through auditory training.
Article
The aim was to determine whether the ability to use place-coded vowel formant information could be improved after training in a group of congenitally deafened patients, who showed limited speech perception ability after cochlear implant use ranging from 1 yr 8 mo to 6 yr 11 mo. A further aim was to investigate the relationship between electrode position difference limens and vowel recognition. Three children, one adolescent, and one young adult were assessed with synthesized versions of the words/hid, head, had, hud, hod, hood/containing three formants and with a natural version of these words as well as with a 12-alternative, closed-set task containing monosyllabic words. The change in performance during a nontraining period was compared to the change in performance after 10 training sessions. After training, two children showed significant gains on a number of tests and improvements were consistent with their electrode discrimination ability. Difference limens ranged from one to three electrodes for these patients as well as for two other patients who showed minimal to no improvements. The minimal gains shown by the final patient could be partly explained by poorer apical electrode position difference limen. Significant gains in vowel perception occurred post-training on several assessments for two of the children. This suggests the need for children to continue to have aural rehabilitation for a substantial period after implantation. Minimal improvements, however, occurred for the remaining patients. With the exception of one patient, their poorer performance was not associated with poorer electrode discrimination.
Article
Mismatch negativity (MMN) was measured on normal-hearing young adult women and men to determine the effect of gender on this auditory evoked potential (AEP). In the experimental condition, recordings were obtained for 1000-Hz tone bursts presented at 75 dB nHL (standard stimuli) and 60 dB nHL (deviant stimuli). AEPs also were obtained in a control condition in which all stimuli were presented at 60 dB nHL; however, 15 percent of these responses were averaged to represent a response analogous to the experimental deviant response. The MMN was derived by subtracting the analogous control waveform from the experimental deviant waveform. Measures of peak-to-peak amplitude, peak latency, and area-under-the-curve were obtained for each derived waveform. Analysis of these data indicated no significant gender differences in peak latency of the MMN response. However, peak-to-peak amplitude and area-under-the-curve were significantly larger for women than for men.
Article
The present study examined the electrophysiological responses that Native English speakers display during a passive oddball task when they are presented with different types of syllabic contrasts, namely a labial /ba/-dental /d a/, a Hindi dental /d a/-retroflex /da/ and a within-category (two /ba/ tokens) contrasts. The analyses of the event-related potentials obtained showed that subjects pre-attentively perceive the differences in all experimental conditions, despite not showing such detection behaviourally in the Hindi and within-category conditions. These results support the notion that there is no permanent loss of the initial perceptual abilities that humans have as infants, but that there is an important neural reorganisation which allows the system to overcome the differences detected and only be aware of contrasts that are relevant in the language which will become the subjects native tongue. We also report order asymmetries in the ERP responses and suggest that the percepts and not only the physical attributes of the stimuli have to be considered for the evaluation of the responses obtained.
Article
This article presents an account of how early language experience can impede the acquisition of non-native phonemes during adulthood. The hypothesis is that early language experience alters relatively low-level perceptual processing, and that these changes interfere with the formation and adaptability of higher-level linguistic representations. Supporting data are presented from an experiment that tested the perception of English /r/ and /l/ by Japanese, German, and American adults. The underlying perceptual spaces for these phonemes were mapped using multidimensional scaling and compared to native-language categorization judgments. The results demonstrate that Japanese adults are most sensitive to an acoustic cue, F2, that is irrelevant to the English /r/-/l/ categorization. German adults, in contrast, have relatively high sensitivity to more critical acoustic cues. The results show how language-specific perceptual processing can alter the relative salience of within- and between-category acoustic variation, and thereby interfere with second language acquisition.
Article
This study examined whether cochlear implant users must perceive differences along phonetic continua in the same way as do normal hearing listeners (i.e., sharp identification functions, poor within-category sensitivity, high between-category sensitivity) in order to recognize speech accurately. Adult postlingually deafened cochlear implant users, who were heterogeneous in terms of their implants and processing strategies, were tested on two phonetic perception tasks using a synthetic /da/-/ta/ continuum (phoneme identification and discrimination) and two speech recognition tasks using natural recordings from ten talkers (open-set word recognition and forced-choice /d/-/t/ recognition). Cochlear implant users tended to have identification boundaries and sensitivity peaks at voice onset times (VOT) that were longer than found for normal-hearing individuals. Sensitivity peak locations were significantly correlated with individual differences in cochlear implant performance; individuals who had a /d/-/t/ sensitivity peak near normal-hearing peak locations were most accurate at recognizing natural recordings of words and syllables. However, speech recognition was not strongly related to identification boundary locations or to overall levels of discrimination performance. The results suggest that perceptual sensitivity affects speech recognition accuracy, but that many cochlear implant users are able to accurately recognize speech without having typical normal-hearing patterns of phonetic perception.
Article
Speech understanding with cochlear implants has improved steadily over the last 25 years, and the success of implants has provided a powerful tool for understanding speech recognition in general. Comparing speech recognition in normal-hearing listeners and in cochlear-implant listeners has revealed many important lessons about the types of information necessary for good speech recognition--and some of the lessons are surprising. This paper presents a summary of speech perception research over the last 25 years with cochlear-implant and normal-hearing listeners. As long as the speech is audible, even the relatively severe amplitude distortion has only a mild effect on intelligibility. Temporal cues appear to be useful for speech intelligibility only up to about 20 Hz. Whereas temporal information above 20 Hz may contribute to improved quality, it contributes little to speech understanding. In contrast, the quantity and quality of spectral information appear to be critical for speech understanding. Only four spectral "channels" of information can produce good speech understanding, but more channels are required for difficult listening situations. Speech understanding is sensitive to the placement of spectral information along the cochlea. In prosthetic devices, in which the spectral information can be delivered to any cochlear location, it is critical to present spectral information to the normal acoustic tonotopic location for that information. If there is a shift or distortion of 2 to 3 mm between frequency and cochlear place, speech recognition is decreased dramatically.