Speech recognition with varying numbers and types of competing talkers by normal-hearing, cochlear-implant, and implant simulation subjects

Hearing and Speech Laboratory, University of California, Irvine, 364 Med Surge II, Room 315, Irvine, California 92697, USA.
The Journal of the Acoustical Society of America (Impact Factor: 1.5). 02/2008; 123(1):450-61. DOI: 10.1121/1.2805617
Source: PubMed


Cochlear-implant users perform far below normal-hearing subjects in background noise. Speech recognition with varying numbers of competing female, male, and child talkers was evaluated in normal-hearing subjects, cochlear-implant users, and normal-hearing subjects utilizing an eight-channel sine-carrier cochlear-implant simulation. Target sentences were spoken by a male. Normal-hearing subjects obtained considerably better speech reception thresholds than cochlear-implant subjects; the largest discrepancy was 24 dB with a female masker. Evaluation of one implant subject with normal hearing in the contralateral ear suggested that this difference is not caused by age-related disparities between the subject groups. Normal-hearing subjects showed a significant advantage with fewer competing talkers, obtaining release from masking with up to three talker maskers. Cochlear-implant and simulation subjects showed little such effect, although there was a substantial difference between the implant and simulation results with talker maskers. All three groups benefited from a voice pitch difference between target and masker, with the female talker providing significantly less masking than the male. Child talkers produced more masking than expected, given their fundamental frequency, syllabic rate, and temporal modulation characteristics. Neither a simulation nor testing in steady-state noise predicts the difficulties cochlear-implant users experience in real-life noisy situations.

  • Source
    • "This guarantees that, on average, noise will affect speech equally for every frequency band and also that it will closely resemble everyday environmental noises such as the babble resulting from several voices heard simultaneously. Multitalker babble is often cited as the environmental noise most frequently encountered by listeners (Cullington & Zeng, 2008; R. H. Wilson, Abrams, & Pillion, 2003). It can thus be said to have more ecological validity than steady-state noises such as white noise. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Forty participants listened to French imperative sentences (commands for moving objects) in a multi-talker babble background for which intensity was experimentally controlled. Participants were instructed to transcribe what they heard and obey the commands in an interactive environment set up for this purpose. The former test provided intelligibility scores and the latter provided comprehension ones. Collected data reveal a globally weak correlation between intelligibility and comprehension scores (r = .35, p < .001). The discrepancy tends to grow as a function of noise level increase. An analysis of standard deviations shows that variability in comprehension scores increases linearly with noise level whereas higher variability in intelligibility scores is found for moderate noise level conditions. These results support the hypothesis that intelligibility scores are poor predictors of listeners' comprehension in real communication situations. Intelligibility and comprehension scores appear to provide different insights, the first measure being centered on speech signal transfer and the second on communicative performance. Both theoretical and practical implications for the use of speech intelligibility tests as indicators of speakers' performances are discussed.
    Full-text · Article · Jun 2015 · Journal of Speech Language and Hearing Research
  • Source
    • "Additionally, the negative correlation coefficients indicated that speech recognition scores decreased with higher RMS noise levels and higher crest factors. These results were consistent with previous reports that cochlear implant listeners generally had more difficulty understanding speech in environments with high masker levels (Skinner et al, 1994; Chung et al, 2006; Ricketts et al, 2006) and in noise with high temporal fluctuations (Kwon and Turner, 2001; Nelson et al, 2003; Qin and Oxenham, 2003; Nelson and Jin, 2004; Stickney et al, 2004; Cullington and Zeng, 2008; Luo et al, 2008). These behavioral patterns were also consistent with cochlear implant stimulation patterns shown in electrodograms where higher noise levels and higher temporal fluctuations result in less speechlike stimulation patterns. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Wind noise can be a nuisance or a debilitating masker for cochlear implant users in outdoor environments. Previous studies indicated that wind noise at the microphone/hearing aid output had high levels of low-frequency energy and the amount of noise generated is related to the microphone directionality. Currently, cochlear implants only offer either directional microphones or omnidirectional microphones for users at-large. As all cochlear implants utilize pre-emphasis filters to reduce low-frequency energy before the signal is encoded, effective wind noise reduction algorithms for hearing aids might not be applicable for cochlear implants. The purposes of this study were to investigate the effect of microphone directionality on speech recognition and perceived sound quality of cochlear implant users in wind noise and to derive effective wind noise reduction strategies for cochlear implants. A repeated-measure design was used to examine the effects of spectral and temporal masking created by wind noise recorded through directional and omnidirectional microphones and the effects of pre-emphasis filters on cochlear implant performance. A digital hearing aid was programmed to have linear amplification and relatively flat in-situ frequency responses for the directional and omnidirectional modes. The hearing aid output was then recorded from 0 to 360° at flow velocities of 4.5 and 13.5 m/sec in a quiet wind tunnel. Study Sample: Sixteen postlingually deafened adult cochlear implant listeners who reported to be able to communicate on the phone with friends and family without text messages participated in the study. Intervention: Cochlear implant users listened to speech in wind noise recorded at locations that the directional and omnidirectional microphones yielded the lowest noise levels. Cochlear implant listeners repeated the sentences and rated the sound quality of the testing materials. Spectral and temporal characteristics of flow noise, as well as speech and/or noise characteristics before and after the pre-emphasis filter, were analyzed. Correlation coefficients between speech recognition scores and crest factors of wind noise before and after pre-emphasis filtering were also calculated. Listeners obtained higher scores using the omnidirectional than the directional microphone mode at 13.5 m/sec, but they obtained similar speech recognition scores for the two microphone modes at 4.5 m/sec. Higher correlation coefficients were obtained between speech recognition scores and crest factors of wind noise after pre-emphasis filtering rather than before filtering. Cochlear implant users would benefit from both directional and omnidirectional microphones to reduce far-field background noise and near-field wind noise. Automatic microphone switching algorithms can be more effective if the incoming signal were analyzed after pre-emphasis filters for microphone switching decisions.
    Full-text · Article · Oct 2011 · Journal of the American Academy of Audiology
  • Source
    • "Culling et al. (2004) found that their data with triplets of speech maskers were predicted fairly well by the Bronkhorst model. However, it is possible that this result is specific to three-masker configurations as the number of maskers can be an important factor in speech-on-speech masking (Brungart et al., 2001; Hawley et al., 2004; Cullington and Zeng, 2008). Moreover, there was limited variation in the degree of asymmetry in the masker configurations tested by Culling et al. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A mathematical formula for estimating spatial release from masking (SRM) in a cocktail party environment would be useful as a simpler alternative to computationally intensive algorithms and may enhance understanding of underlying mechanisms. The experiment presented herein was designed to provide a strong test of a model that divides SRM into contributions of asymmetry and angular separation [Bronkhorst (2000). Acustica 86, 117-128] and to examine whether that model can be extended to include speech maskers. Across masker types the contribution to SRM of angular separation of maskers from the target was found to grow at a diminishing rate as angular separation increased within the frontal hemifield, contrary to predictions of the model. Speech maskers differed from noise maskers in the overall magnitude of SRM and in the contribution of angular separation (both greater for speech). These results were used to develop a modified model that achieved good fits to data for noise maskers (ρ=0.93) and for speech maskers (ρ=0.94) while using the same functions to describe separation and asymmetry components of SRM for both masker types. These findings suggest that this approach can be used to accurately model SRM for speech maskers in addition to primarily "energetic" noise maskers.
    Full-text · Article · Sep 2011 · The Journal of the Acoustical Society of America
Show more