Speech recognition with varying numbers and types of competing talkers by normal-hearing, cochlear-implant, and implant simulation subjects

Hearing and Speech Laboratory, University of California, Irvine, 364 Med Surge II, Room 315, Irvine, California 92697, USA.
The Journal of the Acoustical Society of America (Impact Factor: 1.56). 02/2008; 123(1):450-61. DOI: 10.1121/1.2805617
Source: PubMed

ABSTRACT Cochlear-implant users perform far below normal-hearing subjects in background noise. Speech recognition with varying numbers of competing female, male, and child talkers was evaluated in normal-hearing subjects, cochlear-implant users, and normal-hearing subjects utilizing an eight-channel sine-carrier cochlear-implant simulation. Target sentences were spoken by a male. Normal-hearing subjects obtained considerably better speech reception thresholds than cochlear-implant subjects; the largest discrepancy was 24 dB with a female masker. Evaluation of one implant subject with normal hearing in the contralateral ear suggested that this difference is not caused by age-related disparities between the subject groups. Normal-hearing subjects showed a significant advantage with fewer competing talkers, obtaining release from masking with up to three talker maskers. Cochlear-implant and simulation subjects showed little such effect, although there was a substantial difference between the implant and simulation results with talker maskers. All three groups benefited from a voice pitch difference between target and masker, with the female talker providing significantly less masking than the male. Child talkers produced more masking than expected, given their fundamental frequency, syllabic rate, and temporal modulation characteristics. Neither a simulation nor testing in steady-state noise predicts the difficulties cochlear-implant users experience in real-life noisy situations.

1 Follower
  • Source
    • "Additionally, the negative correlation coefficients indicated that speech recognition scores decreased with higher RMS noise levels and higher crest factors. These results were consistent with previous reports that cochlear implant listeners generally had more difficulty understanding speech in environments with high masker levels (Skinner et al, 1994; Chung et al, 2006; Ricketts et al, 2006) and in noise with high temporal fluctuations (Kwon and Turner, 2001; Nelson et al, 2003; Qin and Oxenham, 2003; Nelson and Jin, 2004; Stickney et al, 2004; Cullington and Zeng, 2008; Luo et al, 2008). These behavioral patterns were also consistent with cochlear implant stimulation patterns shown in electrodograms where higher noise levels and higher temporal fluctuations result in less speechlike stimulation patterns. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Wind noise can be a nuisance or a debilitating masker for cochlear implant users in outdoor environments. Previous studies indicated that wind noise at the microphone/hearing aid output had high levels of low-frequency energy and the amount of noise generated is related to the microphone directionality. Currently, cochlear implants only offer either directional microphones or omnidirectional microphones for users at-large. As all cochlear implants utilize pre-emphasis filters to reduce low-frequency energy before the signal is encoded, effective wind noise reduction algorithms for hearing aids might not be applicable for cochlear implants. The purposes of this study were to investigate the effect of microphone directionality on speech recognition and perceived sound quality of cochlear implant users in wind noise and to derive effective wind noise reduction strategies for cochlear implants. A repeated-measure design was used to examine the effects of spectral and temporal masking created by wind noise recorded through directional and omnidirectional microphones and the effects of pre-emphasis filters on cochlear implant performance. A digital hearing aid was programmed to have linear amplification and relatively flat in-situ frequency responses for the directional and omnidirectional modes. The hearing aid output was then recorded from 0 to 360° at flow velocities of 4.5 and 13.5 m/sec in a quiet wind tunnel. Study Sample: Sixteen postlingually deafened adult cochlear implant listeners who reported to be able to communicate on the phone with friends and family without text messages participated in the study. Intervention: Cochlear implant users listened to speech in wind noise recorded at locations that the directional and omnidirectional microphones yielded the lowest noise levels. Cochlear implant listeners repeated the sentences and rated the sound quality of the testing materials. Spectral and temporal characteristics of flow noise, as well as speech and/or noise characteristics before and after the pre-emphasis filter, were analyzed. Correlation coefficients between speech recognition scores and crest factors of wind noise before and after pre-emphasis filtering were also calculated. Listeners obtained higher scores using the omnidirectional than the directional microphone mode at 13.5 m/sec, but they obtained similar speech recognition scores for the two microphone modes at 4.5 m/sec. Higher correlation coefficients were obtained between speech recognition scores and crest factors of wind noise after pre-emphasis filtering rather than before filtering. Cochlear implant users would benefit from both directional and omnidirectional microphones to reduce far-field background noise and near-field wind noise. Automatic microphone switching algorithms can be more effective if the incoming signal were analyzed after pre-emphasis filters for microphone switching decisions.
    Journal of the American Academy of Audiology 10/2011; 22(9):586-600. DOI:10.3766/jaaa.22.9.4 · 1.59 Impact Factor
  • Source
    • "Culling et al. (2004) found that their data with triplets of speech maskers were predicted fairly well by the Bronkhorst model. However, it is possible that this result is specific to three-masker configurations as the number of maskers can be an important factor in speech-on-speech masking (Brungart et al., 2001; Hawley et al., 2004; Cullington and Zeng, 2008). Moreover, there was limited variation in the degree of asymmetry in the masker configurations tested by Culling et al. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A mathematical formula for estimating spatial release from masking (SRM) in a cocktail party environment would be useful as a simpler alternative to computationally intensive algorithms and may enhance understanding of underlying mechanisms. The experiment presented herein was designed to provide a strong test of a model that divides SRM into contributions of asymmetry and angular separation [Bronkhorst (2000). Acustica 86, 117-128] and to examine whether that model can be extended to include speech maskers. Across masker types the contribution to SRM of angular separation of maskers from the target was found to grow at a diminishing rate as angular separation increased within the frontal hemifield, contrary to predictions of the model. Speech maskers differed from noise maskers in the overall magnitude of SRM and in the contribution of angular separation (both greater for speech). These results were used to develop a modified model that achieved good fits to data for noise maskers (ρ=0.93) and for speech maskers (ρ=0.94) while using the same functions to describe separation and asymmetry components of SRM for both masker types. These findings suggest that this approach can be used to accurately model SRM for speech maskers in addition to primarily "energetic" noise maskers.
    The Journal of the Acoustical Society of America 09/2011; 130(3):1463-74. DOI:10.1121/1.3613928 · 1.56 Impact Factor
  • Source
    • "Cochlear implants (CIs) have helped many patients to re-gain the ability to understand speech. However, speech understanding in background noise or reverberation presents a major challenge for individuals with CIs, whose speech reception thresholds are often 10 dB and sometimes even as much as 24 dB higher than those of normal-hearing listeners (e.g., Schön et al., 2002; Cullington and Zeng, 2008). This inability to cope with background sounds is, among other factors, linked to misrepresentation of information in the temporal fine structure (TFS). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The precedence effect (PE) describes the ability to localize a direct, leading sound correctly when its delayed copy (lag) is present, though not separately audible. The relative contribution of binaural cues in the temporal fine structure (TFS) of lead-lag signals was compared to that of interaural level differences (ILDs) and interaural time differences (ITDs) carried in the envelope. In a localization dominance paradigm participants indicated the spatial location of lead-lag stimuli processed with a binaural noise-band vocoder whose noise carriers introduced random TFS. The PE appeared for noise bursts of 10 ms duration, indicating dominance of envelope information. However, for three test words the PE often failed even at short lead-lag delays, producing two images, one toward the lead and one toward the lag. When interaural correlation in the carrier was increased, the images appeared more centered, but often remained split. Although previous studies suggest dominance of TFS cues, no image is lateralized in accord with the ITD in the TFS. An interpretation in the context of auditory scene analysis is proposed: By replacing the TFS with that of noise the auditory system loses the ability to fuse lead and lag into one object, and thus to show the PE.
    The Journal of the Acoustical Society of America 03/2011; 129(3):1509-21. DOI:10.1121/1.3531836 · 1.56 Impact Factor
Show more


1 Download
Available from