Speech Recognition in Fluctuating and Continuous Maskers: Effects of Hearing Loss and Presentation Level

Walter Reed National Military Medical Center, Washington, Washington, D.C., United States
Journal of Speech Language and Hearing Research (Impact Factor: 2.07). 05/2004; 47(2):245-56. DOI: 10.1044/1092-4388(2004/020)
Source: PubMed


Listeners with normal-hearing sensitivity recognize speech more accurately in the presence of fluctuating background sounds, such as a single competing voice, than in unmodulated noise at the same overall level. These performance differences are greatly reduced in listeners with hearing impairment, who generally receive little benefit from fluctuations in masker envelopes. If this lack of benefit is entirely due to elevated quiet thresholds and the resulting inaudibility of low-amplitude portions of signal + masker, then listeners with hearing impairment should derive increasing benefit from masker fluctuations as presentation levels increase. Listeners with normal-hearing (NH) sensitivity and listeners with hearing impairment (HI) were tested for sentence recognition at moderate and high presentation levels in competing speech-shaped noise, in competing speech by a single talker, and in competing time-reversed speech by the same talker. NH listeners showed more accurate recognition at moderate than at high presentation levels and better performance in fluctuating maskers than in unmodulated noise. For these listeners, modulated versus unmodulated performance differences tended to decrease at high presentation levels. Listeners with HI, as a group, showed performance that was more similar across maskers and presentation levels. Considered individually, only 2 out of 6 listeners with HI showed better overall performance and increasing benefit from masker fluctuations as presentation level increased. These results suggest that audibility alone does not completely account for the group differences in performance with fluctuating maskers; suprathreshold processing differences between groups also appear to play an important role. Competing speech frequently provided more effective masking than time-reversed speech containing temporal fluctuations of equal magnitude. This finding is consistent with "informational" masking resulting from competitive processing of words and phrases within the speech masker that would notoccur for time-reversed sentences.

11 Reads
  • Source
    • "The optimal rate of modulation has been shown to depend on the type of speech material and the number of possible response alternatives (Buss et al., 2009). In addition to studies that have found modulation rate to be an important parameter (Miller and Licklider, 1950; Buss et al., 2009), the amount of masking release incurred by introducing masker amplitude modulation (AM) is larger for deeper masker modulation depth (Gnansia et al., 2008), and for more intense maskers (Summers and Molis, 2004; George et al., 2006). Whereas most studies of masker fluctuation have evaluated envelope fluctuations that are coherent across frequency , naturally occurring maskers often contain spectrotemporally complex fluctuations. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Howard-Jones and Rosen [(1993). J. Acoust. Soc. Am. 93, 2915-2922] investigated the ability to integrate glimpses of speech that are separated in time and frequency using a "checkerboard" masker, with asynchronous amplitude modulation (AM) across frequency. Asynchronous glimpsing was demonstrated only for spectrally wide frequency bands. It is possible that the reduced evidence of spectro-temporal integration with narrower bands was due to spread of masking at the periphery. The present study tested this hypothesis with a dichotic condition, in which the even- and odd-numbered bands of the target speech and asynchronous AM masker were presented to opposite ears, minimizing the deleterious effects of masking spread. For closed-set consonant recognition, thresholds were 5.1-8.5 dB better for dichotic than for monotic asynchronous AM conditions. Results were similar for closed-set word recognition, but for open-set word recognition the benefit of dichotic presentation was more modest and level dependent, consistent with the effects of spread of masking being level dependent. There was greater evidence of asynchronous glimpsing in the open-set than closed-set tasks. Presenting stimuli dichotically supported asynchronous glimpsing with narrower frequency bands than previously shown, though the magnitude of glimpsing was reduced for narrower bandwidths even in some dichotic conditions.
    The Journal of the Acoustical Society of America 08/2012; 132(2):1152-64. DOI:10.1121/1.4730976 · 1.50 Impact Factor
  • Source
    • "SRM was measured for a speech target in the presence of two symmetrically placed speech maskers that were either highly intelligible and confusable with the target (high in IM) or time-reversed and unintelligible, and thus less confusable with the target (lower in IM). The comparison of these two kinds of maskers has the advantage that they have very similar spectrotemporal structures and thus generate approximately equivalent amounts of EM (e.g., Dirks and Bower, 1969; Freyman et al., 2001; Brungart and Simpson, 2002; Summers and Molis, 2004; Rhebergen et al., 2005; Brungart and Simpson, 2007; Hornsby and Ricketts, 2007; Kidd et al., 2010). Experiment 2 was conducted to examine whether the interaction between IM and hearing status could be replicated in NH listeners when the signal quality was varied. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This study tested the hypothesis that the reduction in spatial release from masking (SRM) resulting from sensorineural hearing loss in competing speech mixtures is influenced by the characteristics of the interfering speech. A frontal speech target was presented simultaneously with two intelligible or two time-reversed (unintelligible) speech maskers that were either colocated with the target or were symmetrically separated from the target in the horizontal plane. The difference in SRM between listeners with hearing impairment and listeners with normal hearing was substantially larger for the forward maskers (deficit of 5.8 dB) than for the reversed maskers (deficit of 1.6 dB). This was driven by the fact that all listeners, regardless of hearing abilities, performed similarly (and poorly) in the colocated condition with intelligible maskers. The same conditions were then tested in listeners with normal hearing using headphone stimuli that were degraded by noise vocoding. Reducing the number of available spectral channels systematically reduced the measured SRM, and again, more so for forward (reduction of 3.8 dB) than for reversed speech maskers (reduction of 1.8 dB). The results suggest that non-spatial factors can strongly influence both the magnitude of SRM and the apparent deficit in SRM for listeners with impaired hearing.
    The Journal of the Acoustical Society of America 04/2012; 131(4):3103-10. DOI:10.1121/1.3693656 · 1.50 Impact Factor
  • Source
    • "In part, the reduction of the FMB for HI listeners can be explained by limitations in audibility that prevent them from detecting some parts of the target signal that are revealed by dips in the masker (Takahashi and Bacon, 1992; Summers and Molis, 2004). Other suprathreshold psychoacoustic deficits might also limit the FMB for HI listeners. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Normal-hearing listeners receive less benefit from momentary dips in the level of a fluctuating masker for speech processed to degrade spectral detail or temporal fine structure (TFS) than for unprocessed speech. This has been interpreted as evidence that the magnitude of the fluctuating-masker benefit (FMB) reflects the ability to resolve spectral detail and TFS. However, the FMB for degraded speech is typically measured at a higher signal-to-noise ratio (SNR) to yield performance similar to normal speech for the baseline (stationary-noise) condition. Because the FMB decreases with increasing SNR, this SNR difference might account for the reduction in FMB for degraded speech. In this study, the FMB for unprocessed and processed (TFS-removed or spectrally smeared) speech was measured in a paradigm that adjusts word-set size, rather than SNR, to equate stationary-noise performance across processing conditions. Compared at the same SNR and percent-correct level (but with different set sizes), processed and unprocessed stimuli yielded a similar FMB for four different fluctuating maskers (speech-modulated noise, one opposite-gender interfering talker, two same-gender interfering talkers, and 16-Hz interrupted noise). These results suggest that, for these maskers, spectral or TFS distortions do not directly impair the ability to benefit from momentary dips in masker level.
    The Journal of the Acoustical Society of America 07/2011; 130(1):473-88. DOI:10.1121/1.3589440 · 1.50 Impact Factor
Show more