Frequency selectivity, temporal fine-structure (TFS) processing, and speech reception were assessed for six normal-hearing (NH) listeners, ten sensorineurally hearing-impaired (HI) listeners with similar high-frequency losses, and two listeners with an obscure dysfunction (OD). TFS processing was investigated at low frequencies in regions of normal hearing, through measurements of binaural masked detection, tone lateralization, and monaural frequency modulation (FM) detection. Lateralization and FM detection thresholds were measured in quiet and in background noise. Speech reception thresholds were obtained for full-spectrum and lowpass-filtered sentences with different interferers. Both the HI listeners and the OD listeners showed poorer performance than the NH listeners in terms of frequency selectivity, TFS processing, and speech reception. While a correlation was observed between the monaural and binaural TFS-processing deficits in the HI listeners, no relation was found between TFS processing and frequency selectivity. The effect of noise on TFS processing was not larger for the HI listeners than for the NH listeners. Finally, TFS-processing performance was correlated with speech reception in a two-talker background and lateralized noise, but not in amplitude-modulated noise. The results provide constraints for future models of impaired auditory signal processing.
All content in this area was uploaded by Torsten Dau
Content may be subject to copyright.
A preview of the PDF is not available
... However, FS measurements involve masked thresholds, and therefore do not require a very low ambient noise level for testing. While FS generally becomes poorer in people with sensorineural hearing loss (SNHL) and audiogram thresholds > 30 dB HL (Glasberg and Moore 1986;Laroche et al. 1992;Shen, Kern, and Richards 2019), it can also be abnormal in individuals with normal thresholds, but with a poor speech-innoise ability (Badri, Siegel, and Wright 2011;Strelcyk and Dau 2009) or subclinical noise-induced hearing damage (Bergman et al. 1992;Laroche et al. 1992). ...
... This permits affordable and portable T FS testing that can offer at least two benefits. Firstly, it can be used to test NH listeners with difficulties in day-to-day listening in noisy environments (Badri, Siegel, and Wright 2011;Strelcyk and Dau 2009). A deterioration in FS can be a cause for poor speech-in-noise intelligibility, as the broadening of cochlear tuning leads to the smearing of speech signals in noise despite normal audibility (Leek and Summers 1996). ...
Objective:
The ear's spectral resolution or frequency selectivity (FS) is a fundamental aspect of hearing but is not routinely measured in clinical practice. This study evaluated a simplified FS testing procedure for clinical use by replacing the time-consuming two-interval forced choice (2IFC) method with method of limits (MOL) carried out using a custom-made software and consumer-grade equipment.
Design and study sample:
Study 1 compared the FS measure obtained with MOL and 2IFC procedure at two centre frequencies (CFs) (1 and 4 kHz) in 21 normal-hearing listeners. Study 2 determined the FS measure using MOL at five CFs (0.5-8 kHz) in 32 normal-hearing and nine sensorineural hearing loss listeners and compared them with their thresholds in quiet.
Results:
FS measurements with MOL and 2IFC methods were highly correlated and had statistically comparable intra-subject test-retest reliability. FS measures determined with MOL were reduced in the hearing-impaired compared to normal-hearing listeners at the CF corresponding to their hearing loss. Linear regression analysis showed significant relationship between FS deterioration and quiet threshold loss (p < 0.0001, R2 = 0.56).
Conclusions:
The simplified and affordable FS testing method can be used alongside audiometry to provide additional information about the cochlear function.
... DTT and SSQ questionnaire provide an assessment of subjects' ability to hear speech sounds in noise and any changes in cochlear FS could affect speech-in-noise intelligibility [26,27]. However, further comparisons of the FS measures at 1 and 4 kHz against subjects' DTT Fig. 1 Classification of subjects' hearing loss (a-c) based on puretone audiometry (PTA) thresholds of the test ear measured with a clinical audiometer in 5-dB steps (see text for details). ...
Purpose
Pure-tone audiometry (PTA) is the gold standard for screening and diagnosis of hearing loss but is not always accessible. This study evaluated a simplified cochlear frequency selectivity (FS) measure as an alternative option to screen for early frequency-specific sensorineural hearing loss (SNHL).
Methods
FS measures at 1 and 4 kHz center frequencies were obtained using a custom-made software in normal-hearing (NH), slight SNHL and mild-to-moderate SNHL subjects. For comparison, subjects were also assessed with the Malay Digit Triplet Test (DTT) and the shortened Malay Speech, Spatial and Qualities of Hearing Scale (SSQ) questionnaire.
Results
Compared to DTT and SSQ, the FS measure at 4 kHz was able to distinguish NH from slight and mild-to-moderate SNHL subjects, and was strongly correlated with their thresholds in quiet determined separately in 1-dB step sizes at the similar test frequency. Further analysis with receiver operating characteristic (ROC) curves indicated area under the curve (AUC) of 0.77 and 0.83 for the FS measure at 4 kHz when PTA thresholds of NH subjects were taken as ≤ 15 dB HL and ≤ 20 dB HL, respectively. At the optimal FS cut-off point for 4 kHz, the FS measure had 77.8% sensitivity and 86.7% specificity to detect 20 dB HL hearing loss.
Conclusion
FS measure was superior to DTT and SSQ questionnaire in detecting early frequency-specific threshold shifts in SNHL subjects, particularly at 4 kHz. This method could be used for screening subjects at risk of noise-induced hearing loss.
... Furthermore, on subjective ratings, listeners with SNHI are known to experience more serious localization difficulties that increase with the degree of HI (Noble et al., 1997;Glyde et al., 2013). Deficits in a number of peripheral and central processes including reduced audibility (Brimijoin and Akeroyd, 2016), impaired frequency selectivity (Strelcyk and Dau, 2009), poor temporal resolution, altered filtered shapes (Baker and Rosen, 2002;Bernstein and Oxenham, 2006), and increased spectral and temporal masking (Le Goff et al., 2013) can be conceived as factors for impaired spatial processing in SNHI listeners. ...
Purpose
The present study aimed to quantify the effects of spatial training using virtual sources on a battery of spatial acuity measures in listeners with sensorineural hearing impairment (SNHI).
Methods
An intervention-based time-series comparison design involving 82 participants divided into three groups was adopted. Group I ( n = 27, SNHI-spatially trained) and group II ( n = 25, SNHI-untrained) consisted of SNHI listeners, while group III ( n = 30) had listeners with normal hearing (NH). The study was conducted in three phases. In the pre-training phase, all the participants underwent a comprehensive assessment of their spatial processing abilities using a battery of tests including spatial acuity in free-field and closed-field scenarios, tests for binaural processing abilities (interaural time threshold [ITD] and level difference threshold [ILD]), and subjective ratings. While spatial acuity in the free field was assessed using a loudspeaker-based localization test, the closed-field source identification test was performed using virtual stimuli delivered through headphones. The ITD and ILD thresholds were obtained using a MATLAB psychoacoustic toolbox, while the participant ratings on the spatial subsection of speech, spatial, and qualities questionnaire in Kannada were used for the subjective ratings. Group I listeners underwent virtual auditory spatial training (VAST), following pre-evaluation assessments. All tests were re-administered on the group I listeners halfway through training (mid-training evaluation phase) and after training completion (post-training evaluation phase), whereas group II underwent these tests without any training at the same time intervals.
Results and discussion
Statistical analysis showed the main effect of groups in all tests at the pre-training evaluation phase, with post hoc comparisons that revealed group equivalency in spatial performance of both SNHI groups (groups I and II). The effect of VAST in group I was evident on all the tests, with the localization test showing the highest predictive power for capturing VAST-related changes on Fischer discriminant analysis (FDA). In contrast, group II demonstrated no changes in spatial acuity across timelines of measurements. FDA revealed increased errors in the categorization of NH as SNHI-trained at post-training evaluation compared to pre-training evaluation, as the spatial performance of the latter improved with VAST in the post-training phase.
Conclusion
The study demonstrated positive outcomes of spatial training using VAST in listeners with SNHI. The utility of this training program can be extended to other clinical population with spatial auditory processing deficits such as auditory neuropathy spectrum disorder, cochlear implants, central auditory processing disorders etc.
... Benefits from both spatial separation and voice gender difference cues can be reduced in listeners with hearing loss ). These difficulties have previously been attributed to the effects on masking release performance of various factors (alone or in combination): (1) a reduction in monaural temporal fine-structure sensitivity (Strelcyk & Dau 2009;Neher et al. 2011;Summers et al. 2013), (2) a reduction in spectral or temporal modulation sensitivity (Bernstein et al. 2013), (3) a reduction in audiometric absolute thresholds and aging (Neher et al. 2011;Glyde et al. 2013;Besser et al. 2015;Srinivasan et al. 2016;Jakien et al. 2017), and (4) a reduction in higher-order processing such as cognitive and linguistic abilities (Besser et al. 2015). To our knowledge, this is the first study to demonstrate an important role of another factor, abnormally broad binaural pitch fusion in CI users (as compared with NH listeners), in reduced binaural benefits for speech perception in multi-talker listening environments. ...
Objectives:
Some cochlear implant (CI) users are fitted with a CI in each ear ("bilateral"), while others have a CI in one ear and a hearing aid in the other ("bimodal"). Presently, evaluation of the benefits of bilateral or bimodal CI fitting does not take into account the integration of frequency information across the ears. This study tests the hypothesis that CI listeners, especially bimodal CI users, with a more precise integration of frequency information across ears ("sharp binaural pitch fusion") will derive greater benefit from voice gender differences in a multi-talker listening environment.
Design:
Twelve bimodal CI users and twelve bilateral CI users participated. First, binaural pitch fusion ranges were measured using the simultaneous, dichotic presentation of reference and comparison stimuli (electric pulse trains for CI ears and acoustic tones for HA ears) in opposite ears, with reference stimuli fixed and comparison stimuli varied in frequency/electrode to find the range perceived as a single sound. Direct electrical stimulation was used in implanted ears through the research interface, which allowed selective stimulation of one electrode at a time, and acoustic stimulation was used in the non-implanted ears through the headphone. Second, speech-on-speech masking performance was measured to estimate masking release by voice gender difference between target and maskers (VGRM). The VGRM was calculated as the difference in speech recognition thresholds of target sounds in the presence of same-gender or different-gender maskers.
Results:
Voice gender differences between target and masker talkers improved speech recognition performance for the bimodal CI group, but not the bilateral CI group. The bimodal CI users who benefited the most from voice gender differences were those who had the narrowest range of acoustic frequencies that fused into a single sound with stimulation from a single electrode from the CI in the opposite ear. There was no similar voice gender difference benefit of narrow binaural fusion range for the bilateral CI users.
Conclusions:
The findings suggest that broad binaural fusion reduces the acoustical information available for differentiating individual talkers in bimodal CI users, but not for bilateral CI users. In addition, for bimodal CI users with narrow binaural fusion who benefit from voice gender differences, bilateral implantation could lead to a loss of that benefit and impair their ability to selectively attend to one talker in the presence of multiple competing talkers. The results suggest that binaural pitch fusion, along with an assessment of residual hearing and other factors, could be important for assessing bimodal and bilateral CI users.
... Perceptual hypersensitivity following noise-induced high-frequency SNHL In humans, a steeply sloping high-frequency hearing loss is a telltale signature of SNHL (Allen and Eddins, 2010;Hannula et al., 2011). We reviewed 132,504 case records from visitors to the audiology clinic at our institution and determined that 23% of pure tone audiograms fit the description of high-frequency SNHL ( Figure 1A), underscoring that it is a common clinical condition commonly related to tinnitus, abnormal loudness growth, and poor speech intelligibility in noise (Horwitz et al., 2002;Lewis et al., 2020;Moore et al., 1999;Oxenham and Bacon, 2003;Strelcyk and Dau, 2009). To model this hearing loss profile in genetically tractable laboratory mice, we induced SNHL through exposure to narrow-band high-frequency noise (16-32 kHz) at 103 dB SPL for 2 hr. ...
Neurons in sensory cortex exhibit a remarkable capacity to maintain stable firing rates despite large fluctuations in afferent activity levels. However, sudden peripheral deafferentation in adulthood can trigger an excessive, non-homeostatic cortical compensatory response that may underlie perceptual disorders including sensory hypersensitivity, phantom limb pain, and tinnitus. Here, we show that mice with noise-induced damage of the high-frequency cochlear base were behaviorally hypersensitive to spared mid-frequency tones and to direct optogenetic stimulation of auditory thalamocortical neurons. Chronic 2-photon calcium imaging from ACtx pyramidal neurons (PyrNs) revealed an initial stage of spatially diffuse hyperactivity, hyper-correlation, and auditory hyperresponsivity that consolidated around deafferented map regions three or more days after acoustic trauma. Deafferented PyrN ensembles also displayed hypersensitive decoding of spared mid-frequency tones that mirrored behavioral hypersensitivity, suggesting that non-homeostatic regulation of cortical sound intensity coding following sensorineural loss may be an underlying source of auditory hypersensitivity. Excess cortical response gain after acoustic trauma was expressed heterogeneously among individual PyrNs, yet 40% of this variability could be accounted for by each cell's baseline response properties prior to acoustic trauma. PyrNs with initially high spontaneous activity and gradual monotonic intensity growth functions were more likely to exhibit non-homeostatic excess gain after acoustic trauma. This suggests that while cortical gain changes are triggered by reduced bottom-up afferent input, their subsequent stabilization is also shaped by their local circuit milieu, where indicators of reduced inhibition can presage pathological hyperactivity following sensorineural hearing loss.
The perception of amplitude modulations (AMs), which is characterized by a frequency-selective process in the modulation domain, is considered critical for speech intelligibility. Previous studies have provided evidence of an age-related decline in AM frequency selectivity, as well as a notable sharpening of AM tuning associated with hearing loss, possibly due to a perceptual advantage resulting from peripheral compression loss. This study aimed to examine whether speech intelligibility in noisy environments would support the following ideas: i) age-related declines in AM tuning might lead to poorer speech intelligibility, and ii) sharper AM tuning associated with hearing loss would not result in improved speech intelligibility. Young (n=10, 22-28 years) and older listeners with normal hearing (n=9, 57-77 years) as well as older listeners with hearing impairment (n=9, 64-77 years) were included in the investigation. All participants had previously taken part in studies on AM frequency selectivity. Speech intelligibility was tested in various listening conditions, including stationary, fluctuating, and competing-speech maskers. Consistent with the hypothesis, the results revealed an age-related increase in speech reception thresholds, with an additional negative impact of hearing loss. These findings motivate further exploration of the relationship between AM frequency selectivity and speech intelligibility in noisy environments.
Modulations in both amplitude and frequency are prevalent in natural sounds and are critical in defining their properties. Humans are exquisitely sensitive to frequency modulation (FM) at the slow modulation rates and low carrier frequencies that are common in speech and music. This enhanced sensitivity to slow-rate and low-frequency FM has been widely believed to reflect precise, stimulus-driven phase locking to temporal fine structure in the auditory nerve. At faster modulation rates and/or higher carrier frequencies, FM is instead thought to be coded by coarser frequency-to-place mapping, where FM is converted to amplitude modulation (AM) via cochlear filtering. Here we show that patterns of human FM perception that have classically been explained by limits in peripheral temporal coding are instead better accounted for by constraints in the central processing of fundamental frequency (F0) or pitch. We measured FM detection in male and female humans using harmonic complex tones with an F0 within the range of musical pitch, but with resolved harmonic components that were all above the putative limits of temporal phase locking (> 8 kHz). Listeners were more sensitive to slow than fast FM rates, even though all components were beyond the limits of phase locking. In contrast, AM sensitivity remained better at faster than slower rates, regardless of carrier frequency. These findings demonstrate that classic trends in human FM sensitivity, previously attributed to auditory-nerve phase locking, may instead reflect the constraints of a unitary code that operates at a more central level of processing.
SIGNIFICANCE STATEMENT:
Natural sounds involve dynamic frequency and amplitude fluctuations. Humans are particularly sensitive to frequency modulation (FM) at slow rates and low carrier frequencies, which are prevalent in speech and music. This sensitivity has been ascribed to encoding of stimulus temporal fine structure (TFS) via phase-locked auditory-nerve activity. To test this long-standing theory, we measured FM sensitivity using complex tones with a low fundamental frequency (F0) but only high-frequency harmonics, beyond the limits of phase locking. Dissociating the F0 from high-frequency TFS showed that FM sensitivity is limited not by peripheral encoding of TFS, but rather by central processing of F0, or pitch. The results suggest a unitary code for FM detection limited by more central constraints.
Objective
The aim of this study was to investigate whether consumer-grade mobile audio equipment can be reliably used as a platform for the notched-noise test, including when the test is conducted outside the laboratory.
Design
Two studies were conducted: Study 1 was a notched-noise masking experiment with three different setups: in a psychoacoustic test booth with a standard laboratory PC; in a psychoacoustic test booth with a mobile device; and in a quiet office room with a mobile device. Study 2 employed the same task as Study 1, but compared circumaural headphones to insert earphones.
Study sample
Nine and ten young, normal-hearing participants completed studies 1 and 2, respectively.
Results
The test-retest accuracy of the notched-noise test on the mobile implementation did not differ from that for the laboratory setup. A possible effect of the earphone design was identified in Study 1, which was corroborated by Study 2, where test-retest variability was smallest when comparing results from experiments conducted using identical acoustic transducers.
Conclusions
Results and test-retest repeatability comparable to standard laboratory settings for the notched-noise test can be obtained with mobile equipment outside the laboratory.
The ability of hearing-impaired listeners to detect spectro-temporal modulation (STM) has been shown to correlate with individual listeners’ speech reception performance. However, the STM detection tests used in previous studies were overly challenging especially for elderly listeners with moderate-to-severe hearing loss. Furthermore, the speech tests considered as a reference were not optimized to yield ecologically valid outcomes that represent real-life speech reception deficits. The present study investigated an STM detection measurement paradigm with individualized audibility compensation, focusing on its clinical viability and relevance as a real-life supra-threshold speech intelligibility predictor. STM thresholds were measured in 13 elderly hearing-impaired native Danish listeners using four previously established (noise-carrier based) and two novel complex-tone carrier based STM stimulus variants. Speech reception thresholds (SRTs) were measured (i) in a realistic spatial speech-on-speech set up and (ii) using co-located stationary noise, both with individualized amplification. In contrast with previous related studies, the proposed measurement paradigm yielded robust STM thresholds for all listeners and conditions. The STM thresholds were positively correlated with the SRTs, whereby significant correlations were found for the realistic speech-test condition but not for the stationary-noise condition. Three STM stimulus variants (one noise-carrier based and two complex-tone based) yielded significant predictions of SRTs, accounting for up to 53% of the SRT variance. The results of the study could form the basis for a clinically viable STM test for quantifying supra-threshold speech reception deficits in aided hearing-impaired listeners.
Speech intelligibility models can provide insights in terms of the auditory processes involved in human speech perception and communication. One successful approach to modelling speech intelligibility has been based on the analysis of the amplitude modulations present in speech as well as competing interferers. This review covers speech intelligibility models that include a modulation-frequency selective processing stage i.e., a modulation filterbank, as part of their front end. The speech-based envelope power spectrum model [sEPSM, Jørgensen and Dau (2011). J. Acoust. Soc. Am. 130(3), 1475-1487], several variants of the sEPSM including modifications with respect to temporal resolution, spectro-temporal processing and binaural processing, as well as the speech-based computational auditory signal processing and perception model [sCASP; Relaño-Iborra et al. J. Acoust. Soc. Am. 146(5), 3306–3317], which is based on an established auditory signal detection and masking model, are discussed. The key processing stages of these models for the prediction of speech intelligibility across a variety of acoustic conditions are addressed in relation to competing modeling approaches. The strengths and weaknesses of the modulation-based analysis are outlined and perspectives presented, particularly in connection with the challenge of predicting the consequences of individual hearing loss on speech intelligibility.
Although Helmholtz, on the basis of experiments with 8-component harmonic complexes of fundamental frequencies near 119 and 238 Hz, claimed to “have never experienced the slightest difference in the quality of tone” with changes in relative phase among the components (Helmholtz, 1954), more recent studies have modified his conclusions (e.g., Mathes and Miller, 1947; Goldstein, 1967). It is now apparent that the primary determinant of the perceptibility of a given phase change is the frequency spacing between the sound’s constituent sinusoidal components. When relative phase changes are made in components that are “close enough” together, they are perceptible; when they are made to widely spaced components, they are not. Phase sensitivity is thus understood to reflect the failure of frequency resolution — only when a sound’s constituent sinusoids interact (i.e., lie sufficiently within a single critical band, or auditory filter) will a phase change be detectable. (For a discussion of other factors, see Rosen, 1986).
While this review discusses many aspects of mammalian auditory nerve function, it is not exhaustive in either breadth or depth of coverage. In particular, the review deals only briefly with the effects of stimulation of the olivocochlear efferent system (Wiederhold 1986; Guinan 1988) and does not cover such important subjects as developmental changes, speech encoding, the mode of origin of the compound action potential and its application in human studies (see Chapter 6 by Kraus and McGee), and investigations of auditory nerve function using psychophysical stimulus paradigms. Such exclusions are necessitated by space considerations and, most importantly, by the reviewer’s lack of expertise in these areas. However, a number of recent reviews provide coverage in these excluded topics (Kiang 1984; Sachs 1984; Abbas 1986; Javel 1986; Pickles 1986, 1988; Harrison 1988a; Javel et al. 1988; Patuzzi and Robertson 1988; Sachs, Winslow, and Blackburn 1988; Smith 1988; Kitzes 1990).
Computational models in this chapter are defined to include models that lead to explicit, quantitative predictions for the phenomena that are being modeled. They may be posed purely in terms of the information that is available for the task, in which case the computed predictions are evaluated using information-theoretical or other statistical communication theory techniques, or they may be posed in terms of mechanisms or algorithms. Both types of computational models are included in this chapter. We do not include models that have been suggested but not evaluated or models which are not sufficiently explicit to allow precise predictions.
A different general philosophy, to be called Full Randomness (FR), for the
analysis of random effects models is presented, involving a notion of reducing
or preferably eliminating fixed effects, at least formally. For example, under
FR applied to a repeated measures model, even the number of repetitions would
be modeled as random. It is argued that in many applications such quantities
really are random, and that recognizing this enables the construction of much
richer, more probing analyses. Methodology for this approach will be developed
here, and suggestions will be made for the broader use of the approach. It is
argued that even in settings in which some factors are fixed by the
experimental design, FR still "gives the right answers." In addition,
computational advantages to such methods will be shown.
Measures of monaural temporal processing and binaural sensitivity were obtained from 12 young (mean age=26.1 years ) and 12 elderly (mean age=70.9 years ) adults with clinically normal hearing (pure-tone thresholds ⩽20 dB HL from 250 to 6000 Hz). Monaural temporal processing was measured by gap detection thresholds. Binaural sensitivity was measured by interaural time difference(ITD) thresholds. Gap and ITD thresholds were obtained at three sound levels (4, 8, or 16 dB above individual threshold). Subjects were also tested on two measures of speech perception, a masking level difference (MLD) task, and a syllable identification/discrimination task that included phonemes varying in voice onset time (VOT). Elderly listeners displayed poorer monaural temporal analysis (higher gap detection thresholds) and poorer binaural processing (higher ITD thresholds) at all sound levels. There were significant interactions between age and sound level, indicating that the age difference was larger at lower stimulus levels. Gap detection performance was found to correlate significantly with performance on the ITD task for young, but not elderly adult listeners. Elderly listeners also performed more poorly than younger listeners on both speechmeasures; however, there was no significant correlation between psychoacoustic and speechmeasures of temporal processing. Findings suggest that age-related factors other than peripheral hearing loss contribute to temporal processing deficits of elderly listeners.