[Show abstract][Hide abstract] ABSTRACT: Researchers of auditory stream segregation have largely taken a bottom-up view on the link between physical stimulus parameters and the perceptual organization of sequences of ABAB sounds. However, in the majority of studies, researchers have relied on the reported decisions of the subjects regarding which of the predefined percepts (e.g., one stream or two streams) predominated when subjects listened to more or less ambiguous streaming sequences. When searching for neural mechanisms of stream segregation, it should be kept in mind that such decision processes may contribute to brain activation, as also suggested by recent human imaging data. The present study proposes that the uncertainty of a subject in making a decision about the perceptual organization of ambiguous streaming sequences may be reflected in the time required to make an initial decision. To this end, subjects had to decide on their current percept while listening to ABAB auditory streaming sequences. Each sequence had a duration of 30 s and was composed of A and B harmonic tone complexes differing in fundamental frequency (ΔF). Sequences with seven different ΔF were tested. We found that the initial decision time varied non-monotonically with ΔF and that it was significantly correlated with the degree of perceptual ambiguity defined from the proportions of time the subjects reported a one-stream or a two-stream percept subsequent to the first decision. This strong relation of the proposed measures of decision uncertainty and perceptual ambiguity should be taken into account when searching for neural correlates of auditory stream segregation.
Frontiers in Neuroscience 08/2015; 9:266. DOI:10.3389/fnins.2015.00266 · 3.66 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: All acoustic information from the periphery is encoded in the timing and rates of spikes in the population of spiral ganglion neurons projecting to the central auditory system. Considerable progress has been made in characterizing the physiological properties of type-I and type-II primary auditory afferents and understanding the basic properties of type-I afferents in response to sounds. Here, we review some of these properties, with emphasis placed on issues such as the stochastic nature of spike timing during spontaneous and driven activity, frequency tuning curves, spike-rate-versus-level functions, dynamic-range and spike-rate adaptation, and phase locking to stimulus fine structure and temporal envelope. We also review effects of acoustic trauma on some of these response properties.
Cell and Tissue Research 04/2015; 361(1). DOI:10.1007/s00441-015-2177-9 · 3.57 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In mammalian auditory systems, the spiking characteristics of each primary afferent (type I auditory-nerve fiber; ANF) are mainly determined by a single ribbon synapse in a single receptor cell (inner hair cell; IHC). ANF spike trains therefore provide a window into the operation of these synapses and cells. It was demonstrated previously (Heil et al., 2007) that the distribution of interspike intervals (ISIs) of cat ANFs during spontaneous activity can be modeled as resulting from refractoriness operating on a non-Poisson stochastic point process of excitation (transmitter release events from the IHC). Here, we investigate nonrenewal properties of these cat-ANF spontaneous spike trains, manifest as negative serial ISI correlations and reduced spike-count variability over short timescales. A previously discussed excitatory process, the constrained failure of events from a homogeneous Poisson point process, can account for these properties, but does not offer a parsimonious explanation for certain trends in the data. We then investigate a three-parameter model of vesicle-pool depletion and replenishment and find that it accounts for all experimental observations, including the ISI distributions, with only the release probability varying between spike trains. The maximum number of units (single vesicles or groups of simultaneously released vesicles) in the readily releasable pool and their replenishment time constant can be assumed to be constant (∼4 and 13.5 ms, respectively). We suggest that the organization of the IHC ribbon synapses not only enables sustained release of neurotransmitter but also imposes temporal regularity on the release process, particularly when operating at high rates.
The Journal of Neuroscience : The Official Journal of the Society for Neuroscience 11/2014; 34(45):15097-15109. DOI:10.1523/JNEUROSCI.0903-14.2014 · 6.34 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Absolute auditory threshold decreases with increasing sound duration, a phenomenon explainable by the assumptions that the sound evokes neural events whose probabilities of occurrence are proportional to the sound's amplitude raised to an exponent of about 3 and that a constant number of events are required for threshold (Heil and Neubauer, Proc Natl Acad Sci USA 100:6151-6156, 2003). Based on this probabilistic model and on the assumption of perfect binaural summation, an equation is derived here that provides an explicit expression of the binaural threshold as a function of the two monaural thresholds, irrespective of whether they are equal or unequal, and of the exponent in the model. For exponents >0, the predicted binaural advantage is largest when the two monaural thresholds are equal and decreases towards zero as the monaural threshold difference increases. This equation is tested and the exponent derived by comparing binaural thresholds with those predicted on the basis of the two monaural thresholds for different values of the exponent. The thresholds, measured in a large sample of human subjects with equal and unequal monaural thresholds and for stimuli with different temporal envelopes, are compatible only with an exponent close to 3. An exponent of 3 predicts a binaural advantage of 2 dB when the two ears are equally sensitive. Thus, listening with two (equally sensitive) ears rather than one has the same effect on absolute threshold as doubling duration. The data suggest that perfect binaural summation occurs at threshold and that peripheral neural signals are governed by an exponent close to 3. They might also shed new light on mechanisms underlying binaural summation of loudness.
Journal of the Association for Research in Otolaryngology 01/2014; 15(2). DOI:10.1007/s10162-013-0432-x · 2.60 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Detection thresholds for auditory stimuli, specified in terms of their -amplitude or level, depend on the stimulus temporal envelope and decrease with increasing stimulus duration. The neural mechanisms underlying these fundamental across-species observations are not fully understood. Here, we present a "continuous look" model, according to which the stimulus gives rise to stochastic neural detection events whose probability of occurrence is proportional to the 3rd power of the low-pass filtered, time-varying stimulus amplitude. Threshold is reached when a criterion number of events have occurred (probability summation). No long-term integration is required. We apply the model to an extensive set of thresholds measured in humans for tones of different envelopes and durations and find it to fit well. Subtle differences at long durations may be due to limited attention resources. We confirm the probabilistic nature of the detection events by analyses of simple reaction times and verify the exponent of 3 by validating model predictions for binaural thresholds from monaural thresholds. The exponent originates in the auditory periphery, possibly in the intrinsic Ca(2+) cooperativity of the Ca(2+) sensor involved in exocytosis from inner hair cells. It results in growth of the spike rate of auditory-nerve fibers (ANFs) with the 3rd power of the stimulus amplitude before saturating (Heil et al., J Neurosci 31:15424-15437, 2011), rather than with its square (i.e., with stimulus intensity), as is commonly assumed. Our work therefore suggests a link between detection thresholds and a key biochemical reaction in the receptor cells.
Advances in Experimental Medicine and Biology 05/2013; 787:21-29. DOI:10.1007/978-1-4614-1590-9_3 · 1.96 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Low-frequency oscillations in the electroencephalogram (EEG) are thought to reflect periodic excitability changes of large neural networks. Consistent with this notion, detection probability of near-threshold somatosensory, visual, and auditory targets has been reported to co-vary with the phase of oscillations in the EEG. In audition, entrainment of δ-oscillations to the periodic occurrence of sounds has been suggested to function as a mechanism of attentional selection. Here, we examine in humans whether the detection of brief near-threshold sounds in quiet depends on the phase of EEG oscillations. When stimuli were presented at irregular intervals, we did not find a systematic relationship between detection probability and phase. When stimuli were presented at regular intervals (2-s), reaction times were significantly shorter and we observed phase entrainment of EEG oscillations corresponding to the frequency of stimulus presentation (0.5 Hz), revealing an adjustment of the system to the regular stimulation. The amplitude of the entrained oscillation was higher for hits than for misses, suggesting a link between entrainment and stimulus detection. However, detection was independent of phase at frequencies ≥1 Hz. Furthermore, we show that when the data are analyzed using acausal, though common, algorithms, an apparent "entrainment" of the δ-phase to presented stimuli emerges and detection probability appears to depend on δ-phase, similar to reports in the literature. We show that these effects are artifacts from phase distortion at stimulus onset by contamination with the event-related potential, which differs markedly for hits and misses. This highlights the need to carefully deal with this common problem, since otherwise it might bias and mislead this exciting field of research.
Frontiers in Psychology 05/2013; 4:262. DOI:10.3389/fpsyg.2013.00262 · 2.80 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Grand means of time-varying signals (waveforms) across subjects in magnetoencephalography (MEG) and electroencephalography (EEG) are commonly computed as arithmetic averages and compared between conditions, for example, by subtraction. However, the prerequisite for these operations, homogeneity of the variance of the waveforms in time, and for most common parametric statistical tests also between conditions, is rarely met. We suggest that the heteroscedasticity observed instead results because waveforms may differ by factors and additive terms and follow a mixed model. We propose to apply the asinh-transformation to stabilize the variance in such cases. We demonstrate the homogeneous variance and the normal distributions of data achieved by this transformation using simulated waveforms, and we apply it to real MEG data and show its benefits. The asinh-transformation is thus an essential and useful processing step prior to computing and comparing grand mean waveforms in MEG and EEG.
[Show abstract][Hide abstract] ABSTRACT: Our study estimates detection thresholds for tones of different durations and frequencies in Great Tits (Parus major) with operant procedures. We employ signals covering the duration and frequency range of communication signals of this species (40-1,010 ms; 2, 4, 6.3 kHz), and we measure threshold level-duration (TLD) function (relating threshold level to signal duration) in silence as well as under behaviorally relevant environmental noise conditions (urban noise, woodland noise). Detection thresholds decreased with increasing signal duration. Thresholds at any given duration were a function of signal frequency and were elevated in background noise, but the shape of Great Tit TLD functions was independent of signal frequency and background condition. To enable comparisons of our Great Tit data to those from other species, TLD functions were first fitted with a traditional leaky-integrator model. We then applied a probabilistic model to interpret the trade-off between signal amplitude and duration at threshold. Great Tit TLD functions exhibit features that are similar across species. The current results, however, cannot explain why Great Tits in noisy urban environments produce shorter song elements or faster songs than those in quieter woodland environments, as detection thresholds are lower for longer elements also under noisy conditions.
[Show abstract][Hide abstract] ABSTRACT: Detection thresholds for pairs or multiple copies of sounds are better than those for a single sound, an observation commonly interpreted as indicating temporal integration by the auditory system. Detection thresholds for pairs of brief tones depend on the delay between the tones (if short) and on frequency, suggesting frequency-dependent temporal overlap of auditory-filter responses elicited by the two successive stimuli (Krumbholz and Wiegrebe, 1998). The model presented by Krumbholz and Wiegrebe did not account for all aspects of their data, despite its complexity. This study shows that a simple probabilistic model based on Neubauer and Heil (2008) predicts the increase in threshold for short temporal delays as well as the asymptotic behaviour towards longer delays. The model entails (i) a 4(th)-order gammatone filter with a brief impulse response and thus broad bandwidth (shorter and broader than those of a filter normally assumed), (ii) the formation of stochastic 'spikes' or 'events' whose probability of occurrence is proportional to the filter output (half-wave rectified fine-structure or amplitude envelope), raised to a power of 3, and (iii) probability summation. The same model with the same front-end filter also predicts thresholds for pairs of clicks presented in band-reject noise, measured by Hall and Lummis (1973). The model accurately predicts the magnitudes and the decay of the alternating increase and decrease of thresholds as the delay between the click varies, the small effects of click polarity, and the dependence of thresholds for pairs of clicks with unequal intensities on their temporal order. Finally, we show that this model also correctly predicts the decrease in threshold with increasing number of temporally separated brief sounds, reported in several studies. While the latter data do not constrain the characteristics of the front-end filter, they do confirm the exponent of 3 in the model. Our paper stresses the viability of the model and raises the possibility that the bandwidths of filters estimated with psychophysical techniques may depend more strongly on the experimental paradigms and stimuli than hitherto thought.
Hearing research 12/2012; 296. DOI:10.1016/j.heares.2012.12.002 · 2.97 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The build-up of auditory stream segregation refers to the notion that sequences of alternating A and B sounds initially tend to be heard as a single stream, but with time appear to split into separate streams. The central assumption in the analysis of this phenomenon is that streaming sequences are perceived as one stream at the beginning by default. In the present study, we test the validity of this assumption and document its impact on the apparent build-up phenomenon. Human listeners were presented with ABAB sequences, where A and B were harmonic tone complexes of seven different fundamental frequency separations (Δf) ranging from 2 to 14 semitones. Subjects had to indicate, as promptly as possible, their initial percept of the sequences, as either "one stream" or "two streams," and any changes thereof during the sequences. We found that subjects did not generally indicate a one-stream percept at the beginning of streaming sequences. Instead, the first perceptual decision depended on Δf, with the probability of a one-stream percept decreasing, and that of a two-stream percept increasing, with increasing Δf. Furthermore, subjects required some time to make and report a decision on their perceptual organization. Taking this time into account, the resulting time courses of two-stream probabilities differ markedly from those suggested by the conventional analysis. A build-up-like increase in two-stream probability was found only for the Δf of six semitones. At the other Δf conditions no or only minor increases in two-stream probability occurred. These results shed new light on the build-up of stream segregation and its possible neural correlates.
Frontiers in Psychology 10/2012; 3:461. DOI:10.3389/fpsyg.2012.00461 · 2.80 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The amplitudes of the most prominent component of auditory evoked magnetic fields and electrical potentials, the M100 and N100, recorded from the human scalp depend on the duration of the stimulus onset interval (SOI). Here, we show, using magnetoencephalography, that the SOI dependence of the M100 amplitude strongly depends upon whether stimuli with different SOIs are presented in a conventional block design or in a random manner. This differential dependence reveals that the M100 is affected not only by the stimulus evoking it and by its predecessor, but by a longer-term history of stimulation. We provide a parsimonious model that accounts for our findings with both designs in a quantitative manner. It assumes a transient, temporally asymmetric reduction in the excitability of a fraction of potentially excitable neurons. A rather stereotyped response function may therefore underlie the stimulation-history effects in the human auditory cortex.
[Show abstract][Hide abstract] ABSTRACT: Acoustic information is conveyed to the brain by the spike patterns in auditory-nerve fibers (ANFs). In mammals, each ANF is excited via a single ribbon synapse in a single inner hair cell (IHC), and the spike patterns therefore also provide valuable information about those intriguing synapses. Here we reexamine and model a key property of ANFs, the dependence of their spike rates on the sound pressure level of acoustic stimuli (rate-level functions). We build upon the seminal model of Sachs and Abbas (1974), which provides good fits to experimental data but has limited utility for defining physiological mechanisms. We present an improved, physiologically plausible model according to which the spike rate follows a Hill equation and spontaneous activity and its experimentally observed tight correlation with ANF sensitivity are emergent properties. We apply it to 156 cat ANF rate-level functions using frequencies where the mechanics are linear and find that a single Hill coefficient of 3 can account for the population of functions. We also demonstrate a tight correspondence between ANF rate-level functions and the Ca(2+) dependence of exocytosis from IHCs, and derive estimates of the effective intracellular Ca(2+) concentrations at the individual active zones of IHCs. We argue that the Hill coefficient might reflect the intrinsic, biochemical Ca(2+) cooperativity of the Ca(2+) sensor involved in exocytosis from the IHC. The model also links ANF properties with properties of psychophysical absolute thresholds.
The Journal of Neuroscience : The Official Journal of the Society for Neuroscience 10/2011; 31(43):15424-37. DOI:10.1523/JNEUROSCI.1638-11.2011 · 6.34 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: MEG and EEG studies of event-related responses often involve comparisons of grand averages, requiring homogeneity of the variances. Here, we examine the possibility, implied by the nature of neural sources and the measuring principles involved, that the M100 component of auditory-evoked magnetic fields of different subjects, hemispheres, to different stimuli, and at different sensors differs by scaling factors. Such a multiplicative model predicts a linear increase in the standard deviation with the mean, and thus would have important implications for averaging and comparing such data. Our analyses, at the sensor and the source level, clearly show that the multiplicative model applies. We therefore propose geometric, rather than arithmetic, averaging of the M100 component across subjects and suggest a novel and superior normalization procedure. Our results question the justification of the common practice of subtracting arithmetic grand averages.
[Show abstract][Hide abstract] ABSTRACT: Several recent studies of mature auditory and vestibular hair cells (HCs), and of visual and olfactory receptor cells, have observed nearly linear dependencies of the rate of neurotransmitter release events, or related measures, on the magnitude of Ca(2+)-entry into the cell. These relationships contrast with the highly supralinear, third to fourth power, Ca(2+)-dependencies observed in most preparations, from neuromuscular junctions to central synapses, and also in HCs from immature and various mutant animals. They also contrast with the intrinsic, biochemical, Ca(2+)-cooperativity of the ubiquitous Ca(2+)-sensors involved in fast exocytosis (synaptotagmins I and II). Here, we propose that the quasi-linear dependencies result from measuring the sum of several supralinear, but saturating, dependencies with different sensitivities at individual active zones of the same cell. We show that published experimental data can be accurately accounted for by this summation model, without the need to assume altered Ca(2+)-cooperativity or nanodomain control of release. We provide support for the proposal that the best power is 3, and we discuss the large body of evidence for our summation model. Overall, our idea provides a parsimonious and attractive reconciliation of the seemingly discrepant experimental findings in different preparations.
Frontiers in Synaptic Neuroscience 11/2010; 2:148. DOI:10.3389/fnsyn.2010.00148
[Show abstract][Hide abstract] ABSTRACT: All acoustic information relayed to the central nervous system is encoded in the spiking patterns of auditory-nerve (AN) fibres. Here we re-examine and model the dependence of the spike rates of AN fibres on the amplitude of tonal stimuli, building upon the seminal study of Sachs and Abbas (1974). These authors modelled the spike rate vs. sound amplitude functions of AN fibres as the result of the interaction of a 'mechanical stage', describing basilar membrane displacement as a function of sound amplitude, with a 'transducer stage', converting displacement into AN fibre spike rate. The latter stage was modelled as a saturating power function, and spontaneous rate was assumed to simply add to the sound-driven rate. However, the 'transducer stage' of the model – though widely used – has several limita-tions. Here, we present a physiologically plausible modification of this stage. With this modification, spontaneous ac-tivity and its tight correlation with AN fibre sensitivity are emergent properties of the model. Furthermore, we show that for frequencies well below characteristic frequency (CF), where the mechanics are linear, the power which best accounts for all 154 measured cat AN fibre rate-level functions is 3, independent of spontaneous rate or CF. Since this power is the same as that obtained from analysis of absolute thresholds at the perceptual level (Heil and Neubauer, 2003; Neubauer and Heil, 2004), our model also unites AN fibres properties with psychophysics.