ArticlePDF Available

Effects of Stimulation Rate, Mode and Level on Modulation Detection by Cochlear Implant Users

Authors:

Abstract and Figures

In cochlear implant (CI) patients, temporal processing is often poorest at low listening levels, making perception difficult for low-amplitude temporal cues that are important for consonant recognition and/or speech perception in noise. It remains unclear how speech processor parameters such as stimulation rate and stimulation mode may affect temporal processing, especially at low listening levels. The present study investigated the effects of these parameters on modulation detection by six CI users. Modulation detection thresholds (MDTs) were measured as functions of stimulation rate, mode, and level. Results show that for all stimulation rate and mode conditions, modulation sensitivity was poorest at quiet listening levels, consistent with results from previous studies. MDTs were better with the lower stimulation rate, especially for quiet-to-medium listening levels. Stimulation mode had no significant effect on MDTs. These results suggest that, although high stimulation rates may better encode temporal information and widen the electrode dynamic range, CI patients may not be able to access these enhanced temporal cues, especially at the lower portions of the dynamic range. Lower stimulation rates may provide better recognition of weak acoustic envelope information.
Content may be subject to copyright.
Effects of Stimulation Rate, Mode and Level on Modulation
Detection by Cochlear Implant Users
JOHN J. GALVIN III AND QIAN-JIE FU
Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, CA 90057, USA
Received: 25 April 2005; Accepted: 6 June 2005; Online publication: 2 August 2005
ABSTRACT
In cochlear implant (CI) patients, temporal process-
ing is often poorest at low listening levels, making
perception difficult for low-amplitude temporal cues
that are important for consonant recognition and/or
speech perception in noise. It remains unclear how
speech processor parameters such as stimulation rate
and stimulation mode may affect temporal process-
ing, especially at low listening levels. The present
study investigated the effects of these parameters on
modulation detection by six CI users. Modulation
detection thresholds (MDTs) were measured as func-
tions of stimulation rate, mode, and level. Results
show that for all stimulation rate and mode condi-
tions, modulation sensitivity was poorest at quiet
listening levels, consistent with results from previous
studies. MDTs were better with the lower stimulation
rate, especially for quiet-to-medium listening levels.
Stimulation mode had no significant effect on MDTs.
These results suggest that, although high stimulation
rates may better encode temporal information and
widen the electrode dynamic range, CI patients may
not be able to access these enhanced temporal cues,
especially at the lower portions of the dynamic range.
Lower stimulation rates may provide better recogni-
tion of weak acoustic envelope information.
Keywords: modulation detection threshold, stimu-
lation rate, stimulation mode, stimulation level,
temporal cues
INTRODUCTION
When spectral information is degraded and/or dis-
torted by cochlear implant (CI) speech processing,
temporal envelope cues contribute greatly to speech
recognition (Shannon et al. 1995;Turneretal.
1995; van Tasell et al. 1987, 1992). For CI patients
and normal-hearing (NH) subjects listening to acous-
tic CI simulations, slowly varying temporal compo-
nents (G20 Hz) provide the most useful phonetic
information (Drullman et al. 1994a, b;Fuand
Shannon, 2000; Shannon et al. 1995; van Tasell
et al. 1987, 1992). Recent studies have also shown
that higher-frequency periodicity cues (50–500 Hz)
contribute to perception of suprasegmental informa-
tion, such as voice gender recognition (Fu et al.
2004) and tone recognition for tonal languages (Fu
et al. 1998; Fu and Zeng 2000; Xu et al. 2002).
In CI speech processing, acoustic amplitude envel-
opes are encoded by modulating the electric current
delivered to implanted electrodes. The transmission
of acoustic envelope cues is primarily limited by the
stimulation rate of the implant device; the small
electrode dynamic range (DR) and limited amplitude
resolution can further reduce the saliency of tempo-
ral cues. Presumably, higher stimulation rates can
provide better coding of periodicity cues. Besides
providing better temporal sampling, higher stimula-
tion rates may increase the stochastic response
properties of the activated neurons (Rubinstein
et al. 1999; Wilson et al. 1997a, b), thereby reducing
the unnatural phase-locking activity of neural firing
patterns. Much recent development of the CI has
been toward increasing the stimulation rate to
provide better temporal coding, lower stimulation
thresholds, and wider dynamic ranges. However, in
terms of CI patient performance, no clear advantage
Correspondence to: John J. Galvin III
&
Department of Auditory
Implants and Perception
&
House Ear Institute
&
2100 West Third
Street, Los Angeles, CA 90057, USA. email: jgalvin@hei.org
JARO 6: 269–279 (2005)
DOI: 10.1007/s10162-005-0007-6
269
JARO
Journal of the Association for Research in Otolaryngology
has been consistently shown with high stimulation
rates (Brill et al. 1997, 1998a, b; Friesen et al. 2005;
Fu and Shannon 2000; Holden et al. 2002; Lawson
et al. 1996; Loizou et al. 2000; Skinner 2003; Vandali
et al. 2000). These studies revealed high inter-
subject variability; although some patients displayed
better recognition with increased stimulation rates,
some were not affected by changes in rate and others
performed best at one particular rate.
Although stimulation rate may limit the transmis-
sion of acoustic envelope cues, CI patients’ modulation
sensitivity limits the reception of envelope informa-
tion. Temporal modulation detection has been
widely studied in CI and NH listeners (e.g., Bacon
and Viemeister 1985; Burns and Viemeister 1981;
Donaldson and Viemeister 2000;Formby1986;
Forrest and Green 1987;Fu2002; Shannon 1992).
Significant differences have been noted between CI
and NH listeners’ modulation sensitivity. For exam-
ple, CI listeners are generally less sensitive to higher
modulation frequencies (9150 Hz) than NH listen-
ers. Also, modulation detection thresholds (MDTs)
improve strongly with stimulus level in CI listeners,
whereas modulation sensitivity is nearly independent
of stimulus level for NH listeners, except at the lowest
listening levels. Many studies have also evaluated the
relationship between CI patients’ modulation sensi-
tivity and speech performance (e.g., Blamey et al.
1992; Cazals et al. 1991, 1994;Fu2002; Muchnik
et al. 1993). Cazals et al. (1994) reported that CI
users’ speech performance was highly correlated with
the cutoff frequency of patients’ temporal modula-
tion transfer function (or, TMTF: MDTs as function
of modulation frequency). Fu (2002) showed strong
individual differences in patients’ MDTs as a function
of loudness level. Results also showed a highly sig-
nificant correlation between mean MDTs (averaged
over subjects’ dynamic range) and phoneme recogni-
tion scores. A similar correlation was recently observed
in patients with the auditory brainstem implant
(Shannon and Colletti 2005).
The inconclusive data from previous studies sug-
gest that, at the very least, not all CI patients may ben-
efit from high stimulation rates. Although higher
rates may transmit more temporal information, it is
unclear whether CI patients can effectively access
these cues, especially at lower listening levels where
higher-rate envelopes may be coded (e.g., consonants).
The present study examined the effects of stimulation
rate, mode, and level on CI patients’ MDTs for a 20-Hz
sinusoidally amplitude modulated pulse train. Rela-
tively high and low stimulation rates were studied to
examine the effects of temporal coding and the size of
the dynamic range on MDTs. Stimulation mode refers
to the electrode configuration used for electrical
stimulation. A bipolar stimulation mode refers to
stimulation between an active and return intraco-
chlear electrode, whereas a monopolar stimulation
mode refers to stimulation between an active intra-
cochlear electrode and an extracochlear return elec-
trode. A narrow stimulation mode (e.g., BP + 1) refers
to a small cochlear distance between the stimulated
electrodes, whereas a wide stimulation mode (e.g., BP
+ 13 or monopolar) refers to a large cochlear distance
between the stimulated electrodes. Relatively wide and
narrow stimulation modes were studied to examine
the effects of spread of excitation and the subsequent
shift in dynamic range on MDTs. Stimulation levels
spanning the entire dynamic range were studied to
examine the effects of loudness level on MDTs, across
the stimulation rate and mode conditions.
METHODS
Subjects
Two Nucleus-24 (N24) and four Nucleus-22 (N22)
users participated in the experiment. All CI subjects
were postlingually deafened and all had more than
two years’ experience with their implant device.
Relevant subject details are shown in Table 1.
Stimuli
For all subjects, MDTs were measured with two car-
rier stimulation rates: 250 pulse per second (pps)
TABLE 1
Relevant information for CI users who participated in the experiment
Subject Age Etiology Prosthesis
CI experience
(years)
Vowel recognition
(% correct)
Consonant recognition
(% correct)
S1 62 Hereditary Nucleus 24 2 83 85
S2 72 Unknown Nucleus 24 5 68 65
S3 45 Trauma Nucleus 22 13 92 81
S4 61 Unknown Nucleus 22 15 73 72
S5 61 Hereditary Nucleus 22 14 78 75
S6 65 Noise-induced Nucleus 22 9 86 67
270 GALVIN AND FU: Modulation Detection in CI Users
and 2000 pps. Although other rates between these
experimental rates might also influence MDTs, these
rates were chosen because they generally correspond
with the rates typically used in speech processors (250
pps for N22 patients fitted with the SPEAK strategy,
often nearly 2000 pps for N24 patients fitted with the
ACE strategy). All subjects were tested for a relatively
narrow electrode configuration (BP + 3). Four sub-
jects were also tested using wider electrode configu-
rations to see whether the spread of excitation would
influence modulation sensitivity; monopolar stimula-
tion was tested in two N24 users and BP + 13 sti-
mulation was tested in two N22 users. Table 2 shows
the electrode configurations, electrode numbers, and
stimulation rates tested for each subject. All stimuli
were biphasic pulse trains; the duration was 300 ms,
the pulse phase duration was 100 ms, and the inter-
phase gap was 45 ms. Stimuli were delivered via
custom research interface developed at House Ear
Institute (Wygonski and Robert 2002).
For the modulated stimuli, amplitude modulation
was applied to the carrier pulse train using the fol-
lowing equation: [f(t)][1 + m sin(2pf
m
t)], where f(t)is
the unmodulated pulse train, m is the modulation
index (modulation depth), and f
m
is the modulation
frequency. The modulation depth was adaptively
varied during the modulation detection test. The
modulation frequency was fixed at 20 Hz.
Dynamic range estimation and
loudness balancing
Before beginning the modulation detection experi-
ment, the dynamic range (DR) was estimated for all
experimental electrodes. Absolute detection thresh-
olds for unmodulated pulse trains were measured
using an adaptive three-alternative forced-choice
(3AFC) procedure (3-down/1-up). During the test,
the stimulus was randomly presented to one of the
three intervals. The amplitude of the stimulus was ad-
justed according to subject response. The final 8 of
12 reversals for each run were averaged to obtain the
threshold. Three to six test runs were conducted for
each electrode pair, and the means from all test runs
were averaged to obtain the mean threshold for each
electrode pair. Maximum acceptable loudness (MAL)
levels were obtained by using a method of limits; MAL
was defined as the loudest sound that the subject
would be willing to listen to for an extended period of
time (e.g., during a psychophysical test). The exper-
imenter slowly and incrementally raised the stimula-
tion level until the subject reported that MAL was
achieved. Three to six test runs were conducted for
each electrode pair, and the means from all test runs
were averaged to obtain the mean MAL. For each
electrode pair, the estimated DR was obtained by
subtracting the mean threshold from the mean MAL.
Mean threshold, mean MAL, and estimated DR are
shown for individual subjects and experimental
electrodes in Table 2. Threshold and MAL levels are
reported in dB re: 1 mA.
To equate loudness levels across experimental
conditions, presentation levels were loudness-bal-
anced to reference levels of a reference electrode
pair. Note that only unmodulated pulse trains were
loudness-balanced. For all subjects (except S2), elec-
trode pair (9, 13) served as the reference electrode
pair. Electrode pair (14, 18) served as the reference
electrode pair for S2 because of the limited electrode
insertion depth. For the reference electrode pair, the
stimulation rate was 1000 pps, the duration was 300
ms, the pulse phase duration was 100 ms, and the
interphase gap was 45 ms. The DR for the reference
electrode pair was estimated using the methods
described above; six threshold and six MAL measure-
ments were used to estimate the DR. Eight reference
levels were linearly distributed across the DR (in mA),
corresponding to 5%, 15%, 25%, 35%, 45%, 55%,
65%, and 75% of the reference electrode pair’s DR.
Stimulation levels for the experimental stimulation
rate and mode conditions were loudness-balanced to
these reference levels using a 2AFC, double-staircase
procedure (Jesteadt 1980; Zeng and Turner 1991).
For each reference level, the amplitude of the expe-
rimental electrode was adjusted according to subject
TABLE 2
Mean thresholds, mean MALs and estimated DRs for
individual subjects and experimental electrodes
Subject Electrode
Rate
(pps)
Thresh
(dB)
MAL
(dB)
DR
(dB)
S1 (13, 17) 250 44.24 51.64 7.40
(13, 17) 2000 35.74 51.40 15.67
(17) 250 36.36 46.50 10.14
(17) 2000 28.04 46.23 18.19
S2 (18, 22) 250 49.44 56.68 7.24
(18, 22) 2000 42.86 55.71 12.85
(22) 250 38.37 45.74 7.37
(22) 2000 30.74 45.04 14.30
S3 (14, 18) 250 48.51 59.10 10.59
(14, 18) 2000 40.43 58.95 18.52
(4, 18) 250 39.80 50.98 11.18
(4, 18) 2000 33.15 49.09 15.95
S4 (14, 18) 250 48.82 54.19 5.37
(14, 18) 2000 41.99 53.29 11.31
(4, 18) 250 42.43 48.60 6.17
(4, 18) 2000 35.17 45.21 10.05
S5 (14, 18) 250 50.44 55.78 5.34
(14, 18) 2000 42.32 54.36 12.05
S6 (14, 18) 250 46.82 53.46 6.64
(14, 18) 2000 38.94 51.30 12.36
Threshold and MAL values are in dB re: 1 mA. DR was calculated as the
difference between MAL and threshold in dB.
GALVIN AND FU: Modulation Detection in CI Users 271
response. For each experimental electrode pair, the
means of four to six loudness-balanced amplitudes
for each reference level were used as presentation
levels for the modulation detection experiments.
Thus, across the stimulation rate and mode condi-
tions, stimulation levels were loudness-balanced for
listening levels ranging from very soft to comfortably
loud.
Modulation detection
MDTs were measured for each stimulation rate and
mode condition as a function of loudness level. An
adaptive 3AFC procedure was used in which the
modulation depth was varied according to subject
response (3-down/1-up). Two intervals (randomly
selected) contained unmodulated pulse trains, where-
as the third interval (randomly selected) contained
the modulated pulse train. The test electrode, stimu-
lation rate, and stimulation level were fixed during
each test run; only the modulation depth was varied
from trial to trial. The final 8 of 12 reversals for each
run were averaged to obtain the MDT; three to six test
runs were conducted for each experimental electrode
pair at each listening level. The modulation thresh-
olds (in percent) were converted to log scale (20 log
m) to allow for easier comparison across test con-
ditions and listening levels.
RESULTS
Table 2 shows the thresholds, MALs, and DRs for
each experimental electrode. For all subjects, the DR
was significantly larger with the 2000 pps stimulation
rate than with the 250 pps rate (paired Student’s t
test, p G 0.001). The increased DR with the high rate
was largely a result of lower detection thresholds;
threshold levels were significantly lower with the high
rate (p G 0.001); MALs were slightly but significantly
lower with the higher rate (p = 0.005). For N24 users
S1 and S2, monopolar stimulation also resulted in
slightly, but not significantly larger DRs, relative to
bipolar stimulation, for both stimulation rates (p =
0.065); threshold (p = 0.003) and MAL (p = 0.016)
levels were significantly lower with monopolar stimu-
lation. For N22 users S3 and S4, stimulation mode
did not significantly affect the size of the DR (p =
0.501); threshold (p G 0.001) and MAL (p = 0.003)
levels were significantly lower with BP + 13 stimula-
tion, effectively shifting the DR to lower current
levels. In general, stimulation rate seemed to have
the greatest effect on DRs, primarily because of the
lower detection thresholds with the 2000 pps rate.
Changes in stimulation mode largely resulted in
shifted DRs, with small increases in DRs sometimes
associated with monopolar or wide bipolar electrode
configurations.
Figure 1 shows individual CI subjects’ modulation
detection functions for high and low carrier stimula-
tion rates, as a function of listening level, for a fixed
stimulation mode (BP + 3). Each panel shows MDTs
for an individual subject with the high and low
carrier rates. The x-axis shows the reference loudness
level (in percent DR of the reference electrode used
in the loudness-balancing procedure); thus, at each
reference level, the MDTs for the different carrier
rates were for equally loud listening levels. The y-axis
shows the MDT in log scale (20 log m); note that
lower values represent better modulation sensitivity.
The filled circles represent MDTs with the 2000 pps
carrier and the open circles represent MDTs with the
250 pps carrier. The dashed and solid lines show
sigmoidal fits to the MDT data for the 250 and 2000
pps carriers, respectively.
There were significant differences in the MDT
functions across subjects and stimulation rates. For
all subjects, MDTs were most significantly affected by
the loudness level. In general, MDTs improved as the
loudness level was increased, for both stimulation
rates. Also, for most subjects, MDTs were better with
the 250 pps carrier than with the 2000 pps carrier,
especially at the lower presentation levels. A two-way
repeated measures ANOVA showed that MDTs were
significantly affected by the stimulation rate (F
1, 35
=
30.3, p = 0.003) and level (F
7,35
= 35.3, p G 0.001).
There was a significant interaction between stimula-
tion rate and level (F
7,35
= 30.6, p = 0.019), reflected
by the better MDTs with the 250 pps carrier at the
lower listening levels.
Figure 2 shows the difference in performance bet-
ween the two carrier rates for the BP + 3 stimulation
mode. The x-axis shows the reference loudness level
(as in Fig. 1). The y-axis shows the difference in
MDTs between the two carrier rates, in dB. For each
subject, at each reference level, the MDT with the
2000 pps carrier was subtracted from the MDT with
the 250 pps carrier; negative values indicate better
MDTs with the 250 pps carrier. Individual subject
data are shown by the different symbols. Filled
symbols indicate a significant difference in MDTs
between the carrier rates (t test, p G 0.05), whereas
open symbols indicate no significant difference in
MDTs between the carrier rates (t test, p 9 0.05). The
solid line represents the mean shift in MDTs between
the carrier rates, across subjects. As seen in the
figure, for most subjects, MDTs were significantly
better with the 250 pps carrier, especially at the lower
presentation levels. A similar pattern was observed
with the monopolar and BP + 13 stimulation modes.
For widely spaced electrode configurations (mono-
polar or BP + 13), stimulation rate and level had a
272 GALVIN AND FU: Modulation Detection in CI Users
similar effect on CI users’ modulation sensitivity. A
two-way repeated-measures ANOVA showed that
MDTs were significantly affected by the stimulation
rate (F
1,21
= 13.7, p = 0.034) and level (F
7, 21
= 38.9, p
G 0.001). There was a significant interaction between
stimulation rate and level (F
7, 21
= 4.27, p = 0.004).
Thus, the effects of stimulation rate and level were
similar for both relatively narrow (BP + 3) and broad
(monopolar, BP + 13) stimulation modes. Figure 3
shows the shift in MDTs (in dB) between the two
stimulation mode conditions, for a fixed carrier rate
(250 pps). Similar to Figure 2, the x-axis shows the
reference loudness level; the y-axis shows the differ-
ence in MDTs between the two stimulation mode
conditions, in dB. For each subject, the MDT with
BP + 3 stimulation was subtracted from the MDT
with either monopolar or BP + 13 stimulation (de-
pending on the subject) at each reference level.
FIG. 1. Modulation detection thresholds (MDTs) for individual CI
subjects. Each panel shows individual subject results. The x-axis
shows the reference loudness level, in percent DR of the reference
electrode. The y-axis shows the MDTs in log scale; note that the
scale is optimized for each subject’s range of MDTs. The filled cir-
cles represent MDTs with the 2000 pps carrier and the open circles
represent MDTs with the 250 pps carrier; the error bars represent 1
standard deviation. The solid and dashed lines represent sigmoid fits to
the MDT data for the 2000 and 250 pps carriers, respectively.
G
ALVIN AND FU: Modulation Detection in CI Users 273
FIG. 2. Shift in MDTs between stimulation rates, for BP + 3
stimulation. For each subject, at each loudness-balanced reference
level, the MDT with the 2000 pps carrier was subtracted from the
MDT with the 250 pps carrier. The x-axis shows the reference
loudness level, in percent dynamic range of the reference electrode.
The y-axis shows the difference in MDTs between the carrier rates
(in dB). The reference line at 0 dB represents no difference in MDTs
between the carrier rates. The thick solid line shows the mean
performance shift, across subjects. Individual subject data are
represented by the different symbols. The filled symbols represent
a statistically significant difference in MDTs between the carrier
rates (paired t test: p G 0.05); the open symbols represent differences
in MDTs that were not statistically significant between the carrier
rates (p 9 0.05).
FIG. 3. Shift in MDTs between stimulation modes, for 250 pps
stimulation rate. For subjects S1 and S2, the MDT with the BP + 3
configuration was subtracted from the MDT with the monpolar
configuration. For subjects S3 and S4, the MDT with the BP + 3
configuration was subtracted from the MDT with the BP + 13
configuration. The x-axis shows the reference loudness level, in
percent dynamic range of the reference electrode. The y-axis shows
the difference in MDTs between the electrode configurations (in dB).
The reference line at 0 dB represents no difference in MDTs between
the carrier rates. The thick solid line shows the mean performance
shift, across subjects. Individual subject data are represented by the
different symbols. The filled symbols represent a statistically
significant difference in MDTs between the carrier rates (t test: p G
0.05); the open symbols represent differences in MDTs that were not
statistically significant between the carrier rates (p 9 0.05).
274 G
ALVIN AND FU: Modulation Detection in CI Users
Individual subject data are shown by the different
symbols. Filled symbols indicate a significant dif-
ference in MDTs between the carrier rates (t test,
p G 0.05), whereas open symbols indicate no sig-
nificant difference in MDTs between the carrier
rates (p 9 0.05). The solid line represents the mean
shift in MDTs between the stimulation modes, across
subjects.
For most subjects, there was no significant differ-
ence in MDTs between the stimulation mode con-
ditions at most presentation levels. For the 250 pps
stimulation rate, a two-way repeated-measures ANOVA
showed that MDTs were significantly affected by
stimulation level (F
7, 21
= 28.8, p G 0.001); however,
stimulation mode had no effect (F
1,21
= 0.006, p =
0.94) and there was no significant interaction be-
tween stimulation level and mode (F
7,21
= 0.57, p =
0.77). Similarly, for the 2000 pps stimulation rate,
a two-way repeated measures ANOVA showed that
MDTs were significantly affected by stimulation
level (F
7, 21
= 26.9, p G 0.001); however, stimulation
mode had no effect (F
7, 21
= 0.263, p = 0.64) and there
was no significant interaction between stimulation
level and mode (F
7, 21
= 0.32, p = 0.935).
To find more global effects of stimulation mode
and rate on CI subjects’ modulation sensitivity, mean
MDTs were calculated across all reference levels
for each subject, for each configuration/rate con-
dition. Figure 4 shows mean MDTs (across all stim-
ulation levels) for individual subjects. The differently
FIG. 4. Mean MDTs (across entire DR). The x-axis shows individual subjects. The y-axis shows the mean MDT in dB; for each subject and
condition, MDTs were averaged across all reference levels. The bars represent different electrode configuration and carrier rate conditions.
FIG. 5. Mean MDTs (across the range of reference levels) as a function of dynamic range (measured between the minimum and maximum
reference levels). Individual subject data are represented by the different symbols; the different stimulation mode/rate conditions are represented
by the different fill shades. The solid line represents the linear regression for all data.
G
ALVIN AND FU: Modulation Detection in CI Users 275
shaded bars represent the different stimulation rate/
mode conditions. For the four subjects who com-
pleted all stimulation rate/mode conditions, a two-
way repeated measures ANOVA showed that mean
MDTs were significantly affected by stimulation rate
(F
1, 3
= 19.4, p = 0.022); again, stimulation mode had
no effect (F
1, 3
= 0.144, p = 0.730) and there was no
significant interaction between stimulation rate and
mode (F
1, 3
= 0.221, p = 0.670). Although individual
subjects may have differed in terms of mean modu-
lation sensitivity across the DR, mean modulation
sensitivity was consistently better with the lower
stimulation rate.
Figure 5 shows subjects’ mean MDTs (calculated
across all reference levels, as in Figure 4) as a func-
tion of the DR (between the minimum and maxi-
mum loudness-balanced reference levels). Individual
subject data are shown by different symbols and the
different test conditions are shown by the fill shades.
A nonsignificant linear regression across all subject
data showed no correlation between the DR and
modulation sensitivity (r
2
= 0.11).
DISCUSSION
Results from the present study demonstrated that
carrier stimulation rate strongly affected CI subjects’
modulation sensitivity, especially at softer listening
levels. This result is somewhat contradictory to the
findings of Wojtczak and Viemeister (1999), who
found that MDTs were similar for sinusoidal carriers
with rates between 250 and 8000 Hz in NH listeners.
Differences in experimental design and subject
population between the present study and the
Wojtczak and Viemeister study may account for
several differences in results. In NH listeners, the
location and rate of neural firing cannot be easily
isolated. Thus, in the Wojtczak and Viemeister study,
the 250 and 8000 Hz carriers would have stimulated
different tonotopic locations; stimulation at either
rate would have produced stochastic, primary neural
response patterns at each of these locations. In CI
patients, the effect of stimulation rate can be
evaluated independent of the tonotopic location.
Thus, in the present study, the two experimental
carrier stimulation rates were evaluated for the same
electrode location, allowing the temporal processing
effects between the two rates to be compared for
(presumably) the same neural population. Differ-
ences between the stimulation levels used in the two
studies may have also contributed to differences in
results; MDTs were most comparable between the
carrier rates at the louder listening levels and most
different at the softer levels. The stimulation rates
used in the present study also produced very differ-
ent DRs, which may have contributed to differ-
ences in MDTs at different loudness levels. Finally,
different modulation frequencies were evaluated (4
Hz in the Wojtczak and Viemeister study, 20 Hz in
the present study), which may have involved dif-
ferent degrees of temporal processing between the
two studies.
The strong effect of stimulation rate on MDTs may
be a result of the limits of temporal processing or
because of the large differences in DR between the
carrier rates. Neural firing has been shown to syn-
chronize with rate cycles of up to õ1000 Hz in
human electric hearing (Wilson et al. 1997a, b). At
higher stimulation rates, low levels of neural noise
due to refractoriness and/or discharge of neural
membranes may introduce a jitter to the neural
response patterns. Higher stimulation rates and/or
conditioning pulse trains have been employed in
contemporary speech processors with the aim of
desynchronizing neural firing patterns and thereby
restoring some of the spontaneous firing patterns
observed in NH listeners (Wilson et al. 1997a, b;
Rubinstein et al. 1999, Litvak et al. 2001). Although
discharge rate patterns in auditory nerves may
indicate some improvement in temporal coding in
animals and models, such improved coding has not
been explicitly and/or consistently shown to improve
speech recognition in CI users. In studying the
cortical physiology of the guinea pig, Middlebrooks
(2004) measured the effects of stimulation rate on
channel interaction and theorized that high rates
might induce temporal integration due to depolar-
ization of the cochlear neural membrane, which
would contribute in part to the steeply declining
thresholds observed with rates above 1000 pps. Such
a depolarization might also influence temporal
processing on a single electrode, in which temporal
smearing, due to depolarization/polarization time
constants, might weaken the salience of the ampli-
tude envelope. Recently, Middlebrooks (2005) re-
ported that MDTs in guinea pigs were poorer at
higher carrier rates; in that study, MDTs were
measured for incremental stimulation levels above
threshold. It is possible that the relative loudness
between the carrier rates was not comparable and
that the differences in MDTs were a result of the
differences in loudness. However, in the present
study, MDTs were compared across carrier rates at
loudness-balanced listening levels, producing similar
results to Middlebrooks’ recent findings.
Changes in stimulation mode did not strongly
affect CI subjects’ modulation sensitivity. Although
the spread of excitation will certainly contribute
to loudness and, to some extent, place-pitch per-
276 GALVIN AND FU: Modulation Detection in CI Users
ception, loudness seems to contribute most strong-
ly to many psychophysical measures (e.g., intensity
discrimination, modulation detection, stimulation
rate discrimination, etc.). The effect of stimulation
mode on MDTs was evaluated in the present ex-
periment to test whether representation of the en-
velope would be enhanced by a larger spatial
spread of excitation or by the change/shift in
DR. Chatterjee (2003) showed that channel interac-
tion increased when synchronized envelopes were
presented to tonotopically remote electrodes. Modu-
lation detection interference also increases when
modulated maskers are presented to tonotopically
remote electrode locations (Chatterjee and Oba
2005). In some way, these data imply that amplitude
envelope cues might be more salient when pre-
sented to a wide region of the cochlea (e.g., BP +
13 and monopolar stimulation modes used in the
present study) rather than a restricted region (e.g.,
BP + 3 in the present study). Morris and Pfingst (2000)
found that, when stimulation levels were equated for
loudness, stimulation mode did not affect stimulation
rate discrimination. Similarly, Franck et al. (2003)
found that speech recognition was not affected by
stimulation mode, but rather by stimulation level,
and that the spread of excitation (bipolar vs. mo-
nopolar stimulation) did not seem to influence CI
speech performance. Thus, it is not surprising that
stimulation mode, while influencing the spread of
excitation and thereby shifting the electrode DR,
did not affect CI subjects’ MDTs in the present
study.
Differences in stimulation level most strongly
affected CI subjects’ modulation sensitivity. Although
the stimulation mode did not affect MDTs and
stimulation rate contributed to a 20-dB difference
(at most) in MDTs, the difference in MDTs between
the loudest and softest stimulation levels was between
15 and 40 dB, across CI subjects. A significant in-
teraction was found between stimulation rate and
level, but not between stimulation level and mode.
Note that the DR was significantly larger between the
stimulation rate conditions (and not between stim-
ulation mode conditions). This interaction sug-
gests that MDTs were affected differently by
stimulation rate at different loudness levels. The
interaction may have been a result of differences
in loudness growth between the two rates and/or
differences between intensity difference limens
(DLs) between rates at equally loud presentation
levels. The strong level dependence of modulation
sensitivity requires that other processor parameters
(e.g., stimulation rate, stimulation mode, modula-
tion frequency, etc.) must be evaluated for a wide
range of listening levels, as in the present study,
to best ascertain their effects.
Because the stimulation rate most significantly
affected the DRs of each carrier rate, the DR might
be expected to be a limiting factor in CI patients’
modulation sensitivity. However, Figure 5 shows no
correlation between mean MDTs (averaged across
the reference listening levels) and DRs (measure be-
tween the minimum and maximum reference levels).
The distribution of data shows a trend in which the
smaller DRs (due to the lower carrier rates) pro-
duced lower MDTs. Although subject S3’s data
(diamond symbols) agrees with this trend, S3’s DRs
were larger and MDTs lower than those of most other
subjects. It should be noted that S3 was implanted for
hearing loss resulting from trauma; all other subjects
were postlingually deafened with progressive hearing
losses. When S3’s data are excluded, the significance
of the correlation greatly increases (from r
2
= 0.11 to
r
2
= 0.81). The data suggest that gains in DR as a
result of higher stimulation rates are offset by
increased MDTs. A similar observation was found in
the relation between DR and intensity DLs (Kreft
et al. 2004). One implication from these results is
that gains in DR do not provide better amplitude
resolution and therefore do not provide for better
envelope processing.
The presumed advantages for high-rate stimulation
seem to be offset by increased channel interaction
(Middlebrooks 2004), poorer intensity resolution
(Kreft et al. 2004), and the poorer MDTs shown
in the present study. Although high stimulation
rates offer more flexibility in CI speech processing
schemes, improved temporal sampling and increased
DRs may not be the most effective use of the
technology. If more spectral channels may be added
(whether by increasing the number of implanted
electrodes or by improving the Beffective^ number of
spectral channels by other means), the increased
cumulative rate might be better distributed among
these additional channels, rather than increasing the
single-channel stimulation rate.
The strong carrier rate effects and level depen-
dence of modulation sensitivity suggest that weak
envelope cues are most affected by speech processor
parameters. Much speech perception, especially for
clear speech under quiet listening conditions, may
not require the full extent of CI patients’ modulation
sensitivity, especially if there is adequate spectral
resolution. However, for speech recognition tasks
requiring the processing of low-amplitude envelopes
(e.g., consonant recognition, speech in noise, etc.),
modulation sensitivity may play a greater role. The
mixed results of high rate processors on CI users’
speech recognition may be a result of variability in
various performance measures in these studies. If
high-rate speech processors are thought to encode
and transmit more temporal information, they must
GALVIN AND FU: Modulation Detection in CI Users 277
be tested using tasks that are sensitive to temporal
cues. Simple sentence recognition tests under quiet
listening conditions may not be sensitive to these
temporal cues. Thus, future work comparing the
effects of stimulation rate on CI patient performance
must include speech tests that are sensitive to the
temporal cues provided by each rate condition.
SUMMARY AND CONCLUSION
Modulation detection thresholds (MDTs) for a 20-Hz
sinusoidally amplitude modulated pulse train were
measured in six cochlear implant users, as functions
of the stimulation rate, mode, and level. Major
findings include:
1. MDTs were sensitive to stimulation rate, especially
at lower stimulation levels. For equally loud stim-
ulation levels, MDTs were generally better with
the 250 pps carrier than with the 2000 pps carrier.
2. MDTs were not sensitive to the stimulation mode.
For equally loud stimulation levels, there was no
significant difference between BP + 3, BP + 13,
and monopolar stimulation modes.
3. MDTs were most sensitive to stimulation level. In
most implant subjects, modulation thresholds
worsened as the stimulation level was decreased,
consistent with previous studies.
4. Modulation sensitivity seems to be related to
intensity resolution (which in turn is related to
the dynamic range). Because intensity resolution
is fairly constant across manipulations of factors
that influence dynamic range (e.g., stimulation
mode, rate) within a given CI patient, it is not
surprising that MDTs were no better with the high
stimulation rate (which produced a much wider
dynamic range than with the lower stimulation
rate). The fact that MDTs were significantly worse
with high stimulation rates may be because of
other temporal processing deficits associated with
high stimulation rates.
5. Although high stimulation rates may provide
better temporal sampling of the acoustic enve-
lope, CI listeners may be unable to access these
additional temporal cues and may ultimately
receive less temporal information than that pro-
vided by lower stimulation rates.
ACKNOWLEDGMENTS
The authors would like to thank all the CI patients who
graciously participated in these experiments. We would also
like to thank Dr. Monita Chatterjee for general guidance
and Dr. Chris Turner and another anonymous reviewer for
their insightful comments. This work was supported by
NIDCD R01-004993.
REFERENCES
BACON SP, VIEMEISTER NF. Temporal modulation transfer functions
in normal-hearing and hearing-impaired listeners. Audiology
24:117–134, 1985.
B
LAMEY PJ, PYMAN BC, GORDON M, CLARK GM, BROWN AM, DOWELL
RC, HOLLOW RD. Factors predicting postoperative sentence
scores in postlinguistically deaf adult cochlear implant patients.
Ann. Otol. Rhinol. Laryngol. 101:342–348, 1992.
B
RILL SM, GSTO
¨
TTNER W, HELMS J, ILBERG CV, BAUMGARTNER W,
M
U
¨
LLER J, KIEFER J. Optimization of channel number and
stimulation rate for the fast continuous interleaved sam-
pling strategy in the COMBI 40+. Am. J. Otol. 18:S104–
S106, 1997.
BRILL SM, HOCHMAIR I, HOCHMAIR ES. The importance of stimulation
rate in pulsatile stimulation strategies in cochlear implants.
Presented at the XXIV International Congress of Audiology,
Buenos Aires, 1998a.
B
RILL SM, SCHATZER R, NOPP P, HOCHMAIR I, HOCHMAIR ES. JCIS:
CIS with temporally jittering stimulation pulses: effect of jit-
tering amplitude and stimulation rate on speech understand-
ing. Presented at the 4th European Symposium on Paediatric
Cochlear Implantation, Hertogenbosch, The Netherlands,
1998b.
B
URNS EM, VIEMEISTER NF. Played-again SAM: further observations
on the pitch of amplitude-modulated noise. J. Acoust. Soc.
Am. 70:1655–1660, 1981.
C
AZALS Y, PELIZZONE M, KASPER A, MONTANDON P. Indication of a
relation between speech perception and temporal resolution
for cochlear implantees. Ann. Otol. Rhinol. Laryngol. 100:893–
895, 1991.
C
AZALS Y, PELIZZONE M, SAUDAN O, BOEX C. Low-pass filtering in
amplitude modulation detection associated with vowel and
consonant identification in subjects with cochlear implants.
J. Acoust. Soc. Am. 96:2048–2054, 1994.
C
HATTERJEE M. Modulation masking in cochlear implant listeners:
envelope versus non seq. tonotopic components. J. Acoust. Soc.
Am. 113:2042–2053, 2003.
C
HATTERJEE M, OBA SI. Across- and within-channel envelope inter-
actions in cochlear implant listeners. J. Assoc. Res. Otolaryngol.
5:360–375, 2005.
D
ONALDSON GS, VIEMEISTER NF. Intensity discrimination and detec-
tion of amplitude modulation in electric hearing. J. Acoust.
Soc. Am. 108:760–763, 2000.
D
RULLMAN R, FESTERN JM, PLOMP R. Effect of temporal envelope
smearing on speech reception. J. Acoust. Soc. Am. 95:1053–
1064, 1994a.
D
RULLMAN R, FESTERN JM, PLOMP R. Effect of reducing slow temporal
modulations on speech reception. J. Acoust. Soc. Am. 95:
2670–2680, 1994b.
F
ORMBY C. Modulation detection by patients with eighth-nerve
tumors. J. Speech Hear. Res. 29:413–419, 1986.
FORREST TG, GREEN DM. Detection of partially filled gaps in noise
and the temporal modulation transfer function. J. Acoust. Soc.
Am. 82:1933–1943, 1987.
F
RANCK KH, XU L, PFINGST BE. Effects of stimulus level on speech
perception with cochlear prostheses. J. Assoc. Res. Otolaryngol.
4:49–59, 2003.
F
RIESEN LM, SHANNON RV, CRUZ RJ. Effects of stimulation rate on
speech recognition with cochlear implants. Audiol. Neuro-otol.
10:169–184, 2005.
278 G
ALVIN AND FU: Modulation Detection in CI Users
FU Q J. Temporal processing and speech recognition in cochlear
implant users. NeuroReport 13:1635–1640, 2002.
FU QJ, SHANNON RV. Effects of stimulation rate on phoneme
recognition in cochlear implant users. J. Acoust. Soc. Am.
107:589–597, 2000.
F
U QJ, ZENG FG. Effects of envelope cues on Mandarin Chinese
tone recognition. Asia-Pac. J. Speech Lang. Hear. 5:45–57, 2000.
FU QJ, ZENG FG, SHANNON RV, SOLI SD. Importance of tonal
envelope cues in Chinese speech recognition. J. Acoust. Soc.
Am. 104:505–510, 1998.
F
U QJ,CHINCHILLA S, GALVIN JJ. The role of spectral and temporal
cues in voice gender discrimination by normal-hearing listeners
and cochlear implant users. J. Assoc. Res. Otolaryngol. 254–260,
2004.
H
OLDEN LK, SKINNER MW, HOLDEN TA, DEMOREST ME. Effects of
stimulation rate with the Nucleus 24 ACE speech coding
strategy. Ear Hear. 23:463–476, 2002.
J
ESTEADT W. An adaptive procedure for subjective judgments.
Percept. Psychophys. 28:85–88, 1980.
K
REFT HA, DONALDSON GS, NELSON DA. Effects of pulse rate and
electrode array design on intensity discrimination in cochlear
implant users. J. Acoust. Soc. Am. 116:2258–2268, 2004.
L
AWSON DT, WILSON BS, ZERBI M, FINLEY CC. Speech processors for
auditory prostheses. Third Quarterly Progress Report, NIH
Contract N01-DC-5-2103, 1996.
L
ITVAK L, DELGUTTE B, EDDINGTON D. Auditory nerve fiber responses
to electric stimulation: modulated and unmodulated pulse
trains. J. Acoust. Soc. Am. 110:368–379, 2001.
L
OIZOU PC, POROY O, DORMAN MF. The effect of parametric
variations of cochlear implant processors on speech under-
standing. J. Acoust. Soc. Am. 108:790–802, 2000.
M
IDDLEBROOKS JC. Effects of cochlear-implant pulse rate and inter-
channel timing on channel interactions and thresholds. J.
Acoust. Soc. Am. 116:452–468, 2004.
M
IDDLEBROOKS JC. Transmission of temporal information from a
cochlear implant to the auditory cortex. Abstracts of Associa-
tion for Research in Otolaryngology 28th Midwinter Meeting,
February 2005, Volume 28, 91, 2005.
M
ORRIS DJ, PFINGST BE. Effects of electrode configuration and
stimulus level on rate and level discrimination with cochlear
implants. J. Assoc. Res. Otolaryngol. 1:211–223, 2000.
M
UCHNIK C, TAITELBAUM R, TENE S, HILDESHEIMER M. Auditory
temporal resolution and open speech recognition in cochlear
implant recipients. Scand. Audiol. 23:105–109, 1993.
R
UBINSTEIN JT, WILSON BS, FINLEY CC, ABBAS PJ. Pseudospontaneous
activity: stochastic independence of auditory nerve fibers with
electrical stimulation. Hear. Res. 127:108–118, 1999.
S
HANNON RV. Temporal modulation transfer functions in patients
with cochlear implants. J. Acoust. Soc. Am. 91:2156–2164, 1992.
S
HANNON RV, COLLETTI V. Evidence from auditory brainstem im-
plants of a modulation-specific auditory pathway that is critical
for speech recognition. Abstracts of Association for Research
in Otolaryngology 28th Midwinter Meeting, February 2005,
Volume 28, 183, 2005.
S
HANNON RV, ZENG FG, WYGONSKI J, KAMATH V, EKELID M. Speech
recognition with primarily temporal cues. Science 270:303–304,
1995.
S
KINNER MW. Optimizing cochlear implant speech performance.
Ann. Otol. Rhinol. Laryngol. Suppl. 191:4–13, 2003.
T
URNER CW, SOUZA PE, FORGET LN. Use of temporal envelope cues
in speech recognition by normal and hearing-impaired listen-
ers. J. Acoust. Soc. Am. 97:2568–2576, 1995.
VAN TASELL DJ, SOLI SD, KIRBY VM, WIDIN GP. Speech waveform
envelope cues for consonant recognition. J. Acoust. Soc. Am.
82:1152–1161, 1987.
VAN TASELL DJ, GREENFIELD DG, LOGEMANN JJ, NELSON DA. Temporal
cues for consonant recognition: training, talker generalization,
and use in evaluation of cochlear implants. J. Acoust. Soc. Am.
92:1247–1257, 1992.
V
ANDALI AE, WHITFORD LA, PLANT KL, CLARK GM. Speech perception
as a function of electrical stimulation rate: using the Nucleus 24
cochlear implant system. Ear Hear. 21:608–624, 2000.
W
ILSON BS, FINLEY CC, LAWSON D, ZERBI M. Temporal representa-
tions with cochlear implants. Am. J. Otol. 18:S30–S34, 1997a.
W
ILSON B, FINLEY C, ZERBI M, LAWSON D, VAN DEN HONERT C. Speech
processors for auditory prostheses. NIH Project N01-DC-5-2103,
Seventh Quarterly Progress Report, Neural Prosthesis Program,
National Institutes of Health, Bethesda, MD, 1997b.
W
OJTCZAK M, VIEMEISTER NF. Intensity discrimination and detection
of amplitude modulation. J. Acoust. Soc. Am. 106:1917–1924,
1999.
W
YGONSKI J, ROBERT M. HEI Nucleus Research Interface (HEINRI)
Specification. Internal Materials, 2002.
XU L, TSAI Y, PFINGST BE. Features of stimulation affecting tonal-
speech perception: implications for cochlear prostheses.
J. Acoust. Soc. Am. 112:247–258, 2002.
Z
ENG FG, TURNER CW. Binaural loudness matches in unilaterally
impaired listeners. Q. J. Exp. Psychol. 43:565–583, 1991.
G
ALVIN AND FU: Modulation Detection in CI Users 279
... AM detection ability in CI users varies as a function of signal-related factors, including presentation level, electrical stimulation/carrier rate, and electrode location. Generally, AM detection ability tends to worsen with decreasing presentation level (Galvin & Fu, 2005;Pfingst et al., 2007), increasing carrier rate (Galvin & Fu, 2005Pfingst et al., 2007), and decreasing electrical dynamic ranges (DRs; Pfingst et al., 2007). The magnitude of these effects varies greatly between listeners, suggesting that listener-related factors including biological variables may impact performance. ...
... AM detection ability in CI users varies as a function of signal-related factors, including presentation level, electrical stimulation/carrier rate, and electrode location. Generally, AM detection ability tends to worsen with decreasing presentation level (Galvin & Fu, 2005;Pfingst et al., 2007), increasing carrier rate (Galvin & Fu, 2005Pfingst et al., 2007), and decreasing electrical dynamic ranges (DRs; Pfingst et al., 2007). The magnitude of these effects varies greatly between listeners, suggesting that listener-related factors including biological variables may impact performance. ...
... Although AM detection has been studied extensively in CI users (e.g., Chatterjee & Oba, 2005;Chatterjee & Oberzut, 2011;Chatterjee & Robert, 2001;Fraser & McKay, 2012;Galvin & Fu, 2005) and has been shown to correlate with speech recognition ability (Cazals et al., 1994;Fu, 2002;Garadat et al., 2012), very little research has focused on the impact of listener-related factors on AM detection in CI users. Previous psychophysical studies with CIs users also tend to have relatively small sample sizes and limited consideration of participants' chronological age, age at onset of hearing loss, and estimation of the degree of neural survival. ...
Article
Full-text available
Although cochlear implants (CIs) are a viable treatment option for severe hearing loss in adults of any age, older adults may be at a disadvantage compared with younger adults. CIs deliver signals that contain limited spectral information, requiring CI users to attend to the temporal information within the signal to recognize speech. Older adults are susceptible to acquiring auditory temporal processing deficits, presenting a potential age-related limitation for recognizing speech signals delivered by CIs. The goal of this study was to measure auditory temporal processing ability via amplitude-modulation (AM) detection as a function of age in CI users. The contribution of the electrode-to-neural interface, in addition to age, was estimated using electrically evoked compound action potential (ECAP) amplitude growth functions. Within each participant, two electrodes were selected: one with the steepest ECAP slope and one with the shallowest ECAP slope, in order to represent electrodes with varied estimates of the electrode-to-neural interface. Single-electrode AM detection thresholds were measured using direct stimulation at these two electrode locations. Results revealed that AM detection ability significantly declined as a function of chronological age. ECAP slope did not significantly impact AM detection, but ECAP slope decreased (became shallower) with increasing age, suggesting that factors influencing the electrode-to-neural interface change with age. Results demonstrated a significant negative impact of chronological age on auditory temporal processing. The locus of the age-related limitation (peripheral vs. central origin), however, is difficult to evaluate because the peripheral influence (ECAPs) was correlated with the central factor (age).
... A number of such factors have been identified. Modulation sensitivity improves with stimulation level within the dynamic range [7,14,15], which motivated the site-rehabilitation strategies where T levels were adjusted [13]. Modulation sensitivity also improves by lowering the carrier rate. ...
... Modulation sensitivity also improves by lowering the carrier rate. The effect has been consistently shown in a handful of studies [14][15][16][17][18]. The benefit of low-rate stimulation was attributed partly to the fast loudness growth within a smaller dynamic range. ...
Article
Full-text available
Temporal modulation sensitivity has been studied extensively for Cochlear Implant (CI) users due to its strong correlation to speech recognition outcomes. Previous studies reported that temporal modulation detection thresholds (MDTs) vary across the tonotopic axis and attributed this variation to patchy neural survival. However, correlates of neural health identified in animal models depend on electrode position in humans. Nonetheless, the relationship between MDT and electrode location has not been explored. We tested 13 ears for the effect of distance on modulation sensitivity, specifically targeting the question of whether electrodes closer to the modiolus are universally beneficial. Participants in this study were postlingually deafened and users of Cochlear Nucleus CIs. The distance of each electrode from the medial wall (MW) of the cochlea and mid-modiolar axis (MMA) was measured from scans obtained using computerized tomography (CT) imaging. The distance measures were correlated with slopes of spatial tuning curves measured on selected electrodes to investigate if electrode position accounts, at least in part, for the width of neural excitation. In accordance with previous findings, electrode position explained 24% of the variance in slopes of the spatial tuning curves. All functioning electrodes were also measured for MDTs. Five ears showed a positive correlation between MDTs and at least one distance measure across the array; 6 ears showed negative correlations and the remaining two ears showed no relationship. The ears showing positive MDT-distance correlations, thus benefiting from electrodes being close to the neural elements, were those who performed better on the two speech recognition measures, i.e., speech reception thresholds (SRTs) and recognition of the AzBio sentences. These results could suggest that ears able to take advantage of the proximal placement of electrodes are likely to have better speech recognition outcomes. Previous histological studies of humans demonstrated that speech recognition is correlated with spiral ganglion cell counts. Alternatively, ears with good speech recognition outcomes may have good overall neural health, which is a precondition for close electrodes to produce spatially confined neural excitation patterns that facilitate modulation sensitivity. These findings suggest that the methods to reduce channel interaction, e.g., perimodiolar electrode array or current focusing, may only be beneficial for a subgroup of CI users. Additionally, it suggests that estimating neural survival preoperatively is important for choosing the most appropriate electrode array type (perimodiolar vs. lateral wall) for optimal implant function.
... A 3D volume conduction model and an active ne model were used, and it was concluded that the differences between the spatial e models of the various multiples cannot be simulated in a model containing linearly neurons with the same morphology at equal volumes [102]. Similarly, another s ported narrower spatial activity for focused stimulation with the bipolar or tripo than the monopolar stimulus [103]. The success of the CI electrodes in stimulatio can be determined by tonotopic activity. ...
... A 3D volume conduction model and an active nerve fiber model were used, and it was concluded that the differences between the spatial excitation models of the various multiples cannot be simulated in a model containing linearly aligned neurons with the same morphology at equal volumes [102]. Similarly, another study reported narrower spatial activity for focused stimulation with the bipolar or tripolar mode than the monopolar stimulus [103]. The success of the CI electrodes in stimulation modes can be determined by tonotopic activity. ...
Article
Full-text available
Cochlear implants are neural implant devices that aim to restore hearing in patients with severe sensorineural hearing impairment. Here, the main goal is to successfully place the electrode array in the cochlea to stimulate the auditory nerves through bypassing damaged hair cells. Several electrode and electrode array parameters affect the success of this technique, but, undoubtedly, the most important one is related to electrodes, which are used for nerve stimulation. In this paper, we provide a comprehensive resource on the electrodes currently being used in cochlear implant devices. Electrode materials, shape, and the effect of spacing between electrodes on the stimulation, stiffness, and flexibility of electrode-carrying arrays are discussed. The use of sensors and the electrical, mechanical, and electrochemical properties of electrode arrays are examined. A large library of preferred electrodes is reviewed, and recent progress in electrode design parameters is analyzed. Finally, the limitations and challenges of the current technology are discussed along with a proposal of future directions in the field.
... Specifically for cochlear implant (CI) users, high speech recognition scores can be achieved in quiet conditions; however, a large decline in performance occurs in the presence of noise, reverberation, and/or multiple speakers (Blamey et al., 1984;Fredelake and Hohmann, 2012;Hazrati and Loizou, 2012b;Payton et al., 1994). This reduction of performance can be attributed to a number of different factors related to CI signal processing such as poor spectral resolution and the prevalence of envelope-based strategies (Croghan et al., 2017;Dorman et al., 1997;Galvin and Fu, 2005;McKay et al., 1992;McKay and McDermott, 1999;Rubinstein, 2004;Wilson et al., 1991;Wilson and Dorman, 2008;Xu and Zheng, 2007;Zeng et al., 2008). Traditional pre-processing techniques like noise suppression and speech enhancement have been utilized to improve recognition (Bhattacharya and Zeng, 2007;Loizou, 2012a, 2013;Lyzenga et al., 2002;Nogueira et al., 2016;Zheng et al., 2017) but are considerably different than a solution where the speech signal is boosted above the noise floor. ...
... the speech-in-noise task. It has been previously shown that CI listeners are able to recover speech intelligibility in noise for spectrally degraded speech when temporal cues are intact, or if the stimulation rate is sufficient enough to illicit higher spectrotemporal modulation depths (Fu, 2002;Fu and Nogaki, 2005;Galvin et al., 2014;Galvin and Fu, 2005;Shannon et al., 1995;Won et al., 2015;Zheng et al., 2017). This could suggest that CI users may utilize different perceptual cues that are not as salient for NH listeners. ...
Article
Natural compensation of speech production in challenging listening environments is referred to as the Lombard effect (LE). The resulting acoustic differences between neutral and Lombard speech have been shown to provide intelligibility benefits for normal hearing (NH) and cochlear implant (CI) listeners alike. Motivated by this outcome, three LE perturbation approaches consisting of pitch, duration, formant, intensity, and spectral contour modifications were designed specifically for CI listeners to combat speech-in-noise performance deficits. Experiment 1 analyzed the effects of loudness, quality, and distortion of approaches on speech intelligibility with and without formantshifting. Significant improvements of þ9.4% were observed in CI listeners without the formant-shifting approach at þ5 dB signal-to-noise ratio (SNR) large-crowd-noise (LCN) when loudness was controlled, however, performance was found to be significantly lower for NH listeners. Experiment 2 evaluated the non-formant-shifting approach with additional spectral contour and high pass filtering to reduce spectral smearing and decrease distortion observed in Experiment 1. This resulted in significant intelligibility benefits of þ30.2% for NH and þ21.2% for CI listeners at 0 and þ5 dB SNR LCN, respectively. These results suggest that LE perturbation may be useful as front-end speech modification approaches to improve intelligibility for CI users in noise. VC 2022 Acoustical Society of America. https://doi.org/10.1121/10.0009377
... ) as in Racey and Dillon (2017). The current results also show no correlation for T-levels and a highly significant correlation for M-levels [r = 0.84, (Galvin and Fu, 2005) of the stimuli used here and in the clinical fitting, as well as to potential variations of T-level in time (Theelen-van den Hoek et al., 2014). However, a significant correlation between the T-levels in the pretests and the hearing thresholds extracted from the loudness functions was found (r = 0.75). ...
Article
Full-text available
Recent studies on loudness perception of binaural broadband signals in hearing impaired listeners found large individual differences, suggesting the use of such signals in hearing aid fitting. Likewise, clinical cochlear implant (CI) fitting with narrowband/single-electrode signals might cause suboptimal loudness perception in bilateral and bimodal CI listeners. Here spectral and binaural loudness summation in normal hearing (NH) listeners, bilateral CI (biCI) users, and unilateral CI (uCI) users with normal hearing in the unaided ear was investigated to assess the relevance of binaural/bilateral fitting in CI users. To compare the three groups, categorical loudness scaling was performed for an equal categorical loudness noise (ECLN) consisting of the sum of six spectrally separated third-octave noises at equal loudness. The acoustical ECLN procedure was adapted to an equivalent procedure in the electrical domain using direct stimulation. To ensure the same broadband loudness in binaural measurements with simultaneous electrical and acoustical stimulation, a modified binaural ECLN was introduced and cross validated with self-adjusted loudness in a loudness balancing experiment. Results showed a higher (spectral) loudness summation of the six equally loud narrowband signals in the ECLN in CI compared to NH. Binaural loudness summation was found for all three listener groups (NH, uCI, and biCI). No increased binaural loudness summation could be found for the current uCI and biCI listeners compared to the NH group. In uCI loudness balancing between narrowband signals and single electrodes did not automatically result in a balanced loudness perception across ears, emphasizing the importance of binaural/bilateral fitting.
Article
Objective: To evaluate whether a 500 pulses per second per channel (pps/ch) rate would provide non-inferior hearing performance compared to the 900 pps/ch rate in the Advanced Combination Encoder (ACE™) sound coding strategy. Design: A repeated measures single-subject design was employed, wherein each subject served as their own control. All except one subject used 900 pps/ch at enrolment. After three weeks of using the alternative rate program, both programs were loaded into the sound processor for two more weeks of take-home use. Subjective performance, preference, words in quiet, sentences in babble, music quality, and fundamental frequency (F0) discrimination were assessed using a balanced design. Study sample: Data from 18 subjects were analysed, with complete datasets available for 17 subjects. Results: Non-inferior performance on all clinical measures was shown for the lower rate program. Subjects' preference ratings were comparable for the programs, with 53% reporting no difference overall. When a preference was expressed, the 900 pps/ch condition was preferred more often. Conclusion: Reducing the stimulation rate from 900 pps/ch to 500 pps/ch did not compromise the hearing outcomes evaluated in this study. A lower pulse rate in future cochlear implants could reduce power consumption, allowing for smaller batteries and processors.
Article
The auditory brainstem implant (ABI) is an auditory neuroprosthesis that provides hearing by electrically stimulating the cochlear nucleus (CN) of the brainstem. Our previous study (McInturff et al., 2022) showed that single-pulse stimulation of the dorsal (D)CN subdivision with low levels of current evokes responses that have early latencies, different than the late response patterns observed from stimulation of the ventral (V)CN. How these differing responses encode more complex stimuli, such as pulse trains and amplitude modulated (AM) pulses, has not been explored. Here, we compare responses to pulse train stimulation of the DCN and VCN, and show that VCN responses, measured in the inferior colliculus (IC), have less adaption, higher synchrony, and higher cross-correlation. However, with high-level DCN stimulation, responses become like those to VCN stimulation, supporting our earlier hypothesis that current spreads from electrodes on the DCN to excite neurons located in the VCN. To AM pulses, stimulation of the VCN elicits responses with larger vector strengths and gain values especially in the high-CF portion of the IC. Additional analysis using neural measures of modulation thresholds indicate that these measures are lowest for VCN. Human ABI users with low modulation thresholds, who score best on comprehension tests, may thus have electrode arrays that stimulate the VCN. Overall, the results show that the VCN has superior response characteristics and suggest that it should be the preferred target for ABI electrode arrays in humans.
Article
Computational models are useful tools to investigate scientific questions that would be complicated to address using an experimental approach. In the context of cochlear-implants (CIs), being able to simulate the neural activity evoked by these devices could help in understanding their limitations to provide natural hearing. Here, we present a computational modelling framework to quantify the transmission of information from sound to spikes in the auditory nerve of a CI user. The framework includes a model to simulate the electrical current waveform sensed by each auditory nerve fiber (electrode-neuron interface), followed by a model to simulate the timing at which a nerve fiber spikes in response to a current waveform (auditory nerve fiber model). Information theory is then applied to determine the amount of information transmitted from a suitable reference signal (e.g., the acoustic stimulus) to a simulated population of auditory nerve fibers. As a use case example, the framework is applied to simulate published data on modulation detection by CI users obtained using direct stimulation via a single electrode. Current spread as well as the number of fibers were varied independently to illustrate the framework capabilities. Simulations reasonably matched experimental data and suggested that the encoded modulation information is proportional to the total neural response. They also suggested that amplitude modulation is well encoded in the auditory nerve for modulation rates up to 1000 Hz and that the variability in modulation sensitivity across CI users is partly because different CI users use different references for detecting modulation.
Article
Objectives: Postimplantation facial nerve stimulation is a common side-effect of intracochlear electrical stimulation. Facial nerve stimulation occurs when electric current intended to stimulate the auditory nerve, spread beyond the cochlea to excite the nearby facial nerve, causing involuntarily facial muscle contractions. Facial nerve stimulation can often be resolved through adjustments in speech processor fitting but, in some instances, these measures exhibit limited benefit or may have a detrimental effect on speech perception. In this study, apical reference stimulation mode was investigated as a potential intervention to facial nerve stimulation. Apical reference stimulation is a bipolar stimulation strategy in which the most apical electrode is used as the reference electrode for stimulation on all the other intracochlear electrodes. Design: A person-specific model of the human cochlea, facial nerve and electrode array, coupled with a neural model, was used to predict excitation of auditory and facial nerve fibers. These predictions were used to evaluate the effectiveness in reducing facial nerve stimulation using apical reference stimulation. Predictions were confirmed in psychoacoustic tests by determining auditory comfort and threshold levels for the apical reference stimulation mode while capturing electromyography data in two participants. Results: Models predicted a favorable outcome for apical reference stimulation, as facial nerve fiber thresholds were higher and auditory thresholds were lower, in direct comparison to conventional monopolar stimulation. Psychophysical tests also illustrated decreased auditory thresholds and increased dynamic range during apical reference stimulation. Furthermore, apical reference stimulation resulted in lower electromyography energy levels, compared to conventional monopolar stimulation, which suggests a reduction in facial nerve stimulation. Subjective feedback corroborated that apical reference stimulation alleviated facial nerve stimulation. Conclusion: Apical reference stimulation may be a viable strategy to alleviate facial nerve stimulation considering the improvements in dynamic range and auditory thresholds, complemented with a reduction in facial nerve stimulation symptoms.
Article
The ability to process rapid modulations in the spectro-temporal structure of sounds is critical for speech comprehension. For users of cochlear implants (CIs), spectral cues in speech are conveyed by differential stimulation of electrode contacts along the cochlea, and temporal cues in terms of the amplitude of stimulating electrical pulses, which track the amplitude-modulated (AM’ed) envelope of speech sounds. Whilst survival of inner-ear neurons and spread of electrical current are known factors that limit the representation of speech information in CI listeners, limitations in the neural representation of dynamic spectro-temporal cues common to speech are also likely to play a role. We assessed the ability of CI listeners to process spectro-temporal cues varying at rates typically present in human speech. Employing an auditory change complex (ACC) paradigm, and a slow (0.5Hz) alternating rate between stimulating electrodes, or different AM frequencies, to evoke a transient cortical ACC, we demonstrate that CI listeners—like normal-hearing listeners—are sensitive to transitions in the spectral- and temporal-domain. However, CI listeners showed impaired cortical responses when either spectral or temporal cues were alternated at faster, speech-like (6-7Hz), rates. Specifically, auditory change following responses—reliably obtained in normal-hearing listeners—were small or absent in CI users, indicating that cortical adaptation to alternating cues at speech-like rates is stronger under electrical stimulation. In CI listeners, temporal processing was also influenced by the polarity—behaviourally—and rate of presentation of electrical pulses—both neurally and behaviorally. Limitations in the ability to process dynamic spectro-temporal cues will likely impact speech comprehension in CI users.
Article
Full-text available
For tonal languages such as Mandarin Chinese, tone recognition is important for understanding the meaning of words, phrases or sentences. While fundamental frequency carries the most distinctive information for tone recognition, waveform temporal envelope cues can also produce a high level of tone recognition. This study attempts to identify what types of temporal envelope cues contribute to tone recognition and whether these temporal envelope cues are dependent on speakers and vowel contexts. Several signal-correlated-noise stimuli were generated to separate the contribution of three major temporal envelope cues – duration, amplitude contour, and periodicity – to tone recognition. Perceptual results show that the duration cue contributed mostly to discrimination of Tone-3, the amplitude cue contributed mostly to Tone-3 and Tone-4 discrimination, and the periodicity cue contributed to recognition of all tones. However, tone recognition based on temporal envelope cues was highly variable across speakers and vowel contexts. Acoustic analysis of these temporal envelope cues revealed that this variability in tone recognition is directly related to the acoustic variability between the amplitude contour and fundamental frequency contour.
Article
Full-text available
This study investigated the effect of pulsatile stimulation rate on medial vowel and consonant recognition in cochlear implant listeners. Experiment 1 measured phoneme recognition as a function of stimulation rate in six Nucleus-22 cochlear implant listeners using an experimental four-channel continuous interleaved sampler CIS speech processing strategy. Results showed that all stimulation rates from 150 to 500 pulses/s/electrode produced equally good performance, while stimulation rates lower than 150 pulses/s/electrode produced significantly poorer performance. Experiment 2 measured phoneme recognition by implant listeners and normal-hearing listeners as a function of the low-pass cutoff frequency for envelope information. Results from both acoustic and electric hearing showed no significant difference in performance for all cutoff frequencies higher than 20 Hz. Both vowel and consonant scores dropped significantly when the cutoff frequency was reduced from 20 Hz to 2 Hz. The results of these two experiments suggest that temporal envelope information can be conveyed by relatively low stimulation rates. The pattern of results for both electrical and acoustic hearing is consistent with a simple model of temporal integration with an equivalent rectangular duration ERD of the temporal integrator of about 7 ms.
Conference Paper
Results of studies performed in our laboratory suggest that cochlear implant recipients understand speech best if the following speech processor parameters are individually chosen for each person: minimum and maximum stimulation levels on each electrode in the speech processor program (MAP), stimulation rate, and speech coding strategy. If these and related parameters are chosen to make soft sounds (from approximately 100 to 6,000 Hz) audible at as close to 20 dB hearing level as possible and loud sounds not too loud, recipients have the opportunity to hear speech in everyday life situations that are of key importance to children who are learning language and to all recipients in terms of ease of communication.
Article
This study investigated the cues for consonant recognition that are available in the time‐intensity envelope of speech. Twelve normal‐hearing subjects listened to three sets of spectrally identical noise stimuli created by multiplying noise with the speech envelopes of 19 /aCa/ natural‐speech nonsense syllables. The speech envelope for each of the three noise conditions was derived using a different low‐pass filter cutoff (20, 200, and 2000 Hz). Average consonant identification performance was above chance for the three noise conditions and improved significantly with the increase in envelope bandwidth from 20–200 Hz. SINDSCAL multidimensional scalinganalysis of the consonant confusions data identified three speech envelope features that divided the 19 consonants into four envelope feature groups (‘‘envemes’’). The enveme groups in combination with visually distinctive speech feature groupings (‘‘visemes’’) can distinguish most of the 19 consonants. These results suggest that near‐perfect consonant identification performance could be attained by subjects who receive only enveme and viseme information and no spectral information.
Article
The pitchlike sensation elicited by sinusoidally amplitude‐modulated (SAM) noise remains a controversial phenomenon. The controversy centers on two major points: (1) whether this sensation is ’’really’’ pitch rather than, e.g., roughness or intermittency, and (2) the possibility that any pitch sensation is mediated by short‐term spectral information rather than temporal information—thus nullifying an interesting aspect of the phenomenon. Three experiments employing SAM wideband noise, SAM wideband noise bandpass‐filtered after modulation, and a SAM 10 kHz pure tone were performed: (1) open‐set melody identification, (2) melodic dictation, and (3) musical‐interval adjustment. These experiments extend our earlier study [Burns and Viemeister, J. Acoust. Soc. Am. 60, 863 (1976)]. The results provide further evidence that SAM noise can, at suitable modulation frequencies, elicit a sensation of pitch (as defined by the ability to carry melodic information), and that this pitch represents a purely temporal phenomenon.
Article
This study investigated the interactive effect of both electrode configurations and frequency regions assigned to electrodes on speech recognition in three subjects implanted with Nucleus-22 multichannel cochlear implant. A 4-channel processor with continuous interleaved sampling speech processing strategy was implemented through a custom interface, The temporal envelopes from four broad frequency bands were used to modulate 500pps, 100 ps/phrsse pulse trains and then delivered to four electrode pairs. Ten different frequency allocations and three sets of 4-electrode configurations were used. Each frequency allocation represented different insertion depths of the four electrodes based on Greenwood's pIace-to-frequency function. Preliminary results showed that the vowel score was highly dependent on the frequency allocations in all electrode configurations. Subjecti with different insertion depths had similar recognition patterns as a function of the frequency allocations, indicating a possible accommodation to the pattern of speech information presented through their normal 20 electrode SPEK processor. A similar recognition pattern as a function of the frequency allocations was observed within subjects for the electrode configurations which had the same return electrode, indicating that pattern recognition of speech sounds in electric hearing depends primarily on the electrically evoked discharge patterns at the return electrode.
Article
Limited consonant phonemic information can be conveyed by the temporal characteristics of speech. In the two experiments reported here, the effects of practice and of multiple talkers on identification of temporal consonant information were evaluated. Naturally produced /aCa/disyllables were used to create "temporal-only" stimuli having instantaneous amplitudes identical to the natural speech stimuli, but flat spectra. Practice improved normal-hearing subjects' identification of temporal-only stimuli from a single talker over that reported earlier for a different group of unpracticed subjects [J. Acoust. Soc. Am. 82, 1152-1161 (1987)]. When the number of talkers was increased to six, however, performance was poorer than that observed for one talker, demonstrating that subjects had been able to learn the individual stimulus items derived from the speech of the single talker. Even after practice, subjects varied greatly in their abilities to extract temporal information related to consonant voicing and manner. Identification of consonant place was uniformly poor in the multiple-talker situation, indicating that for these stimuli consonant place is cued via spectral information. Comparison of consonant identification by users of multi-channel cochlear implants showed that the implant users' identification of temporal consonant information was largely within the range predicted from the normal data. In the instances where the implant users were performing especially well, they were identifying consonant place information at levels well beyond those predicted by the normal-subject data. Comparison of implant-user performance with the temporal-only data reported here can help determine whether the speech information available to the implant user consists of entirely temporal cues, or is augmented by spectral cues.
Article
A sample of 64 postlinguistically profoundly to totally deaf adult cochlear implant patients were tested without lipreading by means of the Central Institute for the Deaf (CID) sentence test 3 months postoperatively. Preoperative promontory stimulation results (thresholds, gap detection, and frequency discrimination), age, duration of profound deafness, cause of deafness, lipreading ability, postoperative intracochlear thresholds and dynamic ranges for electrical stimulation, depth of insertion of the electrode array into the scala tympani, and number of electrodes in use were considered as possible factors that might be related to the postoperative sentence scores. A multiple regression analysis with stepwise inclusion of independent variables indicated that good gap detection and frequency discrimination during preoperative promontory testing, larger numbers of electrodes in use, and greater dynamic ranges for intracochlear electrical stimulation were associated with better CID scores. The CID scores tended to decrease with longer periods of profound deafness.