ArticlePDF Available

Selective and Divided Attention Modulates Auditory-vocal Integration in the Processing of Pitch Feedback Errors

Authors:
  • Guang Dong Work Injury Rehabilitation Center

Abstract and Figures

Speakers rapidly adjust their ongoing vocal productions to compensate for errors they hear in their auditory feedback. It is currently unclear what role attention plays in these vocal compensations. This event-related potential (ERP) study examined the influence of selective and divided attention on the vocal and cortical responses to pitch errors heard in auditory feedback regarding ongoing vocalizations. During the production of a sustained vowel, participants briefly heard their vocal pitch shifted up 2 semitones while they actively attended to auditory or visual events (selective attention), or both auditory and visual events (divided attention), or were not told to attend to either modality (control condition). The behavioral results showed that attending to the pitch perturbations elicited larger vocal compensations than attending to the visual stimuli. Moreover, ERPs were likewise sensitive to the attentional manipulations: P2 responses to pitch perturbations were larger when participants attended to the auditory stimuli compared to when they attended to the visual stimuli, and compared to when they were not explicitly told to attend to either the visual or auditory stimuli. By contrast, dividing attention between the auditory and visual modalities caused suppressed P2 responses relative to all the other conditions, and enhanced N1 responses relative to the control condition. These findings provide strong evidence for the influence of attention on the mechanisms underlying the auditory-vocal integration in the processing of pitch feedback errors. As well, selective attention and divided attention appear to modulate the neurobehavioral processing of pitch feedback errors in different ways. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Content may be subject to copyright.
Selective and divided attention modulates auditoryvocal
integration in the processing of pitch feedback errors
Ying Liu,
1,
* Huijing Hu,
1,2,
* Jeffery A. Jones,
3
Zhiqiang Guo,
4
Weifeng Li,
1
Xi Chen,
1
Peng Liu
1
and Hanjun Liu
1,4
1
Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510080, China
2
Guangdong Provincial Work Injury Rehabilitation Center, Guangzhou, China
3
Psychology Department and Laurier Centre for Cognitive Neuroscience, Wilfrid Laurier University, Waterloo, ON, Canada
4
Department of Biomedical Engineering, School of Engineering, Sun Yat-sen University, Guangzhou, China
Keywords: auditory feedback, auditorymotor integration, divided attention, selective attention, speech motor control
Abstract
Speakers rapidly adjust their ongoing vocal productions to compensate for errors they hear in their auditory feedback. It is cur-
rently unclear what role attention plays in these vocal compensations. This event-related potential (ERP) study examined the
influence of selective and divided attention on the vocal and cortical responses to pitch errors heard in auditory feedback regard-
ing ongoing vocalisations. During the production of a sustained vowel, participants briefly heard their vocal pitch shifted up two
semitones while they actively attended to auditory or visual events (selective attention), or both auditory and visual events
(divided attention), or were not told to attend to either modality (control condition). The behavioral results showed that attending
to the pitch perturbations elicited larger vocal compensations than attending to the visual stimuli. Moreover, ERPs were likewise
sensitive to the attentional manipulations: P2 responses to pitch perturbations were larger when participants attended to the audi-
tory stimuli compared to when they attended to the visual stimuli, and compared to when they were not explicitly told to attend to
either the visual or auditory stimuli. By contrast, dividing attention between the auditory and visual modalities caused suppressed
P2 responses relative to all the other conditions and caused enhanced N1 responses relative to the control condition. These find-
ings provide strong evidence for the influence of attention on the mechanisms underlying the auditoryvocal integration in the pro-
cessing of pitch feedback errors. In addition, selective attention and divided attention appear to modulate the neurobehavioral
processing of pitch feedback errors in different ways.
Introduction
Auditory feedback plays an important role in the acquisition and
maintenance of speech motor skills (Houde & Jordan, 1998; Hickok
et al., 2011). By exposing speakers to altered auditory feedback
(AAF) regarding their ongoing speech production, researchers have
demonstrated that speakers adjust their vocal output to compensate
for feedback perturbations in fundamental frequency (F
0
), vocal
intensity and formant frequency (Jones & Munhall, 2005; Bauer
et al., 2006; Liu et al., 2010; Cai et al., 2011; Scheerer et al.,
2013). Researchers have also shown that activity in the auditory cor-
tex is suppressed during active vocalisation relative to passive listen-
ing when the auditory feedback matches the actual vocal output
(Houde et al., 2002; Heinks-Maldonado et al., 2005; Flinker et al.,
2010; Behroozmand & Larson, 2011; Wang et al., 2014). By con-
trast, when there is a mismatch between auditory feedback and vocal
output, vocalisation elicits larger neural responses than during pas-
sive listening (Eliades & Wang, 2008; Behroozmand et al., 2009;
Chang et al., 2013; Chen et al., 2013). This enhanced cortical
activity is thought to reect a mechanism that serves to compensate
for errors perceived during vocalisation (Chang et al., 2013).
Although recent research has greatly advanced our understanding
of the integration of sensory and motor information during speech,
we know little about the role, if any, that attention plays. Outside of
the laboratory environment, speakers are typically faced with the
task of simultaneously processing auditory feedback in conjunction
with other sensory information (e.g. visual, somatosensory, etc.). If
auditory feedback processing requires some degree of attention, and
given that attention is considered a limited resource (N
a
at
anen,
1992), then the processing of perceived motor errors may be modu-
lated by the attentional load incurred by processing multimodal
stimuli.
The cognitive and neural processes that underlie attention have
been the subject of intense study for many years. For example, stud-
ies of unimodal selective attention have demonstrated enhanced
brain activity in the auditory cortex in response to attended vs. unat-
tended stimuli (Woldorff et al., 1993; Pugh et al., 1996; Ahveninen
et al., 2011). Studies of bimodal (e.g. auditory and visual) selective
attention have likewise shown more brain activity in the auditory
Correspondence: Hanjun Liu,
1
Department of Rehabilitation Medicine, as above.
E-mail: lhanjun@mail.sysu.edu.cn
*
These two authors contributed equally to this study.
Received 28 November 2014, revised 28 April 2015, accepted 11 May 2015
©2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd
European Journal of Neuroscience, pp. 110, 2015 doi:10.1111/ejn.12949
cortex when participants attend to auditory stimuli while ignor-
ing visual stimuli, and more activity in the visual cortex when
participants attend to visual stimuli while ignoring auditory stimuli
(Woodruff et al., 1996; Johnson & Zatorre, 2005, 2006). In addition
to this enhancement in the attended modality, cross-modal inhibition
has been demonstrated in the form of decreases in activity observed
in the ignoredsensory cortex (Shomstein & Yantis, 2004; Johnson
& Zatorre, 2005, 2006).
Although attention is typically manifest as a selective focus on
the processing of one aspect of the environment, attention can also
be divided when we must perform two (or more) tasks simulta-
neously. Not surprisingly, the complexity of dividing attention
across stimuli, or between modalities, involves activity in additional
regions of the brain. For example, in a study of divided visual atten-
tion, discrimination between shape, color and speed of a visual stim-
ulus activated the anterior cingulate and the prefrontal cortex (PFC)
in the right hemisphere (Corbetta et al., 1991). In addition, when
attention has to be divided between auditory and visual stimuli,
researchers have found signicant decreases in the level of activity
in both the auditory and visual cortices compared to when attention
is focused on either modality alone (Klingberg, 1998; Loose et al.,
2003; Johnson & Zatorre, 2006). Event-related potential (ERP) stud-
ies have similarly shown that divided attention causes processing
delays, which are reected as increased latencies of the P300 com-
ponent (Hohnsbein et al., 1991).
The research on cross-modal attention thus indicates that attention
modulates activity in the sensory cortices. As speech motor control
relies on sensory feedback from audition, somatosensation and kin-
esthesis, whether or not the sensorimotor integration that is critical
to speech motor control can be shaped by attention is an important
question. Accurate estimation of the dynamic state of the vocal artic-
ulators relies on the detection of errors in voice auditory feedback.
In addition, considerable evidence has demonstrated that the central
auditory processing of speech sounds is highly dependent on atten-
tion. For example, larger ERPs (Hink & Hillyard, 1976; Stevens
et al., 2006) or enhanced brain activity (Ahveninen et al., 2006;
Sabri et al., 2008) in the auditory cortex is observed in response to
attended vs. unattended speech sounds. Moreover, disruption of the
lip representation induced by transcranial magnetic stimulation
results in increased left-hemisphere P50m response to attended
speech sounds (Mottonen et al., 2014), suggesting that the earlier
interactions between auditory and motor cortices depend on atten-
tion. Taken together, these studies suggest that attentional mecha-
nisms may inuence the auditorymotor processing of vocal pitch
errors.
Evidence that attention modulates auditorymotor integration
comes from two recent studies. In an ERP study performed by Tum-
ber et al. (2014), participants were exposed to pitch perturbations
during vocalisation while they either passively viewed a rapid serial
visual presentation (RSVP) of letters, or actively identied target
stimuli in the RSVP of letters. The results showed that actively
attending to the RSVP elicited a decrease in the magnitude of vocal
compensation for pitch perturbations relative to passively viewing
the RSVP, whereas the P1-N1-P2 complex elicited by pitch pertur-
bations was not modulated by attention. This nding provides
behavioral evidence for the inuence of attention on the auditory
motor integration in voice control.
In another study conducted by Hu et al. (2015), participants were
exposed to pitch perturbations during vocalisation while they either
attended to a visual stimulus or attended to the pitch perturbations
in a low (counting the number of perturbations) or high (counting
the type of perturbations) attentional load condition. The results
revealed no systematic change in the vocal compensations for the
pitch perturbations, irrespective of whether the perturbations were
attended or not. However, larger P2 responses were observed when
participants attended to the pitch perturbations in the low-load atten-
tional condition relative to P2 responses observed in the high-load
attentional condition or those observed when the pitch perturbations
were unattended. The authors concluded that attentional load modu-
lates auditory cortical responses to perceived vocal pitch errors.
Although the above-cited studies provide support for the hypothe-
sis that there is a role for attention in auditorymotor integration
during speech production, certain shortcomings in these two studies
limit our understanding. For example, Tumber et al. (2014) investi-
gated whether taxing central attentional processes affects auditory
motor integration by manipulating the visual attention load but did
not manipulate the attention their participants paid to the pitch per-
turbations. Thus, the question of whether dividing attention between
visual and auditory stimuli affects the auditorymotor processing of
pitch perturbations remains unanswered. Hu et al. (2015) found the
modulatory effect of attention on the cortical processing of vocal
pitch feedback errors as a function of load level, but ERP responses
for these attentional manipulations were not compared to a control
condition where participants would passively listen to the pitch per-
turbations and view the visual stimuli. Thus, Hu et al. (2015) were
unable to determine whether the observed differences in the cortical
responses between attentional conditions reected enhancement in
the attended modality, inhibition in the unattended modality or both.
Therefore, whether and how attention inuences the auditorymotor
integration in voice control remains poorly understood.
In the present study, we directly compared the inuence of selec-
tive attention and divided attention on auditorymotor integration
during voice control. Participants sustained a vowel phonation while
they heard their voice pitch-shifted (i.e., auditory stimuli) and simul-
taneously saw ashing lights (i.e., visual stimuli). The participants
task was to attend to either the auditory or to the visual stimuli
(selective attention), to attend to both modalities (divided attention),
or to passively observe both the auditory and visual stimuli (control
condition). Vocal and cortical (N1-P2 complex) responses to the
pitch perturbations in these experimental conditions were measured
and compared. We hypothesised that larger vocal and cortical
responses would occur when participants attended to the pitch per-
turbations as compared to when participants were asked to ignore
them. Moreover, we predicted that divided attention would elicit
smaller vocal and ERP responses to the pitch perturbations than in
the selective attention condition.
Materials and methods
Al Subjects
Thirty-three students from Sun Yat-sen University of China partici-
pated in the experiment. Three participants were excluded from the
nal sample because of technical problems (N=2) and poor data
quality (N=1). Therefore, data from 30 subjects (seven males and
23 females) were analysed. All participants were right-handed,
native-Mandarin speakers with a mean age of 23 3 (SD) years. No
participant reported a history of speech, hearing, language or neuro-
logical disorders. All subjects passed a hearing screening at 25 dB
hearing level (HL) for octave intervals of 5004000 Hz and had
normal or corrected-to-normal vision. Informed consent was
obtained from all participants. All the procedures, including subject
recruitment and data acquisition, were approved by the Institutional
Review Board of The First Afliated Hospital at Sun Yat-sen Uni-
©2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd
European Journal of Neuroscience,110
2 Y. Liu et al.
versity of China, and were in accordance with the Code of Ethics of
the World Medical Association (Declaration of Helsinki).
Apparatus
Participants were seated in a sound-attenuated booth throughout the
experiment. Prior to data recording, the experimental system was
calibrated to ensure that the intensity of voice feedback that partici-
pants heard was 10 dB (sound pressure level, SPL) higher than their
actual voice output to partially mask the airborne and bone-con-
ducted feedback (Behroozmand et al., 2009). The voice signals were
transduced by a dynamic microphone (model DM2200; Takstar Inc.,
Huizhou, China) and amplied by a MOTU Ultralite Mk3 FireWire
audio interface (Cambridge, MA, USA). The amplied signals were
then pitch-shifted by an Eventide Eclipse Harmonizer (Little Ferry,
NJ, USA), which was controlled by a custom program created with
Max/MSP (v.6.0 by Cycling 74, San Francisco, CA, USA). This
program also generated transistortransistor logic (TTL) pulses that
marked the onset and offset of the pitch perturbations. The TTL
pulses were also sent to the electroencephalograph (EEG) recording
system via a FireWire cable. Finally, the pitch-shifted voices were
amplied by an ICON NeoAmp (Middleton, WI, USA) headphone
amplier and fed back to participants through insert earphones
(ER1-14A, Etymotic Research Inc., Elk Grove Village, IL, USA).
The original voice, the pitch-shifted feedback and the TTL pulses
were digitised at 10 kHz by a PowerLab A/D converter (model
ML880, AD Instruments, Castle Hill, Australia) and recorded using
LabChart software (v.7.0, AD Instruments).
Visual and auditory Stimuli
Visual stimuli
Two circles representing the blue and red indicator lights were gen-
erated by Max/MSP software and displayed on the computer screen.
The red indicator light began to ash 500 ms after the blue indicator
light prompted participants to vocalise. Participants were instructed
to vocalise when the blue indicator light was turned on and termi-
nate their vocalisations when the blue indicator light was turned off,
during which they heard their voice auditory feedback unexpectedly
shifted upwards (see Auditory stimuli). During each vocalisation, the
red indicator light ashed 17 times with variable inter-stimulus
intervals (ISIs) ranging from 400 to 1600 ms (400, 600, 800, 1000,
1200, 1400 and 1600 ms). The duration of the red light was xed at
200 ms.
Auditory stimuli
The pitch-shift stimulus (PSS) that participants heard was +200
cents (100 cents equals 1 semitone) with a xed duration of
200 ms. The number of PSS presented to subjects ranged from one
to four per vocalisation. The rst PSS was presented 5001000 ms
after the vocal onset, and the succeeding stimuli occurred with an
ISI of 700900 ms. The onsets of visual and auditory stimuli were
asynchronous.
Procedure
The present study included four conditions: (i) attend to the PSS but
ignore the red indicator light (Auditory Attention); (ii) attend to the
red indicator light but ignore the PSS (Visual Attention); (iii) attend
to both PSS and red indicator light (Bimodal Attention); and (iv)
passively listen to the PSS and view the lights (Bimodal Passive).
Across all conditions, subjects vocalised the vowel sound /u/ for ~5s
at a comfortable pitch when cued by the blue indicator light, and
saw the red indicator light ashing. At the end of each vocalisation,
subjects were required to take a break of 23s prior to initiating the
next vocalisation. Production of ~40 consecutive vocalisations
constituted one block, which resulted in ~100 trials (i.e. PSS) per
condition. As a control condition, the Bimodal Passive condition
was always the rst block, while participants were unaware of the
attention-related tasks that were to follow. The order of the other
three conditions was counterbalanced across all subjects.
Across all but the Bimodal Passive condition, an immediate recall
test was performed after each vocalisation to ensure participants
attended to or ignored the stimuli as instructed. Subjects reported
the number of PSS that they heard in the Auditory Attention condi-
tion, the number of red indicator light ashes that they saw in the
Visual Attention condition, or both in the Bimodal Attention condi-
tion. The percentage of correctly remembered stimuli across the
three conditions was calculated based on these data.
EEG data acquisition and analyses
Participants wore a 64-electrode Geodesic Sensor Net (Electrical
Geodesics Inc., Eugene, OR, USA) on their scalp, and EEG signals
were amplied by a Net Amps 300 amplier (Electrical Geodesics
Inc.) and recorded using NetStation software (v.4.5; Electrical Geo-
desics Inc.). EEG signals from all channels were referenced to the
vertex (Cz) during the recording and digitised at a sampling fre-
quency of 1 kHz. The impedances of individual sensors were main-
tained below 50 kΩthroughout the recording (Ferree et al., 2001).
After data acquisition, NetStation software was used for off-line
analyses of the EEG signals. Data from all channels were band-
pass-ltered at 120 Hz and then segmented into epochs ranging
from 200 ms before to 500 ms after the onset of the PSS. Seg-
mented trials contaminated by excessive muscular activity, eye
blinks or eye movements were assessed using the Artifact Detec-
tion toolbox in NetStation and eliminated from further analyses.
Additional visual inspection on individual trials was performed to
ensure that artifacts were rejected appropriately. Individual elec-
trodes containing artifacts in >20% of the segments would be
excluded from further analyses. All channels were then re-refer-
enced to the average of electrodes on each mastoid, and artifact-
free epochs were averaged and baseline-corrected to generate an
overall response across each condition. The amplitude and latency
of both the N1 and P2 components across conditions were mea-
sured as the negative and positive peaks in the time windows of
80180 and 160280 ms after the onset of PSS, and submitted to
statistical analyses.
Vocal data analysis
Voice F
0
contours were extracted for each vowel production using
Praat (Boersma, 2001). F
0
values in Hz were converted to the cents
scale using the formula cents =100 9[12 9log
2
(F
0
/reference)],
where the reference denotes the arbitrary reference note 195.997 Hz
(G4). Each trial was segmented into epochs ranging from 200 ms
before to 700 ms after the onset of the PSS. Each trial was visually
inspected to ensure that trials with vocal interruptions or signal pro-
cessing errors were excluded from further analyses. Artifact-free tri-
als for each of the four conditions were then averaged for each
participant to generate an overall vocal response. Vocal response
latencies in ms were measured as the time when the F
0
trajectory
©2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd
European Journal of Neuroscience,110
Attention effects on voice feedback processing 3
exceeded 2 SDs above or below the pre-stimulus mean following
the perturbation onset. Vocal response magnitudes in cents were cal-
culated by subtracting the pre-stimulus mean from the peak value of
the voice F
0
contour following the response onset. The peak time of
vocal response in ms were determined at the point of greatest devia-
tion from the value at the onset of the stimulus.
Statistical analyses
The vocal and neurophysiological response data were subjected to
repeated-measures analyses of variance (RM-ANOVAs) in SPSS
(v.16.0). Specically, the magnitudes, latencies and peak times of
vocal responses were analysed using one-way RM-ANOVAs with atten-
tion condition as the single factor (i.e. Auditory Attention, Visual
Attention, Bimodal Attention and Bimodal Passive). The amplitude
and latency of the N1-P2 complex, extracted from 10 electrodes
(FC1, FC2, FCz, FC3, FC4, C1, C2, Cz, C3, C4), were analysed
using three-way RM-ANOVAs (attention condition, anteriority and later-
ality). Frontal (FC1, FC2, FCz, FC3, FC4) and central (C1, C2, Cz,
C3, C4) electrodes were chosen as an anteriority factor, while lateral
left (FC3, C3), medial left (FC1, C1), midline (FCz, Cz), medial right
(FC2, C2) and lateral right (FC4, C4) were used as a laterality factor.
Responses from frontal and central electrodes were chosen for statis-
tical analyses because neurophysiological responses to PSS are
mostly pronounced in these two areas (Hawco et al., 2009; Chen
et al., 2012). Subsidiary RM-ANOVAs were calculated if higher-order
interactive effects reached signicance. Probability values were
adjusted using GreenhouseGeisser correction when the assumption
of sphericity was violated.
Results
Behavioral performance
Participantsaccuracy at counting and recalling the number of PSS
in the Auditory Attention condition, the number of red light ashes
in the Visual Attention condition, and both of these auditory and
visual stimuli in the Bimodal Attention condition, was calculated as
a percentage of correct responses for each of these conditions. Par-
ticipantsaccuracy at reporting the number of PSS in the Bimodal
Attention condition (79 2%) (mean SEM throughout unless
otherwise indicated) was signicantly lower than that in the Audi-
tory Attention condition (98 1%; t
29
=8.470, P<0.001). Simi-
larly, participantsaccuracy at reporting the number of red light
ashes in the Bimodal Attention condition (80 3%) was signi-
cantly lower than that in the Visual Attention condition (95 1%;
t
29
=6.293, P<0.001).
Vocal responses
Figure 1 shows the grand-averaged voice F
0
contours and the T-bar
plots of the absolute magnitudes of vocal response to the PSS across
the four attention conditions. A one-way RM-ANOVA of the response
magnitude revealed a signicant main effect of attention condition
(F
3,87
=2.863, P=0.041). Post hoc Bonferroni comparison tests
revealed that the Auditory Attention condition (12.5 0.8 cents)
elicited signicantly larger vocal responses than the Visual Attention
condition (10.3 0.8 cents; P=0.026). Vocal responses during the
Bimodal Attention condition (12.2 1.1 cents) were larger than
during the Visual Attention condition (11.4 0.9 cents) but this
difference failed to reach signicance (P=0.203). Vocal response
magnitudes for the Bimodal Passive condition were not signicantly
different from the responses elicited in the other three conditions
(P>0.5). In addition, vocal response latencies did not differ as a
function of the attention condition (F
3,87
=0.486, P=0.693; Audi-
tory Attention, 86 7 ms; Visual Attention, 86 9 ms; Bimodal
Attention, 86 7 ms; Bimodal Passive, 91 8 ms). In addition,
no signicant main effect of attention condition was found for the
peak times of vocal responses (F
3,87
=0.697, P=0.556; Auditory
Attention, 352 34 ms; Visual Attention, 348 37 ms; Bimodal
Attention, 324 23 ms; Bimodal Passive, 326 25 ms).
Neurophysiological responses
Figure 2 shows the grand-averaged ERP waveforms in response to
the PSS for the Auditory Attention (red lines), Visual Attention
(blue lines), Bimodal Attention (black lines) and Bimodal Passive
conditions (green lines). Figure 3 shows the topographical distribu-
tions of the N1 and P2 components across the four attention condi-
tions. As can be seen in Figs 2 and 3, the Auditory Attention
condition elicited the largest P2 responses and the Bimodal Atten-
tion condition elicited the smallest P2 responses, while P2
responses were similar in the Visual Attention and Bimodal Pas-
sive conditions. In addition, N1 response also appeared to be
affected by attention, as reected by the larger responses (i.e. more
negative) in the Bimodal Attention condition than in the Bimodal
Passive condition.
AB
Fig. 1. (A) Grand-averaged voice F
0
contours and (B) T-bar plots of the magnitudes of vocal responses to pitch perturbations across the four attention condi-
tions. The thick solid line, the sparse dashed line, the dense dashed line and the thin solid line represent the vocal responses in the Auditory Attention, Visual
Attention, Bimodal Attention and Bimodal Passive conditions, respectively. *P<0.05 (P=0.026) for the size of the vocal responses between the Auditory
Attention and Visual Attention conditions. Error bars represent SEM.
©2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd
European Journal of Neuroscience,110
4 Y. Liu et al.
One three-way RM-ANOVA revealed a signicant main effect of
attention condition (F
3,87
=5.205, P=0.006) on N1 amplitudes.
Post hoc Bonferroni comparison tests revealed that N1 amplitudes
were larger in the Bimodal Attention condition (-2.89 0.10 lV)
than in the Bimodal Passive condition (-2.13 0.08 lV; P<
0.001; see Fig. 4). Frontal electrodes (-2.63 0.07 lV) recorded
signicantly smaller N1 amplitudes than central electrodes (-2.35
0.06 lV; F
1,29
=16.695, P<0.001). The main effect of laterali-
ty also reached signicance (F
4,116
=8.607, P<0.001), which was
primarily driven by larger N1 amplitudes at the medial electrodes
(-2.74 0.11 lV) than at the lateral electrodes (left lateral,
-2.13 0.09 lV; right lateral, -2.32 0.10 lV). No signicant
difference, however, was found between left and right electrodes
(P>0.05).
Analysis of the N1 latencies revealed a signicant main effect of
laterality (F
4,116
=4.041, P<0.032). Midline electrodes (127
1 ms) recorded signicantly shorter N1 latencies than both the right
lateral (131 1 ms; P<0.002) and the right medial electrodes
(130 1 ms; P<0.013). There was no systematic change in N1
latencies as a function of attention condition (F
3,87
=1.466,
P=0.229; Auditory Attention, 127 1 ms; Visual Attention,
128 1 ms; Bimodal Attention, 129 1 ms; Bimodal Passive,
131 1 ms) or anteriority (F
1,29
=2.148, P=0.154; frontal,
129 1 ms; central, 128 1 ms).
Fig. 2. Grand-averaged ERP waveforms in response to pitch perturbations of +200 cents across the four attention conditions. The red, blue, black and green
solid lines denote the cortical responses in the Audition Attention, Visual Attention, Bimodal Attention and Bimodal Passive conditions, respectively. ERP
responses are averages over electrodes FC1, FCz, FC2, C1, Cz and C2, across all subjects.
Fig. 3. Topographical distributions of the N1 (top) and P2 (bottom) responses to pitch perturbations of +200 cents across the four attention conditions. From
left to right are shown the responses in the Audition Attention, Visual Attention, Bimodal Attention and Bimodal Passive conditions, respectively.
©2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd
European Journal of Neuroscience,110
Attention effects on voice feedback processing 5
For P2 amplitudes, there was a signicant main effect of attention
condition (F
3,87
=23.819, P<0.001). Post hoc Bonferroni compar-
ison tests showed that the Auditory Attention condition
(3.94 0.11 lV) elicited larger P2 amplitudes (i.e., more positive)
than the other three attention conditions (P<0.003), and the Bimo-
dal Attention condition (2.55 0.08 lV) elicited smaller P2 ampli-
tudes than the other three attention conditions (P<0.01; see
Fig. 4). No signicant difference was found between the Visual
Attention condition (3.11 0.08 lV) and the Bimodal Passive con-
dition (3.12 0.10 lV; P>0.05).
In addition to the main effect of attention, main effects of anteri-
ority (F
1,29
=68.498, P<0.001) and laterality (F
4,116
=49.634,
P<0.001) also reached signicance. The main effect of anteriority
was driven by P2 amplitudes that were larger at the frontal elec-
trodes (3.50 0.07 lV) than at the central electrodes (2.86
0.06 lV; P<0.001). The main effect of laterality was primarily
caused by larger P2 amplitudes at midline electrodes (3.92
0.11 lV) relative to lateral and medial electrodes (P<0.02), and
larger P2 amplitudes at medial electrodes (3.46 0.11 lV) than at
the lateral electrodes (2.53 0.09 lV; P<0.01). There was no
signicant difference between the left and right electrodes
(P>0.05).
A main effect of laterality (F
4,116
=5.363, P=0.006) was sig-
nicant for P2 latencies. P2 latencies observed for right lateral
electrodes (227 1 ms) were longer than latencies observed for
left medial electrodes (222 1 ms; P=0.002), right medial elec-
trodes (224 1 ms; P=0.009) and midline electrodes
(223 1 ms; P=0.011). Main effects of attention
(F
3,87
=1.191, P=0.318; Auditory Attention, 222 1 ms;
Visual Attention, 225 1 ms; Bimodal Attention, 222 1 ms;
Bimodal Passive, 226 1 ms) and anteriority (F
1,29
=0.019,
P=0.890; frontal, 224 1 ms; central, 224 1 ms) failed to
reach signicance.
Discussion
The present study examined the inuence of selective and divided
attention on the auditorymotor processing of feedback errors dur-
ing vocal pitch regulation. The behavioral results revealed an
effect of selective attention as larger vocal compensations for
pitch perturbations were elicited when participants attended to the
auditory stimuli compared to when they attended to the visual
stimuli. This pattern extended to the cortical responses; larger P2
responses to pitch perturbations were elicited when participants
attended to the auditory stimuli as compared to when they
attended to the visual stimuli or passively observed the auditory
and visual stimuli. Moreover, when participants divided their
attention between auditory and visual stimuli, P2 responses were
signicantly smaller than when participants attended to the audi-
tory or visual stimuli, or passively observed both. In addition,
divided attention elicited more negative N1 responses than the
control condition. These ndings support our hypothesis that audi-
torymotor integration in voice control is modulated by attention.
Moreover, selective attention and divided attention appear to mod-
ulate the processing of auditory feedback during vocal pitch regu-
lation in different ways.
Attention-dependent modulation of the vocal response
It is generally thought that vocal compensation for feedback pertur-
bations is involuntary in nature and cannot be controlled con-
sciously. For example, participants produce compensatory vocal
responses to pitch perturbations even when told to ignore them
(Burnett et al., 1998; Hain et al., 2000; Zarate & Zatorre, 2008).
Keough et al. (2013) further reported that vocal responses observed
when vocally-trained and -untrained participants were instructed to
ignore the PSS did not differ from those observed when they were
instructed to compensate for the PSS. Although these studies sug-
gest that vocal compensation for feedback perturbations is indepen-
dent of attentional control, there were no explicit measures of the
degree of attentional engagement. In the present study, we employed
a post-presentation recall test providing conrmation that visual and/
or auditory attention was engaged. Indeed, our results revealed lar-
ger vocal responses to the attended pitch perturbations than when
they were not attended. Similarly, Tumber et al. (2014) reported that
actively attending to an RSVP elicited smaller vocal compensation
for pitch perturbations than did passively viewing the RSVP. Note
that only when participants were asked to passively view the RSVP
before performing the vocal and RSVP task simultaneously did this
modulation of vocal compensation occur, which was because the
participants were unable to ignore the RSVP after previously per-
forming a visual task (Scheerer & Jones, 2014). Taken together,
these ndings provide evidence that selective attention modulates
vocal compensation for pitch perturbations.
Our nding, however, is inconsistent with another study by Hu
et al. (2015) that showed no effect of attention on the level of vocal
compensations for pitch perturbations. It is noteworthy that only one
size of pitch perturbation (+200 cents) was used in the present
study, while pitch perturbations of +100, +200 and +500 cents were
randomly presented in the Hu et al. (2015) study. In addition, the
same auditory and visual stimuli were used across the four attention
conditions in the present study, while the visual stimuli (i.e., the red
Fig. 4. T-bar plots of the grand-averaged amplitudes of N1 (left) and P2 (right) response to pitch perturbations of +200 cents across 10 electrodes. *P<0.05
for differences in cortical responses between conditions. Error bars represent SEM.
©2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd
European Journal of Neuroscience,110
6 Y. Liu et al.
indicator light) were not presented when participants were asked to
attend to the pitch perturbations in the Hu et al. (2015) study.
Therefore, the inconsistency in the effect of selective attention on
the vocal pitch monitoring between these two studies may be attrib-
uted to the differences in the presentation of the auditory and visual
stimuli across the attention conditions.
As expected, participants performed more poorly during the
Bimodal Attention condition than the Auditory Attention condition
in terms of their recall for the number of PSS they heard during
each utterance. This nding is in line with other cross-modal studies
that have shown behavioral decits when participants divided their
attention between modalities as compared to selectively attending to
one modality and ignoring the other (Bonnel & Hafter, 1998; Jolico-
eur, 1999). One might therefore hypothesise that the Bimodal Atten-
tion condition should elicit smaller vocal compensatory responses to
pitch perturbations than the Auditory Attention condition. However,
the vocal responses did not differ across these two conditions, which
suggests that divided attention does not affect the vocal responses.
Nonetheless, selectively attending to or ignoring the auditory modal-
ity did affect the vocal responses. Moreover, the vocal responses did
not differ across the Visual Attention, Bimodal Attention and Bimo-
dal Passive conditions. In addition, vocal responses were similar in
the Auditory Attention and Bimodal Passive conditions. Together,
this pattern of results suggests that the attention effect on vocal
compensation for feedback perturbations may be dependent on the
degree of attentional engagement. In other words, only when pitch
perturbations were fully attended vs. completely ignored did atten-
tion have an impact on the vocal compensation.
Selective attention effect on the neurophysiological responses
The present study found larger P2 responses were elicited when par-
ticipants attended to the pitch perturbations (i.e., Auditory Attention
condition) than when they ignored the pitch perturbations (i.e.,
Visual Attention condition) or were not asked to attend to the audi-
tory stimuli or the visual stimuli (i.e., Bimodal Passive condition),
indicating that selective attention modulates cortical processing of
auditory feedback during vocal pitch regulation. This aspect of our
nding is consistent with one recent study that showed larger P2
responses to attended vs. unattended pitch perturbations in voice
auditory feedback (Hu et al., 2015). Similarly, previous research on
auditory perception showed enhanced brain activity in the auditory
cortex (Grady et al., 1997; Petkov et al., 2004), larger P2 responses
(Picton & Hillyard, 1974; Woldorff & Hillyard, 1991; Neelon et al.,
2006) and mismatch negativity (N
a
at
anen et al., 1993; Alain &
Woods, 1997; Woldorff et al., 1998; Sussman et al., 2007) elicited
by attended vs. unattended sounds. Our results are also consistent
with the effects of selective attention on bimodal sensory processing
(e.g., auditory and visual) reported in other neuroimaging studies
that showed more activity in the auditory cortex while participants
attended to an auditory stimulus than when the same stimulus was
ignored (Kawashima et al., 1999; Loose et al., 2003; Johnson &
Zatorre, 2005, 2006).
There are theories that account for the enhanced P2 responses to
attended vs. unattended pitch perturbations. One theory is the gain-
based theory of selective attention (Hillyard et al., 1998) that posits
that paying attention to one modality while ignoring another
increases the gain of neurons that are sensitive to the attended modal-
ity, which results in decreased perceptual thresholds for that modal-
ity. In the context of this theory then, attending to the pitch
perturbations while ignoring the ashing lights in the Auditory Atten-
tion condition may lead to an increased gain for neurons involved in
auditorymotor integration that facilitates the detection/correction of
pitch errors in voice auditory feedback, resulting in enhanced P2
responses to attended vs. unattended pitch perturbations.
An alternative explanation for the enhanced P2 responses we
observed may be that cortical responses to the unattended pitch
perturbations decreased. There is evidence that suggests that
modality-specic selective attention results primarily from
decreased processing in the unattended modality rather than
increased processing in the attended modality (Mozolic et al.,
2008). Thus, the selective attention effect observed in the present
study may have resulted from inhibited activity in the sensory
cortices that subserve the processing of pitch perturbations, thereby
freeing central attentional resources for the processing of ashing
lights in the Visual Attention condition.
There is also considerable evidence that both cross-modal facilita-
tion and inhibition underlie attentional modulation of the sensory
cortices, such that attention to one modality leads not only to
increased activity in the sensory cortical area involved in the pro-
cessing of that modality but also to suppressed activity in regions
associated with other modalities (Loose et al., 2003; Johnson & Za-
torre, 2005, 2006; Mozolic et al., 2008). This leaves open the possi-
bility that changes in cortical responses may result from not only
enhanced response to the attended pitch perturbations but also
decreased responses to the unattended pitch perturbations.
As the above interpretations involve the comparison of the audi-
tory and visual selective attention conditions with each other, rather
than to the control condition, it is unclear whether changes in the
cortical responses reect only enhancement in the attended modality,
inhibition in the unattended modality, or both. However, our results
revealed that cortical responses to pitch perturbations in the Visual
Attention condition did not differ from those observed when partici-
pants passively observed the bimodal stimuli (i.e., Bimodal Passive
condition). Similarly, Tumber et al. (2014) reported that the P1-N1-
P2 complex to pitch perturbations did not change when participants
actively attended to the RSVP as compared to when they passively
viewed the RSVP. In other words, increased attention to the visual
stimuli, relative to the control condition, did not appear to suppress
the cortical processing of auditory stimuli during the online control
of vocal production. This nding does not support the hypothesis
that the processing of attended information leads to the inhibition of
processing of unattended information. Thus, the observed effect of
selective attention may be primarily due to increased gain in neurons
that are sensitive to the detection and/or correction of feedback per-
turbation during vocal production.
Divided attention effect on the neurophysiological response
The present study found smaller P2 responses in the Bimodal Atten-
tion condition than in the Auditory Attention condition. This nding
suggests that divided attention leads to decreased auditory cortical
processing of vocal pitch perturbations relative to selective attention.
These results are in line with previous bimodal studies that showed
decreased activity in the sensory cortex (i.e., auditory or visual) dur-
ing divided attention relative to the levels of activity observed dur-
ing selective attention (Loose et al., 2003; Johnson & Zatorre,
2006). This pattern of results ts the classical theory of limited
attention resource (N
a
at
anen, 1992), according to which fewer neu-
ral resources are allocated for the processing of auditory stimuli
when dividing attention to auditory and visual stimuli than when
selectively focusing attention on auditory stimuli. However, another
possible mechanism responsible for this divided attention effect is
that the auditory and visual systems reciprocally inhibit the process-
©2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd
European Journal of Neuroscience,110
Attention effects on voice feedback processing 7
Selective and divided attention modulates auditoryvocal
integration in the processing of pitch feedback errors
Ying Liu,
1,
* Huijing Hu,
1,2,
* Jeffery A. Jones,
3
Zhiqiang Guo,
4
Weifeng Li,
1
Xi Chen,
1
Peng Liu
1
and Hanjun Liu
1,4
1
Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510080, China
2
Guangdong Provincial Work Injury Rehabilitation Center, Guangzhou, China
3
Psychology Department and Laurier Centre for Cognitive Neuroscience, Wilfrid Laurier University, Waterloo, ON, Canada
4
Department of Biomedical Engineering, School of Engineering, Sun Yat-sen University, Guangzhou, China
Keywords: auditory feedback, auditorymotor integration, divided attention, selective attention, speech motor control
Abstract
Speakers rapidly adjust their ongoing vocal productions to compensate for errors they hear in their auditory feedback. It is cur-
rently unclear what role attention plays in these vocal compensations. This event-related potential (ERP) study examined the
influence of selective and divided attention on the vocal and cortical responses to pitch errors heard in auditory feedback regard-
ing ongoing vocalisations. During the production of a sustained vowel, participants briefly heard their vocal pitch shifted up two
semitones while they actively attended to auditory or visual events (selective attention), or both auditory and visual events
(divided attention), or were not told to attend to either modality (control condition). The behavioral results showed that attending
to the pitch perturbations elicited larger vocal compensations than attending to the visual stimuli. Moreover, ERPs were likewise
sensitive to the attentional manipulations: P2 responses to pitch perturbations were larger when participants attended to the audi-
tory stimuli compared to when they attended to the visual stimuli, and compared to when they were not explicitly told to attend to
either the visual or auditory stimuli. By contrast, dividing attention between the auditory and visual modalities caused suppressed
P2 responses relative to all the other conditions and caused enhanced N1 responses relative to the control condition. These find-
ings provide strong evidence for the influence of attention on the mechanisms underlying the auditoryvocal integration in the pro-
cessing of pitch feedback errors. In addition, selective attention and divided attention appear to modulate the neurobehavioral
processing of pitch feedback errors in different ways.
Introduction
Auditory feedback plays an important role in the acquisition and
maintenance of speech motor skills (Houde & Jordan, 1998; Hickok
et al., 2011). By exposing speakers to altered auditory feedback
(AAF) regarding their ongoing speech production, researchers have
demonstrated that speakers adjust their vocal output to compensate
for feedback perturbations in fundamental frequency (F
0
), vocal
intensity and formant frequency (Jones & Munhall, 2005; Bauer
et al., 2006; Liu et al., 2010; Cai et al., 2011; Scheerer et al.,
2013). Researchers have also shown that activity in the auditory cor-
tex is suppressed during active vocalisation relative to passive listen-
ing when the auditory feedback matches the actual vocal output
(Houde et al., 2002; Heinks-Maldonado et al., 2005; Flinker et al.,
2010; Behroozmand & Larson, 2011; Wang et al., 2014). By con-
trast, when there is a mismatch between auditory feedback and vocal
output, vocalisation elicits larger neural responses than during pas-
sive listening (Eliades & Wang, 2008; Behroozmand et al., 2009;
Chang et al., 2013; Chen et al., 2013). This enhanced cortical
activity is thought to reect a mechanism that serves to compensate
for errors perceived during vocalisation (Chang et al., 2013).
Although recent research has greatly advanced our understanding
of the integration of sensory and motor information during speech,
we know little about the role, if any, that attention plays. Outside of
the laboratory environment, speakers are typically faced with the
task of simultaneously processing auditory feedback in conjunction
with other sensory information (e.g. visual, somatosensory, etc.). If
auditory feedback processing requires some degree of attention, and
given that attention is considered a limited resource (N
a
at
anen,
1992), then the processing of perceived motor errors may be modu-
lated by the attentional load incurred by processing multimodal
stimuli.
The cognitive and neural processes that underlie attention have
been the subject of intense study for many years. For example, stud-
ies of unimodal selective attention have demonstrated enhanced
brain activity in the auditory cortex in response to attended vs. unat-
tended stimuli (Woldorff et al., 1993; Pugh et al., 1996; Ahveninen
et al., 2011). Studies of bimodal (e.g. auditory and visual) selective
attention have likewise shown more brain activity in the auditory
Correspondence: Hanjun Liu,
1
Department of Rehabilitation Medicine, as above.
E-mail: lhanjun@mail.sysu.edu.cn
*
These two authors contributed equally to this study.
Received 28 November 2014, revised 28 April 2015, accepted 11 May 2015
©2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd
European Journal of Neuroscience, pp. 110, 2015 doi:10.1111/ejn.12949
Behroozmand, R., Karvelis, L., Liu, H. & Larson, C.R. (2009) Vocalization-
induced enhancement of the auditory cortex responsiveness during voice
F0 feedback perturbation. Clin. Neurophysiol.,120, 13031312.
Behroozmand, R., Ibrahim, N., Korzyukov, O., Robin, D.A. & Larson,
C.R. (2014) Left-hemisphere activation is associated with enhanced
vocal pitch error detection in musicians with absolute pitch. Brain Cog-
nition,84,97108.
Boersma, P. (2001) Praat, a system for doing phonetics by computer. Glot
Int.,5, 341345.
Bonnel, A.M. & Hafter, E.R. (1998) Divided attention between simultaneous
auditory and visual signals. Percept. Psychophys.,60, 179190.
Burnett, T.A., Freedland, M.B., Larson, C.R. & Hain, T.C. (1998) Voice F0
Responses to Manipulations in Pitch Feedback. J. Acoust. Soc. Am.,103,
31533161.
Cai, S., Ghosh, S.S., Guenther, F.H. & Perkell, J.S. (2011) Focal manipula-
tions of formant trajectories reveal a role of auditory feedback in the
online control of both within-syllable and between-syllable speech timing.
J. Neurosci.,31, 1648316490.
Chang, E.F., Niziolek, C.A., Knight, R.T., Nagarajan, S.S. & Houde, J.F.
(2013) Human cortical sensorimotor network underlying feedback control
of vocal pitch. Proc. Natl. Acad. Sci. USA,110, 26532658.
Chen, Z., Liu, P., Wang, E.Q., Larson, C.R., Huang, D. & Liu, H. (2012)
ERP correlates of language-specic processing of auditory pitch feedback
during self-vocalization. Brain Lang.,121,2534.
Chen, Z., Jones, J.A., Liu, P., Li, W., Huang, D. & Liu, H. (2013) Dynamics
of vocalization-induced modulation of auditory cortical activity at mid-
utterance. PLoS One,8, e60039.
Corbetta, M., Miezin, F.M., Dobmeyer, S., Shulman, G.L. & Petersen, S.E.
(1991) Selective and divided attention during visual discriminations of
shape, color, and speed: functional anatomy by positron emission tomogra-
phy. J. Neurosci.,11, 23832402.
Curtis, C.E. & DEsposito, M. (2004) The effects of prefrontal lesions on
working memory performance and theory. Cogn. Affect. Behav. Ne.,4,
528539.
Eliades, S.J. & Wang, X. (2008) Neural substrates of vocalization feedback
monitoring in primate auditory cortex. Nature,453, 11021106.
Ferree, T.C., Luu, P., Russell, G.S. & Tucker, D.M. (2001) Scalp electrode
impedance, infection risk, and EEG data quality. Clin. Neurophysiol.,112,
536544.
Flinker, A., Chang, E.F., Kirsch, H.E., Barbaro, N.M., Crone, N.E. &
Knight, R.T. (2010) Single-trial speech suppression of auditory cortex
activity in humans. J. Neurosci.,30, 1664316650.
Grady, C.L., Van Meter, J.W., Maisog, J.M., Pietrini, P., Krasuski, J. &
Rauschecker, J.P. (1997) Attention-related modulation of activity in pri-
mary and secondary auditory cortex. NeuroReport,8, 25112516.
Hain, T.C., Burnett, T.A., Kiran, S., Larson, C.R., Singh, S. & Kenney,
M.K. (2000) Instructing subjects to make a voluntary response reveals the
presence of two components to the audio-vocal reex. Exp. Brain Res.,
130, 133141.
Hawco, C.S., Jones, J.A., Ferretti, T.R. & Keough, D. (2009) ERP correlates
of online monitoring of auditory feedback during vocalization. Psycho-
physiology,46, 12161225.
Heinks-Maldonado, T.H., Mathalon, D.H., Gray, M. & Ford, J.M. (2005)
Fine-tuning of auditory cortex during speech production. Psychophysiol-
ogy,42, 180190.
Hickok, G., Houde, J.F. & Rong, F. (2011) Sensorimotor integration in
speech processing: computational basis and neural organization. Neuron,
69, 407422.
Hillyard, S.A., Vogel, E.K. & Luck, S.J. (1998) Sensory gain control (ampli-
cation) as a mechanism of selective attention: electrophysiological and
neuroimaging evidence. Philos. T. Roy. Soc. B., 353, 12571270.
Hink, R.F. & Hillyard, S.A. (1976) Auditory evoked potentials during selective
listening to dichotic speech messages. Percept. Psychophys.,20, 236242.
Hohnsbein, J., Falkenstein, M., Hoormann, J. & Blanke, L. (1991) Effects of
crossmodal divided attention on late ERP components. I. Simple and
choice reaction tasks. Electroen. Clin. Neuro.,78, 438446.
Houde, J.F. & Jordan, M.I. (1998) Sensorimotor adaptation in speech pro-
duction. Science,279, 12131216.
Houde, J.F., Nagarajan, S.S., Sekihara, K. & Merzenich, M.M. (2002)
Modulation of the auditory cortex during speech: an MEG study. J. Cogni-
tive Neurosci.,14, 11251138.
Hu, H., Liu, Y., Guo, Z., Li, W., Liu, P., Chen, S. & Liu, H. (2015)
Attention modulates cortical processing of pitch feedback errors in voice
control. Sci. Rep.,5, 7812.
Johnson, J.A. & Zatorre, R.J. (2005) Attention to simultaneous unrelated
auditory and visual events: behavioral and neural correlates. Cereb. Cor-
tex,15, 16091620.
Johnson, J.A. & Zatorre, R.J. (2006) Neural substrates for dividing and
focusing attention between simultaneous auditory and visual events.
NeuroImage,31, 16731681.
Johnson, J.A., Strafella, A.P. & Zatorre, R.J. (2007) The role of the dorsolat-
eral prefrontal cortex in bimodal divided attention: two transcranial mag-
netic stimulation studies. J. Cognitive Neurosci.,19, 907920.
Jolicoeur, P. (1999) Restricted attentional capacity between sensory modali-
ties. Psychon. B. Rev.,6,8792.
Jones, J.A. & Munhall, K.G. (2005) Remapping auditory-motor representa-
tions in voice production. Curr. Biol.,15, 17681772.
Kawashima, R., Imaizumi, S., Mori, K., Okada, K., Goto, R., Kiritani, S.,
Ogawa, A. & Fukuda, H. (1999) Selective visual and auditory attention
toward utterances-a PET study. NeuroImage,10, 209215.
Keough, D., Hawco, C. & Jones, J.A. (2013) Auditory-motor adaptation to
frequency-altered auditory feedback occurs when participants ignore feed-
back. BMC Neurosci.,14, 25.
Klingberg, T. (1998) Concurrent performance of two working memory
tasks: potential mechanisms of interference. Cereb. Cortex,8, 593601.
Knight, R.T., Scabini, D. & Woods, D.L. (1989) Prefrontal cortex gating of
auditory transmission in humans. Brain Res.,504, 338342.
Liu, H., Wang, E.Q., Chen, Z., Liu, P., Larson, C.R. & Huang, D. (2010)
Effect of tonal native language on voice fundamental frequency responses
to pitch feedback perturbations during vocalization. J. Acoust. Soc. Am.,
128, 37393746.
Loose, R., Kaufmann, C., Auer, D.P. & Lange, K.W. (2003) Human prefron-
tal and sensory cortical activity during divided attention tasks. Hum. Brain
Mapp.,18, 249259.
Mottonen, R., van de Ven, G.M. & Watkins, K.E. (2014) Attention ne-
tunes auditory-motor processing of speech sounds. J. Neurosci.,34,
40644069.
Mozolic, J.L., Joyner, D., Hugenschmidt, C.E., Peiffer, A.M., Kraft, R.A.,
Maldjian, J.A. & Laurienti, P.J. (2008) Cross-modal deactivations during
modality-specic selective attention. BMC Neurol.,8, 35.
N
a
at
anen, R. (1992) Attention and Brain Function. Psychology Press, Hills-
dale, NJ.
N
a
at
anen, R., Paavilainen, P., Tiitinen, H., Jiang, D. & Alho, K. (1993)
Attention and mismatch negativity. Psychophysiology,30, 436450.
Neelon, M.F., Williams, J. & Garell, P.C. (2006) The effects of auditory
attention measured from human electrocorticograms. Clin. Neurophysiol.,
117, 504521.
Petkov, C.I., Kang, X., Alho, K., Bertrand, O., Yund, E.W. & Woods, D.L.
(2004) Attentional modulation of human auditory cortex. Nat. Neurosci.,
7, 658663.
Petrides, M. (2000) The role of the mid-dorsolateral prefrontal cortex in
working memory. Exp. Brain Res.,133,4454.
Picton, T.W. & Hillyard, S.A. (1974) Human auditory evoked potentials. II.
Effects of attention. Electroen. Clin. Neuro.,36, 191199.
Pugh, K.R., offywitz, B.A., Shaywitz, S.E., Fulbright, R.K., Byrd, D., Skud-
larski, P., Shankweiler, D.P., Katz, L., Constable, R.T., Fletcher, J., Laca-
die, C., Marchione, K. & Gore, J.C. (1996) Auditory selective attention:
an fMRI investigation. NeuroImage,4, 159173.
Sabri, M., Binder, J.R., Desai, R., Medler, D.A., Leitl, M.D. & Liebenthal,
E. (2008) Attentional and linguistic interactions in speech perception. Neu-
roImage,39, 14441456.
Scheerer, N.E. & Jones, J.A. (2014) The predictability of frequency-altered
auditory feedback changes the weighting of feedback and feedforward
input for speech motor control. Eur. J. Neurosci.,40, 37933806.
Scheerer, N.E., Liu, H. & Jones, J.A. (2013) The developmental trajectory of
vocal and ERP responses to frequency altered auditory feedback. Eur. J.
Neurosci.,38, 31893200.
Shomstein, S. & Yantis, S. (2004) Control of attention shifts between vision
and audition in human cortex. J. Neurosci.,24, 1070210706.
Stevens, C., Sanders, L. & Neville, H. (2006) Neurophysiological evidence
for selective auditory attention decits in children with specic language
impairment. Brain Res.,1111, 143152.
Sussman, E.S., Horvath, J., Winkler, I. & Orr, M. (2007) The role of
attention in the formation of auditory streams. Percept. Psychophys.,69,
136152.
Tumber, A.K., Scheerer, N.E. & Jones, J.A. (2014) Attentional demands
inuence vocal compensations to pitch errors heard in auditory feedback.
PLoS One,9, e109968.
©2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd
European Journal of Neuroscience,110
Attention effects on voice feedback processing 9
Wang, J., Mathalon, D.H., Roach, B.J., Reilly, J., Keedy, S.K., Sweeney,
J.A. & Ford, J.M. (2014) Action planning and predictive coding when
speaking. NeuroImage,91,9198.
Woldorff, M.G. & Hillyard, S.A. (1991) Modulation of early auditory pro-
cessing during selective listening to rapidly presented tones. Electroen.
Clin. Neuro.,79, 170191.
Woldorff, M.G., Gallen, C.C., Hampson, S.A., Hillyard, S.A., Pantev, C.,
Sobel, D. & Bloom, F.E. (1993) Modulation of early sensory processing
in human auditory cortex during auditory selective attention. Proc. Natl.
Acad. Sci. USA,90, 87228726.
Woldorff, M.G., Hillyard, S.A., Gallen, C.C., Hampson, S.R. & Bloom, F.E.
(1998) Magnetoencephalographic recordings demonstrate attentional
modulation of mismatch-related neural activity in human auditory cortex.
Psychophysiology,35, 283292.
Woodruff, P.W., Benson, R.R., Bandettini, P.A., Kwong, K.K., Howard,
R.J., Talavage, T., Belliveau, J. & Rosen, B.R. (1996) Modulation of
auditory and visual cortex by selective attention is modality-dependent.
NeuroReport,7, 19091913.
Yamaguchi, S. & Knight, R.T. (1990) Gating of somatosensory input by
human prefrontal cortex. Brain Res.,521, 281288.
Zarate, J.M. & Zatorre, R.J. (2008) Experience-dependent neural substrates
involved in vocal pitch regulation during singing. NeuroImage,40, 1871
1887.
©2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd
European Journal of Neuroscience,110
10 Y. Liu et al.
... There is accumulating evidence suggesting the involvement of higher-level cognitive processes that drive compensatory adjustment of vocal motor behavior in response to auditory feedback errors. For example, focused attention on pitch perturbations in voice auditory feedback has been shown to elicit enhanced vocal compensations and event-related potential (ERP) P2 responses, while divided attention has been shown to elicit the opposite pattern of effects (Tumber et al. 2014;Hu et al. 2015;Liu et al. 2015). Increases in the load level of divided attention have been shown to lead to larger vocal compensations for pitch perturbations, larger N1 responses, and smaller P2 responses ). ...
... Current models/theories of speech motor control describe the processes for monitoring and responding to auditory feedback during vocal production in an input-driven, bottom-up manner (Guenther et al. 2006;Golfinopoulos et al. 2010;Hickok et al. 2011;Houde and Nagarajan 2011;Guenther and Vladusich 2012). However, this control process has been shown to be modulated by higher-level cognitive functions such as attentional control and working memory (Tumber et al. 2014;Liu et al. 2015;Scheerer et al. 2016;Guo et al. 2017), reflecting top-down influences on auditory-vocal integration. According to the language comprehension model (Friederici 2012), bottom-up processing of information received by the auditory cortex occurs along the ventral pathway and reaches the frontal cortex. ...
Article
The dorsolateral prefrontal cortex (DLPFC) has been implicated in auditory-motor integration for accurate control of vocal production, but its precise role in this feedback-based process remains largely unknown. To this end, the present event-related potential study applied a transcranial magnetic stimulation (TMS) protocol, continuous theta-burst stimulation (c-TBS), to disrupt cortical activity in the left DLPFC as young adults vocalized vowel sounds while hearing their voice unexpectedly shifted upwards in pitch. The results showed that, as compared to the sham condition, c-TBS over left DLPFC led to significantly larger vocal compensations for pitch perturbations that were accompanied by significantly smaller cortical P2 responses. Source localization analyses revealed that this brain activity pattern was the result of reduced activation in the left superior frontal gyrus and right inferior parietal lobule (supramarginal gyrus). These findings demonstrate c-TBS-induced modulatory effects of DLPFC on the neurobehavioral processing of vocal pitch regulation, suggesting that disrupting prefrontal function may impair top-down inhibitory control mechanisms that prevent speech production from being excessively inf luenced by auditory feedback, resulting in enhanced vocal compensations for feedback perturbations. This is the first study that provides direct evidence for a causal role of the left DLPFC in auditory feedback control of vocal production.
... In the past two decades, numerous studies have elucidated the role of auditory feedback in controlling voice fundamental frequency (f 0 ) using an altered feedback paradigm (Burnett et al., 1998;Larson et al., 2001;Jones and Munhall, 2002;Natke et al., 2003;Xu et al., 2004;Zarate and Zatorre, 2005;Liu and Larson, 2007;Jones and Keough, 2008;Liu P. et al., 2010;Liu et al., 2012Liu et al., , 2015Ning et al., 2014;Kim and Larson, 2019). These studies showed speakers typically produced an "opposing" response (i.e., a compensatory response that changed in the opposite direction to the pitch-shift stimulus) when they received an unexpected change in pitch through auditory feedback during vocalization. ...
Article
Full-text available
Auditory feedback plays an important role in regulating our vocal pitch. When pitch shifts suddenly appear in auditory feedback, the majority of the responses are opposing, correcting for the mismatch between perceived pitch and actual pitch. However, research has indicated that following responses to auditory perturbation could be common. This study attempts to explore the ways individual speakers would respond to pitch perturbation (using an opposing response or a following response) from trial to trial. Thirty-six native speakers of Mandarin produced the vowel /a/ while receiving perturbed pitch at a random time (500 ~ 700 ms) after vocal onset for a duration of 200 ms. Three blocks of 30 trials that differed in the pitch-shift stimulus direction were recorded in a randomized order: (a) the down-only condition where pitch was shifted downwards 250 cents; (b) the up-only condition where pitch was shifted upwards 250 cents; and (c) the random condition where downshifts and upshifts occurred randomly and were equally likely. The participants were instructed to ignore the pitch shifts. Results from the latent class analysis show that at the individual level across trials, 57% of participants were switchers, 28% were opposers, and 15% were followers. Our results support that speakers produce a mix of opposing and following responses when they respond to perturbed pitch. Specifically, the proportion of followers was conditional on the expectancy of pitch-shift stimulus direction: More followers were observed when the pitch-shift stimulus direction was predictable. Closer inspection of the levels of response consistency in different time phases shows that a particular mechanism (opposing or following) was initially implemented; the two mechanisms may alternate in the middle phase; and then finally, the pitch-shift response was featured as a particular mechanism near the end phase.
... Individuals who produce small vocal compensations may be less dependent on auditory feedback because they rely more on somatosensory feedback or feedforward systems to regulate their vocal production (Lametti et al. 2012;Wang et al. 2019). There is evidence, however, showing that vocal compensation is modulated by other factors such as certain genetic variants , hormone concentration (Zhu et al. 2016), and cognitive function (Tumber et al. 2014;Liu et al. 2015;Guo et al. 2017). It is thus difficult at present to determine the physiological meaning of the vocal responses to perturbed auditory feedback. ...
Article
Speakers regulate vocal motor behaviors in a compensatory manner when perceiving errors in auditory feedback. Little is known, however, about the source of interindividual variability that exists in the degree to which speakers compensate for perceived errors. The present study included 40 young adults to investigate whether individual differences in auditory integration for vocal pitch regulation, as indexed by vocal compensations for pitch perturbations in auditory feedback, can be predicted by cortical morphology as assessed by gray-matter volume, cortical thickness, and surface area in a whole-brain manner. The results showed that greater gray-matter volume in the left inferior parietal lobule and greater cortical thickness and surface area in the left superior/middle temporal gyrus, temporal pole, inferior/superior parietal lobule, and precuneus predicted larger vocal responses. Greater cortical thickness in the right inferior frontal gyrus and superior parietal lobule and surface area in the left precuneus and cuneus were significantly correlated with smaller magnitudes of vocal responses. These findings provide the first evidence that vocal compensations for feedback errors are predicted by the structural morphology of the frontal and tempo-parietal regions, and further our understanding of the neural basis that underlies interindividual variability in auditory–motor control of vocal production.
... Moreover, multivariate logistic regression analysis revealed a lower educational level as an independent risk factor for PD-associated MSD. Several studies on the neural mechanisms underlying motor speech control [15,16,26,27] have hypothesized that a higher educational level might play a negative role in the occurrence and aggravation of PD-associated MSD. Previous MRI examinations have shown that lesions in cognitive-relevant cortical areas, especially those closely associated with educational level within the prefrontal lobe, frequently occur in SD patients [28][29][30]. ...
Article
Full-text available
Background: The mechanisms underlying the online modulation of motor speech in Parkinson's disease (PD) have not been determined. Moreover, medical and rehabilitation interventions for PD-associated motor speech disorder (MSD) have a poor long-term prognosis. Methods: To compare risk factors in PD patients with MSD to those without MSD (non-MSD) and determine predictive independent risk factors correlated with the MSD phenotype, we enrolled 314 PD patients, including 250 with and 64 without MSD. We compared demographic, characteristic data, as well as PD-associated evaluations between the MSD group and non-MSD group. Results: Univariate analysis showed that demographic characteristics, including occupation, educational level, monthly income and speaking background; clinical characteristics, including lesions in the frontal and temporal lobes, and concurrent dysphagia; and PD-associated evaluations, including the activity of daily living (ADL) score, non-motor symptoms scale (NMSS) domain 4 score (perceptual problem), and NMSS domain 5 score (attention/memory) were all significantly different between the MSD and non-MSD group (all P < 0.05). Multivariate logistic regression analysis showed that educational level, frontal lesions, and NMSS domain 5 score (attention/memory) were independent risk factors for PD-associated MSD (all P < 0.005). Conclusions: We determined an association between MSD phenotype and cognitive impairment, reflected by low-level education and related clinical profiles. Moreover, attention and memory dysfunction may play key roles in the progression of MSD in PD patients. Further studies are required to detail the mechanism underlying abnormal speech motor modulation in PD patients. Early cognitive intervention may enhance rehabilitation management and motor speech function in patients with PD-associated MSD.
Article
Clinical studies have shown the efficacy of transcranial magnetic stimulation in treating movement disorders in patients with spinocerebellar ataxia (SCA). However, whether similar effects occur for their speech motor disorders remains largely unknown. The present event-related potential study investigated whether and how abnormalities in auditory-vocal integration associated with SCA can be modulated by neuronavigated continuous theta burst stimulation (c-TBS) over the right cerebellum. After receiving active or sham cerebellar c-TBS, 19 patients with SCA were instructed to produce sustained vowels while hearing their voice unexpectedly pitch-shifted by ±200 cents. Behaviorally, active cerebellar c-TBS led to smaller magnitudes of vocal compensations for pitch perturbations than sham stimulation. Parallel modulatory effects were also observed at the cortical level, as ref lected by increased P1 and P2 responses but decreased N1 responses elicited by active cerebellar c-TBS. Moreover, smaller magnitudes of vocal compensations were predicted by larger amplitudes of cortical P1 and P2 responses. These findings provide the first neurobehavioral evidence that c-TBS over the right cerebellum produces modulatory effects on abnormal auditory-motor integration for vocal pitch regulation in patients with SCA, offering a starting point for the treatment of speech motor disorders associated with SCA with cerebellar c-TBS.
Preprint
Full-text available
Background The mechanisms underlying the online modulation of motor speech in Parkinson’s disease (PD) have not been determined. Moreover, medical and rehabilitation interventions for PD-associated motor speech disorder (MSD) have a poor long-term prognosis. Methods To compare risk factors in PD patients with MSD to those without MSD (non-MSD) and determine predictive independent risk factors correlated with the MSD phenotype, we enrolled 314 PD patients, including 250 with and 64 without MSD. We compared demographic, characteristic data, as well as PD-associated evaluations between the MSD group and non-MSD group. Results Univariate analysis showed that demographic characteristics, including occupation, educational level, and monthly income; clinical characteristics, including lesions in the frontal and temporal lobes, and concurrent dysphagia; and PD-associated evaluations, including the activity of daily living (ADL) score, non-motor symptoms scale (NMSS) domain 4 score (perceptual problem), and NMSS domain 5 score (attention/memory) were all significantly different between the MSD and non-MSD group (all P
Preprint
Full-text available
Background The mechanisms underlying the online modulation of motor speech in Parkinson’s disease (PD) have not been determined. Moreover, medical and rehabilitation interventions for PD-associated motor speech disorder (MSD) have a poor long-term prognosis. Methods To compare risk factors in PD patients with MSD to those without MSD (non-MSD) and determine predictive independent risk factors correlated with the MSD phenotype, we enrolled 314 PD patients, including 250 with and 64 without MSD. We compared demographic, characteristic data, as well as PD-associated evaluations between the MSD group and non-MSD group. Results Univariate analysis showed that demographic characteristics, including occupation, educational level, monthly income and speaking background; clinical characteristics, including lesions in the frontal and temporal lobes, and concurrent dysphagia; and PD-associated evaluations, including the activity of daily living (ADL) score, non-motor symptoms scale (NMSS) domain 4 score (perceptual problem), and NMSS domain 5 score (attention/memory) were all significantly different between the MSD and non-MSD group (all P
Preprint
Full-text available
Background The mechanisms underlying the online modulation of motor speech in Parkinson’s disease (PD) have not been determined. Moreover, medical and rehabilitation interventions for PD-associated motor speech disorder (MSD) have a poor long-term prognosis. Methods To compare risk factors in PD patients with MSD to those without MSD (non-MSD) and determine predictive independent risk factors correlated with the MSD phenotype, we enrolled 314 PD patients, including 250 with and 64 without MSD. We compared demographic, characteristic data, as well as PD-associated evaluations between the MSD group and non-MSD group. Results Univariate analysis showed that demographic characteristics, including occupation, educational level, monthly income and speaking background; clinical characteristics, including lesions in the frontal and temporal lobes, and concurrent dysphagia; and PD-associated evaluations, including the activity of daily living (ADL) score, non-motor symptoms scale (NMSS) domain 4 score (perceptual problem), and NMSS domain 5 score (attention/memory) were all significantly different between the MSD and non-MSD group (all P
Preprint
Full-text available
Background The mechanisms underlying the online modulation of motor speech in Parkinson’s disease (PD) have not been determined. Moreover, medical and rehabilitation interventions for PD-associated motor speech disorder (MSD) have a poor long-term prognosis. Methods To compare risk factors in PD patients with MSD to those without MSD (non-MSD) and determine predictive independent risk factors correlated with the MSD phenotype, we enrolled 314 PD patients, including 250 with and 64 without MSD. We compared demographic, characteristic data, as well as PD-associated evaluations between the MSD group and non-MSD group. Results Univariate analysis showed that demographic characteristics, including occupation, educational level, monthly income and speaking background; clinical characteristics, including lesions in the frontal and temporal lobes, and concurrent dysphagia; and PD-associated evaluations, including the activity of daily living (ADL) score, non-motor symptoms scale (NMSS) domain 4 score (perceptual problem), and NMSS domain 5 score (attention/memory) were all significantly different between the MSD and non-MSD group (all P < 0.05). Multivariate logistic regression analysis showed that educational level, frontal lesions, and NMSS domain 5 score (attention/memory) were independent risk factors for PD-associated MSD (all P < 0.005). Conclusions We determined an association between MSD phenotype and cognitive impairment, reflected by low-level education and related clinical profiles. Moreover, attention and memory dysfunction may play key roles in the progression of MSD in PD patients. Further studies are required to detail the mechanism underlying abnormal speech motor modulation in PD patients. Early cognitive intervention may enhance rehabilitation management and motor speech function in patients with PD-associated MSD.
Article
Full-text available
The final stage in the process of spoken production is articulation, which involves the integration of feedforward and feedback control in speech motor system. Specifically, feedforward control (top-down mechanism) refers to speakers’ ability to retrieve and execute the motor commands responsible for producing target speech sounds, while feedback control (bottom-up mechanism) refers to speakers’ ability to adjust speech movements based on the sensory feedback generated by articulation. Sensory goals and sensory predictions are important hubs linking feedforward and feedback control systems. Based on the neural computational model DIVA (directions into velocities of articulators), the cognitive and neural mechanisms of the integration between feedforward and feedback control are illustrated in the stage of speech acquisition and speech production. On the basis of previous studies, how speakers utilize auditory feedback to control online speech and update feedforward motor representations, and the cognitive significance of the P1-N1-P2 components in the ERP studies are especially discussed. Furthermore, various factors that influence feedforward and feedback control are summarized, including individual variabilities, training experience and task demands. Additionally, some suggestions are proposed for future investigation.
Article
Full-text available
Considerable evidence has shown that unexpected alterations in auditory feedback elicit fast compensatory adjustments in vocal production. Although generally thought to be involuntary in nature, whether these adjustments can be influenced by attention remains unknown. The present event-related potential (ERP) study aimed to examine whether neurobehavioral processing of auditory-vocal integration can be affected by attention. While sustaining a vowel phonation and hearing pitch-shifted feedback, participants were required to either ignore the pitch perturbations, or attend to them with low (counting the number of perturbations) or high attentional load (counting the type of perturbations). Behavioral results revealed no systematic change of vocal response to pitch perturbations irrespective of whether they were attended or not. At the level of cortex, there was an enhancement of P2 response to attended pitch perturbations in the low-load condition as compared to when they were ignored. In the high-load condition, however, P2 response did not differ from that in the ignored condition. These findings provide the first neurophysiological evidence that auditory-motor integration in voice control can be modulated as a function of attention at the level of cortex. Furthermore, this modulatory effect does not lead to a general enhancement but is subject to attentional load.
Article
Full-text available
Auditory feedback is required to maintain fluent speech. At present, it is unclear how attention modulates auditory feedback processing during ongoing speech. In this event-related potential (ERP) study, participants vocalized/a/, while they heard their vocal pitch suddenly shifted downward a ½ semitone in both single and dual-task conditions. During the single-task condition participants passively viewed a visual stream for cues to start and stop vocalizing. In the dual-task condition, participants vocalized while they identified target stimuli in a visual stream of letters. The presentation rate of the visual stimuli was manipulated in the dual-task condition in order to produce a low, intermediate, and high attentional load. Visual target identification accuracy was lowest in the high attentional load condition, indicating that attentional load was successfully manipulated. Results further showed that participants who were exposed to the single-task condition, prior to the dual-task condition, produced larger vocal compensations during the single-task condition. Thus, when participants' attention was divided, less attention was available for the monitoring of their auditory feedback, resulting in smaller compensatory vocal responses. However, P1-N1-P2 ERP responses were not affected by divided attention, suggesting that the effect of attentional load was not on the auditory processing of pitch altered feedback, but instead it interfered with the integration of auditory and motor information, or motor control itself.
Article
Full-text available
The earliest stages of cortical processing of speech sounds take place in the auditory cortex. Transcranial magnetic stimulation (TMS) studies have provided evidence that the human articulatory motor cortex contributes also to speech processing. For example, stimulation of the motor lip representation influences specifically discrimination of lip-articulated speech sounds. However, the timing of the neural mechanisms underlying these articulator-specific motor contributions to speech processing is unknown. Furthermore, it is unclear whether they depend on attention. Here, we used magnetoencephalography and TMS to investigate the effect of attention on specificity and timing of interactions between the auditory and motor cortex during processing of speech sounds. We found that TMS-induced disruption of the motor lip representation modulated specifically the early auditory-cortex responses to lip-articulated speech sounds when they were attended. These articulator-specific modulations were left-lateralized and remarkably early, occurring 60-100 ms after sound onset. When speech sounds were ignored, the effect of this motor disruption on auditory-cortex responses was nonspecific and bilateral, and it started later, 170 ms after sound onset. The findings indicate that articulatory motor cortex can contribute to auditory processing of speech sounds even in the absence of behavioral tasks and when the sounds are not in the focus of attention. Importantly, the findings also show that attention can selectively facilitate the interaction of the auditory cortex with specific articulator representations during speech processing.
Article
Full-text available
Speech motor control develops gradually as the acoustics of speech are mapped onto the positions and movements of the articulators. In this event-related potential (ERP) study, children and adults aged 4-30 years produced vocalizations while exposed to frequency-altered feedback. Vocal pitch variability and the latency of vocal responses were found to differ as a function of age. ERP responses indexed by the P1-N1-P2 complex were also modulated as a function of age. P1 amplitudes decreased with age, whereas N1 and P2 amplitudes increased with age. In addition, a correlation between vocal variability and N1 amplitudes was found, suggesting a complex interaction between behavioural and neurological responses to frequency-altered feedback. These results suggest that the neural systems that integrate auditory feedback during vocal motor control undergo robust changes with age and physiological development.
Article
Full-text available
Background Recent research has addressed the suppression of cortical sensory responses to altered auditory feedback that occurs at utterance onset regarding speech. However, there is reason to assume that the mechanisms underlying sensorimotor processing at mid-utterance are different than those involved in sensorimotor control at utterance onset. The present study attempted to examine the dynamics of event-related potentials (ERPs) to different acoustic versions of auditory feedback at mid-utterance. Methodology/Principal findings Subjects produced a vowel sound while hearing their pitch-shifted voice (100 cents), a sum of their vocalization and pure tones, or a sum of their vocalization and white noise at mid-utterance via headphones. Subjects also passively listened to playback of what they heard during active vocalization. Cortical ERPs were recorded in response to different acoustic versions of feedback changes during both active vocalization and passive listening. The results showed that, relative to passive listening, active vocalization yielded enhanced P2 responses to the 100 cents pitch shifts, whereas suppression effects of P2 responses were observed when voice auditory feedback was distorted by pure tones or white noise. Conclusion/Significance The present findings, for the first time, demonstrate a dynamic modulation of cortical activity as a function of the quality of acoustic feedback at mid-utterance, suggesting that auditory cortical responses can be enhanced or suppressed to distinguish self-produced speech from externally-produced sounds.
Article
Full-text available
Background Auditory feedback is important for accurate control of voice fundamental frequency (F0). The purpose of this study was to address whether task instructions could influence the compensatory responding and sensorimotor adaptation that has been previously found when participants are presented with a series of frequency-altered feedback (FAF) trials. Trained singers and musically untrained participants (nonsingers) were informed that their auditory feedback would be manipulated in pitch while they sang the target vowel [/ɑ /]. Participants were instructed to either ‘compensate’ for, or ‘ignore’ the changes in auditory feedback. Whole utterance auditory feedback manipulations were either gradually presented (‘ramp’) in -2 cent increments down to -100 cents (1 semitone) or were suddenly (’constant‘) shifted down by 1 semitone. Results Results indicated that singers and nonsingers could not suppress their compensatory responses to FAF, nor could they reduce the sensorimotor adaptation observed during both the ramp and constant FAF trials. Conclusions Compared to previous research, these data suggest that musical training is effective in suppressing compensatory responses only when FAF occurs after vocal onset (500-2500 ms). Moreover, our data suggest that compensation and adaptation are automatic and are influenced little by conscious control.
Article
Speech production requires the combined effort of a feedback control system driven by sensory feedback, and a feedforward control system driven by internal models. However, the factors dictating the relative weighting of these feedback and feedforward control systems are unclear. In this event-related potential (ERP) study, participants produced vocalizations while exposed to blocks of frequency-altered feedback (FAF) perturbations that were either predictable in magnitude (consistently either 50 or 100 cents), or unpredictable in magnitude (50 and 100-cent perturbations varying randomly within each vocalization). Vocal and P1-N1-P2 ERP responses revealed decreases in the magnitude and trial-to-trial variability of vocal responses, smaller N1-amplitudes, and shorter vocal, P1 and N1 response latencies following predictable FAF perturbation magnitudes. In addition, vocal response magnitudes correlated with N1 amplitudes, as well as vocal response latencies and P2 latencies. This pattern of results suggests that after repeated exposure to predictable FAF perturbations, the contribution of the feedforward control system increases. Examination of the presentation order of the FAF perturbations revealed smaller compensatory responses, P1 and P2 amplitudes, as well as shorter N1 latencies when the block of predictable 100-cent perturbations occurred prior to the block of predictable 50-cent perturbations. These results suggest that exposure to large perturbations modulates responses to subsequent perturbations of equal or smaller size. Similarly, exposure to a 100-cent perturbation, prior to a 50-cent perturbation within a vocalization, decreased the magnitude of vocal and N1 responses, but increased P1 and P2 latencies. Thus, exposure to a single perturbation can affect responses to subsequent perturbations.