ArticlePDF Available

Training of Working Memory Impacts Neural Processing of Vocal Pitch Regulation

Authors:

Abstract and Figures

Working memory training can improve the performance of tasks that were not trained. Whether auditory-motor integration for voice control can benefit from working memory training, however, remains unclear. The present event-related potential (ERP) study examined the impact of working memory training on the auditory-motor processing of vocal pitch. Trained participants underwent adaptive working memory training using a digit span backwards paradigm, while control participants did not receive any training. Before and after training, both trained and control participants were exposed to frequency-altered auditory feedback while producing vocalizations. After training, trained participants exhibited significantly decreased N1 amplitudes and increased P2 amplitudes in response to pitch errors in voice auditory feedback. In addition, there was a significant positive correlation between the degree of improvement in working memory capacity and the post-pre difference in P2 amplitudes. Training-related changes in the vocal compensation, however, were not observed. There was no systematic change in either vocal or cortical responses for control participants. These findings provide evidence that working memory training impacts the cortical processing of feedback errors in vocal pitch regulation. This enhanced cortical processing may be the result of increased neural efficiency in the detection of pitch errors between the intended and actual feedback.
Content may be subject to copyright.
1
Scientific RepoRts | 5:16562 | DOI: 10.1038/srep16562
www.nature.com/scientificreports
Training of Working Memory
Impacts Neural Processing of Vocal
Pitch Regulation
Weifeng Li1,*, Zhiqiang Guo2,*, Jeery A. Jones3, Xiyan Huang1, Xi Chen1, Peng Liu1,
Shaozhen Chen1 & Hanjun Liu1,2
Working memory training can improve the performance of tasks that were not trained. Whether
auditory-motor integration for voice control can benet from working memory training, however,
remains unclear. The present event-related potential (ERP) study examined the impact of working
memory training on the auditory-motor processing of vocal pitch. Trained participants underwent
adaptive working memory training using a digit span backwards paradigm, while control participants
did not receive any training. Before and after training, both trained and control participants were
exposed to frequency-altered auditory feedback while producing vocalizations. After training, trained
participants exhibited signicantly decreased N1 amplitudes and increased P2 amplitudes in response
to pitch errors in voice auditory feedback. In addition, there was a signicant positive correlation
between the degree of improvement in working memory capacity and the post-pre dierence in P2
amplitudes. Training-related changes in the vocal compensation, however, were not observed. There
was no systematic change in either vocal or cortical responses for control participants. These ndings
provide evidence that working memory training impacts the cortical processing of feedback errors
in vocal pitch regulation. This enhanced cortical processing may be the result of increased neural
eciency in the detection of pitch errors between the intended and actual feedback.
Working memory refers to the neural process by which information is stored and manipulated over a
brief period of time1. Multiple lines of evidence have demonstrated strong associations between working
memory and complex cognitive tasks such as uid reasoning2,3, reading comprehension4,5, and atten-
tional control6,7. Brain damage due to events such as stroke and traumatic brain injury, as well as devel-
opmental and psychiatric disorders such as intellectual development disorder and schizophrenia, can
cause impairments in working memory that aect quality of life8,9. Given the importance of working
memory in facilitating complex cognition, recent years have seen a surge in the development of training
programs aimed at not only enhancing working memory capacity, but also producing eects that gen-
eralize to untrained tasks.
Previous research has shown that working memory capacity can be improved by training. For exam-
ple, following computerized working memory training, children with attention-decit/hyperactivity
disorder and patients who have recently suered a stroke showed a signicant improvement in their
working memory capacity10,11. Moreover, working memory training can lead to benecial eects in tasks
that were not trained. For example, in a study by Jaeggi et al.2, participants were trained on a dual
n-back task where they simultaneously heard a series of single letters while they saw a square sequentially
placed at dierent positions on a screen. eir task was to determine whether each of these stimuli was
presented n items back in the series. Aer this training, participants exhibited improved performance
1Department of Rehabilitation Medicine, The First Aliated Hospital, Sun Yat-sen University, Guangzhou, 510080,
P. R. China. 2Department of Biomedical Engineering, School of Engineering, Sun Yat-sen University, Guangzhou,
China, 510006. 3Psychology Department and Laurier Centre for Cognitive Neuroscience, Wilfrid Laurier University,
Waterloo, Ontario, N2L 3C5, Canada. *These authors contributed equally to this work. Correspondence and
requests for materials should be addressed to H.L. (email: lhanjun@mail.sysu.edu.cn)
Received: 02 March 2015
Accepted: 15 October 2015
Published: 10 November 2015
OPEN
www.nature.com/scientificreports/
2
Scientific RepoRts | 5:16562 | DOI: 10.1038/srep16562
on an untrained task that involved non-verbal reasoning. Similarly, Dahlin et al.12 reported that, aer
participating in ve weeks of adaptive training that required participants to update single items (letters,
digits, colors, and spatial locations) in their working memory, young adults signicantly improved their
performance on a non-trained 3-back working memory task.
In addition to its inuence on the above-mentioned cognitive functions, working memory has been
found to be involved in sensorimotor integration in speech processing. In the phonological loop system
of Baddeley and Hitch’s working memory model1, verbal information is processed through the interac-
tion between the passive phonological storage and the active articulatory rehearsal. It has been suggested
that sensorimotor processes may assist with the representation and manipulation of information, and
auditory working memory acts to translate the auditory information into a rehearseable sensorimotor
code13. Previous research has established a neural network underlying auditory verbal working memory,
which includes the dorsolateral prefrontal cortex, premotor cortex, superior temporal gyrus, and area Spt
(Sylvian-parietal-temporal)14,15; activity in these structures was also observed for the auditory-motor pro-
cessing of feedback errors during vocal pitch regulation16–18. In particular, area Spt has been suggested to
act as an auditory-motor interface for working memory because of its involvement in the temporary stor-
age of verbal information during working memory tasks19, and the mapping of perceived speech sounds
onto articulatory representations20. It is thus proposed that the internal representation and manipulation
of speech sounds stored in working memory rely on auditory-motor integration13.
ere are theories that postulate that auditory-motor integration in speech processing relies on a
subtractive comparison of the expected auditory feedback (corollary discharge), based on a copy of the
motor command, with the actual vocal output (re-aerence)21,22. In the case of vocal pitch regulation,
errors heard in auditory feedback may be detected by comparing the auditory re-aerence with a sensory
memory trace of the vocal target, with perceived discrepancies eliciting a corrective motor command
to regulate the vocal production23. Working memory functions to integrate incoming information with
stored information in existing memory24, and it has been suggested that information related to the speech
motor command and sensory re-aerence can be stored in short-term memory within a feedback circuit
and recalled when needed to adjust the motor activity25,26. On the other hand, activation of the brain’s
error detection system is aected by characteristics of the working memory system27, and the error detec-
tion mechanisms can benet from training of working memory, as evidenced by increased amplitudes
of the error-related negativity (ERN) using electroencephalography (EEG) for participants who received
a training of auditory, visual, and cross-modality working memory skills using the CogniFit Personal
Coach Program28. Moreover, there is evidence that intensive training of auditory working memory can
make the storage, access, updating, and rehearsal of auditory information more ecient29. us, train-
ing of working memory may increase neural eciency in the detection of feedback errors during the
online monitoring of self-produced vocalization, facilitating the cortical mechanisms of auditory-motor
integration for voice control.
ere is also evidence that training individuals to recognize speech sounds involves a variety of cog-
nitive abilities such as working memory and acoustic discrimination, and that this perceptual training
can drive changes to speech motor control. For example, in one recent speech perceptual learning study,
Mandarin speakers were trained to associate pictures with words that were spoken with pitch patterns
that resembled ve ai tones. Participants who received the training produced signicantly smaller
N1 responses and larger P2 responses to pitch errors in their voice auditory feedback, while control
participants who did not undergo the training did not30. Given that improving perceptual abilities can
directly impact working memory31, it is also possible that changes in sensorimotor integration in speech
processing caused by perceptual training are related to improvements in working memory capabilities. To
date, no study has directly investigated the interactions between working memory training and plasticity
in auditory-motor integration.
Since the training of auditory working memory can lead to increased neural eciency in the storage
and manipulation of auditory information, it is conceivable that auditory working memory training
may also lead to benecial eects for voice control by facilitating the detection of vocal pitch errors in
auditory feedback. To answer this question, the present event-related potential (ERP) study recruited one
trained group who participated in an adaptive training of auditory working memory using a digit span
backwards (DSB) paradigm, and one control group who did not receive any training. At the pre-training
and post-training sessions, both trained and control participants were instructed to sustain a vowel pho-
nation while hearing their voice auditory feedback unexpectedly pitch-shied. Behavioral and cortical
responses to pitch perturbations were measured to examine the inuence of working memory training
on the auditory-motor control of vocal production. Because previous research has shown that working
memory training leads to benecial eects on untrained tasks2,12, and changes in N1 and P2 responses
to vocal pitch perturbations following speech perceptual learning30, we predicted that this adaptive
DSB training would result in not only improvements in working memory capacity, but also improved
auditory-motor processing of vocal pitch. Specically, we expected larger vocal responses, smaller N1
responses (i.e., less negative), and larger P2 responses (i.e., more positive) to pitch feedback perturbations
for the participants who underwent the training.
www.nature.com/scientificreports/
3
Scientific RepoRts | 5:16562 | DOI: 10.1038/srep16562
Methods
Subjects. irty-three right-handed and native-Mandarin speaking participants, who were students at
Sun Yat-sen University in China, were recruited and randomly assigned to the trained or control group.
Eighteen participants (8 male and 10 female; mean age = 22 years) were assigned to the trained group
and underwent 10 days of working memory training. Fieen participants (7 male and 8 female; mean
age = 21 years) who did not undergo the training were assigned to the control group. e two groups
did not dier in terms of age, gender, and education. Participants had no history of language, hearing,
or neurological disorders. Participants were included in the experiments only if they had normal hear-
ing ( 25 dB hearing level [HL] pure-tone thresholds from 250–4000 Hz). Written informed consent was
obtained from all participants, and the research protocol was approved by the Institutional Review Board
of e First Aliated Hospital at Sun Yat-sen University of China in accordance with the Code of Ethics
of the World Medical Association (Declaration of Helsinki). e study was carried out in accordance
with the approved guidelines.
Working memory training. e training scheme was designed to provide participants with prac-
tice in auditory working memory. Working memory training was carried out in a sound-attenuated
booth. Training was implemented with E-Prime soware (Psychological Testing Services, Pittsburgh,
Pennsylvania) using a DSB paradigm, during which participants heard a string of digits and then were
asked to type the digits in the reverse order that they were presented (e.g., if the participant heard “1 2
3 4”, then the correct response was “4 3 2 1”). Participants were seated in front of a personal computer
and heard the sounds through Sennheiser headphones (HMD 250). Listeners were allowed to choose a
preferred loudness level for training stimuli at the outset of the experiment, and this level was maintained
throughout training.
e duration of the experiment was 12 days for each participant. Both trained and control partici-
pants completed the pre- and post-test on the rst and twelh day, and the trained participants received
training on the second through eleventh day (i.e. training days 1–10). e training paradigm was adap-
tive such that the length of digit span was adjusted based on the listener’s performance. On training day
1, the training session started with 2 digits. Each block consisted of 7 lists of digits, and 10 blocks were
completed during each training day. Subjects had to reach 70% accuracy on a given span before the
length of the spans was increased. Accuracy of lower than 50% caused the span lengths to decrease on
the subsequent trial. Participants began each new training day with the span length they had attained
on the last trial of the previous day.
Participants were asked to remember the digits one through nine, which were spoken by a native
Mandarin speaker and recorded with a sampling frequency of 44100 Hz. e digit stimuli were then RMS
matched to a 70 dB sound pressure level (SPL) pure tone at 1 kHz. Pilot testing revealed that listeners
plateaued in performance when all training was presented in quiet, but not when noise was introduced.
We therefore presented the digits in noise with varying signal-to-noise (SNR) levels. Steady-state noise
matching the long-term average spectrum of the recordings was created using MATLAB and mixed with
the target recordings in Adobe Audition to create stimuli with SNRs of 5 and 10 dB. In addition, a set
of 27 non-speech sounds were obtained from online databases, including twelve animal sounds (e.g., bird
song, dog bark), two mechanical sounds (e.g., clock ticking, gun shot), ve non-speech human sounds
(e.g., cough, laugh), and eight musical sounds (e.g., violin, ute). ese sounds were cropped to 1 s, RMS
Figure 1. Schematic illustration showing the experimental setup. Trained participants received adaptive
training using a digit span backwards paradigm over 10 consecutive days. e digits were presented in
quiet, steady-state noise ( 5 and 10 dB), and non-speech noise ( 5 and 10 dB) conditions. Control
participants did not receive training. Both groups participated in the altered auditory feedback (AAF)
experiment before and aer training, during which they produced vowel sounds while hearing their voice
pitch feedback unexpectedly shied through headphones.
www.nature.com/scientificreports/
4
Scientific RepoRts | 5:16562 | DOI: 10.1038/srep16562
matched in amplitude to the calibration tone, and then mixed with all target digits at 5 and 10 dB
SNR, resulting in a total of 468 stimulus combinations.
e training paradigm was implemented over 10 consecutive days (see Fig.1). On training days one
and two, the digits were presented in a quiet condition (i.e. without noise). On training days three and
four, each digit was presented in steady-state noise at 5 dB SNR. On training days ve and six, each
digit was presented in non-speech distractors that were randomly chosen and presented at 5 dB SNR.
Training days seven through ten replicated training days three through six but at 10 dB SNR.
Behavioral and neurophysiological assessment. e vocal production experiment was carried
out using the altered auditory feedback (AAF) paradigm. Both trained and control participants were
instructed to produce a steady vowel sound /u/for about 5–6 s at their habitual and comfortable pitch and
loudness level, during which their voice pitch feedback was unexpectedly shied + 50 or + 200 cents (100
cents = one semitone) for 200 ms, and fed back to them instantaneously. e rst pitch-shied stimulus
was presented with a delay of 500–1000 ms aer vocal onset, and the succeeding stimuli occurred with
an inter-stimulus interval of 700–900 ms. ere were ve pitch-shi stimuli per vocalization and partic-
ipants produced 40 consecutive vocalizations. In total, 200 trials were collected including 100 trials for
+ 50 cents and 100 trials for + 200 cents.
e vocal production experiment was conducted in a sound-attenuated booth. Prior to the testing,
the experimental system was calibrated to ensure that the intensity of voice feedback heard by the par-
ticipants was 10 dB SPL higher than that of the subjects’ voice output to partially mask any air-born and
bone-conducted feedback32. e voice signals were transduced by a dynamic microphone (DM2200,
Takstar Inc.) and amplied by a MOTU Ultralite Mk3 Firewire audio interface. e amplied signals
were pitch-shied through an Eventide Eclipse Harmonizer that was controlled by a custom-developed
MIDI soware program (Max/MSP, v.5.0 by Cycling 74) running on a Macintosh computer. is pro-
gram was also used to manipulate parameters including the magnitude, direction, and duration of the
pitch shis and generate transistor-transistor logic (TTL) control pulses to signal the onset and oset of
the pitch shis. Finally, the pitch-shied voices were amplied by an ICON NeoAmp headphone ampli-
er and fed back to the participants through insert earphones (ER1-14 A, Etymotic Research Inc.). e
voice, feedback, and TTL control pulses were digitized at 10 kHz by a PowerLab A/D converter (ML880,
AD Instruments), recorded using LabChart soware (v.7.0 by AD Instruments), and saved onto another
Macintosh computer.
EEG signals were recorded using a 64-electrode Geodesic Sensor Net, amplied by a Net Amps 300
amplier (Electrical Geodesics Inc.), digitized at a sampling frequency of 1 kHz, and saved onto a Mac
Pro computer. A DIN synch cable was used to send the TTL control pulses generated by Max/MSP to the
EEG recording system for the measurement of stimulus-evoked potentials. During the online recording,
EEG signals across all channels were referenced to the vertex (Cz). Impedance of individual sensors were
adjusted and maintained below 50 kΩ
33.
Data analyses. Vocal Responses. e measurement of vocal responses to pitch-shied voice audi-
tory feedback was implemented using a custom-developed IGOR PRO program (v.6.0, Wavemetrics
Inc.) using event-related averaging techniques34. Voice F0 contours in Hertz were extracted from voice
signals using Praat35, and converted to the cent scale using the formula: cents = 100 × (12 × log2(F0/ref-
erence)) [reference = 195.997 Hz (G4)]. Voice F0 contours were then segmented into epochs from 200 ms
before and 700 ms aer the stimulus onset. All individual trials were visually inspected such that those
trials containing vocal interruption or signal processing errors were excluded from analyses. Finally,
artifact-free trials for each condition were averaged to generate an overall response. Acceptable responses
exceeded a value of two standard deviations (SDs) of the pre-stimulus mean beginning at least 60 ms
aer the onset of the pitch shi and lasting at least 50 ms. e response magnitude was measured by sub-
tracting the peak value of voice contour following the response onset from the pre-stimulus mean ( 200
to 0 ms). e response latency was determined as the time of voice F0 departure from the pre-stimulus
mean by more than 2 SDs.
Evoked Potentials. e EEG signals were analyzed o-line using NetStation soware (v.4.5, Electrical
Geodesics Inc.). All channels were digitally band-passed ltered with cut-o frequencies of 1 to 20 Hz.
Individual trials were segmented into epochs with a window of 200 ms and + 500 ms relative to the
onset of the pitch shi. Segmented trials contaminated by excessive muscular activity, eye blinks, or eye
movements were assessed using the Artifact Detection Toolbox in NetStation and excluded from further
analyses. An additional visual inspection was also performed on individual trials to ensure that artifacts
were being adequately rejected. Individual electrodes were excluded from further analyses if they con-
tained artifacts more than 20% of the segmented trials. Finally, artifact-free segments were averaged,
re-referenced to the average of the electrodes on each mastoid, and baseline-corrected across all condi-
tions. Given that cortical responses to pitch-shied voice auditory feedback are mostly pronounced in
the N1-P2 complex36,37, the amplitudes and latencies of N1 and P2 were measured as the negative and
positive peaks in the time windows of 80–180 ms and 160–280 ms aer the onset of the pitch shi and
submitted to statistical analyses.
www.nature.com/scientificreports/
5
Scientific RepoRts | 5:16562 | DOI: 10.1038/srep16562
e DSB scores and the magnitudes and latencies of vocal and neurophysiological responses were
subjected to repeated-measures mixed analyses of variance (ANOVAs) using SPSS (v.16.0). Testing ses-
sion (pre- vs. post-training), stimulus magnitude (+ 50 and + 200 cents), and electrode site (FC1, FC3,
FCz, FC2, FC4, C1, C3, Cz, C2, C4, P1, P3, Pz, P2, P4) were chosen as within-subject factors, while group
(trained vs. control group) was chosen as a between-subject factor. Appropriate subsidiary RM-ANOVAs
were calculated if higher-order interactions reached signicance. Greenhouse-Geisser was used to correct
probability values for multiple degrees of freedom when violations of the sphericity assumption occurred.
Results
DSB measure. Fig.2 shows participants’ performance indexed by the DSB scores throughout training.
As can be seen, the DSB score was below 10 (ranging from 7.4 to 9.2) on training days one through three.
On training days four through ten, the DSB scores increased to as large as 11.7. A one-way ANOVA
revealed a signicant main eect of testing day (F(9, 153) = 22.288, p < 0.001), and post-hoc Bonferroni
comparisons showed that the DSB scores on training days one through three were signicantly lower
than those on training days seven through ten (p < 0.03). In summary, the signicant increase in the DSB
measure indicates improvements in working memory capacity following training.
Vocal response. One two-way mixed ANOVA conducted on the vocal response magnitude revealed
no signicant main eects of testing session (F(1, 31) = 0.045, p = 0.834) (pre-training: 14.7 ± 6.3 cents;
post-training: 13.6 ± 5.8 cents), stimulus magnitude (F(1, 31) = 0.963, p = 0.334) (+ 50 cents: 14.0 ± 5.6
cents; + 200 cents: 14.2 ± 6.4 cents), or group (F(1, 31) = 1.055, p = 0.312) (trained: 13.6 ± 5.6 cents;
control: 14.7 ± 6.5 cents). Similarly, response latencies did not dier as a function of testing session
(F(1, 31) = 0.425, p = 0.519) (pre-training: 77 ± 24 ms; post-training: 75 ± 18 cents), stimulus magnitude
(F(1, 31) = 0.234, p = 0.632) (+ 50 cents: 75 ± 19 ms; + 200 cents: 77 ± 23 ms), or group (F(1, 31) = 0.054,
p = 0.817) (trained: 76 ± 21 ms; control: 77 ± 21 ms). In addition, no signicant interactions between
these factors were observed for response magnitude or latency (p > 0.05).
ERP ndings. Fig.3 and 4 show the grand-averaged ERP waveforms and topographical distributions
of N1 and P2 amplitudes in response to pitch shis of + 50 and + 200 cents across all control and trained
participants before (blue solid lines) and aer (red solid lines) working memory training. As can be seen,
the cortical responses to pitch shis produced by the control participants during the pre-training ses-
sion did not dier from the responses produced during the post-training session. By contrast, following
training N1 amplitudes decreased and P2 amplitudes increased in response to both the + 50 and + 200
cents pitch shis heard by the trained participants, and these changes were primarily observed in the
frontocentral electrodes.
One four-way mixed ANOVA conducted on the N1 amplitudes revealed signicant main eects of
testing session (F(1, 31) = 4.846, p = 0.035) and electrode site (F(14, 434) = 13.265, p < 0.001), whereas
the main eects of stimulus magnitude (F(1, 31) = 3.209, p = 0.083) and group (F(1, 31) = 1.134,
p = 0.295) failed to reach signicance. ere was also a signicant testing session × electrode site × group
interaction (F(14, 434) = 2.705, p = 0.038). Follow-up three-way mixed ANOVAs conducted on the data
from the trained group revealed a signicant decrease in N1 amplitudes (less negative) following training
(F(1, 17) = 5.039, p = 0.038), whereas there was no systematic change in N1 amplitudes as a function of
testing session for the control group (F(1, 14) = 0.650, p = 0.434) (see Fig.5A).
Figure 2. e mean scores of the digit span backwards task across the 10 training days. e error bars
represent the standard deviations of the mean scores.
www.nature.com/scientificreports/
6
Scientific RepoRts | 5:16562 | DOI: 10.1038/srep16562
As for N1 latency, pitch shis of + 50 cents elicited signicantly longer N1 latencies than pitch shis
of + 200 cents (143 ± 25 vs. 122 ± 20 ms) (F(1, 20) = 81.329, p < 0.001). ere was also a signicant
main eect of electrode site (F(14, 434) = 3.785, p = 0.008), which was primarily caused by longer N1
latencies at electrode C4 as compared to Pz (p = 0.010) and P2 (p = 0.004). A main eect of group (F(1,
31) = 1.014, p = 0.322) and testing session (F(1, 31) = 1.486, p = 0.232), however, failed to reach signi-
cance. No signicant interactive eects between these factors were found either (p > 0.05).
One four-way mixed ANOVA conducted on the P2 amplitudes revealed a signicant main eect of
stimulus magnitude (F(1, 31) = 20.102, p < 0.001), indicating that pitch shis of + 50 cents elicited sig-
nicantly smaller P2 amplitudes than pitch shis of + 200 cents. ere was also a signicant main eect
of electrode site (F(14, 434) = 73.727, p < 0.001), which was primarily driven by larger P2 amplitudes at
frontocentral electrodes than parietal electrodes (p < 0.05) (see Figs3 and 4). e main eect of testing
session (F(1, 31) = 15.256, p < 0.001) reached signicance, whereas there was no main eect of group
(F(1, 31) = 2.928, p = 0.097). A signicant interaction, however, was found between testing session and
group (F(1, 31) = 4.917, p = 0.034). A follow-up three-way mixed ANOVA conducted on the data from
Figure 3. Grand-averaged ERP waveforms (FCz, Cz, Pz) and topographical distributions of N1 and P2
amplitudes in response to pitch shis of +50 cents across all control (A) and trained participants (B) before
(blue solid lines) and aer (red solid lines) training.
Figure 4. Grand-averaged ERP waveforms (FCz, Cz, Pz) and topographical distributions of N1 and P2
amplitudes in response to pitch shis of +200 cents across all control (A) and trained participants (B) before
(blue solid lines) and aer (red solid lines) training.
www.nature.com/scientificreports/
7
Scientific RepoRts | 5:16562 | DOI: 10.1038/srep16562
the trained group revealed a signicant main eect of testing session (F(1, 17) = 17.828, p = 0.001), indi-
cating that P2 amplitudes signicantly increased following training (see Fig.5B). By contrast, P2 ampli-
tudes did not dier as a function of testing session (F(1, 14) = 1.614, p = 0.225) for the control group.
Regarding P2 latency, there was a signicant main eect of stimulus magnitude (F(1, 31) = 97.498,
p < 0.001) as reected by signicantly shorter P2 latencies elicited by pitch shis of + 200 cents, as com-
pared to pitch shis of + 50 cents (225 ± 24 vs. 252 ± 25 ms). However, the main eects of testing session
(F(1, 31) = 3.774, p = 0.061), electrode site (F(14, 434) = 1.817, p = 0.148), and group (F(1, 31) = 0.037,
p = 0.849) failed to reached signicance. ere also were no signicant interactions between these factors
(p > 0.05).
In addition, regression analyses were performed to examine the relationship between working mem-
ory training and cortical responses to pitch-shied voice auditory feedback. e percentage change of
DSB scores (i.e. post-pre DSB scores divided by pre-training DSB scores) is plotted against the post-pre
dierence for the mean N1 and P2 responses in Fig. 6. e results showed a signicant positive cor-
relation between the post-pre dierence for the mean P2 responses and the percentage change in DSB
scores (r = 0.655, p = 0.004), indicating that the degree of improvement in working memory capacity
was predictive of the training-related enhancement of cortical responses to pitch feedback perturbations.
e post-pre dierence for the mean N1 responses, however, was not signicantly correlated with the
percentage change in DSB scores (r = 0.162, p = 0.536).
Figure 5. T-bar plots (means and standard errors) of N1 (A) and P2 (B) amplitude as a function of testing
session and group. e black and the blank bars denote the cortical responses at the pre-training and the
post-training session, respectively. e asterisks indicate signicant dierences between conditions.
Figure 6. e percentage change of DSB scores is plotted against the post-pre dierence for the mean
N1 (A) and P2 (B) responses to pitch feedback perturbations. ere was a signicant positive correlation
between the post-pre dierence for the mean P2 responses and the percentage change in DSB scores
(r = 0.655, p = 0.004), whereas the post-pre dierence for the mean N1 responses was not signicantly
correlated with the percentage change in DSB scores (r = 0.162, p = 0.536).
www.nature.com/scientificreports/
8
Scientific RepoRts | 5:16562 | DOI: 10.1038/srep16562
Discussion
e present study investigated whether working memory training can lead to improved processing of
feedback errors during vocal pitch regulation. Before and aer an adaptive DSB training procedure,
participants’ vocal and cortical responses to errors detected in their auditory feedback were measured.
As hypothesized, in addition to signicant improvements in participants’ working memory capacity (i.e.
the DSB score), we observed an impact of working memory training on the cortical processing of vocal
pitch errors. e trained participants’ N1 responses to pitch errors in voice auditory feedback decreased,
whereas the N1 responses for the rst and second AAF sessions of control participants’ were not sig-
nicantly dierent. Conversely, P2 responses increased aer working memory training, but remained
the same across the vocal production tasks for control participants. Moreover, the percentage change in
DSB scores was signicantly correlated with the post-pre dierence for the mean P2 responses to pitch
feedback perturbations, indicating that improvement in working memory capacity was predictive of
training-related enhancement of cortical responses to voice feedback errors. Neither the trained partic-
ipants nor the control participants, however, showed a systematic change in their vocal compensation
magnitude or latency across the pre- and post-training sessions. Taken together, these ndings demon-
strate that training-related improvements in working memory capacities generalize to other tasks, and
modify the cortical processing of mismatches between intended and actual auditory feedback.
Training-related eects on the N1-P2 complex. A primary nding in this study is that the train-
ing of working memory led to decreased N1 responses when participants heard pitch-shied auditory
feedback. is nding is in line with other studies that showed decreased N1 amplitudes in response
to pitch feedback perturbations aer speech perceptual learning30, and decreased brain activity in the
frontoparietal regions aer working memory training29,38,39. e N1 component is generally thought to
reect pre-attentive detection of a mismatch between incoming auditory stimuli and the memory trace
of previous sensory input into the auditory system40. We hypothesize that increased neural eciency41
accounts for the decreased N1 responses to pitch errors in voice auditory feedback that we observed.
is hypothesis is supported by work showing increased eciency in the processing of auditory infor-
mation following training of auditory working memory as reected by decreases in brain activation. For
example, in a recent study an adaptive n-back training with tonal sequences led to not only improved
performance on an auditory 2-back task, but also decreased activation in the right inferior frontal gyrus
and posterior parietal regions29. us, following intensive training of auditory working memory, fewer
neural resources may be required for encoding the same level of auditory information such that the
integration of incoming information with information stored in memory becomes more ecient. In the
case of voice control then, the detection and comparison of pitch errors in voice auditory feedback may
become more ecient, and this increased eciency may be reected by the signicantly decreased N1
responses observed aer training in the present study.
In addition to the changes in N1 responses, enhancement of P2 responses to pitch-shied voice
auditory feedback was also observed for trained participants. Similarly in a previous study, P2 responses
to pitch feedback perturbations were signicantly increased aer a sound-to-word learning of lexical
tones by associating speech stimuli with pictures of objects30. e present nding is also in line with
other ndings that showed increased theta power42 and ERN28 using EEG, and increased blood oxygen-
ation level dependent (BOLD) responses measured with fMRI43–45 following working memory training.
Since P2 responses to pitch feedback perturbations are associated with relatively later stages of corti-
cal processing, it has been suggested that the P2 component may reect functional mechanisms that
demand higher-level cognitive processes of auditory-motor integration46. In particular, there is evidence
that P2 receives contributions from the planum temporale47, and one function of this region is to sup-
port sensorimotor integration in speech processing22. For example, the posterior region of the planum
temporale, area Spt, has been found to function as an interface between auditory and motor representa-
tions20,48. Taken together, increased P2 responses to pitch feedback errors may index the training-induced
enhancement of the coordinated neural activity for the interaction between the auditory and motor
systems in voice control.
Interestingly, the post-pre dierence in the mean P2 responses was positively correlated with the per-
centage change in DSB scores, suggesting a relationship between the degree of improvement in working
memory capacity and training-related enhancement of cortical responses to pitch feedback perturba-
tions. is nding provides further evidence that working memory in the phonological loop is depend-
ent on the operation of an auditory-motor interface system in the posterior planum temporale19. Also
this correlation lends support to the idea that the P2 component reects higher-level cognitive processing
of feedback errors during the auditory-motor integration for voice control.
Since it has been previously shown that P2 responses become larger when participants attend to
pitch feedback perturbations as compared to when they do not49,50, and that attentional control can be
improved by working memory training51, one may argue that the training-related changes we observed in
the N1-P2 complex were not the result of improved working memory capacities per se, but the result of
an increased capacity for attentional control. at is, aer working memory training, our trained partici-
pants may have paid more attention to their voice auditory feedback, and this increased attention caused
the training-related changes in the N1-P2 complex. However, in addition to enhancing P2 responses,
attending to pitch feedback perturbations has been shown to increase N1 amplitudes and increase the
www.nature.com/scientificreports/
9
Scientific RepoRts | 5:16562 | DOI: 10.1038/srep16562
size of vocal compensations, as compared to when they are not attended49,52. In the present study, we
observed decreased N1 responses and vocal compensation remained unchanged aer training. Moreover,
the correlation observed between the improvements in working memory capacity and the enhanced P2
responses to pitch perturbations suggests that the training gains in auditory-motor integration for voice
control are most likely due to increased working memory capacity.
The lack of training eect on the vocal compensation. Unlike the N1-P2 complex, the vocal
compensations for the pitch errors heard in voice auditory feedback were not aected by the working
memory training. Similarly, Chen et al.30 did not nd a systematic change in the vocal responses to pitch
feedback perturbations for participants who underwent training to associate unfamiliar speech sounds
with words. ese results are also in line with studies on visuomotor adaptation, which showed that
improving working memory capacity using dual n-back training does not alter the rates of visuomotor
adaptation53, although individual dierences in spatial working memory capacity are predictive of the
rate of early visuomotor adaptation54. In terms of the present study, there are a number of potential
explanations for the lack of transfer to vocal compensation responses. For example, it is possible that
working memory training may only benet the detection of pitch errors in auditory feedback. It may also
be that if participants were presented with feedback perturbations that were much smaller in magnitude
than those presented in the present study, and near the threshold for detection, dierences between a
trained and untrained group of participants would emerge. In addition, there are several other factors
that contribute to the underlying mechanisms engaged in the modulation of vocal compensation for
voice feedback errors, such as stimulus features55,56 and task demands57,58. Future studies that manipu-
late these factors could provide a more thorough characterization of transfer eect of working memory
training to auditory-motor integration.
Limitation. One limitation of the present study was the use of a passive control group that was used
to control for pre-/post-test eects, but did not receive any form of training between these tests. us,
confounding factors such as motivational and psychological eects, which have been shown to inu-
ence the eectiveness of working memory training59, were not controlled using the present experimental
design. erefore, it is possible that transfer eects we observed are not solely a result of training, but
may in part be related to dierent levels in motivation and/or practice. Positive transfer eects of work-
ing memory training will be more convincing when compared against an active control group with a
cognitively engaging alternative intervention. Another limitation is the inclusion of a perception-in-noise
component, which makes it dicult to determine how much of the transfer eect was a result of the
working memory training gains or an improved ability to discriminate the acoustic stimuli in noise. In
future studies, the present training paradigm should be compared to an adaptive DSB training without
noise to determine the contributions of perception in noise vs. working memory capacity to the transfer
eects we observed.
Conclusions
Overall, we found neurophysiological evidence that training of auditory working memory impacts the
auditory-motor processing of vocal pitch regulation at the cortical level. is includes decreased N1 and
increased P2 responses to pitch errors in voice auditory feedback following training. Our results extend
previous ndings regarding the transfer of training gains in working memory, demonstrating that the
cortical processing of vocal pitch regulation can benet from working memory training.
References
1. Baddeley, A. Woring memory: looing bac and looing forward. Nat. ev. Neurosci. 4, 829–39 (2003).
2. Jaeggi, S. M., Buschuehl, M., Jonides, J. & Perrig, W. J. Improving uid intelligence with training on woring memory. Proc.
Natl. Acad. Sci. USA. 105, 6829–33 (2008).
3. Borella, E., Carretti, B. & Mammarella, I. C. Do woring memory and susceptibility to interference predict individual dierences
in uid intelligence? Eur. J. Cogn. Psychol. 18, 51–69 (2006).
4. DeDe, G., Caplan, D., emtes, . & Waters, G. e relationship between age, verbal woring memory, and language
comprehension. Psychol. Aging 19, 601–16 (2004).
5. De Beni, ., Borella, E. & Carretti, B. eading comprehension in aging: the role of woring memory and metacomprehension.
Neuropsychol. Dev. Cogn. B Aging Neuropsychol. Cogn. 14, 189–212 (2007).
6. Bherer, L. et al. Testing the limits of cognitive plasticity in older adults: application to attentional control. Acta Psychol. (Amst)
123, 261–78 (2006).
7. ramer, A. F., Hahn, S. & Gopher, D. Tas coordination and aging: explorations of executive control processes in the tas
switching paradigm. Acta Psychol. (Amst) 101, 339–78 (1999).
8. obertson, I. H. & Murre, J. M. ehabilitation of brain damage: brain plasticity and principles of guided recovery. Psychol. Bull.
125, 544–75 (1999).
9. Goldman-aic, P. S. Woring memory dysfunction in schizophrenia. J. Neuropsychiatry Clin. Neurosci. 6, 348–57 (1994).
10. lingberg, T. et al. Computerized training of woring memory in children with ADHD—a randomized, controlled trial. J. Am.
Acad. Child. Psy. 44, 177–86 (2005).
11. Westerberg, H. et al. Computerized woring memory training aer stroe—a pilot study. Brain Inj. 21, 21–9 (2007).
12. Dahlin, E., Nyberg, L., Bacman, L. & Neely, A. S. Plasticity of executive functioning in young and older adults: immediate
training gains, transfer, and long-term maintenance. Psychol. Aging 23, 720–30 (2008).
13. Schulze, . & oelsch, S. Woring memory for speech and music. Ann. N. Y. Acad. Sci. 1252, 229–36 (2012).
www.nature.com/scientificreports/
10
Scientific RepoRts | 5:16562 | DOI: 10.1038/srep16562
14. Buchsbaum, B. ., Olsen, . ., och, P. & Berman, . F. Human dorsal and ventral auditory streams subserve rehearsal-based
and echoic processes during verbal woring memory. Neuron 48, 687–97 (2005).
15. oelsch, S. et al. Functional architecture of verbal and tonal woring memory: an FMI study. Hum. Brain Mapp. 30, 859–73
(2009).
16. Behroozmand, . et al. Sensory-motor networs involved in speech production and motor control: an fMI study. Neuroimage
109, 418–28 (2015).
17. Parinson, A. L. et al. Understanding the neural mechanisms involved in sensory control of voice production. Neuroimage 61,
314–322 (2012).
18. Zarate, J. M. & Zatorre, . J. Experience-dependent neural substrates involved in vocal pitch regulation during singing.
Neuroimage 40, 1871–87 (2008).
19. Buchsbaum, B. . & D’Esposito, M. e search for the phonological store: from loop to convolution. J. Cogn. Neurosci. 20, 762–78
(2008).
20. Hico, G., Buchsbaum, B., Humphries, C. & Muuler, T. Auditory-motor interaction revealed by fMI: speech, music, and
woring memory in area Spt. J. Cogn. Neurosci. 15, 673–82 (2003).
21. Heins-Maldonado, T. H., Mathalon, D. H., Gray, M. & Ford, J. M. Fine-tuning of auditory cortex during speech production.
Psychophysiology 42, 180–190 (2005).
22. Hico, G., Houde, J. F. & ong, F. Sensorimotor integration in speech processing: computational basis and neural organization.
Neuron 69, 407–22 (2011).
23. Hawco, C. S. & Jones, J. A. Control of vocalization at utterance onset and mid-utterance: Dierent mechanisms for dierent goals.
Brain es. 1276, 131–139 (2009).
24. Brumbac, C. ., Low, . A., Gratton, G. & Fabiani, M. Putting things into perspective: individual dierences in woring-
memory span and the integration of information. Exp. psychol. 52, 21–30 (2005).
25. Troyer, T. W. & Doupe, A. J. An associational model of birdsong sensorimotor learning I. Eerence copy and the learning of song
syllables. J. Neurophysiol. 84, 1204–23 (2000).
26. Gonzalez, C. C. & Bure, M. . e brain uses eerence copy information to optimise spatial memory. Exp. Brain es. 224,
189–97 (2013).
27. Hochman, E. Y. & Meiran, N. Central interference in error processing. Mem. Cognit. 33, 635–43 (2005).
28. Horowitz-raus, T. & Breznitz, Z. Can the error detection mechanism benet from training the woring memory? A comparison
between dyslexics and controls—an EP study. PLoS ONE 4, e7141 (2009).
29. Schneiders, J. A. et al. e impact of auditory woring memory training on the fronto-parietal woring memory networ. Front.
Hum. Neurosci. 6, 173 (2012).
30. Chen, Z. et al. Transfer eect of speech-sound learning on auditory-motor processing of perceived vocal pitch errors. Sci. ep.
5, 13134 (2015).
31. Berry, A. S. et al. e inuence of perceptual training on woring memory in older adults. PLoS ONE 5, e11537 (2010).
32. Behroozmand, ., arvelis, L., Liu, H. & Larson, C. . Vocalization-induced enhancement of the auditory cortex responsiveness
during voice F0 feedbac perturbation. Clin. Neurophysiol. 120, 1303–1312 (2009).
33. Ferree, T. C., Luu, P., ussell, G. S. & Tucer, D. M. Scalp electrode impedance, infection ris, and EEG data quality. Clin.
Neurophysiol. 112, 536–44 (2001).
34. Li, W. et al. Neurophysiological evidence of dierential mechanisms involved in producing opposing and following responses to
altered auditory feedbac. Clin. Neurophysiol. 124, 2161–2171 (2013).
35. Boersma, P. Praat, a system for doing phonetics by computer. Glot International 5, 341–345 (2001).
36. Behroozmand, ., Liu, H. & Larson, C. . Time-dependent neural processing of auditory feedbac during voice pitch error
detection. J. Cogn. Neurosci. 23, 1205–1217 (2011).
37. Chen, Z. et al. EP correlates of language-specic processing of auditory pitch feedbac during self-vocalization. Brain Lang.
121, 25–34 (2012).
38. Sayala, S., Sala, J. B. & Courtney, S. M. Increased neural eciency with repeated performance of a woring memory tas is
information-type dependent. Cereb. Cortex. 16, 609–17 (2006).
39. Dahlin, E., Neely, A. S., Larsson, A., Bacman, L. & Nyberg, L. Transfer of learning aer updating training mediated by the
striatum. Science 320, 1510–2 (2008).
40. Näätänen, . & Picton, T. e N1 wave of the human electric and magnetic response to sound: a review and an analysis of the
component structure. Psychophysiology 24, 375–425 (1987).
41. elly, C., Foxe, J. J. & Garavan, H. Patterns of normal human brain plasticity aer practice and their implications for
neurorehabilitation. Arch. Phys. Med. ehab. 87, S20–9 (2006).
42. Langer, N., von Bastian, C. C., Wirz, H., Oberauer, . & Jance, L. e eects of woring memory training on functional brain
networ eciency. Cortex 49, 2424–38 (2013).
43. Olesen, P. J., Westerberg, H. & lingberg, T. Increased prefrontal and parietal activity aer training of woring memory. Nat.
Neurosci. 7, 75–9 (2004).
44. Wexler, B. E., Anderson, M., Fulbright, . . & Gore, J. C. Preliminary evidence of improved verbal woring memory performance
and normalization of tas-related frontal lobe activation in schizophrenia following cognitive exercises. Am. J. Psychiat. 157,
1694–7 (2000).
45. Moore, C. D., Cohen, M. X. & anganath, C. Neural mechanisms of expert sills in visual woring memory. J. Neurosci. 26,
11187–96 (2006).
46. Behroozmand, ., Ibrahim, N., orzyuov, O., obin, D. A. & Larson, C. . Le-hemisphere activation is associated with
enhanced vocal pitch error detection in musicians with absolute pitch. Brain Cogn. 84, 97–108 (2014).
47. Godey, B., Schwartz, D., de Graaf, J. B., Chauvel, P. & Liegeois-Chauvel, C. Neuromagnetic source localization of auditory evoed
elds and intracerebral evoed potentials: a comparison of data in the same patients. Clin. Neurophysiol. 112, 1850–9 (2001).
48. Hico, G. & Poeppel, D. e cortical organization of speech processing. Nat. ev. Neurosci. 8, 393–402 (2007).
49. Liu, Y. et al. Selective and divided attention modulates auditory-vocal integration in the processing of pitch feedbac errors. Eur.
J. Neurosci. 42, 1895–1904 (2015).
50. Hu, H. et al. Attention modulates cortical processing of pitch feedbac errors in voice control. Sci. ep. 5, 7812 (2015).
51. Salminen, T., Strobach, T. & Schubert, T. On the impacts of woring memory training on executive functioning. Front. Hum.
Neurosci. 6, 166 (2012).
52. Tumber, A. ., Scheerer, N. E. & Jones, J. A. Attentional Demands Inuence Vocal Compensations to Pitch Errors Heard in
Auditory Feedbac. PLoS ONE 9, e109968 (2014).
53. Anguera, J. A. et al. e eects of woring memory resource depletion and training on sensorimotor adaptation. Behav. Brain.
es. 228, 107–15 (2012).
54. Anguera, J. A., euter-Lorenz, P. A., Willingham, D. T. & Seidler, . D. Contributions of spatial woring memory to visuomotor
learning. J. Cogn. Neurosci. 22, 1917–30 (2010).
www.nature.com/scientificreports/
11
Scientific RepoRts | 5:16562 | DOI: 10.1038/srep16562
55. Liu, H. & Larson, C. . Eects of perturbation magnitude and voice F0 level on the pitch-shi reex. J. Acoust. Soc. Am. 122,
3671–3677 (2007).
56. Larson, C. ., Sun, J. & Hain, T. C. Eects of simultaneous perturbations of voice pitch and loudness feedbac on voice F0 and
amplitude control. J. Acoust. Soc. Am. 121, 2862–2872 (2007).
57. Nate, U., Donath, T. M. & alveram, . T. Control of voice fundamental frequency in speaing versus singing. J. Acoust. Soc.
Am. 113, 1587–1593 (2003).
58. Liu, H., Xu, Y. & Larson, C. . Attenuation of vocal responses to pitch perturbations during Mandarin speech. J. Acoust. Soc.
Am. 125, 2299–2306 (2009).
59. von Bastian, C. C. & Oberauer, . Eects and mechanisms of woring memory training: a review. Psychol. es. 78, 803–20
(2014).
Acknowledgements
e authors would like to thank Drs. Patrick C. M. Wong and Erin M. Ingvalson for their help with data
collection. is study was funded by grants from National Natural Science Foundation of China (Nos.
31070990, 31371135, 81301675 and 81472154), Guangdong Natural Science Funds for Distinguished
Young Scholar (No. S2013050014470), and the Fundamental Research Funds for the Central Universities
(No. 13ykzd05).
Author Contributions
W.L. and H.L. designed the project; W.L., Z.G. and X.H. performed the experiment and analyzed the
data; W.L., J.J., X.C., P.L., S.C. and H.L. interpreted the results; W.L., J.J. and H.L. wrote the manuscript;
all authors read and approved the nal manuscript.
Additional Information
Competing nancial interests: e authors declare no competing nancial interests.
How to cite this article: Li, W. et al. Training of Working Memory Impacts Neural Processing of Vocal
Pitch Regulation. Sci. Rep. 5, 16562; doi: 10.1038/srep16562 (2015).
is work is licensed under a Creative Commons Attribution 4.0 International License. e
images or other third party material in this article are included in the article’s Creative Com-
mons license, unless indicated otherwise in the credit line; if the material is not included under the
Creative Commons license, users will need to obtain permission from the license holder to reproduce
the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
... This suggests that when less attention was available for the pitch-shift task, individuals were less able to utilize the auditory feedback to change speech production. Additionally, a study by Li and colleagues [23] found that working memory training modified brain activation during a pitch-shift task. In this study participants trained on an adaptive backwards digit span task for 12 days, and brain activity in response to auditory stimuli (event related potentials or ERPs) were measured before and after training during a standard pitch-shift task. ...
... Specifically, both the N1 and P2 peaks occurred earlier post-training compared to pre-training, and the P2 peak magnitude was enhanced post-training compared to pre-training. These results are consistent with the findings of Li and colleagues [23] who report an increase in P2 magnitude post-training. While Li and colleagues [23] found a decrease in the N1 amplitude, we did not find changes to the N1 following training in our study, potentially due to differences in the training task. ...
... These results are consistent with the findings of Li and colleagues [23] who report an increase in P2 magnitude post-training. While Li and colleagues [23] found a decrease in the N1 amplitude, we did not find changes to the N1 following training in our study, potentially due to differences in the training task. Other research has shown a N1 suppression (vocalization compared to listening) for pitch-shifts that occur at voice onset cite but a P2 enhancement for pitch shifts that occur mid-vocalization [24,25]. ...
Article
Full-text available
The pitch perturbation technique is a validated technique that has been used for over 30 years to understand how people control their voice. This technique involves altering a person’s voice pitch in real-time while they produce a vowel (commonly, a prolonged /a/ sound). Although post-task changes in the voice have been observed in several studies (e.g., a change in mean f o across the duration of the experiment), the potential for using the pitch perturbation technique as a training tool for voice pitch regulation and/or modification has not been explored. The present study examined changes in event related potentials (ERPs) and voice pitch in three groups of subjects due to altered voice auditory feedback following a brief, four-day training period. Participants in the opposing group were trained to change their voice f o in the opposite direction of a pitch perturbation stimulus. Participants in the following group were trained to change their voice f o in the same direction as the pitch perturbation stimulus. Participants in the non-varying group did not voluntarily change their pitch, but instead were asked to hold their voice constant when they heard pitch perturbations. Results showed that all three types of training affected the ERPs and the voice pitch-shift response from pre-training to post-training (i.e., “hold your voice pitch steady” task; an indicator of voice pitch regulation). Across all training tasks, the N1 and P2 components of the ERPs occurred earlier, and the P2 component of the ERPs occurred with larger amplitude post-training. The voice responses also occurred earlier but with a smaller amplitude following training. These results demonstrate that participation in pitch-shifted auditory feedback tasks even for brief periods of time can modulate the automatic tendency to compensate for alterations in voice pitch feedback and has therapeutic potential.
... For example, larger N1 and P2 responses were elicited by larger size of pitch perturbations (Behroozmand et al., 2009;Liu et al., 2011b;Scheerer et al., 2013). Training-induced decrease of N1 responses and/or increase of P2 responses to pitch perturbations were also found when healthy participants underwent speech perceptual learning or working memory training Li et al., 2015;Guo et al., 2017). Previous findings have shown enhanced vocal and/or P2 responses to pitch perturbations in individuals with PD (Liu et al., 2012;Chen et al., 2013;Huang et al., 2016;Mollaei et al., 2016). ...
... It is noteworthy that LSVT LOUD did not lead to systematic changes of N1 responses to pitch perturbations. This is in contrast with other studies that have shown decreased N1 responses to pitch perturbations in healthy participants following speech perceptual learning and auditory working memory training Li et al., 2015) and individuals with PD following external auditory cueing (Huang et al., 2019), reflecting increased efficiency in the neural encoding of pitch information in auditory feedback (Zatorre et al., 2012). Although we cannot provide specific explanations for the absence of N1 modulation following LSVT LOUD, it may be related to the differences in the training protocol. ...
... Although we cannot provide specific explanations for the absence of N1 modulation following LSVT LOUD, it may be related to the differences in the training protocol. In previous studies Li et al., 2015;Huang et al., 2019), the participants were required to learn to perceive different lexical tones, remember digits with varying signal-noise-ration (SNR) levels, or vocalize to match specific pitch target, which demands extensive involvement of auditory-related regions that contributed to the generation of N1 responses. In contrast, LSVT LOUD is specifically designed to improve vocal SPL for speech tasks that has been found to accompanied with changes in cortical activity of motor and premotor regions as well as the DLPFC (Liotti et al., 2003;Narayana et al., 2010), which may not lead to the modulatory effects on the N1 responses. ...
Article
Full-text available
Individuals with Parkinson's disease (PD) are impaired in auditory-vocal integration, characterized by abnormal compensatory responses to auditory feedback errors during self-monitoring of vocal production. The present study examined whether auditory feedback control of vocal pitch production in PD can benefit from Lee Silverman voice treatment (LSVT R LOUD), a high effort, intensive speech treatment for hypokinetic dysarthria in PD. Before and immediately after LSVT LOUD, 12 individuals with PD were instructed to produce sustained vowel sounds while hearing their voice unexpectedly pitch-shifted by −200 cents. Their vocal responses and event-related potentials (ERPs) to pitch perturbations were measured to assess the treatment outcomes. A group of 12 healthy controls were one-to-one pair matched by age, sex, and language. Individuals with PD exhibited abnormally enhanced vocal and ERP P2 responses to pitch perturbations relative to healthy controls. Successful treatment with LSVT LOUD, however, led to significantly smaller and faster vocal compensations that were accompanied by significantly larger P2 responses. Moreover, improved vocal loudness during passage reading was significantly correlated with reduced vocal compensations for pitch perturbations. These preliminary findings provide the first neurobehavioral evidence for beneficial effects of LSVT LOUD on impaired auditory-vocal integration associated with PD, which may be related to improved laryngeal motor functions and a top-down modulation of the speech motor network by LSVT LOUD.
... N1 amplitude decreases with more distributed attention, that is, when perceiving more diverse stimuli, the N1 component decreases (e. g., Frtusova et al., 2013;Han et al., 2013). P2 amplitude is sensitive to the level of attention (e.g., Han et al., 2013) and correlates positively with performance in recall (Dunn et al., 1998) and auditory feedback tasks (Li et al., 2015). Amplitude of the posterior N2 component increases with lower stimulus predictability (Folstein and Van Petten, 2008); this is relevant for tone-monitoring because the relative frequency of each syllable decreases as load increases. ...
... These load-independent PHS effects are in line with the hypothesis that imagery techniques involved in PHS facilitate mental practice. This hypothesis is supported by observations that (1) WM training in auditory feedback tasks increased P2 amplitude (Li et al., 2015), and (2) that P3 amplitude is larger in individuals with higher WM in both n-back (Dong et al., 2015;Evans et al., 2011) and other memory tasks (Dunn et al., 1998;Shiran and Breznitz, 2011). Intriguingly, a similar increase of P3 amplitude by PHSs had been observed in a previous study (Zahedi et al., 2019), aiming to inhibit access to word meanings in a Stroop task. ...
Article
Updating is an essential executive function (EF), responsible for storing, retrieving, and substituting information in working memory (WM). Here we investigated whether posthypnotic suggestions (PHS) given to high-hypnotizable participants can enhance updating in WM and measured neural correlates of the observed effects by recording event-related brain potentials (ERP). In a tone-monitoring task different syllables were presented in random order, requiring a response to every fourth presentation of a given syllable. Experiment 1 (n = 19) established the relationship between performance and several ERP components across updating load (different numbers of syllables). In Experiment 2 (n = 18), a no-hypnosis (NH) and a hypnosis-plus-PHS session were administrated in counterbalanced order. Task instructions, presented at the beginning of the sessions, emphasized a cognitive strategy, demanding imagination of visual counters, a strategy that was also emphasized during PHS. PHS additionally contained suggestions stimulating cognitive simulation of the task, where participants were advised to apply the suggested strategy. Relative to the NH session, PHS enhanced WM performance with medium to large effect sizes. In ERPs, PHS increased the P2 and P3 components, indicating the proactive recruitment of control-related attention and updating-related cognitive control processes, respectively. PHS also reduced updating load effects in the posterior recognition component, suggesting diminished demands on WM buffers. These ERP findings suggest that PHS enhanced updating in WM by strengthening proactive control, which may have diminished the necessity for reactive control. Hence, the present results suggest that our PHS had worked like mental practice helping participants to develop an efficient and context-dependent trigger-action contingency. Consequentially, the present study provides a new framework for employing PHSs, which may be used as a basis for developing new training regimes for modifying WM or other EFs.
... As such, an abnormally enhanced behavioral pitch perturbation response in AD patients may either result from abnormal sensorimotor processing mechanisms within the speech-motor control network itself, or as a result of failed regulatory mechanisms across cognitive systems. Evidence for interconnections between speech-motor control network and other cognitive systems come from studies of healthy participants where auditory attention and auditory working memory have been shown to modulate the behavioral and neural correlates of speech motor control [16][17][18][19][20] . Moreover, in our previous behavioral experiment we found that the degree of enhanced pitch perturbation response was significantly correlated with executive dysfunction in AD patients 5 . ...
... Both models further revealed that there was no significant main effect of group identity towards this association (F = 1.96; P = 0. 18). ...
Article
Full-text available
Accurate integration of sensory inputs and motor commands is essential to achieve successful behavioral goals. A robust model of sensorimotor integration is the pitch perturbation response, in which speakers respond rapidly to shifts of the pitch in their auditory feedback. In a previous study, we demonstrated abnormal sensorimotor integration in patients with Alzheimer’s disease (AD) with an abnormally enhanced behavioral response to pitch perturbation. Here we examine the neural correlates of the abnormal pitch perturbation response in AD patients, using magnetoencephalographic imaging. The participants phonated the vowel /α/ while a real-time signal processor briefly perturbed the pitch (100 cents, 400 ms) of their auditory feedback. We examined the high-gamma band (65–150 Hz) responses during this task. AD patients showed significantly reduced left prefrontal activity during the early phase of perturbation and increased right middle temporal activity during the later phase of perturbation, compared to controls. Activity in these brain regions significantly correlated with the behavioral response. These results demonstrate that impaired prefrontal modulation of speech-motor-control network and additional recruitment of right temporal regions are significant mediators of aberrant sensorimotor integration in patients with AD. The abnormal neural integration mechanisms signify the contribution of cortical network dysfunction to cognitive and behavioral deficits in AD.
... Overall, the contour of the ERP waveforms recorded in the present study replicate a large number of ERP results reported in PSP studies (Behroozmand et al., 2009;Behroozmand et al., 2016;Chen et al., 2013;Korzyukov et al., 2012a;Korzyukov et al., 2012b;Korzyukov et al., 2015;Li et al., 2015;Liu et al., 2011;Liu et al., 2018;Scheerer and Jones, 2014;Scheerer and Jones, 2018). In our previous PSP ERP study of expectation, the difference between predictable and unpredictable stimuli was statistically tested across 6 electrodes at latency segments peaking around 140-145 ms where the ERP difference was most prominent (Korzyukov et al., 2012b). ...
Article
Predictive processing across hierarchically organized time scales is one of the fundamental principles of neural computations in the cerebral cortex. We hypothesize that relatively complex aggregation of auditory and vocal brain systems that use auditory feedback for reflexive control of vocalizations can be an object for predictive processing. We used repetitive patterns of perturbations in auditory feedback during vocalizations to elicit implicit expectations that were violated by surprising direction of perturbations in one of the experimental conditions. Our results provides empirical support for the idea that formation of expectancy for integrated auditory-vocal brain systems, within the time range of seconds, resulted in two sequential neuronal processes. The first process reflects monitoring and error detection in prediction about perturbations in auditory feedback during vocalizations within the time range of seconds. The second neuronal process can be attributed to the optimization of brain predictions for sensory contingencies during vocalizations at separable and distinct timescales.
Article
Clinical studies have shown the efficacy of transcranial magnetic stimulation in treating movement disorders in patients with spinocerebellar ataxia (SCA). However, whether similar effects occur for their speech motor disorders remains largely unknown. The present event-related potential study investigated whether and how abnormalities in auditory-vocal integration associated with SCA can be modulated by neuronavigated continuous theta burst stimulation (c-TBS) over the right cerebellum. After receiving active or sham cerebellar c-TBS, 19 patients with SCA were instructed to produce sustained vowels while hearing their voice unexpectedly pitch-shifted by ±200 cents. Behaviorally, active cerebellar c-TBS led to smaller magnitudes of vocal compensations for pitch perturbations than sham stimulation. Parallel modulatory effects were also observed at the cortical level, as ref lected by increased P1 and P2 responses but decreased N1 responses elicited by active cerebellar c-TBS. Moreover, smaller magnitudes of vocal compensations were predicted by larger amplitudes of cortical P1 and P2 responses. These findings provide the first neurobehavioral evidence that c-TBS over the right cerebellum produces modulatory effects on abnormal auditory-motor integration for vocal pitch regulation in patients with SCA, offering a starting point for the treatment of speech motor disorders associated with SCA with cerebellar c-TBS.
Article
Full-text available
When people hear unexpected perturbations in auditory feedback, they produce rapid compensatory adjustments of their vocal behavior. Recent evidence has shown enhanced vocal compensations and cortical event-related potentials (ERPs) in response to attended pitch feedback perturbations, suggesting that this reflex-like behavior is influenced by selective attention. Less is known, however, about auditory-motor integration for voice control during divided attention. The present cross-modal study investigated the behavioral and ERP correlates of auditory feedback control of vocal pitch production during divided attention. During the production of sustained vowels, 32 young adults were instructed to simultaneously attend to both pitch feedback perturbations they heard and flashing red lights they saw. The presentation rate of the visual stimuli was varied to produce a low, intermediate, and high attentional load. The behavioral results showed that the low-load condition elicited significantly smaller vocal compensations for pitch perturbations than the intermediate-load and high-load conditions. As well, the cortical processing of vocal pitch feedback was also modulated as a function of divided attention. When compared to the low-load and intermediate-load conditions, the high-load condition elicited significantly larger N1 responses and smaller P2 responses to pitch perturbations. These findings provide the first neurobehavioral evidence that divided attention can modulate auditory feedback control of vocal pitch production.
Article
Full-text available
Although a growing body of research has focused on the cortical sensorimotor mechanisms that support auditory feedback control of speech production, much less is known about the subcortical contributions to this control process. The present study examined whether subregional anatomy of subcortical structures assessed by statistical shape analysis is associated with vocal compensations and cortical event-related potentials in response to vocal pitch errors. The results revealed significant negative correlations between the magnitudes of vocal compensations and subregional shape of the right thalamus, between the latencies of vocal compensations and subregional shape of the left caudate and pallidum, and between the latencies of cortical N1 responses and subregional shape of the left putamen. These associations indicate that smaller local volumes of the basal ganglia and thalamus are predictive of slower and larger neurobehavioral responses to vocal pitch errors. Furthermore, increased local volumes of the left hippocampus and right amygdala were predictive of larger vocal compensations, suggesting that there is an interplay between the memory-related subcortical structures and auditory-vocal integration. These results, for the first time, provide evidence for differential associations of subregional morphology of the basal ganglia, thalamus, hippocampus, and amygdala with neurobehavioral processing of vocal pitch errors, suggesting that subregional shape measures of subcortical structures can predict behavioral outcome of auditory-vocal integration and associated neural features.
Article
Full-text available
Although working memory (WM) is considered as an emergent property of the speech perception and production systems, the role of WM in sensorimotor integration during speech processing is largely unknown. We conducted two event-related potential experiments with female and male young adults to investigate the contribution of WM to the neurobehavioural processing of altered auditory feedback during vocal production. A delayed match-to-sample task that required participants to indicate whether the pitch feedback perturbations they heard during vocalizations in test and sample sequences matched, elicited significantly larger vocal compensations, larger N1 responses in the left middle and superior temporal gyrus, and smaller P2 responses in the left middle and superior temporal gyrus, inferior parietal lobule, somatosensory cortex, right inferior frontal gyrus and insula as compared to a control task that did not require memory retention of the sequence of pitch perturbations. On the other hand, participants who underwent extensive auditory WM training produced suppressed vocal compensations that were correlated with improved auditory WM capacity, and enhanced P2 responses in the left middle frontal gyrus, inferior parietal lobule, right inferior frontal gyrus and insula that were predicted by pre-training auditory WM capacity. These findings indicate that WM can enhance the perception of voice auditory feedback errors while inhibiting compensatory vocal behaviour to prevent voice control from being excessively influenced by auditory feedback. This study provides the first evidence that auditory-motor integration for voice control can be modulated by top-down influences arising from WM, rather than modulated exclusively by bottom-up and automatic processes.
Article
Full-text available
Speech perception and production are intimately linked. There is evidence that speech motor learning results in changes to auditory processing of speech. Whether speech motor control benefits from perceptual learning in speech, however, remains unclear. This event-related potential study investigated whether speech-sound learning can modulate the processing of feedback errors during vocal pitch regulation. Mandarin speakers were trained to perceive five Thai lexical tones while learning to associate pictures with spoken words over 5 days. Before and after training, participants produced sustained vowel sounds while they heard their vocal pitch feedback unexpectedly perturbed. As compared to the pre-training session, the magnitude of vocal compensation significantly decreased for the control group, but remained consistent for the trained group at the post-training session. However, the trained group had smaller and faster N1 responses to pitch perturbations and exhibited enhanced P2 responses that correlated significantly with their learning performance. These findings indicate that the cortical processing of vocal pitch regulation can be shaped by learning new speech-sound associations, suggesting that perceptual learning in speech can produce transfer effects to facilitating the neural mechanisms underlying the online monitoring of auditory feedback regarding vocal production.
Article
Full-text available
Speakers rapidly adjust their ongoing vocal productions to compensate for errors they hear in their auditory feedback. It is currently unclear what role attention plays in these vocal compensations. This event-related potential (ERP) study examined the influence of selective and divided attention on the vocal and cortical responses to pitch errors heard in auditory feedback regarding ongoing vocalizations. During the production of a sustained vowel, participants briefly heard their vocal pitch shifted up 2 semitones while they actively attended to auditory or visual events (selective attention), or both auditory and visual events (divided attention), or were not told to attend to either modality (control condition). The behavioral results showed that attending to the pitch perturbations elicited larger vocal compensations than attending to the visual stimuli. Moreover, ERPs were likewise sensitive to the attentional manipulations: P2 responses to pitch perturbations were larger when participants attended to the auditory stimuli compared to when they attended to the visual stimuli, and compared to when they were not explicitly told to attend to either the visual or auditory stimuli. By contrast, dividing attention between the auditory and visual modalities caused suppressed P2 responses relative to all the other conditions, and enhanced N1 responses relative to the control condition. These findings provide strong evidence for the influence of attention on the mechanisms underlying the auditory-vocal integration in the processing of pitch feedback errors. As well, selective attention and divided attention appear to modulate the neurobehavioral processing of pitch feedback errors in different ways. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Article
Full-text available
Considerable evidence has shown that unexpected alterations in auditory feedback elicit fast compensatory adjustments in vocal production. Although generally thought to be involuntary in nature, whether these adjustments can be influenced by attention remains unknown. The present event-related potential (ERP) study aimed to examine whether neurobehavioral processing of auditory-vocal integration can be affected by attention. While sustaining a vowel phonation and hearing pitch-shifted feedback, participants were required to either ignore the pitch perturbations, or attend to them with low (counting the number of perturbations) or high attentional load (counting the type of perturbations). Behavioral results revealed no systematic change of vocal response to pitch perturbations irrespective of whether they were attended or not. At the level of cortex, there was an enhancement of P2 response to attended pitch perturbations in the low-load condition as compared to when they were ignored. In the high-load condition, however, P2 response did not differ from that in the ignored condition. These findings provide the first neurophysiological evidence that auditory-motor integration in voice control can be modulated as a function of attention at the level of cortex. Furthermore, this modulatory effect does not lead to a general enhancement but is subject to attentional load.
Article
Full-text available
Auditory feedback is required to maintain fluent speech. At present, it is unclear how attention modulates auditory feedback processing during ongoing speech. In this event-related potential (ERP) study, participants vocalized/a/, while they heard their vocal pitch suddenly shifted downward a ½ semitone in both single and dual-task conditions. During the single-task condition participants passively viewed a visual stream for cues to start and stop vocalizing. In the dual-task condition, participants vocalized while they identified target stimuli in a visual stream of letters. The presentation rate of the visual stimuli was manipulated in the dual-task condition in order to produce a low, intermediate, and high attentional load. Visual target identification accuracy was lowest in the high attentional load condition, indicating that attentional load was successfully manipulated. Results further showed that participants who were exposed to the single-task condition, prior to the dual-task condition, produced larger vocal compensations during the single-task condition. Thus, when participants' attention was divided, less attention was available for the monitoring of their auditory feedback, resulting in smaller compensatory vocal responses. However, P1-N1-P2 ERP responses were not affected by divided attention, suggesting that the effect of attentional load was not on the auditory processing of pitch altered feedback, but instead it interfered with the integration of auditory and motor information, or motor control itself.
Article
Full-text available
Can cognitive abilities such as reasoning be improved through working memory training? This question is still highly controversial, with prior studies providing contradictory findings. The lack of theory-driven, systematic approaches and (occasionally serious) methodological shortcomings complicates this debate even more. This review suggests two general mechanisms mediating transfer effects that are (or are not) observed after working memory training: enhanced working memory capacity, enabling people to hold more items in working memory than before training, or enhanced efficiency using the working memory capacity available (e.g., using chunking strategies to remember more items correctly). We then highlight multiple factors that could influence these mechanisms of transfer and thus the success of training interventions. These factors include (1) the nature of the training regime (i.e., intensity, duration, and adaptivity of the training tasks) and, with it, the magnitude of improvements during training, and (2) individual differences in age, cognitive abilities, biological factors, and motivational and personality factors. Finally, we summarize the findings revealed by existing training studies for each of these factors, and thereby present a roadmap for accumulating further empirical evidence regarding the efficacy of working memory training in a systematic way.
Article
The focus of Part III has been on understanding how the reality of human energy regulation is far more complex and far more dynamic than the static, open-loop model of the simplistic energy balance equation, in part because the two limbs of energy balance are not independent, but instead are physiologically linked, interacting in complex ways and often compensating for each other in the face of interventions.272 Part III has also addressed how nonlinear feedback processes underlie many of the interactions and mutual independences in human energy regulation; how some of these interactions induce positive feedback effects, while others are negative; and how feedback processes operate at multiple overlapping levels (as homeostatic processes at the physiological level, between the physiological and the behavioral, and between people and their external environment).
Article
Speaking is one of the most complex motor behaviors developed to facilitate human communication. The underlying neural mechanisms of speech involve sensory-motor interactions that incorporate feedback information for online monitoring and control of produced speech sounds. In the present study, we adopted an auditory feedback pitch perturbation paradigm and combined it with functional magnetic resonance imaging (fMRI) recordings in order to identify brain areas involved in speech production and motor control. Subjects underwent fMRI scanning while they produced a steady vowel sound /a/ (speaking) or listened to the playback of their own vowel production (playback). During each condition, the auditory feedback from vowel production was either normal (no perturbation) or perturbed by an upward (+600 cents) pitch shift stimulus randomly. Analysis of BOLD responses during speaking (with and without shift) vs. rest revealed activation of a complex network including bilateral superior temporal gyrus (STG), Heschl's gyrus, precentral gyrus, supplementary motor area (SMA), Rolandic operculum, postcentral gyrus and right inferior frontal gyrus (IFG). Performance correlation analysis showed that the subjects produced compensatory vocal responses that significantly correlated with BOLD response increases in bilateral STG and left precentral gyrus. However, during playback, the activation network was limited to cortical auditory areas including bilateral STG and Heschl's gyrus. Moreover, the contrast between speaking vs. playback highlighted a distinct functional network that included bilateral precentral gyrus, SMA, IFG, postcentral gyrus and insula. These findings suggest that speech motor control involves feedback error detection in sensory (e.g. auditory) cortices that subsequently activate motor-related areas for the adjustment of speech parameters during speaking. Copyright © 2015 Elsevier Inc. All rights reserved.
Article
In the current study we examined the relationship between working memory capacity, inhibition/susceptibility to interference and fluid intelligence, measured by the Raven's Progressive Matrices (PM38), comparing groups of young (aged 18–35), young-old (aged 65–74), and old-old (aged 75–86) participants. Groups were administered two working memory tasks tapping into different mechanisms involved in working memory. The ability to control for irrelevant information was measured both considering memory errors (intrusion errors) in a working memory task and an index of susceptibility to interference obtained with a variant of the Brown-Peterson task. Regression analyses showed that the classical working memory measure was the most potent predictors of the Raven's score. Susceptibility to interference and intrusions errors contributed, but to a lower extent, to the Raven explained variance. These results confirm that working memory shares cognitive aspects with the fluid intelligence measure considered, whereas the role of inhibition to Raven scores is still in need of better evidence.