Individual differences in premotor and motor recruitment during speech perception

Article (PDF Available)inNeuropsychologia 50(7):1380-92 · April 2012with45 Reads
DOI: 10.1016/j.neuropsychologia.2012.02.023 · Source: PubMed
Abstract
Although activity in premotor and motor cortices is commonly observed in neuroimaging studies of spoken language processing, the degree to which this activity is an obligatory part of everyday speech comprehension remains unclear. We hypothesised that rather than being a unitary phenomenon, the neural response to speech perception in motor regions would differ across listeners as a function of individual cognitive ability. To examine this possibility, we used functional magnetic resonance imaging (fMRI) to investigate the neural processes supporting speech perception by comparing active listening to pseudowords with matched tasks that involved reading aloud or repetition, all compared to acoustically matched control stimuli and matched baseline tasks. At a whole-brain level there was no evidence for recruitment of regions in premotor or motor cortex during speech perception. A focused region of interest analysis similarly failed to identify significant effects, although a subset of regions approached significance, with notable variability across participants. We then used performance on a battery of behavioural tests that assessed meta-phonological and verbal short-term memory abilities to investigate the reasons for this variability, and found that individual differences in particular in low phonotactic probability pseudoword repetition predicted participants' neural activation within regions in premotor and motor cortices during speech perception. We conclude that normal listeners vary in the degree to which they recruit premotor and motor cortex as a function of short-term memory ability. This is consistent with a resource-allocation approach in which recruitment of the dorsal speech processing pathway depends on both individual abilities and specific task demands.

Figures

Neuropsychologia
50 (2012) 1380–
1392
Contents
lists
available
at
SciVerse
ScienceDirect
Neuropsychologia
j
ourna
l
ho
me
pag
e:
ww
w.elsevier.com/locate/neuropsychologia
Individual
differences
in
premotor
and
motor
recruitment
during
speech
perception
Gayaneh
Szenkovits
,
Jonathan
E.
Peelle
1
,
Dennis
Norris,
Matthew
H.
Davis
Medical
Research
Council,
Cognition
and
Brain
Sciences
Unit,
15
Chaucer
Road,
Cambridge
CB2
7EF,
England,
UK
a
r
t
i
c
l
e
i
n
f
o
Article
history:
Received
29
June
2011
Received
in
revised
form
13
December
2011
Accepted
25
February
2012
Available online 12 April 2012
Keywords:
Speech
perception
Dual
stream
model
Premotor
cortex
Motor
cortex
Individual
differences
fMRI
a
b
s
t
r
a
c
t
Although
activity
in
premotor
and
motor
cortices
is
commonly
observed
in
neuroimaging
studies
of
spoken
language
processing,
the
degree
to
which
this
activity
is
an
obligatory
part
of
everyday
speech
comprehension
remains
unclear.
We
hypothesised
that
rather
than
being
a
unitary
phenomenon,
the
neural
response
to
speech
perception
in
motor
regions
would
differ
across
listeners
as
a
function
of
individual
cognitive
ability.
To
examine
this
possibility,
we
used
functional
magnetic
resonance
imaging
(fMRI)
to
investigate
the
neural
processes
supporting
speech
perception
by
comparing
active
listening
to
pseudowords
with
matched
tasks
that
involved
reading
aloud
or
repetition,
all
compared
to
acoustically
matched
control
stimuli
and
matched
baseline
tasks.
At
a
whole-brain
level
there
was
no
evidence
for
recruitment
of
regions
in
premotor
or
motor
cortex
during
speech
perception.
A
focused
region
of
interest
analysis
similarly
failed
to
identify
significant
effects,
although
a
subset
of
regions
approached
signifi-
cance,
with
notable
variability
across
participants.
We
then
used
performance
on
a
battery
of
behavioural
tests
that
assessed
meta-phonological
and
verbal
short-term
memory
abilities
to
investigate
the
reasons
for
this
variability,
and
found
that
individual
differences
in
particular
in
low
phonotactic
probability
pseudoword
repetition
predicted
participants’
neural
activation
within
regions
in
premotor
and
motor
cortices
during
speech
perception.
We
conclude
that
normal
listeners
vary
in
the
degree
to
which
they
recruit
premotor
and
motor
cortex
as
a
function
of
short-term
memory
ability.
This
is
consistent
with
a
resource-allocation
approach
in
which
recruitment
of
the
dorsal
speech
processing
pathway
depends
on
both
individual
abilities
and
specific
task
demands.
© 2012 Elsevier Ltd. All rights reserved.
1.
Introduction
A
prevailing
question
in
cognitive
science
is
the
degree
to
which
the
cognitive
and
neural
systems
engaged
in
a
particular
task
are
consistent
across
individuals.
In
most
domains
there
is
an
assumption
that
these
systems
are
relatively
uniform,
which
enables
the
construction
of
generalisable
neuroanatomically
con-
strained
models
of
cognitive
processes
in
both
health
and
disease.
However,
there
is
an
increasing
awareness
of
the
role
that
individ-
ual
differences
in
cognitive
ability
play
in
this
process:
because
the
availability
of
cognitive
and
neural
resources
varies
across
indi-
viduals,
the
particular
instantiation
of
neural
support
for
a
given
task
will
also
vary
(Seghier
&
Price,
2009).
In
the
current
study
we
examine
how
these
individual
differences
manifest
in
speech
pro-
cessing,
with
a
particular
focus
on
the
role
of
the
motor
system
in
speech
perception.
Although
studies
of
speech
perception
generally
Corresponding
author.
Tel.:
+44
1223
273
631;
fax:
+44
1223
359
062.
E-mail
address:
gayaneh.szenkovits@mrc-cbu.cam.ac.uk
(G.
Szenkovits).
1
Present
address:
Department
of
Neurology,
University
of
Pennsylvania,
Philadelphia,
USA.
reveal
some
involvement
of
premotor
and
motor
regions
(Devlin
&
Aydelott,
2009)
there
remains
disagreement
about
whether
this
activity
is
an
obligatory
part
of
speech
processing
(D’Ausilio
et
al.,
2009;
Meister,
Wilson,
Deblieck,
&
Wu,
2007;
Wilson,
Saygin,
&
Sereno,
2004),
or
might
instead
reflect
other
associated
processes
(Hickok,
2010;
Hickok
et
al.,
2008;
Scott
&
Wise,
2004;
Scott,
Mcgettigan,
&
Eisner,
2009).
We
hypothesised
that
inter-individual
variability
in
cognitive
ability
may
be
one
factor
that
contributes
to
the
seemingly
inconsistent
pattern
of
results
present
in
the
litera-
ture.
In
the
current
study,
we
used
a
set
of
pseudoword
processing
tasks
to
investigate
the
role
of
motor
areas
in
speech
perception,
and
most
importantly,
whether
the
level
of
activity
could
be
related
to
individual
differences
in
behavioural
measures.
Early
models
of
spoken
language
processing
suggested
that
sen-
sory
representations
of
speech
interface
with
at
least
two
systems:
an
articulatory
motor
system
and
a
conceptual
semantic
system
(e.g.
Lichtheim,
1885/2006).
This
idea
remains
at
the
heart
of
con-
temporary
neurocognitive
models
of
speech
processing
(Hickok
&
Poeppel,
2007;
Scott
&
Johnsrude,
2003).
According
to
these
dual-stream
accounts,
acoustic
processing
(in
Heschl’s
gyrus
and
the
superior
temporal
gyrus)
is
followed
by
at
least
two
diverg-
ing
processing
streams.
The
ventral
stream
that
projects
towards
0028-3932/$
see
front
matter ©
2012 Elsevier Ltd. All rights reserved.
doi:10.1016/j.neuropsychologia.2012.02.023
G.
Szenkovits
et
al.
/
Neuropsychologia
50 (2012) 1380–
1392 1381
the
left
anterior
and/or
posterior
inferior
temporal
regions,
and
serves
as
sound-to-meaning
interface
by
mapping
sound
represen-
tations
of
speech
onto
conceptual
representations.
And
the
dorsal
stream
that
projects
towards
the
left
posterior
temporo-parietal
junction,
left
supramarginal
gyrus,
premotor
and
motor
cortices,
and
left
inferior
frontal
gyrus,
and
serves
as
an
auditory–motor
interface
by
mapping
speech
sounds
onto
articulatory/motor
rep-
resentations.
According
to
a
traditional
interpretation
of
this
dual
stream
account,
tasks
selectively
involving
speech
comprehension
(e.g.,
listening
to
meaningful
speech)
are
proposed
to
be
primarily
processed
in
the
ventral
stream,
while
tasks
linking
speech
percep-
tion
and
production
(e.g.,
repeating
back
heard
speech)
are
thought
to
primarily
engage
the
dorsal
stream
(Hickok
&
Poeppel,
2004,
2007;
Saur
et
al.,
2008;
Scott
and
Johnsrude,
2003).
In
recent
years,
however,
a
number
of
functional
MRI
stud-
ies
have
challenged
this
traditional
view
by
reporting
dorsal
pathway
involvement
in
speech
perception
tasks,
even
when
no
production
component
was
required
(Osnes,
Hugdahl,
&
Specht,
2011;
Pulvermüller
et
al.,
2006;
Wilson
&
Iacoboni,
2006;
Wilson
et
al.,
2004).
These
studies
focused
in
particular
on
regions
of
left
premotor
and
motor
cortex
that
are
the
output
of
the
dor-
sal
speech
pathway.
For
example,
an
influential
study
by
Wilson
et
al.
(2004)
reported
neural
activation
in
the
vicinity
of
the
precentral
gyrus
(and
premotor
cortex)
during
passive
listen-
ing
to
repeated
consonant-vowel
pseudowords
compared
to
rest,
whereas
listening
to
a
non-speech
sound
(a
bell)
did
not
trigger
neural
responses
in
the
same
regions.
More
direct
evidence
for
premotor
and
motor
cortex
recruitment
during
speech
percep-
tion
comes
from
studies
using
Transcranial
Magnetic
Stimulation
(TMS)
to
either
facilitate
(D’Ausilio
et
al.,
2009;
Fadiga
et
al.,
2002;
Watkins,
Strafella,
&
Paus,
2003)
or
to
temporarily
disrupt
(Meister
et
al.,
2007;
Möttönen
&
Watkins,
2009)
processing
in
motor
regions.
Furthermore,
TMS
has
been
also
shown
to
affect
per-
ceptual
discrimination
of
speech
sounds
in
an
articulator-specific
manner
(D’Ausilio
et
al.,
2009;
Möttönen
&
Watkins,
2009).
Taken
together,
the
above
findings
have
led
some
to
conclude
that
regions
of
the
dorsal
pathway
namely,
premotor
and
motor
cortices
are
critical
for
speech
perception,
and
that
when
listening
to
speech
both
dorsal
and
ventral
processing
streams
are
necessarily
recruited.
Despite
these
provocative
findings,
however,
the
degree
to
which
premotor
and
motor
processing
is
a
necessary
component
of
speech
perception
is
still
unclear,
and
recent
reviews
of
the
lit-
erature
have
called
for
caution
when
interpreting
the
above
results
(Hickok,
2008,
2009;
Hickok
and
Poeppel,
2007;
Lotto,
Hickok,
&
Holt,
2009;
Scott
et
al.,
2009).
One
reason
for
caution
is
that
the
above
findings
are
not
always
replicated;
for
example,
in
a
repet-
itive
TMS
study
of
premotor
regions,
Sato,
Tremblay,
and
Gracco
(2009)
reported
slower
phoneme
discrimination
only
(requiring
phonemic
segmentation)
but
no
effect
on
phoneme
identification
or
syllable
discrimination
(Sato
et
al.,
2009).
In
addition,
sev-
eral
methodological
points
are
consistently
highlighted.
First,
the
critical
contrasts
generally
fail
to
show
activation
at
a
corrected
level
of
significance.
Second,
functional
imaging
studies
reporting
motor
activation
have
not
made
comparisons
with
a
well-matched
non-speech
condition,
and
in
studies
using
better
controlled
stim-
uli,
there
is
little
evidence
for
motor
involvement
(Obleser,
Wise,
Dresner,
&
Scott,
2007;
Rodd,
Davis,
&
Johnsrude,
2005;
Rodd,
Longe,
Randall,
&
Tyler,
2010;
Scott,
Blank,
Rosen,
&
Wise,
2000).
Perhaps
most
importantly,
many
studies
that
have
found
motor
activation
during
speech
perception
have
used
phoneme
identifi-
cation
or
discrimination
tasks
(e.g.
Möttönen
&
Watkins,
2009;
Sato
et
al.,
2009;
Yuen,
Davis,
Brysbaert,
&
Rastle,
2009)
which
require
attending
to
sublexical
elements
of
speech
(such
as
phonemes).
These
complex
meta-phonological
tasks
(see
Morais,
Bertelson,
Cary,
&
Alegria,
1986;
Morais,
Cary,
Alegria,
&
Bertelson,
1979;
Morais
&
Kolinsky,
1994)
involve
multiple
processes
(e.g.
segmen-
tation
of
speech
into
its
constituent
elements,
decision
making
and/or
categorisation),
which
may
in
turn
depend
on
speech
pro-
duction
or
verbal
short-term
memory
processes.
For
example,
behavioural
studies
have
found
that
articulatory
suppression
slows
rhyme
judgements
but
not
homophony
judgements
of
written
words
(Besner,
1987;
Besner,
Davies,
&
Daniels,
1981;
Brown,
1987;
Richardson,
1987),
implying
that
subvocal
articulatory
processes
may
be
involved
in
tasks
involving
manipulation
of
sublexical
representations.
More
generally,
it
is
well
established
that
verbal
short-term
memory
is
supported
by
subvocal
rehearsal
processes
(Baddeley
&
Hitch,
1974;
Baddeley,
Gathercole,
Papagno,
&
Degli,
1998)
and
that
these
are
mediated
by
the
dorsal
auditory
path-
way
(Buchsbaum,
Olsen,
Koch,
&
Berman,
2005).
Based
on
these
findings,
Hickok
and
colleagues
suggest
that
the
recruitment
of
sub-
vocal
rehearsal
processes
could
explain
much
of
the
evidence
for
motor
involvement
in
speech
perception
(Buchsbaum,
Hickok,
&
Humphries,
2001;
Hickok
&
Buchsbaum,
2003;
Hickok,
Buchsbaum,
Humphries,
&
Muftuler,
2003).
Finally,
consistent
with
this
task-based
explanation
for
pre-
motor
and
motor
activity,
other
studies
have
shown
motor
involvement
in
challenging
listening
conditions
such
as
listening
in
noise,
or
using
degraded
or
phonemically
ambiguous
stimuli
(D’Ausilio
et
al.,
2009;
Davis
&
Johnsrude,
2003;
Dufor,
Serniclaes,
Sprenger-Charolles,
&
Demonet,
2009;
Osnes
et
al.,
2011).
Such
challenging
listening
situations
may
similarly
recruit
short-term
memory
processes,
and
perhaps
rely
on
a
form
of
analysis-
by-synthesis
which
recruits
motor
regions
(Davis
&
Johnsrude,
2007;
Skipper,
van
Wassenhove,
Nusbaum,
&
Small,
2007;
van
Wassenhove,
Grant,
&
Poeppel,
2005).
In
the
same
line,
Callan,
Callan,
Gamez,
Sato,
and
Kawato
(2010)
have
reported
that
increased
accuracy
of
phoneme
identification
in
noise
was
asso-
ciated
with
increased
activation
in
the
ventral
part
of
the
premotor
cortex.
Taken
together,
these
findings
might
suggest
that
the
func-
tion
of
the
motor
system
in
speech
perception
may
be
contingent
upon
perceptual
ambiguity
(Callan
et
al.,
2010;
Sato
et
al.,
2009).
While
these
additional
task
related
cognitive
processes
are
likely
to
recruit
dorsal
networks,
they
go
beyond
those
brain
regions
asso-
ciated
with
natural
speech
perception
(Hickok
&
Poeppel,
2000;
Osnes
et
al.,
2011;
Scott
et
al.,
2009).
In
this
paper,
we
return
to
the
question
of
whether
premotor
and
motor
cortices
are
necessarily
activated
during
speech
percep-
tion.
To
minimise
ventral
stream
engagement
and
lexical/semantic
effects,
we
use
phonotactically
legal
pseudowords;
thus,
we
oper-
ationally
define
speech
perception
as
encompassing
prelexical
phonological
processing.
We
avoid
complex
meta-phonological
tasks
and
use
a
simple,
one-back
identity
judgement
with
no
requirement
for
either
overt
speech
production
or
phonological
segmentation,
and
minimal
short-term
memory
load.
We
compare
the
neural
activation
required
for
this
task
to
that
in
a
one-back
identity
judgement
involving
complex
non-speech
stimuli,
acous-
tically
well
matched
to
the
speech
stimuli
on
both
spectral
and
amplitude
characteristics.
We
also
included
two
additional
pro-
duction
tasks
known
to
rely
on
the
dorsal
pathway
(and
for
which
we
are
therefore
confident
will
show
activation
in
premotor
and
motor
cortices):
reading
aloud,
and
repeating
heard
pseudowords.
These
production
tasks
allow
us
to
localise
dorsal
pathway
regions
in
our
cohort
of
participants
and
verify
the
efficacy
of
our
general
paradigm
independently
of
our
speech
perception
task.
Most
important
for
the
current
study
is
our
approach
of
explic-
itly
examining
individual
differences
in
the
degree
of
premotor
and
motor
activation
during
speech
perception.
As
discussed
above,
speech
perception
is
often
tested
by
phoneme
identification
or
discrimination
tasks.
These
tasks
arguably
involve
segmentation,
short-term
memory,
and
subvocal
rehearsal
likely
supported
by
the
dorsal
stream.
Moreover,
all
studies
to
date
have
conducted
1382 G.
Szenkovits
et
al.
/
Neuropsychologia
50 (2012) 1380–
1392
group
analyses
in
which
mean
activation
differences
across
sub-
jects
are
compared
to
the
null
hypothesis
of
zero
activation
for
speech
compared
to
non-speech
perception.
We
therefore
aim
to
examine,
for
the
first
time,
whether
participants
show
linked
vari-
ability
between
behaviour
(as
measured
by
tasks
engaging
the
dorsal
stream)
and
neural
measure
of
speech
perception.
To
do
so,
we
will
first
characterise
participants’
behaviour
separately
on
tasks
that
are
suggested
to
rely
on
the
dorsal
stream:
namely,
phonological
awareness
and
verbal
short-term
memory
tasks.
We
will
then
relate
the
observed
variability
in
performance
to
neu-
ral
activation
during
the
speech
perception
task
that
(as
described
above)
is
largely
independent
of
segmental
phonological
aware-
ness
and
short-term
memory.
Our
prediction
is
that
neural
activity
in
the
dorsal
stream
will
reflect
individual
differences
in
cognitive
ability.
2.
Materials
and
methods
2.1.
Participants
Twenty-one
healthy,
right-handed
native
speakers
of
British
English
partici-
pated
in
the
study
(9
men,
average
age
26.8
years,
SD
=
3.1,
and
12
women,
average
age
22.8
years,
SD
=
7.9).
They
were
recruited
through
the
MRC
Cognition
and
Brain
Sciences
Unit
volunteer
panel,
and
received
£10
per
hour
for
their
participation.
None
of
the
participants
reported
any
history
of
neurological,
speech,
or
hearing
disorder.
All
showed
normal
MRI
structural
scans.
One
participant
was
excluded
from
the
analyses
due
to
excessive
head
motion;
the
fMRI
analyses
reported
here
are
on
the
remaining
20
participants.
All
participants
were
fully
briefed
and
pro-
vided
written
informed
consent.
Ethical
approval
was
granted
by
the
Cambridge
Psychology
Research
Ethics
Committee.
2.2.
Stimuli
and
experimental
design
2.2.1.
Behavioural
tasks
to
predict
neural
activity
We
used
a
series
of
behavioural
tests
conducted
outside
the
scanner
to
charac-
terise
participants’
behaviour
on
complex
speech
processing
and
verbal
short-term
memory
tasks
frequently
used
in
the
literature,
with
the
goal
of
linking
individ-
ual
abilities
in
these
domains
to
neural
activation
in
premotor
and
motor
regions.
Specifically,
we
assessed
meta-phonological
skills
(the
ability
to
consciously
manip-
ulate
and
evaluate
speech
segments)
and
verbal
short-term
memory
capacity
using
the
following
tasks.
Spoonerisms
task.
We
used
the
spoonerisms
task
to
assess
participants’
meta-
phonological
ability.
The
spoonerisms
task
consisted
of
40
pairs
of
spoken
disyllabic
English
nouns
with
matching
stress
pattern.
For
half
of
the
trials,
par-
ticipants
were
instructed
to
swap
the
initial
sound
of
each
word
(e.g.
‘chemist
leader’
[ ])
and
for
the
other
half
to
swap
the
final
sound
(e.g.
‘fetish
scalpel’
[
]).
In
both
conditions
participants
were
asked
to
main-
tain
the
original
order
of
the
words.
Responses
were
recorded,
and
percent
correct
responses
were
averaged
over
the
initial-
and
final-swap
trials.
Auditory
and
visual
digit
spans
tasks.
The
auditory
digit
span
task
is
a
comput-
erised
version
of
the
WAIS-III
subtest
(Wechsler,
1998)
which
we
used
to
help
assess
participants’
verbal
short
term
memory
ability.
Sequences
of
digits
were
presented
aurally
at
a
rate
of
1
item
per
second.
The
task
consisted
of
forward
and
backward
repetitions.
The
sequence
length
is
increased
from
two
digits
per
sequence
to
nine
digits
per
sequence
in
the
forward
condition
and
from
two
to
eight
digits
in
the
backward
condition.
Participants
were
presented
with
two
trials
per
length;
the
test
finished
when
participants
failed
on
both
trials.
The
sum
of
all
correct
responses
(sequences
repeated
correctly)
provided
participants’
scores.
The
visual
digit
span
task
mirrored
the
auditory
one,
with
the
difference
that
the
digits
were
presented
in
the
middle
of
the
screen
for
500
ms,
one
by
one,
at
a
rate
of
1
item
every
500
ms.
The
same
scoring
was
used
as
for
the
auditory
digit
task.
Pseudoword
repetition
task.
As
an
additional
measure
of
verbal
short-term
mem-
ory
we
used
a
pseudoword
repetition
task.
In
this
task,
participants
had
to
repeat
sequences
of
between
two
and
six
monosyllabic
pseudowords.
Items
in
the
sequence
were
presented
quickly,
with
an
inter
syllable
interval
of
100
ms.
Since
pseudowords
lack
long-term
memory
representations,
pseudoword
span
should
predominantly
be
driven
by
the
capacity
of
phonological
store
and
the
efficiency
of
the
articula-
tory
rehearsal
(Baddeley,
1998;
Baddeley
&
Hitch,
1974)
therefore
providing
a
more
accurate
measure
of
verbal
short-term
memory
capacity
than
digit
or
word
span
tasks
(Cowan,
2001;
Gathercole
&
Baddeley,
1990).
Note
that
although
pseudowords
themselves
have
no
long-term
memory
representations,
memory
for
pseudowords
is
nevertheless
sensitive
to
linguistic
knowledge
about
the
phonological
structure
of
a
given
language
(i.e.,
phonotactic
probability).
Previous
studies
have
demonstrated
that
pseudowords
composed
of
commonly
co-occurring
segments
(high
phonotac-
tic
probability)
are
repeated
faster
and
more
accurately
than
pseudoword
composed
of
less
common
segments
(low
phonotactic
probability)
(Edwards,
Beckman,
&
Munson,
2004;
Vitevitch
&
Luce,
2005),
and
are
also
remembered
better
(Gathercole,
Frankish,
Pickering,
&
Peaker,
1999).
In
order
to
investigate
the
extent
to
which
variability
in
short-term
memory
performance
is
influenced
by
language
specific
constraints,
we
included
both
high
and
low
phonotactic
probability
pseudowords.
We
selected
52
consonant–vowel–consonant
(CVC)
pseudowords
from
(Gathercole
et
al.,
1999).
To
assess
the
effects
of
information
load
on
memory
performance,
and
maximise
variability
on
the
task,
sequence
length
was
parametrically
manip-
ulated.
Sequence
length
increased
over
successive
presentation
blocks
beginning
with
sequences
of
two
pseudowords
and
ending
with
sequences
of
six
pseudowords.
Each
test
consisted
of
5
blocks
of
12
trials
(a
total
of
60
trials).
At
the
end
of
each
sequence
participants
heard
an
auditory
cue
to
begin
verbal
recall.
Participants
were
instructed
to
repeat
each
sequence
in
the
correct
order
after
the
cue.
In
addition
to
online
coding,
spoken
responses
were
also
recorded.
Participants
were
allowed
to
have
a
short
break
between
the
blocks.
High
and
low
phonotactic
probability
pseu-
doword
repetitions
were
administered
separately.
Each
test
lasted
approximately
15
min.
The
order
of
the
tests
as
well
as
the
order
of
the
experimental
session
(before
or
after
fMRI)
was
counterbalanced
across
participants.
All
stimuli
were
recorded
by
a
native
female
speaker
of
Southern
British
English,
at
44,100
Hz
sampling
rate,
and
were
edited
offline
using
Adobe
Audition
(Adobe
System
Corporation,
San
Jose,
CA).
Experiments
were
programmed
and
run
using
E-Prime
®
(Psychology
Software
Tools,
Inc.,
Pittsburgh,
PA)
and
DMDX
(Forster
&
Forster,
2003)
software.
2.2.2.
Imaging
tasks
In
order
to
investigate
the
role
of
premotor
and
motor
regions
in
speech
percep-
tion,
we
used
three
tasks.
Two
of
the
tasks
(reading
and
repetition)
required
speech
production,
which
should
involve
obligatory
premotor
and
motor
cortex
activation.
The
third
task,
a
perception
task
using
the
same
materials,
was
our
critical
test
condition.
In
order
to
minimise
lexical
and
semantic
effects
and
to
tap
networks
underling
phonological
input
and
output
systems,
all
stimuli
in
the
imaging
experiments
were
CVC
pseudowords.
A
total
of
360
pseudowords
were
created
and
were
recorded
by
two
speakers
(one
male
and
one
female,
both
native
speakers
of
British
English)
at
44,100
Hz
sampling
rate.
For
the
non-speech
baseline
control
conditions,
the
pseudoword
recordings
were
passed
through
a
single-channel
pulse-train
vocoder
(Deeks
&
Carlyon,
2004)
implemented
using
Praat
software
(www.praat.org).
This
procedure
generated
a
buzzy
sound
(a
pulse
train)
filtered
to
have
the
same
long-
term
spectrum
and
amplitude
envelope
as
the
original
pseudowords,
and
thus,
well
matched
for
relevant
acoustic
properties
including
the
presence
of
pitch,
harmonic
spectral
structure
and
a
slowly
fluctuating
amplitude
envelope.
For
each
pseu-
doword,
three
control
stimuli
were
constructed
with
an
F0
of
100,
150
or
200
Hz,
introducing
pitch
variability
in
addition
to
the
intrinsic
variability
between
different
pseudowords
in
their
amplitude
envelope.
All
tasks
had
the
same
timing
characteristics
and
used
the
same
blocked-design
where
test
and
control
(or
baseline)
stimuli
alternated
in
12.6
s
blocks
as
recom-
mended
for
tasks
involving
overt
speech
production
(Soltysik
&
Hyde,
2006,
2008).
In
addition,
a
silent
inter-block
interval
of
2
s
was
included.
Blocks
consisted
of
6
stimuli
that
were
presented
with
a
2.1
s
stimulus
onset
asynchrony.
Scanning
runs
consisted
of
40
blocks
(20
test
and
20
control
blocks)
and
took
approximately
10
min
to
complete.
The
specific
tasks
were
as
follows.
Pseudoword
reading
vs.
visualmotor
baseline.
In
the
reading
task
participants
had
to
read
aloud
short
monosyllabic
pseudowords.
These
were
printed
on
the
screen
in
black
Times
New
Roman
font,
with
36-point
size.
In
the
control
condition
partic-
ipants
were
presented
with
unpronounceable
consonant
strings
(e.g.
xtqs),
and
had
to
say
‘yes’
to
acknowledge
them.
These
consonant
strings
were
matched
in
length
to
the
average
length
of
the
pseudowords
(4
letters).
To
make
the
consonant
strings
more
salient,
they
were
displayed
on
the
screen
in
blue.
Pseudoword
repetition
vs.
auditorymotor
baseline.
The
repetition
task
was
designed
to
engage
both
speech
perception
and
production.
Participants
listened
to
a
series
of
monosyllabic
pseudowords
and
were
instructed
to
repeat
each
back
immediately.
In
the
control
task,
participants
heard
matched
non-speech
buzzes
(as
used
as
a
baseline
in
the
speech
perception
task)
and
had
to
say
‘yes’
after
each
buzz.
Speech
perception
vs.
auditory
baseline.
In
the
speech
perception
task,
partici-
pants
listened
to
short
monosyllabic
pseudowords
and
were
instructed
to
press
a
button
with
their
left
hand
when
they
detected
two
successive
presentations
of
the
same
syllable
(one-back
task).
Only
10%
of
the
stimuli
were
repeated.
In
order
to
prevent
participants
relying
on
low-level
acoustic
information,
the
presentation
of
the
auditory
pseudowords
alternated
between
a
male
and
female
voice.
Hence,
judgements
of
repetition
depend
on
abstract
phonological
comparisons
but
do
not
require
division
of
pseudowords
into
segments
or
other
meta-phonological
abilities.
In
the
control
condition,
participants
listened
to
the
non-speech
buzzes
and
again
had
to
detect
immediate
repetitions
(i.e.
two
successive
stimuli
with
the
same
pitch
and
amplitude
envelope)
with
a
left
hand
button
press.
Pseudowords,
consonant
strings
and
buzzes
were
pseudo
randomly
distributed
and
counterbalanced
across
the
three
tasks
and
participants
such
that
no
two
stimuli
were
presented
twice.
All
control
tasks
were
well
matched
for
stimulus
and
response
characteristics.
Participants’
spoken
responses
were
recorded
with
a
FOMRI
MRI
safe
noise-cancelling
microphone
(Opto-Acoustics
Ltd.,
Or-Yehuda,
Israel)
for
offline
analysis.
Prior
to
scanning,
participants
took
part
in
a
short
practice
session
outside
of
the
scanner,
during
which
they
were
familiarised
with
the
tasks.
The
practice
session
included
four
blocks
of
each
tasks.
G.
Szenkovits
et
al.
/
Neuropsychologia
50 (2012) 1380–
1392 1383
Table
1
Descriptive
statistics
of
the
behavioural
experiments
for
18
participants.
CR
=
correct
responses.
Descriptive
statistics
N
Minimum
Maximum
Mean
Std.
deviation
Variance
Participants
18
Age
18
19
35
23.5
4.13
17.08
Men
7
19
35
24.7
5.3
28.2
Women 11 19 29 22.7
3.2
10.4
High
phonotactic
probability
pseudowords
Length
2
17
87.5
100
98.28
3.914
15.32
Length
3
17
83.33
100
94.28
5.51
30.35
Length
4
17
50
95.83
75
12.84
164.93
Length
5 17 25 73.33
51.08
12.15
147.55
Length
6 17 18.06
59.72
38.97
11.12
123.57
Low
phonotactic
probability
pseudowords
Length
2 17
87.5
100
98.04
3.33
11.11
Length
3
17
77.78
100
88.89
6.87
47.26
Length
4 17
43.75
91.67
65.93
15.41
237.6
Length
5
17
16.67
66.67
41.18
15.84
250.96
Length
6
17
13.89
40.28
25.57
8.63
74.52
Average
of
all
high
phonotactic
probability
pseudowords
(%
CR
sequence)
17
57.5
85.36
71.52
6.64
44.12
Average
of
all
low
phonotactic
probability
pseudowords
(%
CR
sequence)
17
51.81
74.06
63.92
7.56
57.23
Auditory
digit 18 14 29 19.67
3.97
15.76
Visual
digit
18
12
27
18.055
3.9
15.23
Spoonerisms
first
phoneme
swap
(%
CR)
17
35
100
70.29
18.32
3.4
Spoonerisms
last
phoneme
swap
(%
CR)
17
20
95
61.18
20.58
4.2
Spoonerisms
first
and
last
phoneme
average
(%
CR)
17
40
98
65.74
17.16
2.9
2.3.
Image
acquisition
and
preprocessing
The
imaging
data
were
acquired
with
a
3T
Siemens
Tim
Trio
MRI
system
with
a
12
channel
head
coil.
Stimuli
were
presented
over
high
quality
electrostatic
head-
phones
built
into
ear
defenders
(NordicNeuroLab,
Bergen,
Norway).
Participants
were
instructed
to
stay
as
still
as
possible
during
the
scan
and
to
avoid
excessive
head
movement
while
speaking.
We
acquired
312
echo
planar
(EPI)
volumes
in
each
of
the
three
10
minute
sessions.
Each
volume
consisted
of
32
× 3
mm
thick
slices
with
0.75
mm
inter-slice
gap,
TR
=
2000
ms,
TA
=
2000
ms,
field
of
view
19.2
×
19.2
cm,
acquisition
matrix
64
×
64,
echo
time
30
ms,
flip
angle
78
,
and
in-plane
resolution
of
3
×
3
mm.
The
acquisition
was
transverse
oblique,
angled
to
avoid
the
eyes
and
to
achieve
whole-brain
coverage
including
the
cerebellum.
In
a
few
cases
the
very
top
of
the
parietal
lobe
was
not
covered;
this
did
not
affect
coverage
of
motor
cor-
tex.
High-resolution
1
×
1
×
1
mm
MPRAGE
anatomical
images
were
collected
for
anatomic
localisation
and
coregistration.
SPM5
was
used
for
image
preprocessing
and
data
analysis
(Wellcome
Trust
Cen-
tre
for
Neuroimaging,
London,
UK).
After
discarding
7
initial
scans
for
each
session
to
allow
for
T2
equilibrium,
images
for
each
participant
were
corrected
for
motion
by
spatial
realignment
to
the
first
image
in
the
series,
using
a
least
squares
approach
with
6
rigid
body
parameters
(Friston
et
al.,
1995).
Following
realignment,
the
images
were
corrected
for
differences
in
slice
time
acquisition
and
coregistered
with
the
structural
image
(Ashburner
&
Friston,
1997)
which
was
then
segmented
and
normalised
(using
affine
and
smoothly
nonlinear
transformations)
to
a
brain
tem-
plate
in
Montreal
Neurological
Institute
(MNI)
space
(Ashburner
&
Friston,
2005).
The
resulting
normalisation
parameters
were
applied
to
all
the
coregistered
EPIs.
Finally,
the
EPI
images
were
smoothed
with
a
10
mm
full-width
at
half-maximum
isotropic
Gaussian
kernel.
Data
were
first
analysed
separately
for
each
participant,
using
a
separate
general
linear
model
for
each
session
(perception,
reading
and
repetition).
Low-frequency
noise
was
removed
with
a
128
s
high-pass
filter.
Individual
stimuli
and
button
presses
were
separately
modelled
using
delta
functions
convolved
with
the
canoni-
cal
hemodynamic
response
function
to
create
the
regressors
used
in
the
model.
The
6
motion
parameters
obtained
during
realignment
were
also
included
in
the
model
as
additional
regressors
of
no
interest.
Trials
with
button
press
were
modelled
out
in
the
analysis.
In
the
perception
run,
we
only
analysed
trials
that
resulted
in
a
correct
response;
in
the
localiser
runs
(reading
and
repetition),
all
trials
were
included.
Contrasts
of
parameter
estimates
from
the
least
mean-square
fit
of
these
single-subject
analyses
were
then
entered
into
the
second
level
random-effects
analyses
(one
sample
t-tests).
Contrasts
of
interest
were:
pseudowords
vs.
control
buzzes
in
the
listening
and
repeating
tasks,
and
pseudowords
vs.
control
conso-
nant
strings
in
the
reading
task.
In
the
speech
perception
task,
only
trials
without
button
press
were
included
in
the
analysis.
Unless
otherwise
specified,
results
are
reported
at
a
whole
brain-corrected
level
of
significance
at
pFWE
<.05
(voxelwise).
Family-wise
correction
was
achieved
by
using
Random
Field
Theory
as
imple-
mented
in
SPM
(Friston,
Frith,
Liddle,
&
Frackowiak,
1991).
To
ensure
that
critical
results
are
not
omitted,
results
for
the
perception
task
are
also
reported
at
whole
brain
uncorrected
p
<
.001
and
qFDR
<
.05
levels
of
significance
(see
Table
2
and
Fig.
2).
Data
for
the
region
of
interest
(ROI)
analyses
were
extracted
using
MarsBar
(http://marsbar.sourceforge.net).
3.
Results
3.1.
Behavioural
tasks
Descriptive
statistics
for
the
behavioural
tasks
are
presented
in
Table
1.
Out
of
the
20
participants,
18
completed
the
behavioural
tasks.
Due
to
a
technical
problem,
one
participant
did
not
com-
plete
the
pseudoword
repetition
and
the
auditory
digit
span
tasks,
and
another
did
not
complete
the
spoonerisms
task.
In
total,
com-
plete
behavioural
datasets
were
acquired
for
16
participants.
For
the
pseudoword
repetition
task,
analyses
were
run
on
the
percent-
age
of
words
correctly
repeated.
For
the
digit
span
tasks,
analyses
were
run
on
number
of
correctly
recalled
sequences.
Pseudoword
repetition
accuracy
is
shown
in
Fig.
1.
To
assess
the
effect
of
phonotactic
probability
and
sequence
length
on
repe-
tition
accuracy,
we
conducted
a
repeated-measures
ANOVA
with
Phonotactic
probability
(high
vs.
low)
and
Length
(5
levels)
as
within
subject
variables.
This
analysis
revealed
significant
main
effects
of
both
Phonotactic
probability
F(1,16)
=
25.14,
p
<
.001
and
Length
F(1,16)
=
299.21
p
<
.001.
The
interaction
between
Phono-
tactic
Probability
and
Length
was
also
significant
F(1,16)
=
5.59,
p
<
.01.
These
results
replicate
previous
findings
(Gathercole
et
al.,
1999;
Vitevitch
&
Luce,
2005)
indicating
that
memory
for
high
phonotactic
probability
pseudowords
is
better
then
memory
for
low
phonotactic
probability
pseudowords,
and
better
for
short
sequences
than
for
long
ones.
Post
hoc
comparisons
with
paired
sample
t-tests
revealed
that
the
interaction
is
driven
by
length
2,
the
only
sequence
length
where
performance
did
not
differenti-
ate
between
high
and
low
phonotactic
probability
pseudowords
(t(16)
=
0.187,
p
=
.85,
two-tailed).
For
all
the
other
lengths,
par-
ticipants
performed
better
with
high
phonotactic
probability
pseudowords
(length
3:
t(16)
=
2.716,
p
=
.015;
length
4:
t(16)
=
3.03,
p
=
.008;
length
5:
t(16)
=
3.3,
p
=
.004;
length
6:
t(16)
=
5.212,
p
<
.001,
all
two-tailed).
1384 G.
Szenkovits
et
al.
/
Neuropsychologia
50 (2012) 1380–
1392
Fig.
1.
Performance
on
high
(solid
line)
and
low
(dashed
line)
phonotactic
proba-
bility
pseudoword
repetitions
as
a
function
of
sequence
length.
Error
bars
indicate
standard
error
of
the
mean
after
between-subject
variability
has
been
removed,
appropriate
for
repeated
measures
comparisons
(Loftus
&
Masson,
1994).
In
the
spoonerisms
tasks,
the
average
accuracy
(±SD)
for
initial
sound
swap
was
70%
(±17),
and
61%
(±19),
for
the
final
sound
swap.
Overall
accuracy
was
65.7%
(±16).
In
the
auditory
digit
span
task
the
average
score
was
19.1
(±3.21),
in
the
visual
digit
span
task
it
was
18.05
(±3.79).
3.2.
Imaging
results
In
order
to
make
sure
participants
did
not
fall
asleep
and
performed
the
tasks
appropriately,
behavioural
performance
was
monitored
online.
This
confirmed
that
participants
followed
task
instruction
in
the
localiser
tasks
(reading
and
repetition),
and
produced
pseudowords
and
‘yes’
answers
as
appropriate.
Only
behavioural
data
from
the
speech
perception
tasks
(experiment
of
interest)
was
analysed
further.
The
average
detection
accuracy
for
pseudowords
was
91.2%
(SD
=
13.1%)
and
90.4%
(SD
=
13.3%)
for
the
buzzes,
indicating
that
participants
performed
the
speech
percep-
tion
and
control
task
reliably.
3.2.1.
Whole-brain
analysis
We
first
examined
brain
areas
that
showed
significant
activa-
tion
during
reading,
repetition,
and
perception
tasks
compared
to
their
corresponding
control
conditions,
shown
in
Fig.
2
and
Table
2.
Results
were
generally
consistent
with
the
findings
of
previous
studies
on
speech
perception
and
production
(see
Price,
2010;
Price
et
al.,
1996).
For
pseudoword
reading,
at
pFWE
<.05
statis-
tical
threshold,
we
observed
extensive
activation
of
bilateral
motor
and
premotor
cortices,
left
inferior
frontal
gyrus
(LIFG),
left
inferior
temporal
gyrus
(LIT),
the
right
cerebellum
and
the
supplementary
motor
area
(SMA).
In
addition,
a
region
in
the
superior
temporal
gyrus
(STG)
was
activated,
presumably
reflecting
a
response
to
par-
ticipants’
own
speech
(Hashimoto
&
Sakai,
2003;
Zheng,
Munhall,
&
Johnsrude,
2010).
The
pseudoword
repetition
task
revealed
acti-
vation
that
overlapped
with
networks
for
both
speech
perception
and
production,
namely
bilateral
premotor
and
motor
cortices,
left
middle
temporal
gyrus
(MTG),
as
well
as
LIFG
and
SMA.
In
addition
the
left
putamen
were
also
found
to
be
active.
Speech
perception
at
pFWE
<.05
statistical
threshold
showed
a
much
more
restricted
pat-
tern
of
activation
encompassing
portions
of
left
inferior
and
middle
temporal
gyri.
Because
we
wanted
to
ensure
we
were
not
miss-
ing
sub-threshold
effects
for
this
critical
contrast,
we
repeated
this
contrast
at
a