Content uploaded by Victoria Leong
Author content
All content in this area was uploaded by Victoria Leong on Feb 24, 2014
Content may be subject to copyright.
ORIGINAL RESEARCH ARTICLE
published: 24 February 2014
doi: 10.3389/fnhum.2014.00096
Impaired extraction of speech rhythm from temporal
modulation patterns in speech in developmental dyslexia
Victoria Leong*and Usha Goswami
Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Cambridge, UK
Edited by:
Pierluigi Zoccolotti, Sapienza
University of Rome, Italy
Reviewed by:
Fumiko Hoeft, University of
California, San Francisco, USA
Roeland Hankock, University of
California, San Francisco, USA
(in collaboration with Fumiko Hoeft)
Jenny Thomson, University of
Sheffield, UK
Jarmo Hämäläinen, University of
Jyväskylä, Finland
*Correspondence:
Victoria Leong, Department of
Psychology, Centre for
Neuroscience in Education,
University of Cambridge, Downing
Street, Cambridge CB2 3EB, UK
e-mail: vvec2@cam.ac.uk
Dyslexia is associated with impaired neural representation of the sound structure of
words (phonology). The “phonological deficit” in dyslexia may arise in part from impaired
speech rhythm perception, thought to depend on neural oscillatory phase-locking to
slow amplitude modulation (AM) patterns in the speech envelope. Speech contains AM
patterns at multiple temporal rates, and these different AM rates are associated with
phonological units of different grain sizes, e.g., related to stress, syllables or phonemes.
Here, we assess the ability of adults with dyslexia to use speech AMs to identify rhythm
patterns (RPs). We study 3 important temporal rates: “Stress” (∼2 Hz), “Syllable” (∼4Hz)
and “Sub-beat” (reduced syllables, ∼14Hz). 21 dyslexics and 21 controls listened to
nursery rhyme sentences that had been tone-vocoded using either single AM rates from
the speech envelope (Stress only, Syllable only, Sub-beat only) or pairs of AM rates
(Stress +Syllable, Syllable +Sub-beat). They were asked to use the acoustic rhythm
of the stimulus to identity the original nursery rhyme sentence. The data showed that
dyslexics were significantly poorer at detecting rhythm compared to controls when they
had to utilize multi-rate temporal information from pairs of AMs (Stress +Syllable or
Syllable +Sub-beat). These data suggest that dyslexia is associated with a reduced
ability to utilize AMs <20 Hz for rhythm recognition. This perceptual deficit in utilizing
AM patterns in speech could be underpinned by less efficient neuronal phase alignment
and cross-frequency neuronal oscillatory synchronization in dyslexia. Dyslexics’ perceptual
difficulties in capturing the full spectro-temporal complexity of speech over multiple
timescales could contribute to the development of impaired phonological representations
for words, the cognitive hallmark of dyslexia across languages.
Keywords: amplitude modulation, envelope, speech rhythm, dyslexia, oscillations
INTRODUCTION
SPEECH RHYTHM AND PHONOLOGICAL AWARENESS IN DYSLEXIA
Dyslexia is characterized across languages by difficulties in
phonological processing (e.g., Snowling, 2000; Ziegler and
Goswami, 2005). Phonological processing encompasses the
encoding and representation of speech at a range of grain sizes,
both segmental (i.e., phoneme) and supra-segmental (e.g., rime,
syllable and stress). As simple decoding (word reading) requires
the acquisition of phonology-orthography correspondences at
different grain sizes (segmental for alphabetic languages, syllabic
for some character-based scripts), this cognitive “phonological
deficit” affects reading acquisition in dyslexia across languages.
While an impairment in segmental processing in dyslexia has long
been noted (e.g., Tallal and Piercy, 1974; Snowling, 1981), supra-
segmental sensitivity has only recently been a focus of study, and
then mainly in English (e.g., Wood and Terrell, 1998; Goswami
et al., 2002, 2010). This is surprising, as children’s phonological
sensitivity to supra-segmental features of speech develops early
in all languages, well before the onset of formal literacy instruc-
tion. Indeed, EEG studies reveal sensitivity to the dominant stress
patterns in the native language within the first months of life
(Friederici et al., 2007; Ragó et al., 2014).
For English-learning infants, this early sensitivity toward dom-
inant syllable stress patterns such as the “Strong-weak” (S-w)
trochaic motif has been shown to be important for word learn-
ing (Jusczyk et al., 1993; Echols et al., 1997). By the age of 7.5
months, English-learning infants are capable of using the trochaic
stress pattern as a template for segmenting words from con-
tinuous speech (Jusczyk et al., 1999). During early childhood,
pre-literate children across languages already exhibitan awareness
for rime and syllable units in speech. Pre-readers are able to iden-
tify pairs of words that rhyme (e.g., “mat” rhymes with “hat” but
not with “cut”), and to clap out the number of constituent sylla-
bles in a word (Bradley and Bryant, 1983; Treiman and Zukowski,
1991; Ziegler and Goswami, 2005). In fact, children’s phonolog-
ical awareness of rhyme, syllables and stress predicts their later
success in learning to read (Bradley and Bryant, 1983; de Bree
et al., 2006; Whalley and Hansen, 2006).
Sensitivity to supra-segmental features of speech, particularly
speech rhythm and syllable stress, also appear to be impaired
in children and adults with developmental dyslexia (e.g., Wood
and Terrell, 1998; Kitzen, 2001; Goswami et al., 2010; Holliman
et al., 2010, 2012; Leong et al., 2011; Mundy and Carroll, 2012).
Acoustically, prosodic rhythm and stress in the speech signal are
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |1
HUMAN NEUROSCIENCE
Leong and Goswami Impaired perception of temporal modulation in dyslexia
cued by a combination of amplitude, duration and frequency
changes (Hirst, 2006). The amplitude-based cues to rhythm
are contained within the slow-varying “amplitude envelope” of
speech (Plomp, 1983; Howell, 1984, 1988a,b; Greenberg et al.,
2003; Tilsen and Johnson, 2008; Leong, 2012; Tilsen and Arvaniti,
2013). These slowly-varying amplitude patterns also cue the
location of the rhythmic “perceptual (P)-center” or moment of
occurrence of a sound (Allen, 1972; Morton et al., 1976; Scott,
1993, 1998; Villing, 2010). The P-center forms the basis for the
deliberate rhythmic timing of speech and for synchronization of
speech between speakers (Cummins and Port, 1998; Cummins,
2003). The P-center is related perceptually to a particular rhyth-
mic marker within the speech amplitude envelope: the envelope
onset rise time. Perceptual sensitivity to rise time is impaired
in children and adults with dyslexia in a range of languages
(Goswami et al., 2002; Hämäläinen et al., 2005, 2009; Surányi
et al., 2009; Poelmans et al., 2011; Goswami et al., 2011a;see
Goswami, 2011, for a recent summary). The rise time or “attack”
time of a sound refers to the rate at which its amplitude increases
during its initial onset, and is closely related to its P-center and
rhythmic “beat strength.” For example, a trumpet note with a
fast rise time and early P-center will typically be perceived as
having a stronger beat than a bowed violin note with a slower
rise time and later P-center (Gordon, 1987). In speech, envelope
onset rise times distinguish between stressed and unstressed sylla-
bles (Leong et al., 2011; Goswami and Leong, 2013), and provide
phonetic cues to voice onset time and manner of articulation,
for example aiding in phonetic distinctions such as between /b/
and /w/ (Goswami et al., 2011b). Dyslexics’ difficulties in per-
ceiving amplitude envelope rise times across languages has led
to the theoretical suggestion that a deficit in neural rhythmic
entrainment to amplitude modulation (AM) patterns in speech
could underlie the phonological deficit in developmental dyslexia
(Goswami, 2011; “temporal sampling theory”).
NEURONAL OSCILLATORY ENTRAINMENT IN DYSLEXIA
The speech amplitude envelope contains a spectrum of AM at
different temporal rates, with certain key rates of AM associated
with characteristic timescales of speech information. For exam-
ple, the envelope is dominated by modulations that occur at
around 3–5 Hz, corresponding to the average duration of the syl-
lable (Greenberg et al., 2003; Greenberg, 2006). AMs at a slower
rate of ∼2 Hz are associated with inter-stress intervals in speech,
which have an average duration of 493 ms (Dauer, 1983). Toward
the other end of the modulation spectrum, faster modulations
immediately above the ‘classic’ syllable rate of 3–5 Hz correspond
to more quickly-uttered unstressed syllables (∼10 Hz, Greenberg
et al., 2003). Faster modulations up to 50 Hz are thought to
provide phonemic cues to manner of articulation, voicing, and
vowel identity (Rosen, 1992). Although the amplitude envelope
has been the focus of many speech intelligibility studies (e.g.,
Drullman et al., 1994a,b; Shannon et al., 1995), the spectral fine
structure also makes an important contribution to speech intelli-
gibility, particularly under adverse listening conditions (Qin and
Oxenham, 2003; Xu et al., 2005; Obleser et al., 2012).
Recently,Poeppelandcolleagueshaveproposedaneural
account of speech processing based on multi-time resolution of
the modulation patterns in the speech envelope (multi-time reso-
lution models, e.g., Poeppel, 2003; Giraud and Poeppel, 2012). In
multi-time resolution models, the brain is thought to track speech
information at different timescales using neuronal oscillations at
different frequencies. These neuronal oscillations entrain (“phase-
lock”) to speech modulation patterns on equivalent timescales,
so that peaks and troughs in oscillatory activity align with peaks
and troughs in modulations in the signal. According to Giraud
and Poeppel (2012), neuronal oscillatory activity in the Theta
band (3–7 Hz) tracks syllable patterns in speech, while slower
oscillatory activity in the Delta band (1–3) Hz tracks phrasal
and intonational patterns, such as stress intervals. Fast oscilla-
tory activity in the Gamma band (25–80 Hz) is thought to track
quickly-varying phonetic information, such as formant transi-
tions and voice-onset times, which have timescales in the order
of tens of milliseconds. This convergence between characteristic
timescales in speech and the dominant neuronal oscillatory bands
in auditory cortex has been used to argue that oscillatory entrain-
ment (“phase locking”) may be an important neural mechanism
for parsing the speech signal into appropriately-sized linguistic
units for further lexical processing (Ghitza and Greenberg, 2009;
Schroeder and Lakatos, 2009; Giraud and Poeppel, 2012; Zion
Golumbic et al., 2012).
In line with dyslexics’ difficulties in rise time perception,
which are particularly evident for slower rise times (Richardson
et al., 2004; Stefanics et al., 2011). Goswami (2011) proposed
a “temporal sampling” framework to explain why the devel-
opment of accurate phonological representation of speech is
impaired across languages in developmental dyslexia. The tem-
poral sampling framework proposed that impaired phonological
representation in dyslexia could arise in part from impaired oscil-
latory entrainment to slow AMs (<10 Hz) that carry stress and
syllable patterning in speech (i.e., involving delta and theta oscil-
lations, see Goswami, 2011; Power et al., 2012, 2013; Soltész et al.,
2013). As neuronal oscillations in the cortex exhibit hierarchi-
cal nesting across slow and fast timescales (e.g., theta-gamma
phase-amplitude coupling; Lakatos et al., 2005), an impairment
in slow oscillatory activity (e.g., delta, stressed syllable rate; theta,
syllable rate) could also have consequences for speech encod-
ing at faster timescales, such as the Gamma or other phonetic
rate timescales. Indeed, recent studies using non-speech stimuli
have indicated that the hemispheric lateralization of Gamma-rate
oscillations (∼30 Hz) may be altered in dyslexia (Lehongre et al.,
2011, 2013).
AM PERCEPTION IN DYSLEXIA
Consistent with Goswami’s (2011) proposal, several AM percep-
tion studies based on non-speech stimuli and psychoacoustic
modulation thresholds indicate that dyslexics show poor AM
sensitivity below 10 Hz (e.g., Lorenzi et al., 2000; Amitay et al.,
2002; Rocheron et al., 2002; although note that Poelmans et al.,
2012 observed no deficit at 4 Hz). Studies reporting on modula-
tion thresholds for faster AM rates vary in whether they report
dyslexic deficits. For example, while McAnally and Stein (1997),
Witton et al. (1998),andMenell et al. (1999) all observed deficits
in dyslexics’ AM detection at ∼20 Hz, Hämäläinen et al. (2009)
failed to find a deficit at the same rate. Meanwhile, while no
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |2
Leong and Goswami Impaired perception of temporal modulation in dyslexia
dyslexic deficit at 80 Hz was reported by (Hari et al., 1999), a
study by Poelmans et al. (2012) found atypical laterality effects
in EEG for 20 Hz AM speech-weighted noise, and a study by
Lehongre et al. (2011) found atypical laterality effects in MEG
for 35 Hz AM white noise. Similarly mixed results have been
observed for dyslexics’ perception of very slow “stress rate” AMs.
While an early study by Witton et al. (1998) found that the per-
ception of 2 Hz AMs was unimpaired in dyslexia, subsequent
studies by Stuart et al. (2006) and Hämäläinen et al. (2012) have
reported significant group differences in AM sensitivity at the
1 Hz and 2 Hz rates respectively. From the non-speech studies,
it is currently unclear whether dyslexics have a general deficit
in AM perception that affects all modulation rates, or whether
their deficit is specific to the AM rates <10 Hz that are identified
in temporal sampling theory (Goswami, 2011). It is also possi-
ble that a single auditory anomaly, impaired phonemic sampling
in left auditory cortex, accounts for the impaired phonological
processing found in dyslexia (Lehongre et al., 2011).
While AM studies are important for studying phase-locking,
their implications for real-life speech perception are limited
because the AM patterns used in these studies are artificial sinu-
soids and not real speech AMs. Real-speech AMs differ from
artificial sinusoids in several important ways. First, unlike sinu-
soids, speech AMs are not perfectly periodically regular, but
contain phase-advancements or delays that reduce their tempo-
ral predictability. Secondly, real-speech AMs differ in patterning
at different acoustic frequencies. These temporal differences in
modulation patterning across different “spectral channels” are
crucial for speech intelligibility (e.g., Shannon et al., 1995).
Finally, in real speech, AM patterns at all timescales (e.g., stress,
syllable and phoneme) are concurrently transmitted to the lis-
tener, unlike artificial AM studies in which only one AM rate is
presented at a time. During real-life speech processing, listeners
probably extract speech information using combinations of AMs
at different rates. For example, we have recently reported that lis-
teners detect prosodic RPs by computing the phase relationship
between two concurrent rates of speech AM: the “Stress” rate
(∼2 Hz) and the “Syllable” rate (∼4Hz, see Leong, 2012). This
proposal is summarized in Figure 1. Dyslexics’ ability to use such
AM combinations in real speech has, to our knowledge, not been
tested.
One obvious difficulty is that the complexity of the speech sig-
nal makes the extraction of specific features like cross-frequency
AM phase alignment at pre-determined rates very difficult.
Accordingly, studies using “vocoded” (envelope-only) real speech
are useful. In vocoder studies, the speech signal is split into dif-
ferent frequency channels (e.g., typically 2, 4, 8 or 16 channels),
the envelopes from each channel are used to modulate noise or
tone carriers, and are then recombined. The resulting speech
sounds like a harsh whisper, and is initially difficult to recog-
nize. Speech vocoder studies with dyslexic children consistently
suggest that their ability to use envelope cues for speech percep-
tion is impaired (e.g., Lorenzi et al., 2000; Johnson et al., 2011;
Nittrouer and Lowenstein, 2013). For example, Lorenzi et al.
(2000) used 4-channel noise-vocoded VCV syllables (e.g., /aCa/)
as stimuli, and found that both typically-developing and dyslexic
11-year-old children performed more poorly than adults when
using envelope cues (<500 Hz) for speech intelligibility. However,
while the speech recognition performance of control children
improved significantly over the course of five training sessions
during the experiment, the performance of dyslexic children did
not improve with training. Johnson et al. (2011) and Nittrouer
and Lowenstein (2013) found more direct evidence for impaired
speech envelope perception in dyslexia. In their study using 4-
and 8-channel semantically-unpredictable noise-vocoded mono-
syllabic sentences (e.g., “dumb shoes will sing”), Johnson et al.
(2011) found that 10–11 year-old children with reading diffi-
culties showed significantly poorer word recognition of vocoded
speech than control children, for both 4- and 8-channel stim-
uli. Similarly, Nittrouer and Lowenstein (2013) used 4-channel
noise-vocoded sentences and found that there were consistent
differences in speech perception performance between typically-
developing and dyslexic children, for both age groups tested (8–9
years and 10–11 years).
In each of these studies, the vocoded stimulus typically con-
tained a very wide range of envelope AM rates rather than a
single AM rate (e.g., the envelope was low-pass filtered under
500 Hz). Thus, a complication of these experiments is that a
deficit in perceiving speech modulations at a specific rate (e.g.,
4 Hz) would be masked if the dyslexic children were able to
extract redundant speech information at other modulation rates
(e.g., 20 Hz) to compensate for a slow AM deficit (see Drullman,
2006). Conversely, if a difference in performance is observed
(as was the case in these studies), it is not clear whether this
is caused by a general deficit in AM processing that affects all
modulation rates, a specific deficit at certain AM rates (e.g., per-
taining to stress, syllable or phoneme-rate information), or a
deficit in combining AM information across different temporal
rates. Therefore, to assess speech AM perception in dyslexia more
closely, a combination of the two approaches (from AM studies
and vocoding studies) is needed. Ideally, the stimuli should be
created from the envelopes of real speech, but AMs at specific
modulation rates (or combinations of modulation rates) should
be systematically isolated from these real envelopes. Here, we
present one such study.
EXPERIMENTAL RATIONALE AND HYPOTHESES
Given the prior literature on the relationship between rhythmic
awareness and reading (e.g., Thomson et al., 2006;Thomson
and Goswami, 2008; Goswami and Leong, 2013; Tierney and
Kraus, 2013), we were specifically interested in assessing dyslex-
ics’ ability to use different AM rates in speech for rhythm per-
ception (rather than speech intelligibility per se). Accordingly,
we devised a rhythm perception task using rhythmic sentences
(nursery rhymes) that had been tone-vocoded using different
AM rates. For normal adult listeners, speech rhythm percep-
tion relies on sensitivity to the phase-relationship between 2
key AM rates (stress ∼2Hz and syllable ∼4Hz; Leong, 2012).
Furthermore, in prior work on rhythmic entrainment, we have
shown that children and adults with dyslexia show “tapping to the
beat” impairments at 2 Hz (Thomson et al., 2006; Thomson and
Goswami, 2008), while when tapping to speech rhythms adults
with dyslexia show impairment at the syllable rate (∼4Hz;Leong
and Goswami, 2014). Accordingly, here we presented dyslexic and
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |3
Leong and Goswami Impaired perception of temporal modulation in dyslexia
FIGURE 1 | Computation of strong-weak (s-w) syllable stress patterns
using the phase-relationship between “Stress”- and “Syllable”-rate
amplitude modulations (AMs) in the speech envelope, illustrated
with the trochaic (s-w) nursery rhyme sentence “Mary Mary quite
contrary.” Left, (A) the original waveform of the speech signal is shown
at the top, with the whole-band amplitude envelope superimposed as a
bold line. The envelope is band-pass filtered at three different rates to
produce a Stress AM (∼2 Hz), a Syllable AM (∼4 Hz) and a Sub-beat AM
(∼14 Hz) respectively. Right, (B) to compute the syllable stress pattern of
the sentence, the oscillatory phase series of the Stress AM and the
Syllable AM are extracted. Here, AM phase values are projected onto a
cosine function for ease of visualization. Note that the 8 Syllable AM
cycles correspond to the 8 spoken syllables in the sentence. The
concurrent Stress AM phase at Syllable AM peaks (indicated with vertical
dotted lines) is transformed into a prominence index (PI), shown in the
bar graph at the top. Syllable AM peaks that occur near the oscillatory
peak of the Stress AM achieve PI values of ∼1, while Syllable AM peaks
that occur near the oscillatory trough of the Stress AM achieve PI values
of ∼0. Here, syllables with a high PI (near 1) are considered “strong”
while syllables with a low PI (near 0) are considered “weak.” Note that
this Stress-Syllable AM phase relationship accurately reflects the trochaic
syllable stress pattern of the sentence.
control adult listeners with tone-vocoded (envelope-only) sen-
tences that contained only a narrow range of AM rates under
20 Hz. In order that the modulation patterns in our stimuli would
be realistically speech-like, these modulation bands did not con-
tain only a single AM rate (i.e., a “4 Hz” sinusoid). Rather each
AM band contained a narrow range of AM rates centered around
a target rate (e.g., 2.3–7 Hz, centered around 4 Hz), each of which
we refer to in shorthand by the center rate (e.g., here as “∼4Hz”
or “Syllable-rate AMs”).
Our dependent variable was the accuracy of speech rhythm
perception. We created stimuli that contained modulations from
either a single narrow AM band (i.e., Stress only ∼2Hz, Syllable
only ∼4Hz,Sub-beatonly∼14 Hz), or from paired combinations
of AM bands (Stress +Syllable and Syllable +Sub-beat). On
the basis of the temporal sampling framework (Goswami, 2011),
we predicted no dyslexic impairment at the sub-beat band rate
of ∼14 Hz (included as a control frequency band), but sig nificant
impairment at both rates <10 Hz (Syllable and Stress rates). On
the basis of our prior data on rhythmic entrainment to speech
rhythms (Leong and Goswami, 2014), we also predicted that
dyslexics would have difficulty in combining speech information
across different temporal modulation rates. As Leong’s modeling
work (Leong, 2012) has shown that rhythm perception depends
critically on the Stress +Syllable AM combination, it may be that
particular dyslexic difficulty is found for this combination.
Note that in this experiment we used the ‘Sub-beat’ rate
(∼14 Hz) as a control AM band, not the “phoneme rate”
(∼30 Hz) that is the theoretical focus of AM work by Lehongre
et al. (2011, 2013). Our decision was motivated by the clas-
sic psychophysical studies of Drullman et al. (1994a,b).These
studies indicated that AM rates up to 16 Hz are the most
important for speech intelligibility, and that the inclusion of
faster AM rates above 16 Hz result in little improvement to
intelligibility. Furthermore, in a rhythmic context, we noticed
that unstressed syllables are often compressed to a “sub-beat”
length in order to fit within the standard “beat” length of one
ordinary syllable. For example, in the nursery rhyme sentence
“Humpty Dumpty sat on the wall,” the syllables “sat” and “on”
are compressed together, or reduced, to fit the space of one
regular syllable like “Hum.” Consequently, the overall trochaic
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |4
Leong and Goswami Impaired perception of temporal modulation in dyslexia
rhythm of the sentence is not disrupted. Thus, the “Sub-beat”
rate (∼14 Hz) is likely to correspond to speech modulations
that are important for intelligibility, but which contribute little
toward the overall rhythmic patterning of “Strong” and “weak”
beats in a sentence, making this an ideal control modulation
band. As the cited “phoneme” rate (∼30 Hz) commonly refers
to the timescale of formant transition patterns in speech (e.g.,
Giraud and Poeppel, 2012), we plan to examine this rate in
the context of frequency modulation (FM) perception in future
studies.
METHODS
PARTICIPANTS
Twenty-one adults (9 M, 12 F) with developmental dyslexia and
26 control adults (7 M, 19 F) participated in the study. All dyslexic
participants had received a formal diagnosis of developmental
dyslexia and also showed significant reading and phonological
deficits according to our own test battery. All participants had no
other diagnosed auditory or learning difficulties, spoke English
as a first language, and were aged under 40 years. As shown in
Table 1, dyslexic and control participants were matched on IQ
[2 subscales of the Wechsler Abbreviated Scale of Intelligence
(WASI), Wechsler, 1999: A non-verbal subscale (Block Design)
and a verbal subscale (Vocabulary)]. However, there was a signif-
icant age difference between dyslexic and control groups, where
controls were slightly older on average [dyslexic mean age =22.9
years; control mean age =25.5 years; F(1,45)=5.66, p<0.05].
To account for this age difference, all our subsequent statistical
analyses include age as a covariate. As this statistical solution is
impartial, we felt that it would be preferable to manually exclud-
ing certain participants on the basis of their age, which would
entail subjectivity as to how many and which participants to
exclude.
Table 1 | Group performance on standardized ability, literacy and
phonological tests.
Task Dyslexic Controls F(1,45)
Age 22.9 25.5 5.66*
(SE) (0.6) (0.8)
IQ 129.6 129.8 0.01
(SE) (1.0) (1.5)
- Non-Verbal IQ T score 70.6 70.7 0.01
(0.7) (0.8)
- Verbal IQ T score 62.0 62.0 0.00
(1.0) (1.5)
Auditory STM score (out of 16) 10.3 13.0 22.91***
(SE) (0.4) (0.4)
Reading standard score 110.8 115.8 8.81**
(SE) (1.4) (1.0)
Spelling standard score 104.7 117.0 43.68***
(SE) (1.5) (1.2)
Phonology score (out of 30) 26.1 28.5 22.13***
(SE) (0.4) (0.3)
*p<0.05; **p<0.01; ***p<0.001.
Consistent with their diagnosis, dyslexics performed signifi-
cantly more poorly than controls in standardized tests for lit-
eracy [Wide Range Achievement Test (WRAT-III), Reading and
Spelling scales, Wilkinson, 1993] and phonological awareness
(Phonological Assessment Battery (PhAB), Spoonerisms task,
Fredrickson et al., 1997; Weschler Adult Intelligence Scale-Revised
(WAIS-R) forward digit span subtest, Wechsler, 1981). Thus,
despite the relatively high IQ of both groups (reflecting the fact
that these were high-performing students at a world-class uni-
versity), dyslexic participants still lagged behind their peers in
their reading, spelling and phonological awareness skills. Both
control and dyslexic participants also took part in other stud-
ies on rhythm perception and production (see also Leong and
Goswami, 2014). Ethical approval for the study was obtained
from the Cambridge Psychology Research Ethics Committee, and
all participants were given a modest payment for taking part in
the experiments.
MATERIALS
In line with our focus on rhythm, children’s nursery rhymes were
used as stimuli because these are a form of naturally-occurring,
rhythmically-rich speech material, whose rhythm patterns (RPs)
should be familiar to and easily identified by listeners. Four duple-
meter nursery rhymes were used for the experiment, taking the
first line of each nursery rhyme (8 syllables). The sentences fell
into either of two RPs, as shown in Tab l e 2 .Twosentenceshad
a “S-w” or trochaic pattern. These were “MA-ry MA-ry QUITE
con-TRA-ry” and “SIM-ple SI-mon MET a PIE-man” (stressed
syllables in CAPS). The other two sentences had a “w-S” or
iambic pattern. These were “as I was GO-ing TO st IVES” and
“the QUEEN of HEARTS she MADE some TARTS.” We chose to
use trochaic and iambic patterns because these are the dominant
prosodic motifs found in children’s nursery rhymes (Gueron,
1974), and were easily understood by our participants. A total
of 4 sentences (2 per RP) were used to encourage participants
to attend to the global “S-w” or “w-S” rhythm patterning that
was common between the 2 exemplars of each pattern. Using two
exemplars also prevented reliance on minor non-rhythmic vari-
ations (e.g., total stimulus length) to perform the task. We did
not use more than 4 sentences as this would have unnecessarily
increased the difficulty of the task (which was already high in dif-
ficulty). Each sentence was ∼2 s in length (Mary: 2.01 s; Simon:
2.12 s; St Ives: 2.37 s; Queen: 2.31s). The nursery rhymes were
spoken by a female native speaker of British English who was
articulating in time to a 4 Hz (syllable rate) metronome beat. The
speaker was instructed to produce the RP of each nursery rhyme
Table 2 | List of nursery rhyme sentences and their rhythm pattern.
Rhythm pattern
(S, Strong; w, weak)
Nursery rhyme sentence
(CAPS, Strong syllable)
SwSwSwSw
(trochaic)
“MA-ry MA-ry QUITE con-TRA-ry”
“SIM-ple SI-mon MET a PIE-man”
wSwSwSwS
(iambic)
“as I was GO-ing TO st IVES”
“the QUEEN of HEARTS she MADE some
TAR TS ”
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |5
Leong and Goswami Impaired perception of temporal modulation in dyslexia
as clearly as possible. Utterances were digitally recorded using a
TASCAM digital recorder (44.1 kHz, 24-bit), and the metronome
wasnotaudibleinthefinalrecording.
RHYTHM PERCEPTION TASK
In each trial, participants heard one of four tone-vocoded nursery
rhyme sentences. They were asked to indicate the target sen-
tence (one of four) by selecting an appropriate response button.
Participants were told to base their judgment on the RP of the
stimulus. Given that the vocoded sentences had a clear rhythm
but were unintelligible (see Section Signal Processing Steps for
Tone Vocoding), we did not expect participants’ sentence iden-
tification to exceed 50% in accuracy (i.e., we expected accurate
discrimination between trochaic vs iambic sentences, but not
within 2 trochaic or iambic sentences). All participants were first
given 20 practice trials, during which they heard the four sen-
tences as originally spoken, without any vocoding. This enabled
participants to learn the RP of each sentence, and to become
familiar with the response button mapping. Subsequently, par-
ticipants performed the task with tone-vocoded stimuli only. The
tone-vocoded stimuli retained the temporal pattern of each nurs-
ery rhyme sentence, but were completely unintelligible. Cartoon
icons representing the four response options were displayed on
the computer screen throughout the experiment to help to reduce
the memory load of the task. Auditory stimuli were presented
diotically using Sennheiser HD580 headphones at 70 dB SPL. The
experimental task was programmed in Presentation and delivered
using a Lenovo ThinkPad Edge laptop.
Signal processing steps for tone vocoding
AM bands were extracted from the amplitude envelope of the
speech signal of each nursery rhyme sentence using two differ-
ent methods. In the first method, the amplitude envelope was
extracted using the Hilbert transform. This Hilbert envelope was
then passed through a modulation filterbank (MFB) of band-pass
filters, which effectively isolated speech AMs corresponding to
the (1) “Stress” rate (0.8–2.3 Hz), (2) “Syllable” rate (2.3–7 Hz),
and (3) “Sub-beat” (7–20 Hz) rate. Please see Stone and Moore
(2003) for details of the spectral filterbank design, which was
adapted to be used as a MFB here. It is possible that artificial
modulations may be introduced into the stimuli by the MFB
method, since band-pass filters can introduce modulations near
the center-frequency of the filter, through “ringing.” Therefore,
a second AM-hierarchy extraction method was also used. This
was Probabilistic Amplitude Demodulation (PAD; Tu r n er an d
Sahani, 2011), and did not involve the Hilbert transform or fil-
tering. Rather, the PAD method estimates the signal envelope
using a model-based approach in which the signal is assumed
to comprise the product of a positive slow envelope and a fast
carrier. Bayesian statistical inference is used to invert the model,
thereby identifying the envelope which best matches the data and
the aprioriassumptions (i.e., a positive-valued envelope whose
mean is constant over time). This envelope extraction protocol
can be run recursively at different timescales, yielding AMs at
the same modulation rates as those derived from MFB filtering
(Turner and Sahani, 2007; Turner, 2010). All participants heard
both MFB-derived and PAD-derived vocoded stimuli in the same
experiment. It was reasoned that if participants produced the
same pattern of results with two methods of AM extraction that
operate using very different sets of principles, the observed effects
were likely to have arisen from real features in speech rather than
filtering artifacts.
The MFB- and PAD-derived AMs were used to modulate a
500 Hz sine-tone carrier in a single-channel vocoder. A multi-
channel vocoder was not used to ensure that the sentences would
be completely unintelligible. As the dependent variable in the
experiment was how well participants could identify each sen-
tence on the basis of its AM RP, all other cues to sentence identity
need to be removed. Therefore, the phonetic fine structure of the
signal was intentionally discarded. In addition, the AMs derived
from the amplitude envelope were used to modulate the sine-tone
carrier, rather than being combined back with the fine struc-
ture of the signal. To create single-AM band stimuli (e.g., Stress
only), the appropriate AM band was extracted and combined with
the 500 Hz sine-tone carrier. A 30 ms-ramped pedestal at chan-
nel RMS power was added prior to combining with the carrier.
To create double-AM band stimuli (e.g., Stress +Syllable), the
two AM bands were first combined via addition (for MFB) or
multiplication (for PAD) before combining with the carrier. All
stimuli were equalized to 70 dB. These signal processing steps are
illustrated in Figure 2.
The resulting tone-vocoded sentences had clear temporal pat-
terns ranging from “Morse-code” to flutter, but were other-
wise completely unintelligible (See Audios 1–5in Supplementary
Material). Figure 3 illustrates the different types of AM-vocoded
stimuli used in the experiment, contrasting trochaic (“Mary
Mary”) and iambic (“the Queen of Hearts”) sentences.
Design
As explained in Section Experimental Rationale and Hypotheses,
five different AM bands or band combinations were used for
vocoding. This generated 3 types of single AM band stimuli
(Stress only; Syllable only; Sub-beat only) and 2 types of paired
AM band stimuli (Stress +Syllable; Syllable +Sub-beat). For
each AM combination, each of the 4 nursery rhyme sentences was
presented 10 times (5 MFB and 5 PAD stimuli) in a fully random-
ized order, giving 40 trials per AM type and 200 trials in total
for the entire experiment. Participants were scored in terms of
their sentence identification accuracy for each AM type (Accuracy
scores), and their ability to discriminate more generally between
trochaic and iambic RPs (RP scores). We had previously found
that control participants showed no difference in listening accu-
racy for MFB and PAD stimuli (Leong, 2012). In our preliminary
analysis of the current data, we likewise found that there was no
difference in performance for PAD as compared to MFB stimuli
[F(1,44)=2.74, p=0.11]. Therefore, to simplify further analy-
sis, the scores for the two types of stimuli in each condition were
averaged into a single mean score for each participant.
RESULTS
SENTENCE IDENTIFICATION ACCURACY
Figure 4 shows the mean Accuracy scores achieved by the control
and dyslexic groups for each AM type. To check for floor effects
in performance (which could obscure group differences), we
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |6
Leong and Goswami Impaired perception of temporal modulation in dyslexia
FIGURE 2 | Illustration of the signal processing steps involved in
tone-vocoding for the nursery rhyme sentence “Mary Mary quite
contrary.” (A) The original speech signal with its wholeband amplitude
envelope overlaid in bold. (B) The Stress AM, Syllable AM and Sub-beat AMs
are extracted from the envelope using either the MFB or PAD method. Single
and double AM band vocoded stimuli are then generated by combining the
AMs with a 500 Hz sine tone. To generate single AM band stimuli (bottom
left), each single AM band is multiplied individually with the sine tone. To
generate double band AM stimuli (bottom right), the two AMs are first
combined via addition (MFB) or multiplication (PAD) before multiplication with
the sine tone. The resulting double band vocoded stimulus contains temporal
patterning at two main rates (i.e., second-order modulation).
assessed whether participants’ scores for each AM type were
significantly above the level of chance (25%). Accordingly, sep-
arate one-sample t-tests were conducted for control and dyslexic
groups against the test value of 0.25. As this necessitated 10 t-tests
in total, Holm’s sequential Bonferroni correction was applied
to the p-value threshold for significance (Holm, 1979). Holm’s
sequential Bonferroni correction entails a smaller reduction in
statistical power than the standard Bonferroni correction, and
is a widely-used alternative for controlling for Type 1 family-
wise error (Rice, 1989; Perneger, 1998). In the Holm-Bonferroni
method, the threshold for significance is computed as 0.05/(10-
[rank of uncorrected p-value] +1). Therefore, for the small-
est (rank 1) p-value, the Holm Bonferroni-corrected threshold
for significance was 0.05/(10 −1+1)=0.005, whereas for the
largest (rank 10) p-value, the threshold for significance was
0.05/(10 −10 +1)=0.05. The results of the t-tests indicated
that both controls and dyslexics performed significantly above
chance for all 5 AM types. Accordingly, we investigated whether
there were group differences across the 5 AM types.
Two repeated measures ANCOVA analyses were conducted.
In the first analysis, we compared group performance for the
3single AM bands (Stress only, Syllable only, Sub-beat only).
Single AM band (3 levels) was entered into the ANCOVA as the
within-subjects factor, and Group (2 levels) was entered as the
between subjects factor. Age was entered as a covariate factor.
The results of the first ANCOVA showed no significant main
effect of Group [F(1,44)=0.14, p=0.71], and no interaction
betweensingleAMbandandGroup[F(2,88)=0.37, p=0.69].
This suggests that controls and dyslexics were performing equally
well in their use of single AM-band information for rhythm
perception.
In the second RM ANCOVA analysis, we investigated group
differences in the ability to combine information across more
than one AM band. The second ANCOVA entered double-AM
band (2 levels, Stress +Syllable, Syllable +Sub-beat) as the
within-subjects factor, and Group (2 levels) as the between sub-
jects factor. Age was again entered as a covariate factor. This
second ANCOVA showed a significant main effect of Group
[F(1,44)=4.51, p<0.05], but the interaction between AM band
and Group did not approach significance [F(1,44)=0.19, p=
0.66]. Therefore, our dyslexic participants were worse at com-
bining AM information across different rates, as they were
significantly less accurate than control participants. For com-
bined AM bands, the dyslexic participants were significantly
poorer at combining the Syllable-rate AM with other AMs at
the Stress rate or the Sub-beat rate.
RHYTHM PATTERN DISCRIMINATION
Next, we wanted to ascertain whether participants were able to
use these speech AMs to discriminate between the two major
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |7
Leong and Goswami Impaired perception of temporal modulation in dyslexia
FIGURE 3 | Comparison of the 5 types of AM tone-vocoded stimuli
produced for trochaic (S-w) and iambic (w-S) nursery rhyme sentences.
Stimuli corresponding to the trochaic sentence “Mary Mary” are shown in
the left column. Stimuli corresponding to the iambic sentence “the Queen of
Hearts” are shown in the right column. Top row: Original acoustic waveform
of each sentence in black, with whole-band amplitude envelope overlaid in
red. Rows (A–E) Stress AM, Syllable AM, Sub-beat AM, Stress +Syllable AM
and Syllable +Sub-beat AM stimuli respectively.
RPs that characterized the 4 nursery rhyme sentences [i.e.,
trochaic (“S-w”) vs. iambic (“w-S”)]. Accordingly, we re-scored
participants responses according to whether they had correctly
identified the RP of each sentence as trochaic or iambic,
disregarding whether they had identified the actual sentence cor-
rectly (i.e., for the stimulus sentence “Mary Mary,” responses of
“Mary Mary” and “Simple Simon” were both scored as the cor-
rect RP, as both were trochaic responses). The resulting mean
RP scores for iambic sentences (Ives, Queen) and trochaic sen-
tences (Mary, Simon) are shown in Figure 5.Tocheckforfloor
effects in performance (which could obscure group differences),
we assessed whether participants’ scores for each AM type were
significantly above the level of chance (50%). Accordingly, sepa-
rate one-sample t-tests were conducted for control and dyslexic
groups against the test value of 0.5. As this necessitated 20 t-
tests in total, Holm’s sequential Bonferroni correction was applied
to the p-value threshold for significance (Holm, 1979). For the
smallest (rank 1) p-value, the Holm Bonferroni-corrected thresh-
old for significance was 0.05/(20 −1+1)=0.0025, whereas for
the largest (rank 10) p-value, the threshold for significance was
0.05/(20 −20 +1)=0.05.
As shown in Figure 5 (∗), controls and dyslexics always
performed significantly above chance when making a binary
discrimination of the rhythm of trochaic (T) sentences (with the
exception of controls in the Sub-beat AM condition). By con-
trast, for iambic (I) sentences, dyslexics never performed above
chance in binary rhythm discrimination, whereas controls per-
formed significantly above chance when listening to Stress-only,
and Stress +Syllable AM types. Given the presence of clear floor
effects for binary rhythm discrimination of iambic sentences, we
were unfortunately unable to draw further conclusions regard-
ing group differences for these sentence types (as both controls
and dyslexics were performing at chance in many conditions).
However, both groups had performed significantly above chance
for trochaic sentences when listening to Stress only AMs, Syllable
only AMs, Stress +Syllable AMs and Syllable +Sub-beat AMs.
According, we performed repeated measures ANCOVAs on these
RP scores for trochaic sentences only.
InthefirstANCOVAanalysis,wecomparedgroupperfor-
mance for the 2 single AM bands only, taking single AM band
(2 levels) as the within-subjects factor, Group (2 levels) as the
between subjects factor, and Age as the covariate. Consistent with
the previous Accuracy analysis, there was no significant main
effect of Group [F(1,44)=0.16, p=0.69], and no interaction
betweensingleAMbandandGroup[F(1,44)=0.11, p=0.75].
This suggests that controls and dyslexics did not differ in their
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |8
Leong and Goswami Impaired perception of temporal modulation in dyslexia
ability to use Stress only and Syllable only AM band information
to make trochaic-iambic distinctions. We then analyzed double-
AM band performance in a similar fashion. This time double-AM
band (2 levels, Stress +Syllable, Syllable +Sub-beat) was the
within-subjects factor, Group (2 levels) was the between subjects
factor, and Age was the covariate. Unlike the Accuracy analy-
sis, the ANCOVA showed no significant main effect of Group
[F(1,44)=1.90, p=0.17]. There was also no interaction between
FIGURE 4 | Group mean Accuracy scores for each AM band and band
combination. Error bars indicate standard error.
double-AM band and Group [F(1,44)=0.17, p=0.68]. Hence
dyslexic participants appeared to recognize trochaic RPs based on
pairsofAMaswellascontrols.
These results should be interpreted with caution, however.
Firstly, only performance for trochaic sentences could be analyzed
meaningfully (meaning that half the total dataset could not be
analyzed). Secondly, the RP scores computed here reflect partici-
pants’ rhythm discrimination indirectly rather than directly. The
RP scores measure the perceptual confusability of sentences (i.e.,
how participants make guesses when they are unsure of the cor-
rect sentence identity). Perceptual confusability will depend in
large part on the global RPs of the stimuli, but will also include
other factors like total duration and perceptual grouping effects,
as well as participants’ own cognitive strategies. Nevertheless, the
data show that perceptual confusability was maximal for trochaic
sentences, for both groups.
CORRELATIONS BETWEEN AM PERCEPTION, PHONOLOGY, AND
LITERACY
By hypothesis, a perceptual deficit in using AM patterns to
discriminate rhythmic sentences should be related to both
phonological awareness and reading skills in our partici-
pants. Accordingly, we investigated the relationship between
participants’ sentence identification Accuracy for each AM band
or combination, and their performance on memory, reading
and phonological tasks. Ta ble 3 shows the partial correlation
matrix between accuracy of performance in the rhythm percep-
tion task (by AM type) and participants’ memory, reading, and
FIGURE 5 | Group mean Rhythm Pattern scores for each AM band and band combination, shown separately for iambic (“I”: Ives & Queen) and
trochaic (“T”: Mary & Simon) sentences. Error bars indicate standard error. (∗) AM bands where performance was above chance (50%) for each group.
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |9
Leong and Goswami Impaired perception of temporal modulation in dyslexia
Table 3 | Pearson’s r partial correlation values between accuracy of performance in rhythm perception (by AM type), and general ability,
literacy and phonology measures.
Partial correlations
controlling for Age
and IQ
AM Combination
Stress only Syllable only Sub-beat only Stress +Syllable Syllable +Sub-beat
AUDITORY STM
All 0.05 0.11 0.13 0.35*0.17
Con −0.07 0.06 0.09 −0.09 −0.34
Dys 0.07 0.31 0.26 0.52*0.55*
READING
All −0.14 0.02 0.21 0.13 0.21
Con −0.07 0.13 0.38$−0.21 0.09
Dys −0.32 0.03 0.14 0.17 0.18
SPELLING
All −0.17 0.12 0.25ˆ 0.17 0.32*
Con −0.15 0.12 0.28 −0.09 0.11
Dys −0.38 −0.06 0.48*0.05 0.27
PHONOLOGY
All 0.13 0.30*0.18 0.40** 0.11
Con 0.04 0.27 0.22 −0.12 −0.16
Dys 0.17 0.42&0.21 0.52*0.07
For each cell, correlations over both groups are shown on the top left, correlations for controls only are shown on the middle right, and correlations for dyslexics only
are shown on the bottom right. Age and IQ are controlled in all the correlations.
*p < 0.05; **p<0.01;$p=0.07; &p=0.074; ∧p=0.096.
phonological ability, with age and IQ controlled. Correlations
were performed with both groups combined, as well as separately.
As shown in Ta b l e 3 , there were several significant relationships
between AM performance, literacy and phonology. Taking the
group as a whole, the conceptually important Stress +Syllable
speech AMs were significantly related to phonological awareness
(r=0.40, p<0.01), as well as to auditory short-term memory
(digit span, r=0.35, p<0.05). Performance with the Syllable +
Sub-beat level was also significantly associated with spelling per-
formance, which was not predicted (r=0.32, p<0.05). When
considering the dyslexic group alone, the table shows that dyslex-
ics’ phonological awareness was significantly related to their sensi-
tivity to Stress +Syllable speech AMs (r=0.52, p<0.01), while
the relationship between Syllable AM performance and phono-
logical awareness approached significance (r=0.42, p=0.074).
Further, spelling skills were significantly related to Sub-beat AM
sensitivity (r=0.48, p<0.05). Dyslexics also showed a signif-
icant relationship between their auditory short-term memory
skills and their performance in the two combined AM condi-
tions (r=0.52, p<0.05 for Stress +Syllable; r=0.55, p<0.05
for Syllable +Sub-beat). This may indicate that dyslexics’ abil-
ity to use multiple patterns of temporal information to recognize
speech rhythm in our experimental paradigm was constrained by
their lower short-term memory capacity in comparison to con-
trols. When considered as a group, controls showed no significant
relationships between performance in the AM RP recognition
task, phonology and reading, although there was a trend toward
a correlation between Sub-beat AM sensitivity and spelling (r=
0.38, p=0.07). Overall, therefore, the partial correlations show
that the perceptual deficit in using AM patterns to detect speech
rhythm was related to phonological awareness for the dyslexic
participants only.
DISCUSSION AND CONCLUSION
Here, we tested the hypothesis that perceptual difficulties in pro-
cessing the AM patterns in speech that yield speech rhythm
are associated with the development of impaired phonological
representations for words by dyslexic individuals. The devel-
opment of impaired phonological representations of speech is
the cognitive hallmark of dyslexia across languages (Snowling,
2000; Ziegler and Goswami, 2005; Goswami, 2011). We tested
the sensitivity of adults with dyslexia to AM patterning yield-
ing speech rhythm for several different AM bands and band
combinations below 20 Hz that are present within the ampli-
tude envelope of speech. We found that dyslexic participants
performed significantly more poorly than control adults when
they were required to combine Syllable-rate AMs with AMs at
other rates (Stress +Syllable or Syllable +Sub-beat).However,
the dyslexic participants performed on par with controls when
asked to utilize the temporal information at a single AM rate
only (Stress only, Syllable only, or Sub-beat only). Accordingly, we
conclude that dyslexics’ difficulties with AM perception appear
to occur across more than one speech timescale (particularly
involvingtheSyllablerate).Moreover,aspredictedbythetem-
poral sampling framework, a perceptual deficit in utilizing AM
patterns in speech is related to phonological development in
dyslexia.
A deficit in Syllable-rate combination or synchronization with
other rates would support the findings of Leong and Goswami
(2014), in which the same group of adult dyslexics tested here
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |10
Leong and Goswami Impaired perception of temporal modulation in dyslexia
showed differences in their phase of rhythmic entrainment at the
Syllable rate in a rhythmic tapping task to nursery rhyme targets.
A difference in Syllable phase of entrainment suggests that dyslex-
ics have temporal differences in their processing of Syllable-rate
information (e.g., they may perceive P-centers as occurring earlier
in a speech sound as compared to controls). Here, participants
with dyslexia were significantly poorer at recognizing the target
nursery rhymes when they had to combine Syllable AM cues with
prosodic stress AM cues (Stress +Syllable).
In fact, a circular-linear correlation analysis of the two datasets
(Leong and Goswami, 2014 and the current study) revealed that
there was a strong correlation between participants’ Syllable AM
phase of tapping in the entrainment task based on rhythmic tap-
ping, and their sensitivity to Stress +Syllable AMs in the current
task (r=0.55, p<0.01). An earlier Syllable AM phase of rhyth-
mic tapping in Leong and Goswami (2014) was associated with
poorer perception of Stress+Syllable AMs in the current study.
No other AM band in the current study yielded significant corre-
lations with tapping phase in the prior study. Others have argued
that the perception and production of rhythm both rely on sim-
ilar cognitive and neural mechanisms, such as the entrainment
of neuronal oscillatory activity (Martin, 1972; Liberman and
Mattingly, 1985; Kotz and Schwartze, 2010). In the current con-
text, it is note-worthy that the common locus of dyslexic deficit
across perception and production tasks involved the Syllable-rate
of temporal processing.
Utilizing younger participants, Power et al. (2013) have shown
in a rhythmic speech processing task that children with dyslexia
also have a different preferred phase of entrainment in the delta
band (2 Hz), both in response to auditory speech alone, and when
speech information is audio-visual. The ‘temporal misalignment’
of both stress- and syllable-rate information in dyslexia found
by Power et al. (2013) and the current study could explain
why individuals with dyslexia develop phonological representa-
tions for words that are impaired (or specified differently) in
comparison to those of unaffected individuals. If temporal pro-
cessing of slower-rate information in speech is impaired, for
example because oscillatory phase alignment is inaccurate, then
this would affect the development of the entire mental lexicon
of word forms, not simply of syllable-level and prosodic infor-
mation. If syllable stress representation and syllabic parsing is
different in dyslexia because of a perceptual deficit in utilizing
AM patterns in speech, this would also affect phonetic-level infor-
mation. Phonemes are perceived more accurately when they are
in stressed syllables (Mehta and Cutler, 1988). Over the course
of development, if dyslexic children consistently fail to capture
rich, high-dimensional representations of the temporal patterns
that occur on multiple timescales in speech (e.g., concurrently
encoding Stress patterns, Syllable patterns and Phoneme patterns
into an integrated representation of a word), this would yield the
impoverished or atypical phonological representations that are
developed by children with dyslexia across languages.
At first glance, our data appear to be inconsistent with the
results of previous AM perception studies as summarized in the
Introduction. These non-speech studies generally indicated that
individuals with dyslexia had poorer AM perception at the 4 Hz
rate (Syllable AM). Here, we find no differences in performance
between controls and dyslexics when making rhythm judgments
onthebasisoftheSyllableAM(4Hz)only.However,itshould
be noted that the dependent variable being assessed in the cur-
rent study is different from that of psychophysical AM studies.
Whereas AM studies assess modulation detection thresholds based
on just noticeable differences in modulation depth or rate (e.g.,
Lorenzi et al., 2000; Rocheron et al., 2002), here we assess nursery
rhyme recognition using real-life speech AMs that contain strong
(and likely supra-threshold) modulation patterns. As such, it is
not surprising that no group differences were observed for our
single AM rate stimuli. It is possible that significant group differ-
ences could have been observed at single AM rates if we had used
sentences with weaker modulation patterns, such as whispered or
mumbled speech. However, we did observe a significant difference
in dyslexics’ ability to combine or integrate speech modulation
patterns across the Stress and Syllable rates, which is consistent
with dyslexics’ poorer speech perception performance in vocoder
studies (e.g., Lorenzi et al., 2000; Johnson et al., 2011; Nittrouer
and Lowenstein, 2013). This difference cannot be attributed to a
general lack of attention or engagement by dyslexic participants,
since they performed as well as controls with the single AM band
stimuli. Rather, dyslexics appear to have a particular difficulty in
making use of modulation information that is patterned at more
than one timescale, here when Syllable-rate information has to
be temporally synchronized with Stress-rate speech information
or Sub-beat information. However, as we did not include paired
AM combinations that did not involve the Syllable AM rate (e.g.,
Stress +Phoneme), we are not able to determine whether this dif-
ficulty is specific to Syllable AM combinations only, or whether it
would also occur for other combinations of speech AMs.
It should also be observed that our participants found the
rhythm judgment task very difficult. This high level of diffi-
culty stemmed from the fact that the sentences were (deliberately)
unintelligible, forcing our participants to rely solely on the acous-
tic modulations in the stimuli to perform rhythm judgments,
without recourse to lexical factors. Consequently, accuracy scores
for both controls and dyslexics (although significantly above
chance) were relatively low (below 50%). In future studies, the
issue of task difficulty may be ameliorated by using a tone-
vocoder with more than 1 spectral channel (i.e., 3 or 4 channels),
which would have the effect of increasing speech intelligibility.
However, increasing the intelligibility of the stimuli would also
introduce a new confound: participants would now be able to use
their lexical knowledge to augment their perceptual judgments
of speech rhythm. Nonetheless, this trade-off might produce
stronger effects. Lexical “boot-strapping” effects could be reduced
by using semantically unpredictable sentences (following Johnson
et al., 2011).
According to the temporal sampling framework (Goswami,
2011), the combination impairment for Stress +Syllable rate
AMs found here should affect speech perception even when lis-
tening to clear (i.e., fully intelligible) speech, which has strong
modulation patterns that are above the threshold for detection.
Interestingly, this was exactly what Lorenzi et al. (2000) found
in their study. They reported that dyslexic children performed
significantly more poorly than adults and control children even
when listening to clear, unprocessed (not-vocoded) VCV syllables
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |11
Leong and Goswami Impaired perception of temporal modulation in dyslexia
(these syllables will contain significant Syllable-rate modulation,
but not Stress-rate modulation). This controversial result might
possibly be explained by other factors like memory or attention,
nonetheless data like these suggest that speech AM perception in
dyslexia clearly requires more investigation. Current data suggest
that individuals with dyslexia are less sensitive to small changes in
modulation depth and rate, particularly around the syllable and
stress rates in speech. Future studies should explore how dyslexics’
difficulties with processing slow modulations affects their abil-
ity to integrate and synchronize slow-varying stress and syllable
information with more quickly-varying phoneme-rate informa-
tion in speech. These perceptual difficulties could be one source
of the impaired or atypical phonological representations stored in
the mental lexicon of word forms by dyslexic individuals.
Finally, we note that, given recent proposals by Poeppel and
colleagues regarding neural oscillatory phase-locking to speech
modulation patterns (e.g., Ghitza, 2011; Giraud and Poeppel,
2012), the perceptual difficulties that we observe here could be
underpinned by impaired phase alignment and cross-frequency
phase synchronization between different neuronal oscillatory
rates. For example, dyslexics could have poorer neuronal oscilla-
tory synchronization between theta oscillations (syllable rate) and
delta (stress rate) or gamma (phoneme rate) oscillations in the
cortex. Similarly, the neural interplay between theta (syllable rate)
and alpha (8–13 Hz, similar to the sub-beat rate here) oscillations
during speech comprehension might be atypical in dyslexia as
well (Obleser and Weisz, 2012). To date, such cross-frequency neu-
ral synchronization has not been studied in dyslexia (although see
Leong and Goswami, 2014, for an assessment of cross-frequency
AM synchronization in dyslexics’ speech). Such studies could be
very informative in the quest to identify cross-linguistic percep-
tual and neural deficits underpinning cognitive markers such as
impaired phonology in developmental dyslexia.
ACKNOWLEDGMENTS
This research was funded by a Harold Hyam Wingate Research
Scholarship to Victoria Leong and by a Medical Research Council
grant G0902375 to Usha Goswami.
SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found
online at: http://www.frontiersin.org/journal/10.3389/fnhum.
2014.00096/abstract
Audio 1 | Stress-only AM (MFB, Trochaic).
Audio 2 | Syllable-only AM (MFB, Trochaic).
Audio 3 | Sub-beat only AM (MFB, Trochaic).
Audio 4 | Stress +Syllable AM (MFB, Trochaic).
Audio 5 | Syllable +Sub-beat AM (MFB, Trochaic).
REFERENCES
Allen, G. (1972). The location of rhythmic stress beats in English: an experimental
study. Lang. Speech 15, 72–100.
Amitay, S., Ahissar, M., and Nelken, I. (2002). Auditory processing deficits
in reading disabled adults. J. Assoc. Res. Otolaryngol. 3, 302–320. doi:
10.1007/s101620010093
Bradley, L., and Bryant, P. E. (1983). Categorising sounds and learning to read - a
causal connection. Nature 301, 4190421. doi: 10.1038/301419a0
Cummins, F. (2003). Practice and performance in speech produced synchronously.
J. Phon. 31, 139–148. doi: 10.1016/S0095-4470(02)00082-7
Cummins, F., and Port, R. (1998). Rhythmic constraints on stress timing in English.
J. Phon. 26, 145–171. doi: 10.1006/jpho.1998.0070
Dauer, R. (1983). Stress-timing and syllable timing revisited. J. Phon. 11, 51–62.
de Bree, E., Wijnen, F., and Zonneveld, W. (2006). Word stress production in
three-year-old children at risk of dyslexia. J. Res. Read. 29, 304–317. doi:
10.1111/j.1467-9817.2006.00310.x
Drullman, R. (2006). “The significance of temporal modulation frequencies for
speech intelligibility,” in Listening to Speech: An Auditory Perspective,edsS.
Greenberg and W. A. Ainsworth (Mahwah, NJ: Lawrence Erlbaum Associates),
39–47.
Drullman, R., Festen, J. M., and Plomp, R. (1994a). Effect of temporal enve-
lope smearing on speech reception. J. Acoust. Soc. Am. 95, 1053–1064. doi:
10.1121/1.408467
Drullman, R., Festen, J. M., and Plomp, R. (1994b). Effect of reducing slow tem-
poral modulations on speech reception. J. Acoust. Soc. Am. 95, 2670–2680. doi:
10.1121/1.409836
Echols, C. H., Crowhurst, M. J., and Childers, J. B. (1997). The perception of rhyth-
mic units in speech by infants and adults. J. Mem. Lang. 36, 202–225. doi:
10.1006/jmla.1996.2483
Fredrickson, N., Frith, U., and Reason, R. (1997). Phonological Assessment Battery
(Standardised Edn.). Windsor: NFER-Nelson.
Friederici, A., Friedrich, M., and Christophe, A. (2007). Brain responses in 4-
month-old infants are already language-specific. Curr. Biol. 17, 1208–1211. doi:
10.1016/j.cub.2007.06.011
Ghitza, O. (2011). Linking speech perception and neurophysiology: speech decod-
ing guided by cascaded oscillators locked to the input rhythm. Front. Psychol.
2:130. doi: 10.3389/fpsyg.2011.00130
Ghitza, O., and Greenberg, S. (2009). On the possible role of brain rhythms
in speech perception: Intelligibility of time compressed speech with periodic
and aperiodic insertions of silence. Phonetica 66, 113–126. doi: 10.1159/000
208934
Giraud, A. L., and Poeppel, D. (2012). Cortical oscillations and speech processing:
emerging computational principles and operations. Nat. Neurosci. 15, 511–517.
doi: 10.1038/nn.3063
Gordon, J. W. (1987). The perceptual attack time of musical tones. J. Acoust. Soc.
Am. 82, 88–105. doi: 10.1121/1.395441
Goswami, U. (2008). The development of reading across languages. Ann. N.Y. Acad.
Sci. 1145, 1–12. doi: 10.1196/annals.1416.018
Goswami, U. (2011). A temporal sampling framework for developmental dyslexia.
Tre nd s Cog n. S ci . 15, 3–10. doi: 10.1016/j.tics.2010.10.001
Goswami, U., Fosker, T., Huss, M., Mead, N., and Szücs, D. (2011b). Rise time and
formant transition duration in the discrimination of speech sounds: the ba-wa
distinction in developmental dyslexia. Dev. Sci. 14, 34–43. doi: 10.1111/j.1467-
7687.2010.00955.x
Goswami, U., Gerson, D., and Astruc, L. (2010). Amplitude envelope perception,
phonology and prosodic sensitivity in children with developmental dyslexia.
Read. Writ. 23, 995–1019. doi: 10.1007/s11145-009-9186-6
Goswami, U., and Leong, V. (2013). Speech rhythm and temporal structure:
converging perspectives? Lab. Phonol. 4, 67–92. doi: 10.1515/lp-2013-0004
Goswami, U., Thomson, J., Richardson, U., Stainthorp, R., Hughes, D., Rosen,
S., et al. (2002). Amplitude envelope onsets and developmental dyslexia:
a new hypothesis. Proc. Natl. Acad. Sci. U.S.A. 99, 10911–10916. doi:
10.1073/pnas.122368599
Goswami, U., Wang, H.-L., Cruz, A., Fosker, T., Mead, N., and Huss,
M. (2011a). Language-universal sensory deficits in developmental dyslexia:
English, Spanish, and Chinese.J.Cogn.Neurosci.23, 325–337. doi:
10.1162/jocn.2010.21453
Greenberg, S. (2006). “A multi-band framework for understanding spoken lan-
guage,” in Understanding Speech: An Auditory Perspective, eds S. Greenberg and
W. Ainsworth (Mahweh, NJ: LEA), 411–434.
Greenberg, S., Carvey, H., Hitchcock, L., and Chang, S. (2003). Temporal properties
of spontaneous speech - a syllable-centric perspective. J. Phon. 31, 465–485. doi:
10.1016/j.wocn.2003.09.005
Gueron, J. (1974). The meter of nursery rhymes: an application of the Halle-Keyser
theory of meter. Poetics 12, 73–111. doi: 10.1016/0304-422X(74)90006-0
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |12
Leong and Goswami Impaired perception of temporal modulation in dyslexia
Hämäläinen, J. A., Leppänen, P. H. T., Eklund, K., Thomson, J., Richardson, U.,
Guttorm, T. K., et al. (2009). Common variance in amplitude envelope percep-
tion tasks and their impact on phoneme duration perception and reading and
spelling in Finnish children with reading disabilities. Appl. Psycholinguist. 30,
511–530. doi: 10.1017/S0142716409090250
Hämäläinen, J. A., Rupp, A., Soltész, F.,Szücs, D., and Goswami, U. (2012). Reduced
phase locking to slow amplitude modulation in adults with dyslexia: an MEG
study. Neuroim age 59, 2952–2961. doi: 10.1016/j.neuroimage.2011.09.075
Hämäläinen, J., Leppänen, P., Torppa, M., Müller, K., and Lyytinen, H. (2005).
Detection of sound rise time by adults with dyslexia. Brain Lang. 94, 32–42.
doi: 10.1016/j.bandl.2004.11.005
Hari, R., Sääskilahti, A., Helenius, P., and Uutela, K. (1999). Non−impaired
auditory phase locking in dyslexic adults. Neuroreport 10, 2347–2348. doi:
10.1097/00001756-199908020-00023
Hirst, D. J. (2006). “Prosodic aspects of speech and language,” in Encyclopedia
of Language and Linguistics,2nd Edn., ed K. Brown (Oxford: Elsevier),
539–546.
Holliman, A. J., Wood, C., and Sheehy, K. (2010). The contribution of sensitivity
to speech rhythm and non-speech rhythm to early reading development. Educ.
Psychol. 30, 247–267. doi: 10.1080/01443410903560922
Holliman, A. J., Wood, C., and Sheehy, K. (2012). A cross-sectional study of
prosodic sensitivity and reading difficulties. J. Res. Read. 35, 32–48. doi:
10.1111/j.1467-9817.2010.01459.x
Holm, S. (1979). A simple sequential rejective multiple test procedure. Scand. J.
Stat. 6, 65–70.
Howell, P. (1984). “An acoustic determinant of perceived and produced
anisochrony,” in Proceedings of the Tenth International Congress of Phonetic
Sciences, eds M. P. R. Van den Broecke and A. Cohen (Dordrecht: Foris).
Howell, P. (1988a). Prediction of P-center location from the distribution of
energy in the amplitude envelope: I. Percept. Psychophys. 43, 90–93. doi:
10.3758/BF03208978
Howell, P. (1988b). Prediction of P-center location from the distribution of
energy in the amplitude envelope: II. Percept. Psychophys. 43, 99. doi:
10.3758/BF03208980
Johnson, E. P., Pennington, B. F., Lowenstein, J. H., and Nittrouer, S. (2011).
Sensitivity to structure in the speech signal by children with speech sound
disorder and reading disability. J. Commun. Disord. 44, 294–314. doi:
10.1016/j.jcomdis.2011.01.001
Jusczyk, P. W., Cutler, A., and Redanz, N. (1993). Preference for the predominant
stress patterns of English words. Child Dev. 64, 675–687. doi: 10.2307/1131210
Jusczyk, P. W., Houston, D., and Newsome, M. (1999). The beginnings of word
segmentation in English-learning infants. Cogn. Psychol. 39, 159–207. doi:
10.1006/cogp.1999.0716
Kitzen, K. R. (2001). Prosodic Sensitivity, Morphological Ability and Reading
Ability in Young Adults With and Without Childhood Histories of Reading
Difficulty. Doctoral dissertation, University of Columbia. Dissertation Abstracts
International, 62, 0460A.
Kotz, S. A., and Schwartze, M. (2010). Cortical speech processing unplugged:
a timely subcortico-cortical framework. Tr en ds C og n. Sc i. 14, 392–399. doi:
10.1016/j.tics.2010.06.005
Lakatos, P., Shah, A. S., Knuth, K. H., Ulbert, I., Karmos, G., and Schroeder, C.
E. (2005). An oscillatory hierarchy controlling neuronal excitability and stim-
ulus processing in the auditory cortex. J. Neurophysiol. 94, 1904–1911. doi:
10.1152/jn.00263.2005
Lehongre, K., Morillon, B., Giraud, A. L., and Ramus, F. (2013). Impaired auditory
sampling in dyslexia: further evidence from combined fMRI and EEG. Front.
Hum. Neu rosci. 7:454. doi: 10.3389/fnhum.2013.00454
Lehongre, K., Ramus, F., Villiermet, N., Schwartz, D., and Giraud, A. L. (2011).
Altered low-gamma sampling in auditory cortex accounts for the three main
facets of dyslexia. Neuron 72, 1080–1090. doi: 10.1016/j.neuron.2011.11.002
Leong, V. (2012). Prosodic Rhythm in the Speech Amplitude Envelope: Amplitude
Modulation Phase Hierarchies (AMPHs) and AMPH Models. Doctoral disserta-
tion, University of Cambridge.
Leong, V., and Goswami, U. (2014). Assessment of rhythmic entrainment at multi-
ple timescales in dyslexia: evidence for disruption to syllable timing. Hear. Res.
308, 141–161. doi: 10.1016/j.heares.2013.07.015
Leong, V., Hamalainen, J., Soltesz, F., and Goswami, U. (2011). Rise time perception
and detection of syllable stress in adults with developmental dyslexia. J. Mem.
Lang. 64, 59–73. doi: 10.1016/j.jml.2010.09.003
Liberman, A. M., and Mattingly, I. G. (1985). The motor theory of
speech perception revised. Cognition 21, 1–36. doi: 10.1016/0010-0277(85)
90021-6
Lorenzi, C., Dumont, A., and Füllgrabe, C. (2000). Use of temporal envelope
cues by children with developmental dyslexia. J. Speech Lang. Hear. Res. 43,
1367–1379. doi: 10.1044/jslhr.4306.1367
Martin, J. G. (1972). Rhythmic (hierarchical) versus serial structuring in speech
and other behavior. Psychol. Rev. 79, 487–509. doi: 10.1037/h0033467
McAnally, K. I., and Stein, J. F. (1997). Scalp potentials evoked by amplitude-
modulated tones in dyslexia. J. Speech Lang. Hear. Res. 40, 939–945.
Mehta, G., and Cutler, A. (1988). Detection of target phonemes in spontaneous and
read speech. Lang. Speech 31, 135–156.
Menell, P., McAnally, K. I., and Stein, J. F. (1999). Psychophysical sensitivity and
physiological response to amplitude modulation in adult dyslexic listeners.J.
Speech Lang. Hear. Res. 42, 797–803.
Morton, J., Marcus, S. M., and Frankish, C. (1976). Perceptual centres (P-centres).
Psychol. Rev. 83, 405–408. doi: 10.1037/0033-295X.83.5.405
Mundy, I. R., and Carroll, J. M. (2012). Speech prosody and developmental
dyslexia: reduced phonological awareness in the context of intact phonologi-
cal representations. J. Cogn. Psychol. 24, 560–581. doi: 10.1080/20445911.2012.
662341
Nittrouer, S., and Lowenstein, J. H. (2013). Perceptual organization of speech sig-
nals by children with and without dyslexia. Res. Dev. Disabil. 34, 2304–2325.
doi: 10.1016/j.ridd.2013.04.018
Obleser, J., Herrmann, J., and Henry, M. J. (2012). Neural oscillations in
speech: don’t be enslaved by the envelope. Front. Hum. Neurosci. 6:250. doi:
10.3389/fnhum.2012.00250
Obleser, J., and Weisz, N. (2012). Suppressed alpha oscillations predict intelli-
gibility of speech and its acoustic details. Cereb. Cortex 22, 2466–2477. doi:
10.1093/cercor/bhr325
Perneger, T. V. (1998). What’s wrong with Bonferroni adjustments. Br.Med.J.316,
1236–1238. doi: 10.1136/bmj.316.7139.1236
Plomp, R. (1983). “Perception of speech as a modulated signal,” Proceedings of the
10th International Congress of Phonetic Sciences. (Utrecht), 29–40.
Poelmans, H., Luts, H., Vandermosten, M., Boets, B., Ghesquière, P., and
Wouters, J. (2011). Reduced sensitivity to slow-rate dynamic auditory infor-
mation in children with dyslexia. Res. Dev. Disabil. 32, 2810–2819. doi:
10.1016/j.ridd.2011.05.025
Poelmans, H., Luts, H., Vandermosten, M., Boets, B., Ghesquière, P., and
Wouters, J. (2012). Auditory steady state cortical responses indicate deviant
phonemic-rate processing in adults with dyslexia. Ear Hear. 33, 134–143. doi:
10.1097/AUD.0b013e31822c26b9
Poeppel, D. (2003). The analysis of speech in different temporal integration win-
dows: cerebral lateralization as ‘asymmetric sampling in time’. Speech Commun.
41, 245–255. doi: 10.1016/S0167-6393(02)00107-3
Power, A. J., Mead, N., Barnes, L., and Goswami, U. (2012). Neural entrainment
to rhythmically-presented auditory, visual and audio-visual speech in children.
Front. Psychol. 3:216 doi: 10.3389/fpsyg.2012.00216
Power, A. J., Mead, N., Barnes, L., and Goswami, U. (2013). Neural entrainment to
rhythmic speech in children with developmental dyslexia. Front. Hum. Neurosci.
7:777. doi: 10.3389/fnhum.2013.00777
Qin, M. K., and Oxenham, A. J. (2003). Effects of simulated cochlear implant pro-
cessing on speech reception in fluctuating maskers. J. Acoust. Soc. Am. 114,
446–454. doi: 10.1121/1.1579009
Ragó, A., Honbolygó, F., Róna, Z., Beke, A., and Csépe, V. (2014). Effect
of maturation on suprasegmental speech processing in full- and preterm
infants: a mismatch negativity study. Res. Dev. Disabil. 35, 192–202. doi:
10.1016/j.ridd.2013.10.006
Rice, W. R. (1989). Analyzing tables of statistical tests. Evolution 43, 223–225. doi:
10.2307/2409177
Richardson, U., Thomson, J., Scott, S. K., and Goswami, U. (2004). Auditory pro-
cessing skills and phonological representation in dyslexic children. Dyslexia 10,
215–233. doi: 10.1002/dys.276
Rocheron, I., Lorenzi, C., Füllgrabe, C., and Dumont, A. (2002). Temporal
envelope perception in dyslexic children. Neuroreport 13, 1683–1687. doi:
10.1097/00001756-200209160-00023
Rosen, S. (1992). Temporal information in speech: acoustic, auditory and lin-
guistic aspects. Philos. Trans. R. Soc. Lond. B Biol. Sci. 336, 367–373. doi:
10.1098/rstb.1992.0070
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |13
Leong and Goswami Impaired perception of temporal modulation in dyslexia
Schroeder, C. E., and Lakatos, P. (2009). Low-frequency neuronal oscilla-
tions as instruments of sensory selection. Tr en ds Neu ro sci . 32, 9–18. doi:
10.1016/j.tins.2008.09.012
Scott, S. K. (1993). P-Centres in Speech: An Acoustic Analysis. Unpublished Ph.D.
thesis, University College London.
Scott, S. K. (1998). The point of P-centres. Psychol. Res. 61, 4–11. doi:
10.1007/PL00008162
Shannon, R. V., Zeng, F.-G., Kamath, V., Wygonski, J., and Ekelid, M. (1995).
Speech recognition with primarily temporal cues. Science 270, 303–304. doi:
10.1126/science.270.5234.303
Snowling, M. J. (1981). Phonemic deficits in developmental dyslexia. Psychol. Res.
43, 219–234. doi: 10.1007/BF00309831
Snowling, M. J. (2000). Dyslexia,2nd Edn. Oxford: Blackwell Publishers.
Soltész, F., Szûcs, D., Leong, V., White, S., and Goswami, U. (2013). Differential
entrainment of neuroelectric delta oscillations in developmental dyslexia. PLoS
ONE 8:e76608. doi: 10.1371/journal.pone.0076608
Stefanics, G., Fosker, T., Huss, M., Mead, N., Szücs, D., and Goswami, U. (2011).
Auditory sensory deficits in developmental dyslexia: a longitudinal ERP study.
Neuro image 57, 723–732. doi: 10.1016/j.neuroimage.2011.04.005
Stone, M. A., and Moore, B. C. J. (2003). Effect of the speed of a single-
channel dynamic range compressor on intelligibility in a competing speech task.
J. Acoust. Soc. Am. 114, 1023–1034. doi: 10.1121/1.1592160
Stuart, G. W., McAnally, K. I., McKay, A., Johnston, M., and Castles, A. (2006). A
test of the magnocellular deficit theory of dyslexia in an adult sample. Cogn.
Neuro psychol. 23, 1215–1229. doi: 10.1080/02643290600814624
Surányi, Z., Csépe, V., Richardson, U., Thomson, J. M., Honbolygó, F., and
Goswami, U. (2009). Sensitivity to rhythmic parameters in dyslexic children:
a comparison of Hungarian and English. Read. Writ. 22, 41–56. doi:
10.1007/s11145-007-9102-x
Tallal, P., and Piercy, M. (1974). Developmental aphasia: rate of auditory processing
and selective impairment of consonant perception. Neuropsycho logi a 12, 83–93.
doi: 10.1016/0028-3932(74)90030-X
Thomson, J., Fryer, B., Maltby, J., and Goswami, U. (2006). Auditory and motor
rhythm awareness in adults with dyslexia. J. Res. Read. 29, 334–348. doi:
10.1111/j.1467-9817.2006.00312.x
Thomson, J. M., and Goswami, U. (2008). Rhythmic processing in children with
developmental dyslexia: auditory and motor rhythms link to reading and
spelling.J.Physiol.Paris102, 120–129. doi: 10.1016/j.jphysparis.2008.03.007
Tierney, A., and Kraus, N. (2013). The ability to tap to a beat relates to
cognitive, linguistic, and perceptual skills. Brain Lang. 124, 225–231. doi:
10.1016/j.bandl.2012.12.014
Tilsen, S., and Arvaniti, A. (2013). Speech rhythm analysis with decomposition
of the amplitude envelope: characterizing rhythmic patterns within and across
languages. J. Acoust. Soc. Am. 134, 628–639. doi: 10.1121/1.4807565
Tilsen, S., and Johnson, K. (2008). Low-frequency fourier analysis of speech
rhythm. J. Acoust. Soc. Am. 124, EL34–EL39. doi: 10.1121/1.2947626
Treiman, R., and Zukowski, A. (1991). “Levels of phonological awareness,” in
Phonological Processes in Literacy: a Tribute to Isabelle P. Liberman, eds S. Brady
and D. Shankweiler (Hillsdale, NJ: Erlbaum), 67–83.
Turner, R. E. (2010). Statistical Models for Natural Sounds. Doctoral disserta-
tion, University College London. Available online at: http://www.gatsby.ucl.ac.
uk/∼turner/Publications/turner-2010.html
Turner, R. E., and Sahani, M. (2007). “Probabilistic amplitude demodulation,”
in Proceedings of the 7th International Conference on Independent Component
Analysis and Signal Separation, 544–551. doi: 10.1007/978-3-540-74494-8_68
Turner, R. E., and Sahani, M. (2011). Demodulation as probabilistic infer-
ence. IEEE Trans. Audio Speech Lang. Process. 19, 2398–2411. doi:
10.1109/TASL.2011.2135852
Villing, R. (2010). Hearing the Moment: Measures and Models of the
Perceptual Centre. Doctoral dissertation, National University of
Ireland Maynooth. Available online at: http://eprints.nuim.ie/2284/1/
Villing_2010_-_PhD_Thesis.pdf
Wechsler, D. (1981). Manual for the Wechsler Adult Intelligence Scale-Revised. New
York,NY:ThePsychologicalCorporation.
Wechsler, D. (1999). Wechsler Abbreviated Scale of Intelligence. San Antonio, TX:
The Psychological Corporation.
Whalley, K., and Hansen, J. (2006). The role of prosodic sensitivity in chil-
dren’s reading development. J. Res. Read. 29, 288–303. doi: 10.1111/j.1467-
9817.2006.00309.x
Wilkinson, G. S. (1993). Wide Range Achievement Test 3. Wilmington, DE: Wide
Range.
Witton, C., Talcott, J. B., Hansen, P. C., Richardson, A. J., Griffiths, T. D., Rees, A.,
et al. (1998). Sensitivity to dynamic auditory and visual stimuli predicts non-
word reading ability in both dyslexic and normal readers. Curr. Biol. 8, 791–797.
doi: 10.1016/S0960-9822(98)70320-3
Wood, C., and Terrell, C. (1998). Pre-school phonological ability and
subsequent literacy development. Educ. Psychol. 18, 253–274. doi:
10.1080/0144341980180301
Xu, L., Thompson, C. S., and Pfingst, B. E. (2005). Relative contributions of spectral
and temporal cues for phoneme recognition. J. Acoust. Soc. Am. 117, 3255–3267.
doi: 10.1121/1.1886405
Ziegler, J., and Goswami, U. (2005). Reading acquisition, developmental dyslexia,
and skilled reading across languages: a psycholinguistic grain size theory.
Psychol. Bull. 131, 3–29. doi: 10.1037/0033-2909.131.1.3
Zion Golumbic, E. M., Poeppel, D., and Schroeder, C. E. (2012). Temporal context
in speech processing and attentional stream selection: A behavioral and neural
perspective. Brain Lang. 122, 151–161. doi: 10.1016/j.bandl.2011.12.010
Conflict of Interest Statement: The authors declare that the research was con-
ducted in the absence of any commercial or financial relationships that could be
construed as a potential conflict of interest.
Received: 03 December 2013; accepted: 08 February 2014; published online: 24
February 2014.
Citation: Leong V and Goswami U (2014) Impaired extraction of speech rhythm
from temporal modulation patterns in speech in developmental dyslexia. Front. Hum.
Neuro sci. 8:96. doi: 10.3389/fnhum.2014.00096
This article was submitted to the journal Frontiers in Human Neuroscience.
Copyright © 2014 Leong and Goswami. This is an open-access article distributed
under the terms of the Creative Commons Attribution License (CC BY). The use, dis-
tribution or reproduction in other forums is permitted, provided the orig inal author(s)
or licensor are credited and that the original publication in this journal is cited, in
accordance with accepted academic practice. No use, distribution or reproduction is
permitted which does not comply with these terms.
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |14