ArticlePDF Available

Impaired extraction of speech rhythm from temporal modulation patterns in speech in developmental dyslexia

Authors:

Abstract and Figures

Dyslexia is associated with impaired neural representation of the sound structure of words (phonology). The "phonological deficit" in dyslexia may arise in part from impaired speech rhythm perception, thought to depend on neural oscillatory phase-locking to slow amplitude modulation (AM) patterns in the speech envelope. Speech contains AM patterns at multiple temporal rates, and these different AM rates are associated with phonological units of different grain sizes, e.g., related to stress, syllables or phonemes. Here, we assess the ability of adults with dyslexia to use speech AMs to identify rhythm patterns (RPs). We study 3 important temporal rates: "Stress" (~2 Hz), "Syllable" (~4 Hz) and "Sub-beat" (reduced syllables, ~14 Hz). 21 dyslexics and 21 controls listened to nursery rhyme sentences that had been tone-vocoded using either single AM rates from the speech envelope (Stress only, Syllable only, Sub-beat only) or pairs of AM rates (Stress + Syllable, Syllable + Sub-beat). They were asked to use the acoustic rhythm of the stimulus to identity the original nursery rhyme sentence. The data showed that dyslexics were significantly poorer at detecting rhythm compared to controls when they had to utilize multi-rate temporal information from pairs of AMs (Stress + Syllable or Syllable + Sub-beat). These data suggest that dyslexia is associated with a reduced ability to utilize AMs <20 Hz for rhythm recognition. This perceptual deficit in utilizing AM patterns in speech could be underpinned by less efficient neuronal phase alignment and cross-frequency neuronal oscillatory synchronization in dyslexia. Dyslexics' perceptual difficulties in capturing the full spectro-temporal complexity of speech over multiple timescales could contribute to the development of impaired phonological representations for words, the cognitive hallmark of dyslexia across languages.
Computation of strong-weak (s-w) syllable stress patterns using the phase-relationship between “Stress”- and “Syllable”-rate amplitude modulations (AMs) in the speech envelope, illustrated with the trochaic (s-w) nursery rhyme sentence “Mary Mary quite contrary.” Left, (A) the original waveform of the speech signal is shown at the top, with the whole-band amplitude envelope superimposed as a bold line. The envelope is band-pass filtered at three different rates to produce a Stress AM (~2 Hz), a Syllable AM (~4 Hz) and a Sub-beat AM (~14 Hz) respectively. Right, (B) to compute the syllable stress pattern of the sentence, the oscillatory phase series of the Stress AM and the Syllable AM are extracted. Here, AM phase values are projected onto a cosine function for ease of visualization. Note that the 8 Syllable AM cycles correspond to the 8 spoken syllables in the sentence. The concurrent Stress AM phase at Syllable AM peaks (indicated with vertical dotted lines) is transformed into a prominence index (PI), shown in the bar graph at the top. Syllable AM peaks that occur near the oscillatory peak of the Stress AM achieve PI values of ~1, while Syllable AM peaks that occur near the oscillatory trough of the Stress AM achieve PI values of ~0. Here, syllables with a high PI (near 1) are considered “strong” while syllables with a low PI (near 0) are considered “weak.” Note that this Stress-Syllable AM phase relationship accurately reflects the trochaic syllable stress pattern of the sentence.
… 
Content may be subject to copyright.
ORIGINAL RESEARCH ARTICLE
published: 24 February 2014
doi: 10.3389/fnhum.2014.00096
Impaired extraction of speech rhythm from temporal
modulation patterns in speech in developmental dyslexia
Victoria Leong*and Usha Goswami
Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Cambridge, UK
Edited by:
Pierluigi Zoccolotti, Sapienza
University of Rome, Italy
Reviewed by:
Fumiko Hoeft, University of
California, San Francisco, USA
Roeland Hankock, University of
California, San Francisco, USA
(in collaboration with Fumiko Hoeft)
Jenny Thomson, University of
Sheffield, UK
Jarmo Hämäläinen, University of
Jyväskylä, Finland
*Correspondence:
Victoria Leong, Department of
Psychology, Centre for
Neuroscience in Education,
University of Cambridge, Downing
Street, Cambridge CB2 3EB, UK
e-mail: vvec2@cam.ac.uk
Dyslexia is associated with impaired neural representation of the sound structure of
words (phonology). The “phonological deficit” in dyslexia may arise in part from impaired
speech rhythm perception, thought to depend on neural oscillatory phase-locking to
slow amplitude modulation (AM) patterns in the speech envelope. Speech contains AM
patterns at multiple temporal rates, and these different AM rates are associated with
phonological units of different grain sizes, e.g., related to stress, syllables or phonemes.
Here, we assess the ability of adults with dyslexia to use speech AMs to identify rhythm
patterns (RPs). We study 3 important temporal rates: “Stress” (2 Hz), “Syllable” (4Hz)
and “Sub-beat” (reduced syllables, 14Hz). 21 dyslexics and 21 controls listened to
nursery rhyme sentences that had been tone-vocoded using either single AM rates from
the speech envelope (Stress only, Syllable only, Sub-beat only) or pairs of AM rates
(Stress +Syllable, Syllable +Sub-beat). They were asked to use the acoustic rhythm
of the stimulus to identity the original nursery rhyme sentence. The data showed that
dyslexics were significantly poorer at detecting rhythm compared to controls when they
had to utilize multi-rate temporal information from pairs of AMs (Stress +Syllable or
Syllable +Sub-beat). These data suggest that dyslexia is associated with a reduced
ability to utilize AMs <20 Hz for rhythm recognition. This perceptual deficit in utilizing
AM patterns in speech could be underpinned by less efficient neuronal phase alignment
and cross-frequency neuronal oscillatory synchronization in dyslexia. Dyslexics’ perceptual
difficulties in capturing the full spectro-temporal complexity of speech over multiple
timescales could contribute to the development of impaired phonological representations
for words, the cognitive hallmark of dyslexia across languages.
Keywords: amplitude modulation, envelope, speech rhythm, dyslexia, oscillations
INTRODUCTION
SPEECH RHYTHM AND PHONOLOGICAL AWARENESS IN DYSLEXIA
Dyslexia is characterized across languages by difficulties in
phonological processing (e.g., Snowling, 2000; Ziegler and
Goswami, 2005). Phonological processing encompasses the
encoding and representation of speech at a range of grain sizes,
both segmental (i.e., phoneme) and supra-segmental (e.g., rime,
syllable and stress). As simple decoding (word reading) requires
the acquisition of phonology-orthography correspondences at
different grain sizes (segmental for alphabetic languages, syllabic
for some character-based scripts), this cognitive “phonological
deficit” affects reading acquisition in dyslexia across languages.
While an impairment in segmental processing in dyslexia has long
been noted (e.g., Tallal and Piercy, 1974; Snowling, 1981), supra-
segmental sensitivity has only recently been a focus of study, and
then mainly in English (e.g., Wood and Terrell, 1998; Goswami
et al., 2002, 2010). This is surprising, as children’s phonological
sensitivity to supra-segmental features of speech develops early
in all languages, well before the onset of formal literacy instruc-
tion. Indeed, EEG studies reveal sensitivity to the dominant stress
patterns in the native language within the first months of life
(Friederici et al., 2007; Ragó et al., 2014).
For English-learning infants, this early sensitivity toward dom-
inant syllable stress patterns such as the “Strong-weak” (S-w)
trochaic motif has been shown to be important for word learn-
ing (Jusczyk et al., 1993; Echols et al., 1997). By the age of 7.5
months, English-learning infants are capable of using the trochaic
stress pattern as a template for segmenting words from con-
tinuous speech (Jusczyk et al., 1999). During early childhood,
pre-literate children across languages already exhibitan awareness
for rime and syllable units in speech. Pre-readers are able to iden-
tify pairs of words that rhyme (e.g., “mat” rhymes with “hat” but
not with “cut”), and to clap out the number of constituent sylla-
bles in a word (Bradley and Bryant, 1983; Treiman and Zukowski,
1991; Ziegler and Goswami, 2005). In fact, children’s phonolog-
ical awareness of rhyme, syllables and stress predicts their later
success in learning to read (Bradley and Bryant, 1983; de Bree
et al., 2006; Whalley and Hansen, 2006).
Sensitivity to supra-segmental features of speech, particularly
speech rhythm and syllable stress, also appear to be impaired
in children and adults with developmental dyslexia (e.g., Wood
and Terrell, 1998; Kitzen, 2001; Goswami et al., 2010; Holliman
et al., 2010, 2012; Leong et al., 2011; Mundy and Carroll, 2012).
Acoustically, prosodic rhythm and stress in the speech signal are
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |1
HUMAN NEUROSCIENCE
Leong and Goswami Impaired perception of temporal modulation in dyslexia
cued by a combination of amplitude, duration and frequency
changes (Hirst, 2006). The amplitude-based cues to rhythm
are contained within the slow-varying “amplitude envelope” of
speech (Plomp, 1983; Howell, 1984, 1988a,b; Greenberg et al.,
2003; Tilsen and Johnson, 2008; Leong, 2012; Tilsen and Arvaniti,
2013). These slowly-varying amplitude patterns also cue the
location of the rhythmic “perceptual (P)-center” or moment of
occurrence of a sound (Allen, 1972; Morton et al., 1976; Scott,
1993, 1998; Villing, 2010). The P-center forms the basis for the
deliberate rhythmic timing of speech and for synchronization of
speech between speakers (Cummins and Port, 1998; Cummins,
2003). The P-center is related perceptually to a particular rhyth-
mic marker within the speech amplitude envelope: the envelope
onset rise time. Perceptual sensitivity to rise time is impaired
in children and adults with dyslexia in a range of languages
(Goswami et al., 2002; Hämäläinen et al., 2005, 2009; Surányi
et al., 2009; Poelmans et al., 2011; Goswami et al., 2011a;see
Goswami, 2011, for a recent summary). The rise time or “attack”
time of a sound refers to the rate at which its amplitude increases
during its initial onset, and is closely related to its P-center and
rhythmic “beat strength.” For example, a trumpet note with a
fast rise time and early P-center will typically be perceived as
having a stronger beat than a bowed violin note with a slower
rise time and later P-center (Gordon, 1987). In speech, envelope
onset rise times distinguish between stressed and unstressed sylla-
bles (Leong et al., 2011; Goswami and Leong, 2013), and provide
phonetic cues to voice onset time and manner of articulation,
for example aiding in phonetic distinctions such as between /b/
and /w/ (Goswami et al., 2011b). Dyslexics’ difficulties in per-
ceiving amplitude envelope rise times across languages has led
to the theoretical suggestion that a deficit in neural rhythmic
entrainment to amplitude modulation (AM) patterns in speech
could underlie the phonological deficit in developmental dyslexia
(Goswami, 2011; “temporal sampling theory”).
NEURONAL OSCILLATORY ENTRAINMENT IN DYSLEXIA
The speech amplitude envelope contains a spectrum of AM at
different temporal rates, with certain key rates of AM associated
with characteristic timescales of speech information. For exam-
ple, the envelope is dominated by modulations that occur at
around 3–5 Hz, corresponding to the average duration of the syl-
lable (Greenberg et al., 2003; Greenberg, 2006). AMs at a slower
rate of 2 Hz are associated with inter-stress intervals in speech,
which have an average duration of 493 ms (Dauer, 1983). Toward
the other end of the modulation spectrum, faster modulations
immediately above the ‘classic’ syllable rate of 3–5 Hz correspond
to more quickly-uttered unstressed syllables (10 Hz, Greenberg
et al., 2003). Faster modulations up to 50 Hz are thought to
provide phonemic cues to manner of articulation, voicing, and
vowel identity (Rosen, 1992). Although the amplitude envelope
has been the focus of many speech intelligibility studies (e.g.,
Drullman et al., 1994a,b; Shannon et al., 1995), the spectral fine
structure also makes an important contribution to speech intelli-
gibility, particularly under adverse listening conditions (Qin and
Oxenham, 2003; Xu et al., 2005; Obleser et al., 2012).
Recently,Poeppelandcolleagueshaveproposedaneural
account of speech processing based on multi-time resolution of
the modulation patterns in the speech envelope (multi-time reso-
lution models, e.g., Poeppel, 2003; Giraud and Poeppel, 2012). In
multi-time resolution models, the brain is thought to track speech
information at different timescales using neuronal oscillations at
different frequencies. These neuronal oscillations entrain (“phase-
lock”) to speech modulation patterns on equivalent timescales,
so that peaks and troughs in oscillatory activity align with peaks
and troughs in modulations in the signal. According to Giraud
and Poeppel (2012), neuronal oscillatory activity in the Theta
band (3–7 Hz) tracks syllable patterns in speech, while slower
oscillatory activity in the Delta band (1–3) Hz tracks phrasal
and intonational patterns, such as stress intervals. Fast oscilla-
tory activity in the Gamma band (25–80 Hz) is thought to track
quickly-varying phonetic information, such as formant transi-
tions and voice-onset times, which have timescales in the order
of tens of milliseconds. This convergence between characteristic
timescales in speech and the dominant neuronal oscillatory bands
in auditory cortex has been used to argue that oscillatory entrain-
ment (“phase locking”) may be an important neural mechanism
for parsing the speech signal into appropriately-sized linguistic
units for further lexical processing (Ghitza and Greenberg, 2009;
Schroeder and Lakatos, 2009; Giraud and Poeppel, 2012; Zion
Golumbic et al., 2012).
In line with dyslexics’ difficulties in rise time perception,
which are particularly evident for slower rise times (Richardson
et al., 2004; Stefanics et al., 2011). Goswami (2011) proposed
a “temporal sampling” framework to explain why the devel-
opment of accurate phonological representation of speech is
impaired across languages in developmental dyslexia. The tem-
poral sampling framework proposed that impaired phonological
representation in dyslexia could arise in part from impaired oscil-
latory entrainment to slow AMs (<10 Hz) that carry stress and
syllable patterning in speech (i.e., involving delta and theta oscil-
lations, see Goswami, 2011; Power et al., 2012, 2013; Soltész et al.,
2013). As neuronal oscillations in the cortex exhibit hierarchi-
cal nesting across slow and fast timescales (e.g., theta-gamma
phase-amplitude coupling; Lakatos et al., 2005), an impairment
in slow oscillatory activity (e.g., delta, stressed syllable rate; theta,
syllable rate) could also have consequences for speech encod-
ing at faster timescales, such as the Gamma or other phonetic
rate timescales. Indeed, recent studies using non-speech stimuli
have indicated that the hemispheric lateralization of Gamma-rate
oscillations (30 Hz) may be altered in dyslexia (Lehongre et al.,
2011, 2013).
AM PERCEPTION IN DYSLEXIA
Consistent with Goswami’s (2011) proposal, several AM percep-
tion studies based on non-speech stimuli and psychoacoustic
modulation thresholds indicate that dyslexics show poor AM
sensitivity below 10 Hz (e.g., Lorenzi et al., 2000; Amitay et al.,
2002; Rocheron et al., 2002; although note that Poelmans et al.,
2012 observed no deficit at 4 Hz). Studies reporting on modula-
tion thresholds for faster AM rates vary in whether they report
dyslexic deficits. For example, while McAnally and Stein (1997),
Witton et al. (1998),andMenell et al. (1999) all observed deficits
in dyslexics’ AM detection at 20 Hz, Hämäläinen et al. (2009)
failed to find a deficit at the same rate. Meanwhile, while no
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |2
Leong and Goswami Impaired perception of temporal modulation in dyslexia
dyslexic deficit at 80 Hz was reported by (Hari et al., 1999), a
study by Poelmans et al. (2012) found atypical laterality effects
in EEG for 20 Hz AM speech-weighted noise, and a study by
Lehongre et al. (2011) found atypical laterality effects in MEG
for 35 Hz AM white noise. Similarly mixed results have been
observed for dyslexics’ perception of very slow “stress rate” AMs.
While an early study by Witton et al. (1998) found that the per-
ception of 2 Hz AMs was unimpaired in dyslexia, subsequent
studies by Stuart et al. (2006) and Hämäläinen et al. (2012) have
reported significant group differences in AM sensitivity at the
1 Hz and 2 Hz rates respectively. From the non-speech studies,
it is currently unclear whether dyslexics have a general deficit
in AM perception that affects all modulation rates, or whether
their deficit is specific to the AM rates <10 Hz that are identified
in temporal sampling theory (Goswami, 2011). It is also possi-
ble that a single auditory anomaly, impaired phonemic sampling
in left auditory cortex, accounts for the impaired phonological
processing found in dyslexia (Lehongre et al., 2011).
While AM studies are important for studying phase-locking,
their implications for real-life speech perception are limited
because the AM patterns used in these studies are artificial sinu-
soids and not real speech AMs. Real-speech AMs differ from
artificial sinusoids in several important ways. First, unlike sinu-
soids, speech AMs are not perfectly periodically regular, but
contain phase-advancements or delays that reduce their tempo-
ral predictability. Secondly, real-speech AMs differ in patterning
at different acoustic frequencies. These temporal differences in
modulation patterning across different “spectral channels” are
crucial for speech intelligibility (e.g., Shannon et al., 1995).
Finally, in real speech, AM patterns at all timescales (e.g., stress,
syllable and phoneme) are concurrently transmitted to the lis-
tener, unlike artificial AM studies in which only one AM rate is
presented at a time. During real-life speech processing, listeners
probably extract speech information using combinations of AMs
at different rates. For example, we have recently reported that lis-
teners detect prosodic RPs by computing the phase relationship
between two concurrent rates of speech AM: the “Stress” rate
(2 Hz) and the “Syllable” rate (4Hz, see Leong, 2012). This
proposal is summarized in Figure 1. Dyslexics’ ability to use such
AM combinations in real speech has, to our knowledge, not been
tested.
One obvious difficulty is that the complexity of the speech sig-
nal makes the extraction of specific features like cross-frequency
AM phase alignment at pre-determined rates very difficult.
Accordingly, studies using “vocoded” (envelope-only) real speech
are useful. In vocoder studies, the speech signal is split into dif-
ferent frequency channels (e.g., typically 2, 4, 8 or 16 channels),
the envelopes from each channel are used to modulate noise or
tone carriers, and are then recombined. The resulting speech
sounds like a harsh whisper, and is initially difficult to recog-
nize. Speech vocoder studies with dyslexic children consistently
suggest that their ability to use envelope cues for speech percep-
tion is impaired (e.g., Lorenzi et al., 2000; Johnson et al., 2011;
Nittrouer and Lowenstein, 2013). For example, Lorenzi et al.
(2000) used 4-channel noise-vocoded VCV syllables (e.g., /aCa/)
as stimuli, and found that both typically-developing and dyslexic
11-year-old children performed more poorly than adults when
using envelope cues (<500 Hz) for speech intelligibility. However,
while the speech recognition performance of control children
improved significantly over the course of five training sessions
during the experiment, the performance of dyslexic children did
not improve with training. Johnson et al. (2011) and Nittrouer
and Lowenstein (2013) found more direct evidence for impaired
speech envelope perception in dyslexia. In their study using 4-
and 8-channel semantically-unpredictable noise-vocoded mono-
syllabic sentences (e.g., “dumb shoes will sing”), Johnson et al.
(2011) found that 10–11 year-old children with reading diffi-
culties showed significantly poorer word recognition of vocoded
speech than control children, for both 4- and 8-channel stim-
uli. Similarly, Nittrouer and Lowenstein (2013) used 4-channel
noise-vocoded sentences and found that there were consistent
differences in speech perception performance between typically-
developing and dyslexic children, for both age groups tested (8–9
years and 10–11 years).
In each of these studies, the vocoded stimulus typically con-
tained a very wide range of envelope AM rates rather than a
single AM rate (e.g., the envelope was low-pass filtered under
500 Hz). Thus, a complication of these experiments is that a
deficit in perceiving speech modulations at a specific rate (e.g.,
4 Hz) would be masked if the dyslexic children were able to
extract redundant speech information at other modulation rates
(e.g., 20 Hz) to compensate for a slow AM deficit (see Drullman,
2006). Conversely, if a difference in performance is observed
(as was the case in these studies), it is not clear whether this
is caused by a general deficit in AM processing that affects all
modulation rates, a specific deficit at certain AM rates (e.g., per-
taining to stress, syllable or phoneme-rate information), or a
deficit in combining AM information across different temporal
rates. Therefore, to assess speech AM perception in dyslexia more
closely, a combination of the two approaches (from AM studies
and vocoding studies) is needed. Ideally, the stimuli should be
created from the envelopes of real speech, but AMs at specific
modulation rates (or combinations of modulation rates) should
be systematically isolated from these real envelopes. Here, we
present one such study.
EXPERIMENTAL RATIONALE AND HYPOTHESES
Given the prior literature on the relationship between rhythmic
awareness and reading (e.g., Thomson et al., 2006;Thomson
and Goswami, 2008; Goswami and Leong, 2013; Tierney and
Kraus, 2013), we were specifically interested in assessing dyslex-
ics’ ability to use different AM rates in speech for rhythm per-
ception (rather than speech intelligibility per se). Accordingly,
we devised a rhythm perception task using rhythmic sentences
(nursery rhymes) that had been tone-vocoded using different
AM rates. For normal adult listeners, speech rhythm percep-
tion relies on sensitivity to the phase-relationship between 2
key AM rates (stress 2Hz and syllable 4Hz; Leong, 2012).
Furthermore, in prior work on rhythmic entrainment, we have
shown that children and adults with dyslexia show “tapping to the
beat” impairments at 2 Hz (Thomson et al., 2006; Thomson and
Goswami, 2008), while when tapping to speech rhythms adults
with dyslexia show impairment at the syllable rate (4Hz;Leong
and Goswami, 2014). Accordingly, here we presented dyslexic and
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |3
Leong and Goswami Impaired perception of temporal modulation in dyslexia
FIGURE 1 | Computation of strong-weak (s-w) syllable stress patterns
using the phase-relationship between “Stress”- and “Syllable”-rate
amplitude modulations (AMs) in the speech envelope, illustrated
with the trochaic (s-w) nursery rhyme sentence “Mary Mary quite
contrary.” Left, (A) the original waveform of the speech signal is shown
at the top, with the whole-band amplitude envelope superimposed as a
bold line. The envelope is band-pass filtered at three different rates to
produce a Stress AM (2 Hz), a Syllable AM (4 Hz) and a Sub-beat AM
(14 Hz) respectively. Right, (B) to compute the syllable stress pattern of
the sentence, the oscillatory phase series of the Stress AM and the
Syllable AM are extracted. Here, AM phase values are projected onto a
cosine function for ease of visualization. Note that the 8 Syllable AM
cycles correspond to the 8 spoken syllables in the sentence. The
concurrent Stress AM phase at Syllable AM peaks (indicated with vertical
dotted lines) is transformed into a prominence index (PI), shown in the
bar graph at the top. Syllable AM peaks that occur near the oscillatory
peak of the Stress AM achieve PI values of 1, while Syllable AM peaks
that occur near the oscillatory trough of the Stress AM achieve PI values
of 0. Here, syllables with a high PI (near 1) are considered “strong”
while syllables with a low PI (near 0) are considered “weak.” Note that
this Stress-Syllable AM phase relationship accurately reflects the trochaic
syllable stress pattern of the sentence.
control adult listeners with tone-vocoded (envelope-only) sen-
tences that contained only a narrow range of AM rates under
20 Hz. In order that the modulation patterns in our stimuli would
be realistically speech-like, these modulation bands did not con-
tain only a single AM rate (i.e., a “4 Hz” sinusoid). Rather each
AM band contained a narrow range of AM rates centered around
a target rate (e.g., 2.3–7 Hz, centered around 4 Hz), each of which
we refer to in shorthand by the center rate (e.g., here as “4Hz
or “Syllable-rate AMs”).
Our dependent variable was the accuracy of speech rhythm
perception. We created stimuli that contained modulations from
either a single narrow AM band (i.e., Stress only 2Hz, Syllable
only 4Hz,Sub-beatonly14 Hz), or from paired combinations
of AM bands (Stress +Syllable and Syllable +Sub-beat). On
the basis of the temporal sampling framework (Goswami, 2011),
we predicted no dyslexic impairment at the sub-beat band rate
of 14 Hz (included as a control frequency band), but sig nificant
impairment at both rates <10 Hz (Syllable and Stress rates). On
the basis of our prior data on rhythmic entrainment to speech
rhythms (Leong and Goswami, 2014), we also predicted that
dyslexics would have difficulty in combining speech information
across different temporal modulation rates. As Leong’s modeling
work (Leong, 2012) has shown that rhythm perception depends
critically on the Stress +Syllable AM combination, it may be that
particular dyslexic difficulty is found for this combination.
Note that in this experiment we used the ‘Sub-beat’ rate
(14 Hz) as a control AM band, not the “phoneme rate”
(30 Hz) that is the theoretical focus of AM work by Lehongre
et al. (2011, 2013). Our decision was motivated by the clas-
sic psychophysical studies of Drullman et al. (1994a,b).These
studies indicated that AM rates up to 16 Hz are the most
important for speech intelligibility, and that the inclusion of
faster AM rates above 16 Hz result in little improvement to
intelligibility. Furthermore, in a rhythmic context, we noticed
that unstressed syllables are often compressed to a “sub-beat”
length in order to fit within the standard “beat” length of one
ordinary syllable. For example, in the nursery rhyme sentence
“Humpty Dumpty sat on the wall,” the syllables “sat” and “on”
are compressed together, or reduced, to fit the space of one
regular syllable like “Hum.” Consequently, the overall trochaic
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |4
Leong and Goswami Impaired perception of temporal modulation in dyslexia
rhythm of the sentence is not disrupted. Thus, the “Sub-beat”
rate (14 Hz) is likely to correspond to speech modulations
that are important for intelligibility, but which contribute little
toward the overall rhythmic patterning of “Strong” and “weak”
beats in a sentence, making this an ideal control modulation
band. As the cited “phoneme” rate (30 Hz) commonly refers
to the timescale of formant transition patterns in speech (e.g.,
Giraud and Poeppel, 2012), we plan to examine this rate in
the context of frequency modulation (FM) perception in future
studies.
METHODS
PARTICIPANTS
Twenty-one adults (9 M, 12 F) with developmental dyslexia and
26 control adults (7 M, 19 F) participated in the study. All dyslexic
participants had received a formal diagnosis of developmental
dyslexia and also showed significant reading and phonological
deficits according to our own test battery. All participants had no
other diagnosed auditory or learning difficulties, spoke English
as a first language, and were aged under 40 years. As shown in
Table 1, dyslexic and control participants were matched on IQ
[2 subscales of the Wechsler Abbreviated Scale of Intelligence
(WASI), Wechsler, 1999: A non-verbal subscale (Block Design)
and a verbal subscale (Vocabulary)]. However, there was a signif-
icant age difference between dyslexic and control groups, where
controls were slightly older on average [dyslexic mean age =22.9
years; control mean age =25.5 years; F(1,45)=5.66, p<0.05].
To account for this age difference, all our subsequent statistical
analyses include age as a covariate. As this statistical solution is
impartial, we felt that it would be preferable to manually exclud-
ing certain participants on the basis of their age, which would
entail subjectivity as to how many and which participants to
exclude.
Table 1 | Group performance on standardized ability, literacy and
phonological tests.
Task Dyslexic Controls F(1,45)
Age 22.9 25.5 5.66*
(SE) (0.6) (0.8)
IQ 129.6 129.8 0.01
(SE) (1.0) (1.5)
- Non-Verbal IQ T score 70.6 70.7 0.01
(0.7) (0.8)
- Verbal IQ T score 62.0 62.0 0.00
(1.0) (1.5)
Auditory STM score (out of 16) 10.3 13.0 22.91***
(SE) (0.4) (0.4)
Reading standard score 110.8 115.8 8.81**
(SE) (1.4) (1.0)
Spelling standard score 104.7 117.0 43.68***
(SE) (1.5) (1.2)
Phonology score (out of 30) 26.1 28.5 22.13***
(SE) (0.4) (0.3)
*p<0.05; **p<0.01; ***p<0.001.
Consistent with their diagnosis, dyslexics performed signifi-
cantly more poorly than controls in standardized tests for lit-
eracy [Wide Range Achievement Test (WRAT-III), Reading and
Spelling scales, Wilkinson, 1993] and phonological awareness
(Phonological Assessment Battery (PhAB), Spoonerisms task,
Fredrickson et al., 1997; Weschler Adult Intelligence Scale-Revised
(WAIS-R) forward digit span subtest, Wechsler, 1981). Thus,
despite the relatively high IQ of both groups (reflecting the fact
that these were high-performing students at a world-class uni-
versity), dyslexic participants still lagged behind their peers in
their reading, spelling and phonological awareness skills. Both
control and dyslexic participants also took part in other stud-
ies on rhythm perception and production (see also Leong and
Goswami, 2014). Ethical approval for the study was obtained
from the Cambridge Psychology Research Ethics Committee, and
all participants were given a modest payment for taking part in
the experiments.
MATERIALS
In line with our focus on rhythm, children’s nursery rhymes were
used as stimuli because these are a form of naturally-occurring,
rhythmically-rich speech material, whose rhythm patterns (RPs)
should be familiar to and easily identified by listeners. Four duple-
meter nursery rhymes were used for the experiment, taking the
first line of each nursery rhyme (8 syllables). The sentences fell
into either of two RPs, as shown in Tab l e 2 .Twosentenceshad
a “S-w” or trochaic pattern. These were “MA-ry MA-ry QUITE
con-TRA-ry” and “SIM-ple SI-mon MET a PIE-man” (stressed
syllables in CAPS). The other two sentences had a “w-S” or
iambic pattern. These were “as I was GO-ing TO st IVES” and
“the QUEEN of HEARTS she MADE some TARTS.” We chose to
use trochaic and iambic patterns because these are the dominant
prosodic motifs found in children’s nursery rhymes (Gueron,
1974), and were easily understood by our participants. A total
of 4 sentences (2 per RP) were used to encourage participants
to attend to the global “S-w” or “w-S” rhythm patterning that
was common between the 2 exemplars of each pattern. Using two
exemplars also prevented reliance on minor non-rhythmic vari-
ations (e.g., total stimulus length) to perform the task. We did
not use more than 4 sentences as this would have unnecessarily
increased the difficulty of the task (which was already high in dif-
ficulty). Each sentence was 2 s in length (Mary: 2.01 s; Simon:
2.12 s; St Ives: 2.37 s; Queen: 2.31s). The nursery rhymes were
spoken by a female native speaker of British English who was
articulating in time to a 4 Hz (syllable rate) metronome beat. The
speaker was instructed to produce the RP of each nursery rhyme
Table 2 | List of nursery rhyme sentences and their rhythm pattern.
Rhythm pattern
(S, Strong; w, weak)
Nursery rhyme sentence
(CAPS, Strong syllable)
SwSwSwSw
(trochaic)
“MA-ry MA-ry QUITE con-TRA-ry”
“SIM-ple SI-mon MET a PIE-man”
wSwSwSwS
(iambic)
“as I was GO-ing TO st IVES”
“the QUEEN of HEARTS she MADE some
TAR TS
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |5
Leong and Goswami Impaired perception of temporal modulation in dyslexia
as clearly as possible. Utterances were digitally recorded using a
TASCAM digital recorder (44.1 kHz, 24-bit), and the metronome
wasnotaudibleinthefinalrecording.
RHYTHM PERCEPTION TASK
In each trial, participants heard one of four tone-vocoded nursery
rhyme sentences. They were asked to indicate the target sen-
tence (one of four) by selecting an appropriate response button.
Participants were told to base their judgment on the RP of the
stimulus. Given that the vocoded sentences had a clear rhythm
but were unintelligible (see Section Signal Processing Steps for
Tone Vocoding), we did not expect participants’ sentence iden-
tification to exceed 50% in accuracy (i.e., we expected accurate
discrimination between trochaic vs iambic sentences, but not
within 2 trochaic or iambic sentences). All participants were first
given 20 practice trials, during which they heard the four sen-
tences as originally spoken, without any vocoding. This enabled
participants to learn the RP of each sentence, and to become
familiar with the response button mapping. Subsequently, par-
ticipants performed the task with tone-vocoded stimuli only. The
tone-vocoded stimuli retained the temporal pattern of each nurs-
ery rhyme sentence, but were completely unintelligible. Cartoon
icons representing the four response options were displayed on
the computer screen throughout the experiment to help to reduce
the memory load of the task. Auditory stimuli were presented
diotically using Sennheiser HD580 headphones at 70 dB SPL. The
experimental task was programmed in Presentation and delivered
using a Lenovo ThinkPad Edge laptop.
Signal processing steps for tone vocoding
AM bands were extracted from the amplitude envelope of the
speech signal of each nursery rhyme sentence using two differ-
ent methods. In the first method, the amplitude envelope was
extracted using the Hilbert transform. This Hilbert envelope was
then passed through a modulation filterbank (MFB) of band-pass
filters, which effectively isolated speech AMs corresponding to
the (1) “Stress” rate (0.8–2.3 Hz), (2) “Syllable” rate (2.3–7 Hz),
and (3) “Sub-beat” (7–20 Hz) rate. Please see Stone and Moore
(2003) for details of the spectral filterbank design, which was
adapted to be used as a MFB here. It is possible that artificial
modulations may be introduced into the stimuli by the MFB
method, since band-pass filters can introduce modulations near
the center-frequency of the filter, through “ringing.” Therefore,
a second AM-hierarchy extraction method was also used. This
was Probabilistic Amplitude Demodulation (PAD; Tu r n er an d
Sahani, 2011), and did not involve the Hilbert transform or fil-
tering. Rather, the PAD method estimates the signal envelope
using a model-based approach in which the signal is assumed
to comprise the product of a positive slow envelope and a fast
carrier. Bayesian statistical inference is used to invert the model,
thereby identifying the envelope which best matches the data and
the aprioriassumptions (i.e., a positive-valued envelope whose
mean is constant over time). This envelope extraction protocol
can be run recursively at different timescales, yielding AMs at
the same modulation rates as those derived from MFB filtering
(Turner and Sahani, 2007; Turner, 2010). All participants heard
both MFB-derived and PAD-derived vocoded stimuli in the same
experiment. It was reasoned that if participants produced the
same pattern of results with two methods of AM extraction that
operate using very different sets of principles, the observed effects
were likely to have arisen from real features in speech rather than
filtering artifacts.
The MFB- and PAD-derived AMs were used to modulate a
500 Hz sine-tone carrier in a single-channel vocoder. A multi-
channel vocoder was not used to ensure that the sentences would
be completely unintelligible. As the dependent variable in the
experiment was how well participants could identify each sen-
tence on the basis of its AM RP, all other cues to sentence identity
need to be removed. Therefore, the phonetic fine structure of the
signal was intentionally discarded. In addition, the AMs derived
from the amplitude envelope were used to modulate the sine-tone
carrier, rather than being combined back with the fine struc-
ture of the signal. To create single-AM band stimuli (e.g., Stress
only), the appropriate AM band was extracted and combined with
the 500 Hz sine-tone carrier. A 30 ms-ramped pedestal at chan-
nel RMS power was added prior to combining with the carrier.
To create double-AM band stimuli (e.g., Stress +Syllable), the
two AM bands were first combined via addition (for MFB) or
multiplication (for PAD) before combining with the carrier. All
stimuli were equalized to 70 dB. These signal processing steps are
illustrated in Figure 2.
The resulting tone-vocoded sentences had clear temporal pat-
terns ranging from “Morse-code” to flutter, but were other-
wise completely unintelligible (See Audios 15in Supplementary
Material). Figure 3 illustrates the different types of AM-vocoded
stimuli used in the experiment, contrasting trochaic (“Mary
Mary”) and iambic (“the Queen of Hearts”) sentences.
Design
As explained in Section Experimental Rationale and Hypotheses,
five different AM bands or band combinations were used for
vocoding. This generated 3 types of single AM band stimuli
(Stress only; Syllable only; Sub-beat only) and 2 types of paired
AM band stimuli (Stress +Syllable; Syllable +Sub-beat). For
each AM combination, each of the 4 nursery rhyme sentences was
presented 10 times (5 MFB and 5 PAD stimuli) in a fully random-
ized order, giving 40 trials per AM type and 200 trials in total
for the entire experiment. Participants were scored in terms of
their sentence identification accuracy for each AM type (Accuracy
scores), and their ability to discriminate more generally between
trochaic and iambic RPs (RP scores). We had previously found
that control participants showed no difference in listening accu-
racy for MFB and PAD stimuli (Leong, 2012). In our preliminary
analysis of the current data, we likewise found that there was no
difference in performance for PAD as compared to MFB stimuli
[F(1,44)=2.74, p=0.11]. Therefore, to simplify further analy-
sis, the scores for the two types of stimuli in each condition were
averaged into a single mean score for each participant.
RESULTS
SENTENCE IDENTIFICATION ACCURACY
Figure 4 shows the mean Accuracy scores achieved by the control
and dyslexic groups for each AM type. To check for floor effects
in performance (which could obscure group differences), we
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |6
Leong and Goswami Impaired perception of temporal modulation in dyslexia
FIGURE 2 | Illustration of the signal processing steps involved in
tone-vocoding for the nursery rhyme sentence “Mary Mary quite
contrary.” (A) The original speech signal with its wholeband amplitude
envelope overlaid in bold. (B) The Stress AM, Syllable AM and Sub-beat AMs
are extracted from the envelope using either the MFB or PAD method. Single
and double AM band vocoded stimuli are then generated by combining the
AMs with a 500 Hz sine tone. To generate single AM band stimuli (bottom
left), each single AM band is multiplied individually with the sine tone. To
generate double band AM stimuli (bottom right), the two AMs are first
combined via addition (MFB) or multiplication (PAD) before multiplication with
the sine tone. The resulting double band vocoded stimulus contains temporal
patterning at two main rates (i.e., second-order modulation).
assessed whether participants’ scores for each AM type were
significantly above the level of chance (25%). Accordingly, sep-
arate one-sample t-tests were conducted for control and dyslexic
groups against the test value of 0.25. As this necessitated 10 t-tests
in total, Holm’s sequential Bonferroni correction was applied
to the p-value threshold for significance (Holm, 1979). Holm’s
sequential Bonferroni correction entails a smaller reduction in
statistical power than the standard Bonferroni correction, and
is a widely-used alternative for controlling for Type 1 family-
wise error (Rice, 1989; Perneger, 1998). In the Holm-Bonferroni
method, the threshold for significance is computed as 0.05/(10-
[rank of uncorrected p-value] +1). Therefore, for the small-
est (rank 1) p-value, the Holm Bonferroni-corrected threshold
for significance was 0.05/(10 1+1)=0.005, whereas for the
largest (rank 10) p-value, the threshold for significance was
0.05/(10 10 +1)=0.05. The results of the t-tests indicated
that both controls and dyslexics performed significantly above
chance for all 5 AM types. Accordingly, we investigated whether
there were group differences across the 5 AM types.
Two repeated measures ANCOVA analyses were conducted.
In the first analysis, we compared group performance for the
3single AM bands (Stress only, Syllable only, Sub-beat only).
Single AM band (3 levels) was entered into the ANCOVA as the
within-subjects factor, and Group (2 levels) was entered as the
between subjects factor. Age was entered as a covariate factor.
The results of the first ANCOVA showed no significant main
effect of Group [F(1,44)=0.14, p=0.71], and no interaction
betweensingleAMbandandGroup[F(2,88)=0.37, p=0.69].
This suggests that controls and dyslexics were performing equally
well in their use of single AM-band information for rhythm
perception.
In the second RM ANCOVA analysis, we investigated group
differences in the ability to combine information across more
than one AM band. The second ANCOVA entered double-AM
band (2 levels, Stress +Syllable, Syllable +Sub-beat) as the
within-subjects factor, and Group (2 levels) as the between sub-
jects factor. Age was again entered as a covariate factor. This
second ANCOVA showed a significant main effect of Group
[F(1,44)=4.51, p<0.05], but the interaction between AM band
and Group did not approach significance [F(1,44)=0.19, p=
0.66]. Therefore, our dyslexic participants were worse at com-
bining AM information across different rates, as they were
significantly less accurate than control participants. For com-
bined AM bands, the dyslexic participants were significantly
poorer at combining the Syllable-rate AM with other AMs at
the Stress rate or the Sub-beat rate.
RHYTHM PATTERN DISCRIMINATION
Next, we wanted to ascertain whether participants were able to
use these speech AMs to discriminate between the two major
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |7
Leong and Goswami Impaired perception of temporal modulation in dyslexia
FIGURE 3 | Comparison of the 5 types of AM tone-vocoded stimuli
produced for trochaic (S-w) and iambic (w-S) nursery rhyme sentences.
Stimuli corresponding to the trochaic sentence “Mary Mary” are shown in
the left column. Stimuli corresponding to the iambic sentence “the Queen of
Hearts” are shown in the right column. Top row: Original acoustic waveform
of each sentence in black, with whole-band amplitude envelope overlaid in
red. Rows (A–E) Stress AM, Syllable AM, Sub-beat AM, Stress +Syllable AM
and Syllable +Sub-beat AM stimuli respectively.
RPs that characterized the 4 nursery rhyme sentences [i.e.,
trochaic (“S-w”) vs. iambic (“w-S”)]. Accordingly, we re-scored
participants responses according to whether they had correctly
identified the RP of each sentence as trochaic or iambic,
disregarding whether they had identified the actual sentence cor-
rectly (i.e., for the stimulus sentence “Mary Mary,” responses of
“Mary Mary” and “Simple Simon” were both scored as the cor-
rect RP, as both were trochaic responses). The resulting mean
RP scores for iambic sentences (Ives, Queen) and trochaic sen-
tences (Mary, Simon) are shown in Figure 5.Tocheckforfloor
effects in performance (which could obscure group differences),
we assessed whether participants’ scores for each AM type were
significantly above the level of chance (50%). Accordingly, sepa-
rate one-sample t-tests were conducted for control and dyslexic
groups against the test value of 0.5. As this necessitated 20 t-
tests in total, Holm’s sequential Bonferroni correction was applied
to the p-value threshold for significance (Holm, 1979). For the
smallest (rank 1) p-value, the Holm Bonferroni-corrected thresh-
old for significance was 0.05/(20 1+1)=0.0025, whereas for
the largest (rank 10) p-value, the threshold for significance was
0.05/(20 20 +1)=0.05.
As shown in Figure 5 (), controls and dyslexics always
performed significantly above chance when making a binary
discrimination of the rhythm of trochaic (T) sentences (with the
exception of controls in the Sub-beat AM condition). By con-
trast, for iambic (I) sentences, dyslexics never performed above
chance in binary rhythm discrimination, whereas controls per-
formed significantly above chance when listening to Stress-only,
and Stress +Syllable AM types. Given the presence of clear floor
effects for binary rhythm discrimination of iambic sentences, we
were unfortunately unable to draw further conclusions regard-
ing group differences for these sentence types (as both controls
and dyslexics were performing at chance in many conditions).
However, both groups had performed significantly above chance
for trochaic sentences when listening to Stress only AMs, Syllable
only AMs, Stress +Syllable AMs and Syllable +Sub-beat AMs.
According, we performed repeated measures ANCOVAs on these
RP scores for trochaic sentences only.
InthefirstANCOVAanalysis,wecomparedgroupperfor-
mance for the 2 single AM bands only, taking single AM band
(2 levels) as the within-subjects factor, Group (2 levels) as the
between subjects factor, and Age as the covariate. Consistent with
the previous Accuracy analysis, there was no significant main
effect of Group [F(1,44)=0.16, p=0.69], and no interaction
betweensingleAMbandandGroup[F(1,44)=0.11, p=0.75].
This suggests that controls and dyslexics did not differ in their
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |8
Leong and Goswami Impaired perception of temporal modulation in dyslexia
ability to use Stress only and Syllable only AM band information
to make trochaic-iambic distinctions. We then analyzed double-
AM band performance in a similar fashion. This time double-AM
band (2 levels, Stress +Syllable, Syllable +Sub-beat) was the
within-subjects factor, Group (2 levels) was the between subjects
factor, and Age was the covariate. Unlike the Accuracy analy-
sis, the ANCOVA showed no significant main effect of Group
[F(1,44)=1.90, p=0.17]. There was also no interaction between
FIGURE 4 | Group mean Accuracy scores for each AM band and band
combination. Error bars indicate standard error.
double-AM band and Group [F(1,44)=0.17, p=0.68]. Hence
dyslexic participants appeared to recognize trochaic RPs based on
pairsofAMaswellascontrols.
These results should be interpreted with caution, however.
Firstly, only performance for trochaic sentences could be analyzed
meaningfully (meaning that half the total dataset could not be
analyzed). Secondly, the RP scores computed here reflect partici-
pants’ rhythm discrimination indirectly rather than directly. The
RP scores measure the perceptual confusability of sentences (i.e.,
how participants make guesses when they are unsure of the cor-
rect sentence identity). Perceptual confusability will depend in
large part on the global RPs of the stimuli, but will also include
other factors like total duration and perceptual grouping effects,
as well as participants’ own cognitive strategies. Nevertheless, the
data show that perceptual confusability was maximal for trochaic
sentences, for both groups.
CORRELATIONS BETWEEN AM PERCEPTION, PHONOLOGY, AND
LITERACY
By hypothesis, a perceptual deficit in using AM patterns to
discriminate rhythmic sentences should be related to both
phonological awareness and reading skills in our partici-
pants. Accordingly, we investigated the relationship between
participants’ sentence identification Accuracy for each AM band
or combination, and their performance on memory, reading
and phonological tasks. Ta ble 3 shows the partial correlation
matrix between accuracy of performance in the rhythm percep-
tion task (by AM type) and participants’ memory, reading, and
FIGURE 5 | Group mean Rhythm Pattern scores for each AM band and band combination, shown separately for iambic (“I”: Ives & Queen) and
trochaic (“T”: Mary & Simon) sentences. Error bars indicate standard error. () AM bands where performance was above chance (50%) for each group.
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |9
Leong and Goswami Impaired perception of temporal modulation in dyslexia
Table 3 | Pearson’s r partial correlation values between accuracy of performance in rhythm perception (by AM type), and general ability,
literacy and phonology measures.
Partial correlations
controlling for Age
and IQ
AM Combination
Stress only Syllable only Sub-beat only Stress +Syllable Syllable +Sub-beat
AUDITORY STM
All 0.05 0.11 0.13 0.35*0.17
Con 0.07 0.06 0.09 0.09 0.34
Dys 0.07 0.31 0.26 0.52*0.55*
READING
All 0.14 0.02 0.21 0.13 0.21
Con 0.07 0.13 0.38$0.21 0.09
Dys 0.32 0.03 0.14 0.17 0.18
SPELLING
All 0.17 0.12 0.25ˆ 0.17 0.32*
Con 0.15 0.12 0.28 0.09 0.11
Dys 0.38 0.06 0.48*0.05 0.27
PHONOLOGY
All 0.13 0.30*0.18 0.40** 0.11
Con 0.04 0.27 0.22 0.12 0.16
Dys 0.17 0.42&0.21 0.52*0.07
For each cell, correlations over both groups are shown on the top left, correlations for controls only are shown on the middle right, and correlations for dyslexics only
are shown on the bottom right. Age and IQ are controlled in all the correlations.
*p < 0.05; **p<0.01;$p=0.07; &p=0.074; p=0.096.
phonological ability, with age and IQ controlled. Correlations
were performed with both groups combined, as well as separately.
As shown in Ta b l e 3 , there were several significant relationships
between AM performance, literacy and phonology. Taking the
group as a whole, the conceptually important Stress +Syllable
speech AMs were significantly related to phonological awareness
(r=0.40, p<0.01), as well as to auditory short-term memory
(digit span, r=0.35, p<0.05). Performance with the Syllable +
Sub-beat level was also significantly associated with spelling per-
formance, which was not predicted (r=0.32, p<0.05). When
considering the dyslexic group alone, the table shows that dyslex-
ics’ phonological awareness was significantly related to their sensi-
tivity to Stress +Syllable speech AMs (r=0.52, p<0.01), while
the relationship between Syllable AM performance and phono-
logical awareness approached significance (r=0.42, p=0.074).
Further, spelling skills were significantly related to Sub-beat AM
sensitivity (r=0.48, p<0.05). Dyslexics also showed a signif-
icant relationship between their auditory short-term memory
skills and their performance in the two combined AM condi-
tions (r=0.52, p<0.05 for Stress +Syllable; r=0.55, p<0.05
for Syllable +Sub-beat). This may indicate that dyslexics’ abil-
ity to use multiple patterns of temporal information to recognize
speech rhythm in our experimental paradigm was constrained by
their lower short-term memory capacity in comparison to con-
trols. When considered as a group, controls showed no significant
relationships between performance in the AM RP recognition
task, phonology and reading, although there was a trend toward
a correlation between Sub-beat AM sensitivity and spelling (r=
0.38, p=0.07). Overall, therefore, the partial correlations show
that the perceptual deficit in using AM patterns to detect speech
rhythm was related to phonological awareness for the dyslexic
participants only.
DISCUSSION AND CONCLUSION
Here, we tested the hypothesis that perceptual difficulties in pro-
cessing the AM patterns in speech that yield speech rhythm
are associated with the development of impaired phonological
representations for words by dyslexic individuals. The devel-
opment of impaired phonological representations of speech is
the cognitive hallmark of dyslexia across languages (Snowling,
2000; Ziegler and Goswami, 2005; Goswami, 2011). We tested
the sensitivity of adults with dyslexia to AM patterning yield-
ing speech rhythm for several different AM bands and band
combinations below 20 Hz that are present within the ampli-
tude envelope of speech. We found that dyslexic participants
performed significantly more poorly than control adults when
they were required to combine Syllable-rate AMs with AMs at
other rates (Stress +Syllable or Syllable +Sub-beat).However,
the dyslexic participants performed on par with controls when
asked to utilize the temporal information at a single AM rate
only (Stress only, Syllable only, or Sub-beat only). Accordingly, we
conclude that dyslexics’ difficulties with AM perception appear
to occur across more than one speech timescale (particularly
involvingtheSyllablerate).Moreover,aspredictedbythetem-
poral sampling framework, a perceptual deficit in utilizing AM
patterns in speech is related to phonological development in
dyslexia.
A deficit in Syllable-rate combination or synchronization with
other rates would support the findings of Leong and Goswami
(2014), in which the same group of adult dyslexics tested here
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |10
Leong and Goswami Impaired perception of temporal modulation in dyslexia
showed differences in their phase of rhythmic entrainment at the
Syllable rate in a rhythmic tapping task to nursery rhyme targets.
A difference in Syllable phase of entrainment suggests that dyslex-
ics have temporal differences in their processing of Syllable-rate
information (e.g., they may perceive P-centers as occurring earlier
in a speech sound as compared to controls). Here, participants
with dyslexia were significantly poorer at recognizing the target
nursery rhymes when they had to combine Syllable AM cues with
prosodic stress AM cues (Stress +Syllable).
In fact, a circular-linear correlation analysis of the two datasets
(Leong and Goswami, 2014 and the current study) revealed that
there was a strong correlation between participants’ Syllable AM
phase of tapping in the entrainment task based on rhythmic tap-
ping, and their sensitivity to Stress +Syllable AMs in the current
task (r=0.55, p<0.01). An earlier Syllable AM phase of rhyth-
mic tapping in Leong and Goswami (2014) was associated with
poorer perception of Stress+Syllable AMs in the current study.
No other AM band in the current study yielded significant corre-
lations with tapping phase in the prior study. Others have argued
that the perception and production of rhythm both rely on sim-
ilar cognitive and neural mechanisms, such as the entrainment
of neuronal oscillatory activity (Martin, 1972; Liberman and
Mattingly, 1985; Kotz and Schwartze, 2010). In the current con-
text, it is note-worthy that the common locus of dyslexic deficit
across perception and production tasks involved the Syllable-rate
of temporal processing.
Utilizing younger participants, Power et al. (2013) have shown
in a rhythmic speech processing task that children with dyslexia
also have a different preferred phase of entrainment in the delta
band (2 Hz), both in response to auditory speech alone, and when
speech information is audio-visual. The ‘temporal misalignment’
of both stress- and syllable-rate information in dyslexia found
by Power et al. (2013) and the current study could explain
why individuals with dyslexia develop phonological representa-
tions for words that are impaired (or specified differently) in
comparison to those of unaffected individuals. If temporal pro-
cessing of slower-rate information in speech is impaired, for
example because oscillatory phase alignment is inaccurate, then
this would affect the development of the entire mental lexicon
of word forms, not simply of syllable-level and prosodic infor-
mation. If syllable stress representation and syllabic parsing is
different in dyslexia because of a perceptual deficit in utilizing
AM patterns in speech, this would also affect phonetic-level infor-
mation. Phonemes are perceived more accurately when they are
in stressed syllables (Mehta and Cutler, 1988). Over the course
of development, if dyslexic children consistently fail to capture
rich, high-dimensional representations of the temporal patterns
that occur on multiple timescales in speech (e.g., concurrently
encoding Stress patterns, Syllable patterns and Phoneme patterns
into an integrated representation of a word), this would yield the
impoverished or atypical phonological representations that are
developed by children with dyslexia across languages.
At first glance, our data appear to be inconsistent with the
results of previous AM perception studies as summarized in the
Introduction. These non-speech studies generally indicated that
individuals with dyslexia had poorer AM perception at the 4 Hz
rate (Syllable AM). Here, we find no differences in performance
between controls and dyslexics when making rhythm judgments
onthebasisoftheSyllableAM(4Hz)only.However,itshould
be noted that the dependent variable being assessed in the cur-
rent study is different from that of psychophysical AM studies.
Whereas AM studies assess modulation detection thresholds based
on just noticeable differences in modulation depth or rate (e.g.,
Lorenzi et al., 2000; Rocheron et al., 2002), here we assess nursery
rhyme recognition using real-life speech AMs that contain strong
(and likely supra-threshold) modulation patterns. As such, it is
not surprising that no group differences were observed for our
single AM rate stimuli. It is possible that significant group differ-
ences could have been observed at single AM rates if we had used
sentences with weaker modulation patterns, such as whispered or
mumbled speech. However, we did observe a significant difference
in dyslexics’ ability to combine or integrate speech modulation
patterns across the Stress and Syllable rates, which is consistent
with dyslexics’ poorer speech perception performance in vocoder
studies (e.g., Lorenzi et al., 2000; Johnson et al., 2011; Nittrouer
and Lowenstein, 2013). This difference cannot be attributed to a
general lack of attention or engagement by dyslexic participants,
since they performed as well as controls with the single AM band
stimuli. Rather, dyslexics appear to have a particular difficulty in
making use of modulation information that is patterned at more
than one timescale, here when Syllable-rate information has to
be temporally synchronized with Stress-rate speech information
or Sub-beat information. However, as we did not include paired
AM combinations that did not involve the Syllable AM rate (e.g.,
Stress +Phoneme), we are not able to determine whether this dif-
ficulty is specific to Syllable AM combinations only, or whether it
would also occur for other combinations of speech AMs.
It should also be observed that our participants found the
rhythm judgment task very difficult. This high level of diffi-
culty stemmed from the fact that the sentences were (deliberately)
unintelligible, forcing our participants to rely solely on the acous-
tic modulations in the stimuli to perform rhythm judgments,
without recourse to lexical factors. Consequently, accuracy scores
for both controls and dyslexics (although significantly above
chance) were relatively low (below 50%). In future studies, the
issue of task difficulty may be ameliorated by using a tone-
vocoder with more than 1 spectral channel (i.e., 3 or 4 channels),
which would have the effect of increasing speech intelligibility.
However, increasing the intelligibility of the stimuli would also
introduce a new confound: participants would now be able to use
their lexical knowledge to augment their perceptual judgments
of speech rhythm. Nonetheless, this trade-off might produce
stronger effects. Lexical “boot-strapping” effects could be reduced
by using semantically unpredictable sentences (following Johnson
et al., 2011).
According to the temporal sampling framework (Goswami,
2011), the combination impairment for Stress +Syllable rate
AMs found here should affect speech perception even when lis-
tening to clear (i.e., fully intelligible) speech, which has strong
modulation patterns that are above the threshold for detection.
Interestingly, this was exactly what Lorenzi et al. (2000) found
in their study. They reported that dyslexic children performed
significantly more poorly than adults and control children even
when listening to clear, unprocessed (not-vocoded) VCV syllables
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |11
Leong and Goswami Impaired perception of temporal modulation in dyslexia
(these syllables will contain significant Syllable-rate modulation,
but not Stress-rate modulation). This controversial result might
possibly be explained by other factors like memory or attention,
nonetheless data like these suggest that speech AM perception in
dyslexia clearly requires more investigation. Current data suggest
that individuals with dyslexia are less sensitive to small changes in
modulation depth and rate, particularly around the syllable and
stress rates in speech. Future studies should explore how dyslexics’
difficulties with processing slow modulations affects their abil-
ity to integrate and synchronize slow-varying stress and syllable
information with more quickly-varying phoneme-rate informa-
tion in speech. These perceptual difficulties could be one source
of the impaired or atypical phonological representations stored in
the mental lexicon of word forms by dyslexic individuals.
Finally, we note that, given recent proposals by Poeppel and
colleagues regarding neural oscillatory phase-locking to speech
modulation patterns (e.g., Ghitza, 2011; Giraud and Poeppel,
2012), the perceptual difficulties that we observe here could be
underpinned by impaired phase alignment and cross-frequency
phase synchronization between different neuronal oscillatory
rates. For example, dyslexics could have poorer neuronal oscilla-
tory synchronization between theta oscillations (syllable rate) and
delta (stress rate) or gamma (phoneme rate) oscillations in the
cortex. Similarly, the neural interplay between theta (syllable rate)
and alpha (8–13 Hz, similar to the sub-beat rate here) oscillations
during speech comprehension might be atypical in dyslexia as
well (Obleser and Weisz, 2012). To date, such cross-frequency neu-
ral synchronization has not been studied in dyslexia (although see
Leong and Goswami, 2014, for an assessment of cross-frequency
AM synchronization in dyslexics’ speech). Such studies could be
very informative in the quest to identify cross-linguistic percep-
tual and neural deficits underpinning cognitive markers such as
impaired phonology in developmental dyslexia.
ACKNOWLEDGMENTS
This research was funded by a Harold Hyam Wingate Research
Scholarship to Victoria Leong and by a Medical Research Council
grant G0902375 to Usha Goswami.
SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found
online at: http://www.frontiersin.org/journal/10.3389/fnhum.
2014.00096/abstract
Audio 1 | Stress-only AM (MFB, Trochaic).
Audio 2 | Syllable-only AM (MFB, Trochaic).
Audio 3 | Sub-beat only AM (MFB, Trochaic).
Audio 4 | Stress +Syllable AM (MFB, Trochaic).
Audio 5 | Syllable +Sub-beat AM (MFB, Trochaic).
REFERENCES
Allen, G. (1972). The location of rhythmic stress beats in English: an experimental
study. Lang. Speech 15, 72–100.
Amitay, S., Ahissar, M., and Nelken, I. (2002). Auditory processing deficits
in reading disabled adults. J. Assoc. Res. Otolaryngol. 3, 302–320. doi:
10.1007/s101620010093
Bradley, L., and Bryant, P. E. (1983). Categorising sounds and learning to read - a
causal connection. Nature 301, 4190421. doi: 10.1038/301419a0
Cummins, F. (2003). Practice and performance in speech produced synchronously.
J. Phon. 31, 139–148. doi: 10.1016/S0095-4470(02)00082-7
Cummins, F., and Port, R. (1998). Rhythmic constraints on stress timing in English.
J. Phon. 26, 145–171. doi: 10.1006/jpho.1998.0070
Dauer, R. (1983). Stress-timing and syllable timing revisited. J. Phon. 11, 51–62.
de Bree, E., Wijnen, F., and Zonneveld, W. (2006). Word stress production in
three-year-old children at risk of dyslexia. J. Res. Read. 29, 304–317. doi:
10.1111/j.1467-9817.2006.00310.x
Drullman, R. (2006). “The significance of temporal modulation frequencies for
speech intelligibility,” in Listening to Speech: An Auditory Perspective,edsS.
Greenberg and W. A. Ainsworth (Mahwah, NJ: Lawrence Erlbaum Associates),
39–47.
Drullman, R., Festen, J. M., and Plomp, R. (1994a). Effect of temporal enve-
lope smearing on speech reception. J. Acoust. Soc. Am. 95, 1053–1064. doi:
10.1121/1.408467
Drullman, R., Festen, J. M., and Plomp, R. (1994b). Effect of reducing slow tem-
poral modulations on speech reception. J. Acoust. Soc. Am. 95, 2670–2680. doi:
10.1121/1.409836
Echols, C. H., Crowhurst, M. J., and Childers, J. B. (1997). The perception of rhyth-
mic units in speech by infants and adults. J. Mem. Lang. 36, 202–225. doi:
10.1006/jmla.1996.2483
Fredrickson, N., Frith, U., and Reason, R. (1997). Phonological Assessment Battery
(Standardised Edn.). Windsor: NFER-Nelson.
Friederici, A., Friedrich, M., and Christophe, A. (2007). Brain responses in 4-
month-old infants are already language-specific. Curr. Biol. 17, 1208–1211. doi:
10.1016/j.cub.2007.06.011
Ghitza, O. (2011). Linking speech perception and neurophysiology: speech decod-
ing guided by cascaded oscillators locked to the input rhythm. Front. Psychol.
2:130. doi: 10.3389/fpsyg.2011.00130
Ghitza, O., and Greenberg, S. (2009). On the possible role of brain rhythms
in speech perception: Intelligibility of time compressed speech with periodic
and aperiodic insertions of silence. Phonetica 66, 113–126. doi: 10.1159/000
208934
Giraud, A. L., and Poeppel, D. (2012). Cortical oscillations and speech processing:
emerging computational principles and operations. Nat. Neurosci. 15, 511–517.
doi: 10.1038/nn.3063
Gordon, J. W. (1987). The perceptual attack time of musical tones. J. Acoust. Soc.
Am. 82, 88–105. doi: 10.1121/1.395441
Goswami, U. (2008). The development of reading across languages. Ann. N.Y. Acad.
Sci. 1145, 1–12. doi: 10.1196/annals.1416.018
Goswami, U. (2011). A temporal sampling framework for developmental dyslexia.
Tre nd s Cog n. S ci . 15, 3–10. doi: 10.1016/j.tics.2010.10.001
Goswami, U., Fosker, T., Huss, M., Mead, N., and Szücs, D. (2011b). Rise time and
formant transition duration in the discrimination of speech sounds: the ba-wa
distinction in developmental dyslexia. Dev. Sci. 14, 34–43. doi: 10.1111/j.1467-
7687.2010.00955.x
Goswami, U., Gerson, D., and Astruc, L. (2010). Amplitude envelope perception,
phonology and prosodic sensitivity in children with developmental dyslexia.
Read. Writ. 23, 995–1019. doi: 10.1007/s11145-009-9186-6
Goswami, U., and Leong, V. (2013). Speech rhythm and temporal structure:
converging perspectives? Lab. Phonol. 4, 67–92. doi: 10.1515/lp-2013-0004
Goswami, U., Thomson, J., Richardson, U., Stainthorp, R., Hughes, D., Rosen,
S., et al. (2002). Amplitude envelope onsets and developmental dyslexia:
a new hypothesis. Proc. Natl. Acad. Sci. U.S.A. 99, 10911–10916. doi:
10.1073/pnas.122368599
Goswami, U., Wang, H.-L., Cruz, A., Fosker, T., Mead, N., and Huss,
M. (2011a). Language-universal sensory deficits in developmental dyslexia:
English, Spanish, and Chinese.J.Cogn.Neurosci.23, 325–337. doi:
10.1162/jocn.2010.21453
Greenberg, S. (2006). “A multi-band framework for understanding spoken lan-
guage,” in Understanding Speech: An Auditory Perspective, eds S. Greenberg and
W. Ainsworth (Mahweh, NJ: LEA), 411–434.
Greenberg, S., Carvey, H., Hitchcock, L., and Chang, S. (2003). Temporal properties
of spontaneous speech - a syllable-centric perspective. J. Phon. 31, 465–485. doi:
10.1016/j.wocn.2003.09.005
Gueron, J. (1974). The meter of nursery rhymes: an application of the Halle-Keyser
theory of meter. Poetics 12, 73–111. doi: 10.1016/0304-422X(74)90006-0
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |12
Leong and Goswami Impaired perception of temporal modulation in dyslexia
Hämäläinen, J. A., Leppänen, P. H. T., Eklund, K., Thomson, J., Richardson, U.,
Guttorm, T. K., et al. (2009). Common variance in amplitude envelope percep-
tion tasks and their impact on phoneme duration perception and reading and
spelling in Finnish children with reading disabilities. Appl. Psycholinguist. 30,
511–530. doi: 10.1017/S0142716409090250
Hämäläinen, J. A., Rupp, A., Soltész, F.,Szücs, D., and Goswami, U. (2012). Reduced
phase locking to slow amplitude modulation in adults with dyslexia: an MEG
study. Neuroim age 59, 2952–2961. doi: 10.1016/j.neuroimage.2011.09.075
Hämäläinen, J., Leppänen, P., Torppa, M., Müller, K., and Lyytinen, H. (2005).
Detection of sound rise time by adults with dyslexia. Brain Lang. 94, 32–42.
doi: 10.1016/j.bandl.2004.11.005
Hari, R., Sääskilahti, A., Helenius, P., and Uutela, K. (1999). Nonimpaired
auditory phase locking in dyslexic adults. Neuroreport 10, 2347–2348. doi:
10.1097/00001756-199908020-00023
Hirst, D. J. (2006). “Prosodic aspects of speech and language,” in Encyclopedia
of Language and Linguistics,2nd Edn., ed K. Brown (Oxford: Elsevier),
539–546.
Holliman, A. J., Wood, C., and Sheehy, K. (2010). The contribution of sensitivity
to speech rhythm and non-speech rhythm to early reading development. Educ.
Psychol. 30, 247–267. doi: 10.1080/01443410903560922
Holliman, A. J., Wood, C., and Sheehy, K. (2012). A cross-sectional study of
prosodic sensitivity and reading difficulties. J. Res. Read. 35, 32–48. doi:
10.1111/j.1467-9817.2010.01459.x
Holm, S. (1979). A simple sequential rejective multiple test procedure. Scand. J.
Stat. 6, 65–70.
Howell, P. (1984). “An acoustic determinant of perceived and produced
anisochrony,” in Proceedings of the Tenth International Congress of Phonetic
Sciences, eds M. P. R. Van den Broecke and A. Cohen (Dordrecht: Foris).
Howell, P. (1988a). Prediction of P-center location from the distribution of
energy in the amplitude envelope: I. Percept. Psychophys. 43, 90–93. doi:
10.3758/BF03208978
Howell, P. (1988b). Prediction of P-center location from the distribution of
energy in the amplitude envelope: II. Percept. Psychophys. 43, 99. doi:
10.3758/BF03208980
Johnson, E. P., Pennington, B. F., Lowenstein, J. H., and Nittrouer, S. (2011).
Sensitivity to structure in the speech signal by children with speech sound
disorder and reading disability. J. Commun. Disord. 44, 294–314. doi:
10.1016/j.jcomdis.2011.01.001
Jusczyk, P. W., Cutler, A., and Redanz, N. (1993). Preference for the predominant
stress patterns of English words. Child Dev. 64, 675–687. doi: 10.2307/1131210
Jusczyk, P. W., Houston, D., and Newsome, M. (1999). The beginnings of word
segmentation in English-learning infants. Cogn. Psychol. 39, 159–207. doi:
10.1006/cogp.1999.0716
Kitzen, K. R. (2001). Prosodic Sensitivity, Morphological Ability and Reading
Ability in Young Adults With and Without Childhood Histories of Reading
Difficulty. Doctoral dissertation, University of Columbia. Dissertation Abstracts
International, 62, 0460A.
Kotz, S. A., and Schwartze, M. (2010). Cortical speech processing unplugged:
a timely subcortico-cortical framework. Tr en ds C og n. Sc i. 14, 392–399. doi:
10.1016/j.tics.2010.06.005
Lakatos, P., Shah, A. S., Knuth, K. H., Ulbert, I., Karmos, G., and Schroeder, C.
E. (2005). An oscillatory hierarchy controlling neuronal excitability and stim-
ulus processing in the auditory cortex. J. Neurophysiol. 94, 1904–1911. doi:
10.1152/jn.00263.2005
Lehongre, K., Morillon, B., Giraud, A. L., and Ramus, F. (2013). Impaired auditory
sampling in dyslexia: further evidence from combined fMRI and EEG. Front.
Hum. Neu rosci. 7:454. doi: 10.3389/fnhum.2013.00454
Lehongre, K., Ramus, F., Villiermet, N., Schwartz, D., and Giraud, A. L. (2011).
Altered low-gamma sampling in auditory cortex accounts for the three main
facets of dyslexia. Neuron 72, 1080–1090. doi: 10.1016/j.neuron.2011.11.002
Leong, V. (2012). Prosodic Rhythm in the Speech Amplitude Envelope: Amplitude
Modulation Phase Hierarchies (AMPHs) and AMPH Models. Doctoral disserta-
tion, University of Cambridge.
Leong, V., and Goswami, U. (2014). Assessment of rhythmic entrainment at multi-
ple timescales in dyslexia: evidence for disruption to syllable timing. Hear. Res.
308, 141–161. doi: 10.1016/j.heares.2013.07.015
Leong, V., Hamalainen, J., Soltesz, F., and Goswami, U. (2011). Rise time perception
and detection of syllable stress in adults with developmental dyslexia. J. Mem.
Lang. 64, 59–73. doi: 10.1016/j.jml.2010.09.003
Liberman, A. M., and Mattingly, I. G. (1985). The motor theory of
speech perception revised. Cognition 21, 1–36. doi: 10.1016/0010-0277(85)
90021-6
Lorenzi, C., Dumont, A., and Füllgrabe, C. (2000). Use of temporal envelope
cues by children with developmental dyslexia. J. Speech Lang. Hear. Res. 43,
1367–1379. doi: 10.1044/jslhr.4306.1367
Martin, J. G. (1972). Rhythmic (hierarchical) versus serial structuring in speech
and other behavior. Psychol. Rev. 79, 487–509. doi: 10.1037/h0033467
McAnally, K. I., and Stein, J. F. (1997). Scalp potentials evoked by amplitude-
modulated tones in dyslexia. J. Speech Lang. Hear. Res. 40, 939–945.
Mehta, G., and Cutler, A. (1988). Detection of target phonemes in spontaneous and
read speech. Lang. Speech 31, 135–156.
Menell, P., McAnally, K. I., and Stein, J. F. (1999). Psychophysical sensitivity and
physiological response to amplitude modulation in adult dyslexic listeners.J.
Speech Lang. Hear. Res. 42, 797–803.
Morton, J., Marcus, S. M., and Frankish, C. (1976). Perceptual centres (P-centres).
Psychol. Rev. 83, 405–408. doi: 10.1037/0033-295X.83.5.405
Mundy, I. R., and Carroll, J. M. (2012). Speech prosody and developmental
dyslexia: reduced phonological awareness in the context of intact phonologi-
cal representations. J. Cogn. Psychol. 24, 560–581. doi: 10.1080/20445911.2012.
662341
Nittrouer, S., and Lowenstein, J. H. (2013). Perceptual organization of speech sig-
nals by children with and without dyslexia. Res. Dev. Disabil. 34, 2304–2325.
doi: 10.1016/j.ridd.2013.04.018
Obleser, J., Herrmann, J., and Henry, M. J. (2012). Neural oscillations in
speech: don’t be enslaved by the envelope. Front. Hum. Neurosci. 6:250. doi:
10.3389/fnhum.2012.00250
Obleser, J., and Weisz, N. (2012). Suppressed alpha oscillations predict intelli-
gibility of speech and its acoustic details. Cereb. Cortex 22, 2466–2477. doi:
10.1093/cercor/bhr325
Perneger, T. V. (1998). What’s wrong with Bonferroni adjustments. Br.Med.J.316,
1236–1238. doi: 10.1136/bmj.316.7139.1236
Plomp, R. (1983). “Perception of speech as a modulated signal,” Proceedings of the
10th International Congress of Phonetic Sciences. (Utrecht), 29–40.
Poelmans, H., Luts, H., Vandermosten, M., Boets, B., Ghesquière, P., and
Wouters, J. (2011). Reduced sensitivity to slow-rate dynamic auditory infor-
mation in children with dyslexia. Res. Dev. Disabil. 32, 2810–2819. doi:
10.1016/j.ridd.2011.05.025
Poelmans, H., Luts, H., Vandermosten, M., Boets, B., Ghesquière, P., and
Wouters, J. (2012). Auditory steady state cortical responses indicate deviant
phonemic-rate processing in adults with dyslexia. Ear Hear. 33, 134–143. doi:
10.1097/AUD.0b013e31822c26b9
Poeppel, D. (2003). The analysis of speech in different temporal integration win-
dows: cerebral lateralization as ‘asymmetric sampling in time’. Speech Commun.
41, 245–255. doi: 10.1016/S0167-6393(02)00107-3
Power, A. J., Mead, N., Barnes, L., and Goswami, U. (2012). Neural entrainment
to rhythmically-presented auditory, visual and audio-visual speech in children.
Front. Psychol. 3:216 doi: 10.3389/fpsyg.2012.00216
Power, A. J., Mead, N., Barnes, L., and Goswami, U. (2013). Neural entrainment to
rhythmic speech in children with developmental dyslexia. Front. Hum. Neurosci.
7:777. doi: 10.3389/fnhum.2013.00777
Qin, M. K., and Oxenham, A. J. (2003). Effects of simulated cochlear implant pro-
cessing on speech reception in fluctuating maskers. J. Acoust. Soc. Am. 114,
446–454. doi: 10.1121/1.1579009
Ragó, A., Honbolygó, F., Róna, Z., Beke, A., and Csépe, V. (2014). Effect
of maturation on suprasegmental speech processing in full- and preterm
infants: a mismatch negativity study. Res. Dev. Disabil. 35, 192–202. doi:
10.1016/j.ridd.2013.10.006
Rice, W. R. (1989). Analyzing tables of statistical tests. Evolution 43, 223–225. doi:
10.2307/2409177
Richardson, U., Thomson, J., Scott, S. K., and Goswami, U. (2004). Auditory pro-
cessing skills and phonological representation in dyslexic children. Dyslexia 10,
215–233. doi: 10.1002/dys.276
Rocheron, I., Lorenzi, C., Füllgrabe, C., and Dumont, A. (2002). Temporal
envelope perception in dyslexic children. Neuroreport 13, 1683–1687. doi:
10.1097/00001756-200209160-00023
Rosen, S. (1992). Temporal information in speech: acoustic, auditory and lin-
guistic aspects. Philos. Trans. R. Soc. Lond. B Biol. Sci. 336, 367–373. doi:
10.1098/rstb.1992.0070
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |13
Leong and Goswami Impaired perception of temporal modulation in dyslexia
Schroeder, C. E., and Lakatos, P. (2009). Low-frequency neuronal oscilla-
tions as instruments of sensory selection. Tr en ds Neu ro sci . 32, 9–18. doi:
10.1016/j.tins.2008.09.012
Scott, S. K. (1993). P-Centres in Speech: An Acoustic Analysis. Unpublished Ph.D.
thesis, University College London.
Scott, S. K. (1998). The point of P-centres. Psychol. Res. 61, 4–11. doi:
10.1007/PL00008162
Shannon, R. V., Zeng, F.-G., Kamath, V., Wygonski, J., and Ekelid, M. (1995).
Speech recognition with primarily temporal cues. Science 270, 303–304. doi:
10.1126/science.270.5234.303
Snowling, M. J. (1981). Phonemic deficits in developmental dyslexia. Psychol. Res.
43, 219–234. doi: 10.1007/BF00309831
Snowling, M. J. (2000). Dyslexia,2nd Edn. Oxford: Blackwell Publishers.
Soltész, F., Szûcs, D., Leong, V., White, S., and Goswami, U. (2013). Differential
entrainment of neuroelectric delta oscillations in developmental dyslexia. PLoS
ONE 8:e76608. doi: 10.1371/journal.pone.0076608
Stefanics, G., Fosker, T., Huss, M., Mead, N., Szücs, D., and Goswami, U. (2011).
Auditory sensory deficits in developmental dyslexia: a longitudinal ERP study.
Neuro image 57, 723–732. doi: 10.1016/j.neuroimage.2011.04.005
Stone, M. A., and Moore, B. C. J. (2003). Effect of the speed of a single-
channel dynamic range compressor on intelligibility in a competing speech task.
J. Acoust. Soc. Am. 114, 1023–1034. doi: 10.1121/1.1592160
Stuart, G. W., McAnally, K. I., McKay, A., Johnston, M., and Castles, A. (2006). A
test of the magnocellular deficit theory of dyslexia in an adult sample. Cogn.
Neuro psychol. 23, 1215–1229. doi: 10.1080/02643290600814624
Surányi, Z., Csépe, V., Richardson, U., Thomson, J. M., Honbolygó, F., and
Goswami, U. (2009). Sensitivity to rhythmic parameters in dyslexic children:
a comparison of Hungarian and English. Read. Writ. 22, 41–56. doi:
10.1007/s11145-007-9102-x
Tallal, P., and Piercy, M. (1974). Developmental aphasia: rate of auditory processing
and selective impairment of consonant perception. Neuropsycho logi a 12, 83–93.
doi: 10.1016/0028-3932(74)90030-X
Thomson, J., Fryer, B., Maltby, J., and Goswami, U. (2006). Auditory and motor
rhythm awareness in adults with dyslexia. J. Res. Read. 29, 334–348. doi:
10.1111/j.1467-9817.2006.00312.x
Thomson, J. M., and Goswami, U. (2008). Rhythmic processing in children with
developmental dyslexia: auditory and motor rhythms link to reading and
spelling.J.Physiol.Paris102, 120–129. doi: 10.1016/j.jphysparis.2008.03.007
Tierney, A., and Kraus, N. (2013). The ability to tap to a beat relates to
cognitive, linguistic, and perceptual skills. Brain Lang. 124, 225–231. doi:
10.1016/j.bandl.2012.12.014
Tilsen, S., and Arvaniti, A. (2013). Speech rhythm analysis with decomposition
of the amplitude envelope: characterizing rhythmic patterns within and across
languages. J. Acoust. Soc. Am. 134, 628–639. doi: 10.1121/1.4807565
Tilsen, S., and Johnson, K. (2008). Low-frequency fourier analysis of speech
rhythm. J. Acoust. Soc. Am. 124, EL34–EL39. doi: 10.1121/1.2947626
Treiman, R., and Zukowski, A. (1991). “Levels of phonological awareness,” in
Phonological Processes in Literacy: a Tribute to Isabelle P. Liberman, eds S. Brady
and D. Shankweiler (Hillsdale, NJ: Erlbaum), 67–83.
Turner, R. E. (2010). Statistical Models for Natural Sounds. Doctoral disserta-
tion, University College London. Available online at: http://www.gatsby.ucl.ac.
uk/turner/Publications/turner-2010.html
Turner, R. E., and Sahani, M. (2007). “Probabilistic amplitude demodulation,
in Proceedings of the 7th International Conference on Independent Component
Analysis and Signal Separation, 544–551. doi: 10.1007/978-3-540-74494-8_68
Turner, R. E., and Sahani, M. (2011). Demodulation as probabilistic infer-
ence. IEEE Trans. Audio Speech Lang. Process. 19, 2398–2411. doi:
10.1109/TASL.2011.2135852
Villing, R. (2010). Hearing the Moment: Measures and Models of the
Perceptual Centre. Doctoral dissertation, National University of
Ireland Maynooth. Available online at: http://eprints.nuim.ie/2284/1/
Villing_2010_-_PhD_Thesis.pdf
Wechsler, D. (1981). Manual for the Wechsler Adult Intelligence Scale-Revised. New
York,NY:ThePsychologicalCorporation.
Wechsler, D. (1999). Wechsler Abbreviated Scale of Intelligence. San Antonio, TX:
The Psychological Corporation.
Whalley, K., and Hansen, J. (2006). The role of prosodic sensitivity in chil-
dren’s reading development. J. Res. Read. 29, 288–303. doi: 10.1111/j.1467-
9817.2006.00309.x
Wilkinson, G. S. (1993). Wide Range Achievement Test 3. Wilmington, DE: Wide
Range.
Witton, C., Talcott, J. B., Hansen, P. C., Richardson, A. J., Griffiths, T. D., Rees, A.,
et al. (1998). Sensitivity to dynamic auditory and visual stimuli predicts non-
word reading ability in both dyslexic and normal readers. Curr. Biol. 8, 791–797.
doi: 10.1016/S0960-9822(98)70320-3
Wood, C., and Terrell, C. (1998). Pre-school phonological ability and
subsequent literacy development. Educ. Psychol. 18, 253–274. doi:
10.1080/0144341980180301
Xu, L., Thompson, C. S., and Pfingst, B. E. (2005). Relative contributions of spectral
and temporal cues for phoneme recognition. J. Acoust. Soc. Am. 117, 3255–3267.
doi: 10.1121/1.1886405
Ziegler, J., and Goswami, U. (2005). Reading acquisition, developmental dyslexia,
and skilled reading across languages: a psycholinguistic grain size theory.
Psychol. Bull. 131, 3–29. doi: 10.1037/0033-2909.131.1.3
Zion Golumbic, E. M., Poeppel, D., and Schroeder, C. E. (2012). Temporal context
in speech processing and attentional stream selection: A behavioral and neural
perspective. Brain Lang. 122, 151–161. doi: 10.1016/j.bandl.2011.12.010
Conflict of Interest Statement: The authors declare that the research was con-
ducted in the absence of any commercial or financial relationships that could be
construed as a potential conflict of interest.
Received: 03 December 2013; accepted: 08 February 2014; published online: 24
February 2014.
Citation: Leong V and Goswami U (2014) Impaired extraction of speech rhythm
from temporal modulation patterns in speech in developmental dyslexia. Front. Hum.
Neuro sci. 8:96. doi: 10.3389/fnhum.2014.00096
This article was submitted to the journal Frontiers in Human Neuroscience.
Copyright © 2014 Leong and Goswami. This is an open-access article distributed
under the terms of the Creative Commons Attribution License (CC BY). The use, dis-
tribution or reproduction in other forums is permitted, provided the orig inal author(s)
or licensor are credited and that the original publication in this journal is cited, in
accordance with accepted academic practice. No use, distribution or reproduction is
permitted which does not comply with these terms.
Frontiers in Human Neuroscience www.frontiersin.org February 2014 | Volume 8 | Article 96 |14
... Plus spécifiquement, le TSF propose qu'il existe dans la dyslexie un défaut de synchronisation des rythmes cérébraux delta et thêta dans le cortex auditif lors de l'écoute de signaux acoustiques. Ce fonctionnement oscillatoire atypique associé au traitement et à la perception de basses fréquences acoustiques (2 à 10 Hz) entraverait en conséquence le traitement des syllabes et de l'accent tonique qui apparaissent respectivement à une fréquence de 5Hz et de 2Hz (Goswami, 2011(Goswami, , 2019Leong & Goswami, 2014). Ceci impacterait plus généralement le développement de la phonologie sur lequel repose l'apprentissage du langage écrit. ...
... Néanmoins, notre étude n'intègre que deux mesures du traitement phonologique (conscience phonémique et dénomination rapide d'images). Il est possible que les difficultés rythmiques soient associées à l'altération d'autres habiletés phonologiques -comme le traitement syllabique ou le traitement de l'accent tonique (Goswami, 2011;Leong & Goswami, 2014 Whitall et al., 2008). Par ailleurs, les études en neuro-imagerie suggèrent que le traitement rythmique active des structures cérébrales impliquées dans la réalisation d'activités motrices comme le cervelet, les ganglions de la base, le cortex prémoteur ou encore l'aire motrice supplémentaire, et ce même en l'absence de mouvement (Chen et al., 2008;Konoike et al., 2012Konoike et al., , 2015 (Goswami, 2011;Ladányi et al., 2020;Wolff, 2002). ...
... ude de l'enveloppe temporelle de la parole et montée d'amplitude au niveau de la syllabe. (a) Exemple des variations de l'enveloppe temporelle (représentée en rouge) dans une courte phrase. (b) Exemple de la montée de l'amplitude (rise time) pour la syllabe « my ». L'enveloppe temporelle est représentée en rouge et la montée de l'amplitude en bleu.(Leong & Goswami, 2014). ...
Thesis
Alors que l’influence des habiletés langagières sur l’apprentissage du langage écrit fait consensus, plusieurs études tendent à montrer que certaines habiletés non-langagières, telles que la motricité fine et les habiletés rythmiques, jouent également un rôle dans cet apprentissage. Ce travail de thèse s’inscrit dans la continuité de ces travaux et cherche à la fois à confirmer l’influence des habiletés motrices et rythmiques sur différentes dimensions du langage écrit, et à examiner les mécanismes qui sous-tendent ces effets chez des élèves du CE2 au CM2 avec ou sans troubles des apprentissages. L’objectif final de ce travail était d’évaluer les bénéfices d’une intervention ciblant ces habiletés au travers d’un jeu vidéo sur l’apprentissage de l’écrit, dans le but d’ouvrir la voie vers de nouvelles pistes de remédiation pour les élèves présentant des difficultés de lecture et d’écriture. Afin d’apporter un éclairage sur les relations entre ces deux habiletés non-langagières et l’apprentissage de l’écrit, trois études corrélationnelles ont été menées chez des élèves de CE2 et CM1 sans difficultés d’apprentissage, et ont permis de confirmer les liens entre les habiletés motrices et rythmiques, et différentes dimensions du langage écrit (lecture et orthographe de mots, compréhension écrite et rédaction). Deux facteurs explicatifs de ces relations ont été identifiés grâce à une méthode de modélisation par équations structurales : l’automatisation de l’écriture manuscrite, qui sous-tend l’effet de la motricité à la fois sur la production écrite et sur la lecture de mots, et les fonctions exécutives, qui expliquent les effets des habiletés motrices et rythmiques sur l’ensemble des dimensions du langage écrit évaluées. Par ailleurs, nous avons cherché dans un deuxième temps à identifier les conséquences de difficultés motrices et/ou rythmiques associées à un trouble de l’apprentissage du langage écrit. Les résultats révèlent que dans cette population, une altération de ces habiletés non-langagières se traduit par une aggravation du déficit du traitement orthographique par rapport à un trouble isolé, ce qui conforte l’hypothèse d’une influence de ces deux habiletés sur l’apprentissage du langage écrit. Enfin, dans un troisième temps, les bénéfices d’une intervention basée sur la pratique d’un jeu vidéo ciblant les habiletés motrices ont été évalués. Bien que les résultats de l’étude ne permettent pas de démontrer clairement les bénéfices d’un tel entraînement, tant soit sur la motricité que sur l’apprentissage du langage écrit, les analyses exploratoires suggèrent que l’intervention pourrait être bénéfique pour les élèves présentant des difficultés motrices. En conclusion, l’ensemble des travaux réalisés dans cette thèse apporte un nouvel éclairage sur les liens entre les habiletés motrices et rythmiques, et l’apprentissage de l’écrit. Ce travail souligne la nécessité de tenir compte de ces deux compétences non-langagières dans l’étude de cet apprentissage.
... The neurobiological foundation for the auditory-phonological processes in reading begins in infancy with the entrainment of the temporal rhythms in speech. Phase alignment, and continuous resetting, of the brain's endogenous oscillations with the high excitability cycle of the rapidly changing acoustic spectrum of incoming speech: (1) enhances intelligibility (Ghitza and Greenberg, 2009;Ortez-Barajas et al., 2021); but also (2) entrains the discrete units of phonology recruited subsequently to enable beginning and fluent reading (Leong and Goswami, 2014). Entrainment of the delta band (1-3 Hz) consolidates the mental representations in memory for tonal prosody and syllabic stress, theta (4-8 Hz) for syllables, and low gamma (25-45) for phonemes (Goswami, 2011). ...
... Indeed, delta/theta PACs with gamma are correlated with reading ability in the auditory-phonological (Leong and Goswami, 2014) and visuospatial processing domains (Archer et al., 2020). However, while the theta-gamma PAC for trans-hemispheric communication is dominant in the auditory-phonological pathways (Goswami, 2019;Gross et al., 2013), a reverse coupling with gamma driving theta is typical of the MDS in visuospatial processing (Archer et al., 2020;Vidyasagar, 2013). ...
Article
Full-text available
A fundamental educational requirement of beginning reading is to learn, access, and rapidly process associations between novel visuospatial symbols and their phonological representations in speech. Children with difficulties in such cross-modal integration are often divided into dyslexia subtypes, based on whether their primary problem is with the written or spoken component of decoding. The present review suggests that starting in infancy, perceptions of audiovisual speech are integrated by mutual oscillatory phase-resetting between sensory cortices, and throughout development visual and auditory experiences are coupled into unified perceptions. Entirely separate subtypes are incompatible with this view. Visual or auditory deficits will invariably affect processing to some degree in both domains. It is suggested that poor auditory/visual integration may be diagnostic for both forms of dyslexia, stemming from an encoding weakness in the early cross-sensory binding of audiovisual speech. The review presents a model of dyslexia as a dysfunction of the large-scale ventral and dorsal attention networks controlling such binding. Excessive glutamatergic neuronal excitability of the attention networks by the Locus coeruleus-norepinephrine system may interfere with multisensory integration, with deleterious effects on the acquisition of reading by degrading grapheme/phoneme conversion.
... And if there is some flaw in this skill might this deficit impair linguistic competence, at least for native speakers? There is some evidence that this is indeed the case, both in speech production (Fujii and Wan, 2014) and in reading (Leong and Goswami, 2014). Perhaps rhythm is a foundational property, one that holds the key to understanding language's neural bases. ...
... For example, a growing body of recent works investigated the relationship between rhythmic skills, language, and the reading disorder (Flaugnacco et al., 2014;Boll-Avetisyan et al., 2020;Pagliarini et al., 2020;Thomson et al., 2006;Thomson & Goswami, 2008). Dyslexic readers seem to show impairments in rhythmical tasks as maintaining a regular tapping both in childhood (Leong & Goswami, 2014a; Thomson & Goswami, 2008) and adulthood (Leong & Goswami, 2014b; Thomson et al., 2006). These findings also promoted rhythmical and musical intervention (Bonacina et al., 2015;Flaugnacco et al., 2015;Overy, 2003;Thomson et al., 2013), often in a computerized form (Cancer et al., 2020). ...
Preprint
Full-text available
In this study, we validated the “ReadFree tool”: a computerized battery of 12 hierarchically organized tasks in the visual and auditory modalities, which do not imply reading. The tool has been developed to identify poor readers irrespective of their specific language background, thus, to be also suitable for Minority-Language Children (MLC).Each task's discriminant power was tested on 142 Italian-monolingual participants (8-13 years-old) that either presented a reading deficit (i.e., monolingual poor readers (mPR); N = 37) or not (i.e., monolingual good readers (mGR); N = 105). The performances at the discriminant tasks were analysed by means of a multivariate machine learning approach based on a classification and regression tree (CART) model to classify mPR versus mGR.To test the diagnostic accuracy of the ReadFree tool in MLC, we first compared reading and ReadFree tool performances in MLC (N = 68) with those in monolingual readers (mGR + mPR; N = 142). The two groups did not show any significant differences, suggesting that (i) the two samples had the same distribution of good and poor readers and (ii) the ReadFree tool can be used to test MLC without introducing any systematic bias associated with their language use experience and exposure. Secondly, the MLC’s CART classification of good and poor readers was compared to the one obtained by adopting clinical reading tests standardized on Italian monolingual children. Interestingly, the percentage of MLC evaluated as poor readers through the standardized reading tests was higher than the one produced by the ReadFree tool. This evidence supports the idea that reading tasks standardized also on MLC population are needed.
... In future studies, it would be interesting to investigate how this relates to potentially atypical temporal alignment of slow oscillatory activities (delta/theta) to the speech envelope in individuals with dyslexia ( Goswami, 2011 ;Lallier et al., 2017 ;Hämäläinen et al., 2013 ). In particular, a reduction in speech-brain phase alignment widely reported in previous studies ( Hämäläinen, Rupp, Soltész, Szücs, & Goswami, 2012 ;Leong & Goswami, 2014 ;Leong et al., 2011 ) does not necessarily result in diminishing synchrony across brain regions. Hence, additional studies are needed to probe into the interaction between local and largescale oscillatory activities during speech processing, and its association with (a)typical reading development. ...
Article
Full-text available
Developmental dyslexia is often accompanied by altered phonological processing of speech. Underlying neural changes have typically been characterized in terms of stimulus- and/or task-related responses within individual brain regions or their functional connectivity. Less is known about potential changes in the more global functional organization of brain networks. Here we recorded electroencephalography (EEG) in typical and dyslexic readers while they listened to (a) a random sequence of syllables and (b) a series of tri-syllabic real words. The network topology of the phase synchronization of evoked cortical oscillations was investigated in four frequency bands (delta, theta, alpha and beta) using minimum spanning tree graphs. We found that, compared to syllable tracking, word tracking triggered a shift toward a more integrated network topology in the theta band in both groups. Importantly, this change was significantly stronger in the dyslexic readers, who also showed increased reliance on a right frontal cluster of electrodes for word tracking. The current findings point towards an altered effect of word-level processing on the functional brain network organization that may be associated with less efficient phonological and reading skills in dyslexia.
... A change in the slope in the region of 7 Hz is visible on the higher frequency side of the spectrum. This rate has been identified as "sub-beats " ( Goswami and Leong, 2013 ;Leong and Goswami, 2014b ), and corresponds to the presence of timecompressed, unstressed syllables that occur in the story phrases, such as, "Never before, " and "Toppling head-over-heels ". Indeed, the average sub-beat rate of 7.2 Hz was twice the average syllable rate of the sentences (3.6 Hz). ...
Article
Phonological difficulties characterise individuals with dyslexia across languages. Currently debated is whether these difficulties arise from atypical neural sampling of (or entrainment to) auditory information in speech at slow rates (<10 Hz, related to speech rhythm), faster rates, or neither. MEG studies with adults suggest that atypical sampling in dyslexia affects faster modulations in the neurophysiological gamma band, related to phoneme-level representation. However, dyslexic adults have had years of reduced experience in converting graphemes to phonemes, which could itself cause atypical gamma-band activity. The present study was designed to identify specific linguistic timescales at which English children with dyslexia may show atypical entrainment. Adopting a developmental focus, we hypothesised that children with dyslexia would show atypical entrainment to the prosodic and syllable-level information that is exaggerated in infant-directed speech and carried primarily by amplitude modulations <10Hz. MEG was recorded in a naturalistic story-listening paradigm. The modulation bands related to different types of linguistic information were derived directly from the speech materials, and lagged coherence at multiple temporal rates spanning 0.9-40 Hz was computed. Group differences in lagged speech-brain coherence between children with dyslexia and control children were most marked in neurophysiological bands corresponding to stress and syllable-level information (<5Hz in our materials), and phoneme-level information (12-40 Hz). Functional connectivity analyses showed network differences between groups in both hemispheres, with dyslexic children showing significantly reduced global network efficiency. Global network efficiency correlated with dyslexic children's oral language development and with control children's reading development. These developmental data suggest that dyslexia is characterised by atypical neural sampling of auditory information at slower rates. They also throw new light on the nature of the gamma band temporal sampling differences reported in MEG dyslexia studies with adults.
... This asynchrony corresponds to a reduced integration of left hemispheric fine-grained and right hemispheric supra-segmental signal representations, which lead to difficulties in discriminating onsets of syllables and perceiving rhythmic structures in speech and music. These difficulties are characteristic for children with dyslexia [116,125] and are frequently associated with AD(H)D [126]. There is evidence that children and adolescences with AD(H)D demonstrate an atypical development of the N1 component with growing latency over time, whereas non-affected individuals are characterized by a declining latency [127]. ...
Article
Full-text available
Research has shown that dyslexia and attention deficit (hyperactivity) disorder (AD(H)D) are characterized by specific neuroanatomical and neurofunctional differences in the auditory cortex. These neurofunctional characteristics in children with ADHD, ADD and dyslexia are linked to distinct differences in music perception. Group-specific differences in the musical performance of patients with ADHD, ADD and dyslexia have not been investigated in detail so far. We investigated the musical performance and neurophysiological correlates of 21 adolescents with dyslexia, 19 with ADHD, 28 with ADD and 28 age-matched, unaffected controls using a music performance assessment scale and magnetoencephalography (MEG). Musical experts independently assessed pitch and rhythmic accuracy, intonation, improvisation skills and musical expression. Compared to dyslexic adolescents, controls as well as adolescents with ADHD and ADD performed better in rhythmic reproduction, rhythmic improvisation and musical expression. Controls were significantly better in rhythmic reproduction than adolescents with ADD and scored higher in rhythmic and pitch improvisation than adolescents with ADHD. Adolescents with ADD and controls scored better in pitch reproduction than dyslexic adolescents. In pitch improvisation, the ADD group performed better than the ADHD group, and controls scored better than dyslexic adolescents. Discriminant analysis revealed that rhythmic improvisation and musical expression discriminate the dyslexic group from controls and adolescents with ADHD and ADD. A second discriminant analysis based on MEG variables showed that absolute P1 latency asynchrony |R-L| distinguishes the control group from the disorder groups best, while P1 and N1 latencies averaged across hemispheres separate the control, ADD and ADHD groups from the dyslexic group. Furthermore, rhythmic improvisation was negatively correlated with auditory-evoked P1 and N1 latencies, pointing in the following direction: the earlier the P1 and N1 latencies (mean), the better the rhythmic improvisation. These findings provide novel insight into the differences between music processing and performance in adolescents with and without neurodevelopmental disorders. A better understanding of these differences may help to develop tailored preventions or therapeutic interventions.
... For example, it becomes possible to compare reading behaviour more directly with evidence from other measurement modalities (such as oscillatory brain activation data 53,54 ) and to other cognitive-psychological domains (such as attention 15,55 ), which typically do not have the advantage of exact duration measurements for different events of interest (for example, during covert attention). Maybe most importantly, the frequency perspective on reading offers direct links to several neurodynamic phenomena in speech perception 5,6 , including the observation that dyslexic children 56,57 and adults 58 show altered cortical tracking of speech signals in the oscillatory domain. ...
Article
Full-text available
Across languages, the speech signal is characterized by a predominant modulation of the amplitude spectrum between about 4.3 and 5.5 Hz, reflecting the production and processing of linguistic information chunks (syllables and words) every ~200 ms. Interestingly, ~200 ms is also the typical duration of eye fixations during reading. Prompted by this observation, we demonstrate that German readers sample written text at ~5 Hz. A subsequent meta-analysis of 142 studies from 14 languages replicates this result and shows that sampling frequencies vary across languages between 3.9 Hz and 5.2 Hz. This variation systematically depends on the complexity of the writing systems (character-based versus alphabetic systems and orthographic transparency). Finally, we empirically demonstrate a positive correlation between speech spectrum and eye movement sampling in low-skilled non-native readers, with tentative evidence from post hoc analysis suggesting the same relationship in low-skilled native readers. On the basis of this convergent evidence, we propose that during reading, our brain’s linguistic processing systems imprint a preferred processing rate—that is, the rate of spoken language production and perception—onto the oculomotor system.
... An alternative to this problem lies in another type of methods relying on the automatic extraction of the signal envelope modulations, or Envelope Modulation Spectrum (EMS), first proposed by [1] and since then used in several studies on speech rhythm [18,2,19,20]). EMS provides spectral analysis of the low-rate amplitude modulations This line of studies is of paramount interest to encompass the whole picture of speech intelligibility. ...
Article
Dyslexia is a frequent developmental disorder in which reading acquisition is delayed and that is usually associated with difficulties understanding speech in noise. At the neuronal level, children with dyslexia were reported to display abnormal cortical tracking of speech (CTS) at phrasal rate. Here, we aimed to determine if abnormal tracking relates to reduced reading experience, and if it is modulated by the severity of dyslexia or the presence of acoustic noise. We included 26 school-age children with dyslexia, 26 age-matched controls and 26 reading-level matched controls. All were native French speakers. Children's brain activity was recorded with magnetoencephalography while they listened to continuous speech in noiseless and multiple noise conditions. CTS values were compared between groups, conditions and hemispheres, and also within groups, between children with mild and severe dyslexia. Syllabic CTS was significantly reduced in the right superior temporal gyrus in children with dyslexia compared with controls matched for age but not for reading level. Severe dyslexia was characterized by lower rapid automatized naming (RAN) abilities compared with mild dyslexia, and phrasal CTS lateralized to the right hemisphere in children with mild dyslexia and all control groups but not in children with severe dyslexia. Finally, an alteration in phrasal CTS was uncovered in children with dyslexia compared with age-matched controls in babble noise conditions but not in other less challenging listening conditions (non-speech noise or noiseless conditions); no such effect was seen in comparison with reading-level matched controls. Overall, our results confirmed the finding of altered neuronal basis of speech perception in noiseless and babble noise conditions in dyslexia compared with age-matched peers. However, the absence of alteration in comparison with reading-level matched controls demonstrates that such alterations are associated with reduced reading level, suggesting they are merely driven by reduced reading experience rather than a cause of dyslexia. Finally, our result of altered hemispheric lateralization of phrasal CTS in relation with altered RAN abilities in severe dyslexia is in line with a temporal sampling deficit of speech at phrasal rate in dyslexia.
Article
Full-text available
Electroencephalographic (EEG) oscillations are hypothesized to reflect cyclical variation in the excitability of neuronal ensembles [1], with particular frequency bands reflecting differing types [2-4] and spatial scales [5-7] of brain operations. Interdependence between the gamma and theta bands [5, 8] suggests an underlying structure to the EEG spectrum, and there is also evidence that ongoing activity influences sensory responses [9, 10]. However, there is no unifying theory of EEG organization and the role of the ongoing oscillatory activity in sensory processing remains controversial. This study analyzed laminar profiles of synaptic activity and action potentials, both spontaneous and stimulus-driven, in primary auditory cortex [11]. We find that -1) The EEG is hierarchically organized; delta (1-4 Hz) phase modulates theta (4-10 Hz) amplitude, and theta phase modulates gamma (30-50 Hz) amplitude. 2) This Oscillatory Hierarchy controls baseline excitability and action potential generation, as well as stimulus-related responses in a neuronal ensemble. We propose that the hierarchical organization of ambient oscillatory activity allows auditory cortex to structure its temporal activity pattern so as to optimize the processing of rhythmic inputs.
Thesis
Perceptual centres, or P-centres, represent the perceptual moments of occurrence of acoustic signals - the 'beat' of a sound. P-centres underlie the perception and production of rhythm in perceptually regular speech sequences. P-centres have been modelled both in speech and non speech (music) domains. The three aims of this thesis were toatest out current P-centre models to determine which best accounted for the experimental data bto identify a candidate parameter to map P-centres onto (a local approach) as opposed to the previous global models which rely upon the whole signal to determine the P-centre the final aim was to develop a model of P-centre location which could be applied to speech and non speech signals. The first aim was investigated by a series of experiments in which a) speech from different speakers was investigated to determine whether different models could account for variation between speakers b) whether rendering the amplitude time plot of a speech signal affects the P-centre of the signal c) whether increasing the amplitude at the offset of a speech signal alters P-centres in the production and perception of speech. The second aim was carried out by a) manipulating the rise time of different speech signals to determine whether the P-centre was affected, and whether the type of speech sound ramped affected the P-centre shift b) manipulating the rise time and decay time of a synthetic vowel to determine whether the onset alteration was had more affect on P-centre than the offset manipulation c) and whether the duration of a vowel affected the P-centre, if other attributes (amplitude, spectral contents) were held constant. The third aim - modelling P-centres - was based on these results. The Frequency dependent Amplitude Increase Model of P-centre location (FAIM) was developed using a modelling protocol, the APU GammaTone Filterbank and the speech from different speakers. The P-centres of the stimuli corpus were highly predicted by attributes of the increase in amplitude within one output channel of the filterbank. When this was used to make predictions of the P-centres for all the stimuli used in the thesis, 85[percent] of the observed variance was accounted for. The FAIM approach combines aspects of previous, speech and non speech models (Gordon 1987, Marcus 1981, Vos and Rasch 1981). P-centre were thus modelled in a non speech specific, local manner.
Article
Daniel Hirst has been working in the field of prosody and phonology for thirty years. In this time he has completed two doctoral theses: a Doctorat de 3e Cycle 1974 (published by Mouton in the Janua Linguarum series) and a Doctorat d'Etat 1987. He is, at present, Directeur de Recherches at the CNRS laboratory Parole et Langage at the University of Provence where he heads a research team working in the field of speech prosody.