Fine-grained pitch processing of music and speech
in congenital amusia
CNRS, UMR5292; INSERM, U1028; Lyon Neuroscience Research Center, Auditory Cognition and
Psychoacoustics Team, Lyon, F-69000, France
University College London, Institute of Cognitive Neurosciences, London WC1N 3AR, United Kingdom
International Laboratory for Brain, Music and Sound Research (BRAMS), Universite ´ de Montre ´al,
Montre ´al, Que ´bec H3C 3J7, Canada
University College London, Institute of Cognitive Neurosciences, London WC1N 3AR, United Kingdom
Carlo Umilta `
Universita ` di Padova, Dipartimento di Psicologia Generale, 35131 Padova, Italy
International Laboratory for Brain, Music and Sound Research (BRAMS), Universite ´ de Montre ´al,
Montre ´al, Que ´bec H3C 3J7, Canada
(Received 9 December 2010; revised 28 September 2011; accepted 10 October 2011)
Congenital amusia is a lifelong disorder of music processing that has been ascribed to impaired
pitch perception and memory. The present study tested a large group of amusics (n¼17) and pro-
vided evidence that their pitch deficit affects pitch processing in speech to a lesser extent: Fine-
grained pitch discrimination was better in spoken syllables than in acoustically matched tones.
Unlike amusics, control participants performed fine-grained pitch discrimination better for musical
material than for verbal material. These findings suggest that pitch extraction can be influenced by
the nature of the material (music vs speech), and that amusics’ pitch deficit is not restricted to
musical material, but extends to segmented speech events. V
C 2011 Acoustical Society of America.
PACS number(s): 43.75.Cd [LD]Pages: 4089–4096
Congenital amusia can occur in the context of normal
cognitive functioning (e.g., memory, attention), normal lan-
guage processing and normal exposure to music (Peretz and
Hyde, 2003). Genetic origins have been suggested (Drayna
et al., 2001; Peretz et al., 2007), and recent data report neural
anomalies in white matter concentration, cortical thickness
and fiber tracts in the right hemisphere (Hyde et al., 2006;
Hyde et al., 2007; Loui et al., 2009), with some evidence
also for grey matter anomalies in the left hemisphere
(Mandell et al., 2007).
Congenital amusia has been thought to result from a mu-
sical pitch-processing disorder. Amusic individuals are
unable to recognize familiar tunes without lyrics and to
detect dissonances and out-of-key tones in tonal melodies.
This musical deficit is not limited to impaired encoding of
pitch in terms of musical scales, but also affects basic pitch
discrimination (e.g., Peretz et al., 2002; Foxton et al., 2004;
Hyde and Peretz, 2004). Amusic individuals have larger
thresholds for the detection of pitch changes and for pitch
direction judgments (Foxton et al., 2004), and have difficul-
ties detecting pitch changes that are smaller than two semi-
tones in repeated tone sequences (Hyde and Peretz, 2004).
In contrast to the observed pitch-processing deficits in
musical and acoustic contexts, initial reports have shown
normal speech and prosody processing (Ayotte et al., 2002).
Individuals with congenital amusia have been reported to be
unimpaired in language and prosody tasks, such as learning
and recognizing lyrics, classifying a spoken sentence as
statement or question based on final, falling or rising pitch
information as well as identifying or discriminating stressed
words in sentences (Ayotte et al., 2002; Peretz et al., 2002;
Patel et al., 2005).
Because of this spared intonation processing, Peretz and
Hyde (2003) suggested that the difference between speech
and music perception is linked to the size of relevant pitch
variations. In speech intonation of non-tonal languages, varia-
tions in fundamental frequency (F0), in particular those indi-
cating statements and questions, are typically coarse (up to
more than 12 semitones for the pitch rise of the final word in
a question; see Patel et al., 2008, Table I). In music, however,
the pitch variations are typically more fine-grained (1 or 2
semitones; see Vos and Troost, 1989). Accordingly, amusics’
a)Also at: University Lyon, Lyon, F-69000, France. Author to whom corre-
spondence should be addressed. Electronic mail: barbara.tillmann@olfac.
J. Acoust. Soc. Am. 130 (6), December 2011
C 2011 Acoustical Society of America40890001-4966/2011/130(6)/4089/8/$30.00
Author's complimentary copy
pitch deficit would affect music more than speech because
music is more demanding in pitch resolution, and congenital
amusia would represent a music-relevant deficit, not neces-
sarily a music-specific deficit.
However, amusics’ deficit may persist for large pitch
differences when these are derived from speech intonation
patterns and presented as musical analogues (respecting glid-
ing pitch changes or transformed into discrete steps). Amu-
sics were impaired when asked to discriminate imitations of
spoken intonation (without words). The deficit was limited
to the non-speech context because amusics were normal at
discriminating the same changes in the context of speech
(Ayotte et al., 2002; Patel et al., 2005). Ayotte et al. (2002)
suggested a beneficial effect of the linguistic cues that could
serve as anchoring points for poor pitch processing abilities.
Another possible account for this speech advantage is that
labeling the word with the salient pitch change might pro-
vide a strategy that decreases the memory load of the task—
a strategy that is not possible for the tone analogues. In the
latter case, the pitch memory problem experienced by amu-
sics (Gosselin et al., 2009; Tillmann et al., 2009; Williamson
et al., 2010) may explain why they fail to discriminate into-
nation patterns in musical analogues even though they con-
tain relatively large pitch changes.
The hypothesis that amusics cannot fully compensate
for their pitch deficit by using speech-based strategies is sup-
ported by recent data. Amusics showed mild deficits in proc-
essing speech for intonation (questions vs statement) in their
native language (English or French; Patel et al., 2008; Liu
et al., 2010), or for pitch contrasts in tone language words
(Mandarin or Thai; Nguyen et al., 2009; Nan et al., 2010;
Tillmann et al., 2011). However, it has been also shown that
amusics performed better with speech than with musical ana-
logues, especially among those amusics with high pitch dis-
crimination thresholds (over one semitone; Tillmann et al.,
2011). The speech advantage extends to singing in amusia:
singing a song with lyrics leads to better performance than
singing with /la/ only (DallaBella et al., 2009). This finding
suggests that in singing lyrics serve as anchoring points for
poor pitch processing abilities. Thus, speech may enhance
pitch processing in amusia, even though it does not necessar-
ily restore normal processing.
The present study aimed to further compare amusics’
pitch processing in music and speech material in identical
conditions. In contrast to previous studies, we focused on
fine-grained pitch discrimination (i.e., down to 25-cent
changes) and used musical and speech stimuli that were
matched in their acoustic structure as closely as possible. To
that aim, the stimuli were created with a synthesis program
based on a source-filter model of the voice (Be ´langer et al.,
2007). The spoken syllable [ka] and the acoustically matched
musical tone, which sounded like a trumpet, were matched in
F0 (215Hz), amplitude envelope, duration, and differed only
by the presence or absence of formants (Fig. 1). To focus on
pitch perception, we used a basic pitch change detection task
(as in Hyde and Peretz, 2004). This task was not as demand-
ing in terms of memory load as the task used in prior studies
(e.g., Patel et al., 2005), so as to avoid confounds between
impairments in pitch discrimination and impairments of
pitch memory (Gosselin et al., 2009; Tillmann et al., 2009;
Williamson et al., 2010). Furthermore, the pitch changes
were implemented in discrete, segmented events (a tone for
the musical material, the syllable /ka/ for the verbal material),
in contrast to gliding pitch changes for which amusics have
shown smaller thresholds (Foxton et al., 2004).
If the deficit of fine-grained pitch processing typically
found in congenital amusia is specific to music, performance
should be impaired for music materials, but not for speech
materials, in comparison to control participants. If, however,
the pitch processing deficit is more general and/or transfers
to speech materials, then we should observe lower perform-
ance levels for both music and speech materials in amusics
in comparison to controls. A key aspect of our study was to
assess whether amusics’ pitch processing would be better for
speech than for music when the spoken segments were
repeated and could not serve as distinctive cues for memory.
In unimpaired (nonamusic) listeners, a music advantage was
expected. Indeed, previous studies have reported higher
(worse) pitch discrimination thresholds for speech (e.g., a
vowel embedded in a short syllable, such as heed; Smith
et al., 2005) than for non-speech material (e.g., pure tones,
complex sounds; see Moore, 2008, for a review).
Note that other recent studies have reported that musical
expertise improves pitch processing not only for music but
also for speech (Scho ¨n et al., 2004; Wong et al., 2007; Mor-
eno et al., 2009). However, these data can not reveal the
material-related differences in subtle pitch processing that
we are investigating here because these studies varied pitch
height and pitch changes between music and speech in a way
that aimed to match materials for task difficulty rather than
to test for pitch discrimination thresholds.
All participants completed the Montreal Battery of Eval-
uation of Amusia (MBEA; Peretz et al., 2003), which is cur-
rently used as a standard method to identify cases of
congenital amusia (e.g., Stewart, 2008, 2011). The MBEA
involves six tests that aim to assess the various components
that are known to contribute to melody processing. The
stimuli are novel melodic sequences, played one note at a
TABLE I. Characteristics of amusic and control participants.
Average age in years
1.16 (range: 0–5)
0.94 (range: 0–3)
Average education in years
(average years of
Average MBEA score (% correct) 6865 9164
4090J. Acoust. Soc. Am., Vol. 130, No. 6, December 2011Tillmann et al.: Pitch in music and speech in amusia
Author's complimentary copy
time on a piano. They are written in accordance to the rules
of the tonal structure of the Western idiom. The sequences
are 4 bars long, last about 4 s and contain from 8 to 19 tones
(mean: 10.7). These melodies are arranged in various tests
so as to assess abilities to discriminate pitch and rhythmic
variations, and to recognize musical sequences heard in prior
tests of the battery. Peretz et al. (2003) tested a large popula-
tion and defined a cut-off score of 78% of correct responses,
under which participants can be defined as amusics or above
which participants can be considered as controls.
Seventeen individuals with congenital amusia partici-
pated in the experiment (see Table I). They performed below
the cut-off score on the MBEA, and were compared to 17
matched controls who obtained normal MBEA scores (see
Monotonic, isochronous sequences of five sounds (either
tones or syllables) defined the standard sequences (i.e., the
same standard sound presented repeatedly). All sounds had a
duration of 100ms and were presented with inter-stimulus-
intervals (ISIs) of 350ms. In sequences containing a pitch
change, the fourth sound was changed upwards or down-
wards by pitch distances of 25, 50, 100, 200, or 300 cents
(i.e., 100 cents corresponding to 1 semitone) in comparison
to the standard. Sound duration, ISIs and pitch changes were
defined as in Hyde and Peretz (2004), except that the F0 of
the standard was set to 215Hz (instead of 1047Hz). This
pitch level is more ecological for speech and corresponds to
the level of adult female voices.
The sequences comprised either tones or syllables. Both
were constructed with a synthesis program based on a
source-filter model of the voice (Be ´langer et al., 2007).
Tones were composed of a 20ms-noise burst followed by a
low-pass filtered impulse train (80ms) emulating the glottal
pulse when pronouncing the syllable [ka]. Syllables were
created by passing these tones through four band-pass filters
emulating the vocal formants. This procedure simulates the
production of the same vocal sound at the level of the larynx
and of the lips, thus before and after modulation by the artic-
ulatory gesture (inducing formant transitions). As F0 is con-
trolled by the vocal folds in the larynx, the pitch information
is present in both cases. The only difference is the absence
or the presence of filtering of this excitation signal emulating
the glottal source. The sound duration was matched across
stimuli (100ms) as well as the duration of the portion con-
taining pitch information (80ms), starting right after the
noise burst. In the portion of the stimulus containing pitch
information, the F0 was constant (in both types of materials).
Note that the synthesis model controls F0 and central fre-
quencies of the formants independently. In the syllables, the
central frequencies of the four formants of the target vowel
[a] were 950, 1570, 3150, and 4370 Hz. These frequencies
vary through time from the onset for the consonant [k] until
they reach the target values. As shown in Fig. 1, the first,
third, and fourth formants are gliding upwards and the sec-
ond formant is gliding downwards.
The stimuli were presented over headphones at a com-
fortable loudness level, and the experiment was run on
PSYSCOPE Software (Cohen, MacWhinney, Flatt, and Provost,
1993), which also recorded participants’ responses.
Participants were required to detect whether each
sequence contained a change or not; they were informed that
each sequence contained five sounds and that the fourth
sound could be changed in pitch (or not). Participants
responded by pressing one of two keys on the computer key-
board. Experimental trials were presented in separate blocks
for syllables and for tones. There were 4 blocks separated by
short breaks and presented in one of two orders: tones, sylla-
bles, syllables, tones, or syllables, tones, tones, syllables.
Each block contained 180 trials, with 90 trials being sequen-
ces without change and 90 trials being sequences with
FIG. 1. (Color online) Spectrograms and waveforms of tones (left) and syllables (right) for the standard (top) and the largest changes (bottom). Tones and syl-
lables are matched in fundamental frequency, amplitude envelope, duration, and differ by the presence or absence of formants. Because of the synthesis model
used here (Be ´langer et al., 2007), tones are unipolar signals and syllables bipolar signals (as can be seen in the waveforms), a difference that does not affect
pitch processing. Sound examples are available at http://olfac.univ-lyon1.fr/bt-sound.
J. Acoust. Soc. Am., Vol. 130, No. 6, December 2011Tillmann et al.: Pitch in music and speech in amusia 4091
Author's complimentary copy
changes (equally distributed across the different pitch distan-
ces). The experimental trials were presented without feed-
back. The first block of each material type was preceded by
40 training trials (with feedback). The order of trials inside
blocks was randomized for each participant. Short breaks
were allowed between the blocks, and within a block, trial
presentation was self-paced; that is, participants started the
next trial by pressing a predefined key on the computer
Performance was analyzed by calculating proportions of
hits (number of correct responses for different trials/number
of different trials) minus false alarms (FAs; number of incor-
rect responses for same trials/number of same trials). These
proportions (Fig. 2) were analyzed with a 2?2?5 analysis
of variance with group (amusics/controls) as between-
participants factor and material (verbal/musical) and pitch
distance (25/50/100/200/300) as within-participants factor. A
significant main effect of group indicated that, overall, amu-
sics performed at a lower performance level than did controls,
F(1,32)¼11.31, MSE¼705.92, p¼0.002. The main effect
of pitch distance was significant, F(4,128)¼176.62, MSE
¼176.62, p<0.0001, and interacted significantly with group,
F(4,128)¼18.14, MSE¼176.62, p<0.0001, indicating that
the influence of pitch distance was more important for amu-
sics than for controls: in particular, the performance of the
amusic group decreased more strongly for the small pitch
changes than did the performance of the control group. The
interaction between group and material just reached signifi-
cance, F(1,32)¼4.11, MSE¼94.66, p¼0.05, indicating that
amusics’ performance was better for verbal material than for
musical material, while the reverse was observed for controls.
Most importantly, this interaction was modulated by pitch
distance, as shown by the significant three-way interaction
between group, material, and pitch distance, F(4,128)¼7.30,
MSE¼53.96, p<0.0001. Planned contrast analyses indi-
cated that performance at 25-cent changes was better for
speech than for music in the amusic group, F(1,32)¼8.70,
p¼0.006, whereas the reverse was observed in the control
group, F(1,32)¼8.12, p¼0.008. At this pitch distance, the
interaction between group and material was also significant,
F(1, 32)¼16.81, p¼0.0003.
For the other pitch distances, there were no significant
differences between verbal and musical material in either
group, except for the 300-cent changes, where performance
was slightly higher for music than speech in the amusic
group only, F(1,32)¼7.10, p¼0.01. However, amusics’
performance was very high for these large pitch changes (94
and 97% for the verbal and musical material, respectively),
and even better than controls for the musical material (97%
vs 93%; p¼0.02). This slightly higher performance can be
related to the dynamics of the experimental session. For
amusics, the proportion of perceived change-trials was lower
than for controls and thus may enhance the oddball effect for
the detected change-trials (hence, the trials with large pitch
distances). This observation is in agreement with an electro-
physiological study using the material of Hyde and Peretz
(2004): the N2-P3 response, which is known to be inversely
proportional to the probability of the deviant (Donchin et al.,
1978, 1981), was enhanced for large pitch changes in amu-
sics as compared to controls (Peretz et al., 2005).
The advantage for verbal material over musical material
in amusics can be also seen in the comparison of perform-
ance level with controls. For verbal material, amusics’ per-
formance was below controls’ performance at the 25-cent
change, F(1,32)¼12.31, p¼0.001, and the 50-cent change,
F(1,32)¼15.55, p¼0.0004, but did not differ from controls
at the 100-cent change (and at the larger changes, p>0.25).
In contrast, for 100-cent changes in the musical material,
amusics’ performance was still below controls’ performance,
p¼0.04 (as it was observed for 25- and 50-cent changes,
p<0.003). The verbal material thus allowed the amusics to
reach the level of controls for relatively small pitch changes
(one semitone), and smaller than did the musical material.
To assess whether the effect of material might relate to
the severity of amusia, performance difference between the
two material types (at 25 cents) was examined as a function
of the MBEA scores (Fig. 3). The obtained correlation,
r(32)¼?0.60, p<0.0001, seemed to be driven by the group
extremes as correlations vanished when computed separately
for amusics and controls [r(15)¼?0.29 and r(15)¼0.09,
respectively]. This correlation pattern was also observed
with scale, contour, interval and meter tests of the MBEA
(ps<0.001, but p¼0.08 for the rhythm test). For the mem-
ory test, the correlation for the amusic group alone showed a
trend, r(15)¼?0.43, p<0.09 (in addition to a significant
correlation over groups, p<0.0001): the lower the memory
score, the stronger the advantage for the verbal material.
To investigate individual pitch discrimination perform-
ance for verbal and musical materials, we calculated the cor-
relation between the scores obtained for syllables and for
tones at the 25-cent distance (Fig. 4). For all participants,
this correlation was positive, r(32)¼0.83, p<0.0001: the
better the performance for tones, the better the performance
for syllables. These correlations were also significant when
separating amusics, r(15)¼0.79, p<0.0001, from controls,
r(15)¼0.71, p<0.001. A closer look at the scatter plot sug-
gests two subgroups of amusics—separated as a function of
FIG. 2. Performance with syllables and acoustically matched tones as a
function of the size of the fundamental frequency (F0) change (from 25 to
300 cents) for the group of congenital amusics and the matched control
group. Performance is expressed as the mean percentage of hits minus false
alarms (FA). Error bars represent standard errors.
4092 J. Acoust. Soc. Am., Vol. 130, No. 6, December 2011Tillmann et al.: Pitch in music and speech in amusia
Author's complimentary copy
their performance on the tone material. More specifically,
for the nine amusics who scored poorly on tones (below
20% of hits-FAs), performance did not correlate significantly
between the two material types, r(7)¼0.34, p¼0.38, while
tone and syllable scores correlated significantly for the other
eight amusics performing above 20%, r(6)¼0.82, p¼0.01.
Although these two groups differed in both their level and
range of performance for the tone material (between 5 and
19% vs 32 and 77%), this pattern suggests that (a) amusics
with better pitch discrimination capacities performed simi-
larly on tones and syllables, as did unimpaired controls, and
(b) amusics with poor pitch discrimination exhibited a larger
advantage for the verbal material over the musical material.
Because some amusics complain that music sounded
like noise, was irritating or unpleasant (Peretz et al., 2002;
McDonald and Stewart, 2008), the observed lower perform-
ance for musical sounds than for syllables might be due to a
similar reaction. To address this possibility, we examined
whether such a complaint was reported by the amusics tested
here. Out of the 17 amusics, three responded “yes” to “music
is like noise for me,” two responded “yes” to “I find most
music irritating,” and none responded “yes” to “music is a
very unpleasant experience for me.” These responses suggest
that a negative attitude towards musical sounds cannot
explain the speech advantage observed here in pitch process-
ing. The 25-cent performance of the three amusics who indi-
cated that “music is like noise” showed an advantage for
verbal material (33% versus 19% for the tone material)
which is similar in size to that of the other fourteen amusics
(39% for the verbal material versus 29% for the tone mate-
rial). This comparison suggests that the observed speech
advantage in pitch processing is not restricted to a minority
of amusics who reported altered timbre perception and who
might be characterized by musical dystimbria (Griffiths
et al., 2006).
The novelty of the present study was to investigate pitch
processing in verbal and musical stimuli that were matched
as closely as possible in their acoustic structure. The results
confirmed the presence of a fine-grained pitch deficit in amu-
sia and extended this disorder to the processing of speech.
However, for the amusic individuals, the speech material
was easier to process than the musical analogues, whereas
the reverse was observed for their matched controls. Acous-
tic differences between the syllable and tone materials, other
than those necessary to define verbal vs nonverbal materials
(i.e., presence vs absence of formants), cannot account for
this pattern of performance because of our procedure for
controlling material construction (see Sec. II A). Both sylla-
bles and tones were controlled for F0 changes and no addi-
tional cues or differences were introduced for syllables in
comparison to the tones; this was allowed by the independ-
ent control of source and filter in the used synthesis algo-
rithm that did not alter formants and formant trajectories for
the events with the changed F0. Moreover, the present mate-
rial was created with a synthesis program that controlled for
the acoustic features for both types of stimuli, including
acoustic variations that would be naturally present in spoken
utterances, where pitch variations correlate with intensity
changes (e.g., Peng et al., 2009).
The amusic group exhibits an interesting pattern for the
smallest pitch changes (e.g., 25 cents): they performed better
for syllables than tones. Furthermore, they reached normal
performance for syllables with intermediate pitch changes
(100 cents). For the same pitch deviations embedded in tone
sequences, amusics performed below controls. These find-
ings indicate a less pronounced deficit for the processing of
pitch changes in verbal material in comparison to the proc-
essing of these changes in musical material.
The advantage conferred by speech on pitch extraction
was related to the severity of the deficit. The speech context
improved pitch processing in amusia, and in particular for
individuals with the most severe pitch impairment for musi-
cal material. This is consistent with the results obtained with
tone language material (i.e., Thai) and non-verbal analogues
thereof (that is, gliding pitch changes) in a short-term mem-
ory task (Tillmann et al., 2011). Here, we showed that the
FIG. 3. Difference scores between verbal and musical material (i.e., for 25-
cent changes) plotted as a function of participants’ global MBEA scores.
Positive values indicate better performance for verbal material and negative
values, better performance for musical material.
FIG. 4. Performance (% hit-FA) of amusic and control participants for the
syllable and the tone materials (i.e., 25-cent changes).
J. Acoust. Soc. Am., Vol. 130, No. 6, December 2011Tillmann et al.: Pitch in music and speech in amusia4093
Author's complimentary copy
speech advantage can be obtained with discrete, segmented
pitches in a simple oddball task. These findings suggest that
early perceptual processes involved in pitch extraction can
be influenced by context (music and speech). This influence
can occur as early as the acoustic analysis of the input or
may result from higher-level processing.
Psychoacoustic studies investigating early processes of
pitch extraction mostly used pure tones or complex tones
(see Moore, 2008, for a review) and the rare studies using
verbal material focused on vowel formants (e.g., Lyzenga
and Horst, 1995). Our study compared for the first time the
same series of pitch changes for music and speech in the
same listeners. The findings in control participants are con-
sistent with previous reports that examined each domain sep-
arately, suggesting that pitch extraction is more precise for
musical material than for speech material in normal listeners.
For example, pitch discrimination thresholds (above 100Hz)
are around 0.2% for complex tones (Moore, 2008) and are
10 times larger (2%) for vowels (Smith et al., 2005).
In natural speech tokens, F0 variations are typically
larger than in musical tones (except for instruments allowing
for portamento and vibrato). However, this was not the case
for our material that was matched as closely as possible for
various acoustic features (e.g., F0, duration, intensity, ampli-
tude waveform). In both verbal and musical materials, F0 is
kept constant within each syllable/note. Our verbal and mu-
sical materials—by definition and design—only differed by
the presence versus absence of formants. This creates differ-
ences in the energy distribution of the sound spectrum,
which might influence pitch extraction. While normal listen-
ers show a cost in pitch extraction with verbal material, amu-
sic individuals benefit from the specific energy distribution
of the sound spectrum, which is created by the presence of
formants. Amusics’ impaired pitch extraction leads to the
use of this information in spoken syllables; information that
a normally functioning system does not rely on. One might
hypothesize that amusics benefit from a speech processing
mode; notably the recognition of articulatory gestures (e.g.,
opening of the jaw) induced by the varying formants of the
verbal material might activate an innate auditory-articulatory
loop, which allows a more finely tuned perception of the
characteristic parameters of the stimuli, including pitch
(Galantucci et al., 2006). The present observation leads to
the new testable prediction that congenital amusics might
process pitch better for tones sung on a syllable than for
tones produced by a musical instrument.
Alternatively, or additionally, the observed difference
between verbal and musical sounds may be related to
higher-level processing related to strategic influences, atten-
tion or memory. On this view, pitch extraction of tones in
congenital amusia might not be impaired per se, but the defi-
cit would occur at later processing stages and be caused by
material-specific top-down processes. This phenomenon is
akin to effects of top-down processing previously observed
for language and music, respectively. For example, native
language knowledge leads to phonological and stress deaf-
ness or impaired phoneme perception for non-native lan-
guage material, despite successful discrimination of isolated
features (Miyawaki et al., 1975; Dupoux et al., 2008). A
brief acoustic event such as a rapidly changing plosive
sound can elicit a Mismatch Negativity with a language-
characteristic left-hemispheric dominance when embedded
in a word, but not in a nonspeech context or a pseudoword
(Shtyrov et al., 2005). Similarly, in music, listeners’ knowl-
edge of the Western tonal system facilitates pitch processing
of a tone when it occurs in tonal sequences over atonal, ran-
dom sequences (Lynch and Eilers, 1992; Brattico et al.,
2002) and when fulfilling an important musical function
over a less-important musical function in tonal sequences
(Marmel et al., 2008). These findings suggest that top-down
influences might also affect early processing steps of pitch
extraction in congenital amusia.
In congenital amusia, our data reveal differences in fine-
grained pitch discrimination for verbal and musical sounds,
with a less-pronounced pitch deficit for verbal material.
Future research is needed to investigate the neural correlates
of these differences. More specifically, it remains to be
tested whether the musical deficit in congenital amusia
shapes pitch processing at the subcortical level (as observed
in musicians, Musacchia et al., 2007; Wong et al., 2007)
and/or the cortical level (e.g., Musacchia et al., 2008). Brain
imaging data suggest that linguistic and non-linguistic infor-
mation are preferentially processed in left and right auditory
cortices, respectively (e.g., Golestani and Zatorre, 2004;
Dehaene-Lambertz et al., 2005; Tervaniemi et al., 2006).
For congenital amusia, neural anomalies have been reported
for cortical thickness in right inferior frontal and auditory
cortices of congenital amusics (Hyde et al., 2007) and the
right arcuate fasciculus (Loui et al., 2009). Based on these
sets of findings, we postulate that the data pattern observed
in congenital amusia can be related to this anomaly in corti-
cal structures, affecting mostly right auditory-frontal connec-
tivity (see Mandell et al., 2007, for one report on left frontal
cortex anomaly) and thus leading to more severely impaired
processing of pitch in music than in speech.
Our findings showed that amusics’ pitch deficit is not re-
stricted to musical sounds, but also affects the processing of
verbal sounds, even though to a lesser extent. This outcome
differs from the first reports on congenital amusia indicating
unimpaired pitch processing in language material (Ayotte
et al., 2002; Patel et al., 2005), but is in agreement with
more recent reports revealing pitch deficits also for language
materials (Patel et al., 2008; Nguyen et al., 2009; Hutchins
et al., 2010; Liu et al., 2010; Nan et al., 2010; Tillmann
et al., 2011). The affected pitch processing in language ma-
terial suggests a domain-general pitch deficit that is causing
amusia. Amusics’ deficit for verbal material is in agreement
with previous data on normal, nonamusic listeners showing
the influence of expertise across domains. Notably, musical
training can facilitate not only pitch perception in musical
material, but also language material (e.g., Burnham and
Brooker, 2002; Schoen et al., 2004; Wong et al., 2007) and
expertise in a tonal language can facilitate pitch perception
(Pfordresher and Brown, 2009) or interfere with it (Bent
et al., 2006; Peretz et al., 2011). These data suggest a
domain-general pitch mechanism. When this mechanism is
trained with one type material (music or tone language), this
training can also benefit to the processing of the other
4094J. Acoust. Soc. Am., Vol. 130, No. 6, December 2011Tillmann et al.: Pitch in music and speech in amusia
Author's complimentary copy
material. Similarly, the data on amusia suggest that the
domain-general pitch mechanism can be impaired, and the
impairment leads to deficits not only on musical pitch, but
also on verbal pitch.
This research was supported by grants from the Cluster
11 of Rho ˆne-Alpes (B.T.), the Italian Ministry of Education,
University and Research PRIN 2007 (C.U.), the Natural Sci-
ences and Engineering Research Council of Canada, the
Canada Institute of Health Research and a Canada Research
Ayotte, J., Peretz, I., and Hyde, K. L. (2002). “Congenital amusia: A group
study of adults afflicted with a music-specific disorder,” Brain 125,
Be ´langer, O., Traube, C., and Piche ´, J. (2007). “Designing and controlling a
source-filter model for naturalistic and expressive singing voice syn-
thesis,” Proceedings of the International Computer Music Conference
(ICMC’07), Copenhagen, Denmark.
Bent, T., Bradlow, A. R., and Wright, B. A. (2006). “The influence of lin-
guistic experience on the cognitive processing of pitch in speech and non-
speech sounds,” J. Exp. Psych.: HPP 32, 97–103.
Brattico, E., Na ¨a ¨ta ¨nen, R., and Tervaniemi, M. (2002). “Context effects on
pitch perception in musicians and nonmusicians: Evidence from event-
related-potential recordings,” Music Percep 19, 199–222.
Burnham, D., and Brooker, R. (2002). “Absolute pitch and lexical tones:
Tone perception by non-musician, musician, and absolute pitch non-tonal
language speakers,” Proceedings of the 7th International Conference on
Spoken Language Processing, Denver, USA, pp. 257–260.
Cohen, J., MacWhinney, B., Flatt, M., and Provost, J. (1993). “PsyScope:
An interactive graphic system for designing and controlling experiments
in the psychology laboratory using Macintosh computers,” Behav. Res.
Methods Instrum. Comput. 25, 257–271.
Dalla Bella, S., Gigue `re, J-F., and Peretz, I. (2009). “Singing in congenital
amusia,” J. Acoust. Soc. Am. 126, 414–424.
Dehaene-Lambertz, G., Pallier, C., Serniclaes, W., Sprenger-Charolles, L.,
Jobert, A., and Dehaene, S. (2005). “Neural correlates of switching from
auditory to speech perception,” NeuroImage 24, 21–33.
Donchin, E. (1981). “Surprise!… Surprise?,” Psychophys. 18, 493–513.
Donchin, E., Ritter, W., and McCallum, C. (1978). “Cognitive psychophysi-
ology: The endogenous components of the ERP,” in Brain Event-Related
Potentials in Man, edited by E. Callaway, P. Tueting, and S. Koslow (Aca-
demic Press, New York), pp. 349–441.
Drayna, D., Manichaikul, A., de Lange, M., Snieder, H., and Spector, T.
(2001). “Genetic correlates of musical pitch recognition in humans,” Sci-
ence 291, 1969–1972.
Dupoux, E., Sebastian-Galles, N., Navarrete, E., and Pepperkamp, S.
(2008). “Persistent stress ‘deafness’: The case of French learners of Span-
ish,” Cognition 106, 682–706.
Foxton, J. M., Dean, J. L., Gee, R., Peretz, I., and Griffiths, T. D. (2004).
“Characterization of deficits in pitch perception underlying ‘tone deaf-
ness’,” Brain 127, 801–810.
Galantucci, B., Fowler, C. A., and Turvey, M. T. (2006). “The motor theory
of speech perception reviewed,” Psych. Bull. Rev. 13, 361–377.
Golestani, N., and Zatorre, R. J. (2004). “Learning new sounds of speech:
reallocation of neural substrates,” NeuroImage 21, 494–506.
Gosselin, N., Jolicoeur, P., and Peretz, I. (2009). “Impaired memory for
pitch in congenital amusia,” Ann. N. Y. Acad. Sci. 1169, 270–272.
Griffiths, T. D., Warren, J. D., and Jennings, A. R. (2006). “Dystimbria: A
distinct musical syndome?,” Proceedings of the Ninth International Con-
ference for Musical Perception and Cognition, Bologna, Italy.
Hyde, K. L., and Peretz, I. (2004). “Brains that are out of tune but in time,”
Psych. Sci. 15, 356–360.
Hyde, K. L., Zatorre, R. J., Griffiths, T. D., and Peretz, I. (2006).
“Morphometry of the amusic brain: A two-site study,” Brain 129,
Hyde, K. L., Lerch, J. P., Zatorre, R. J., Griffiths, T. D., Evans, A. C., and
Peretz, I. (2007). “Cortical thickness in congenital amusia: When less is
better than more,” J. Neurosci. 27, 13028–13032.
Hutchins, S., Gosselin, N., and Peretz, I. (2010). “Identification of changes
along a continuum of speech intonation is impaired in congenital amusia,”
Front. Psychol. 1, 236, doi: 10.3389/fpsyg.2010.00236.
Liu, F., Patel, A. D., Fourcin, A., and Stewart, L. (2010). “Intonation proc-
essing in congenital amusia: Discrimination, identification, and imitation,”
Brain 133, 1682–1693.
Loui, P., Alsop, D., and Schlaug, G. (2009). “Tone deafness: A new discon-
nection syndrome?,” J. Neurosci. 29, 10215–10220.
Lynch, M. P., and Eilers, R. E. (1992). “A study of perceptual development
for musical tuning,” Percep. Psychophys. 52, 599–608.
Lyzenga, J., and Horst, J. W. (1995). “Frequency discrimination of band-
limited harmonic complexes related to vowel formants,” J. Acoust. Soc.
Am. 98, 1943–1955.
Mandell, J., Schultz, K., and Schlaug, G. (2007). “Congenital amusia: An
auditory-motor feedback disorder?,” Restor. Neuro. Neurosci. 25, 323–334.
Marmel, F., Tillmann, B., and Dowling, W. J. (2008). “Tonal expectations
influence pitch perception,” Percep. Psychophys. 70, 841–852.
McDonald, C., and Stewart, L. (2008). “Uses and functions of music in con-
genital amusia,” Music Percep. 25, 345–355.
Miyawaki, K., Strange, W., Verbrugge, R., Liberman, A. M., Jenkins, J. J.,
and Fujimura, O. (1975). “Effect of linguistic experience—discrimination
of [r] and [l] by native speakers of Japanese and English,” Percep. Psycho-
phys. 18, 331–340.
Moore, B. C. J. (2008). “Basic auditory processes involved in the analysis of
speech sounds,” Philos. Trans. R. Soc. London, Ser. B 363, 947–963.
Moreno, S., Marques, C., Santos, A., Santos, M., Castro, S. L., and Besson,
M. (2009). “Musical training influences linguistic abilities in 8-year-old
children: More evidence for brain plasticity,” Cereb. Cortex 19, 712–723.
Musacchia, G., Sams, M., Skoe, E., and Kraus, N. (2007). “Musicians have
enhanced subcortical auditory and audiovisual processing of speech and
music,” Proc. Natl. Acad. Sci. U.S.A. 104, 15894–15898.
Musacchia, G., Strait, D., and Kraus, N. (2008). “Relationships between
behavior, brainstem and cortical encoding of seen and heard speech in
musicians and non-musicians,” Hear. Res. 241, 34–42.
Nan, Y., Sun, Y., and Peretz, I. (2010). “Congenital amusia in speakers of a
tonal language: Association with lexical tone agnosia,” Brain 133,
Nguyen, S., Tillmann, B., Gosselin, N., and Peretz, I. (2009). “Tonal lan-
guage processing in congenital amusia,” Ann. N.Y. Acad. Sci. 1169,
Patel, A. D., Foxton, J. M., and Griffiths, T. D. (2005). “Musically tone-deaf
individuals have difficulty discriminating intonation contours extracted
from speech,” Brain Cogn. 59, 310–313.
Patel, A. D., Wong, M., Foxton, J., Lochy, A., and Peretz, I. (2008). “Speech
intonation perception deficits in musical tone deafness (congenital
amusia),” Music Percep. 25, 357–368.
Peng, S-C., Lu, N., and Chatterjee, M. (2009). “Effects of cooperating and
conflicting cues on speech intonation recognition by cochlear implant
users and normal hearing listeners,” Audiol. Neurootol. 14, 327–337.
Peretz, I., Ayotte, J., Zatorre, R. J., Mehler, J., Ahad, P., Penhune, V. B., and
Jutras, B. (2002). “Congenital amusia: A disorder of fine-grained pitch dis-
crimination,” Neuron 33, 185–191.
Peretz, I., and Hyde, K. L. (2003). “What is specific to music processing?
Insights from congenital amusia,” Trends Cogn. Sci. 7, 362–367.
Peretz, I., Champod, S., and Hyde, K. L. (2003). “Varieties of musical disor-
ders: The Montreal battery of evaluation of amusia,” Ann. N.Y. Acad. Sci.
Peretz, I., Brattico, E., and Tervaniemi, M. (2005). “Abnormal electrical
brain responses to pitch in congenital amusia,” Ann. Neurol. 58, 478–482.
Peretz, I., Cummings, S., and Dube ´, M. P. (2007). “The genetics of congeni-
tal amusia (tone deafness): A family-aggregation study,” Am. J. Hum.
Genetics 81, 582–588.
Peretz, I., Gosselin, N., Tillmann, B., Cuddy, L. L., Trimmer, C., Paquette,
S., and Bochard, B. (2008). “Online identification of congential amusia,”
Music Percep. 25, 331–343.
Peretz, I., Nguyen, S., and Cummings, S. (2011). “Tone language fluency
impairs pitch discrimination,” Front. Psychol. 2, 145, doi: 10.3389/
Pfordresher, P. Q., and Brown, S., (2009). “Enhanced production and per-
ception of musical pitch in tone language speakers,” Attent. Percep. Psy-
chophys. 71, 1385–1398.
Scho ¨n, D., Magne, C., and Besson, M. (2004). “The music of speech: music
training facilitates pitch processing in both music and language,” Psycho-
phys. 41, 341–349.
J. Acoust. Soc. Am., Vol. 130, No. 6, December 2011Tillmann et al.: Pitch in music and speech in amusia4095
Author's complimentary copy
Shtyrov, Y., Pihko, E., and Pulvermuller, F. (2005). “Determinants of domi- Download full-text
nance: Is language laterality explained by physical or linguistic features of
speech?,” NeuroImage 27, 37–47.
Smith, D. R. R., Patterson, R. D., and Turner, R. (2005). “The processing
and perception of size information in speech sounds,” J. Acoust. Soc. Am.
Stewart, L. (2008). “Fractionating the musical mind: Insights from congeni-
tal amusia,” Current Opin. Neurobiol. 18, 127–130.
Stewart, L. (2011). “Characterizing congenital amusia,” Q. J. Exp. Psych.
Tervaniemi, M., Szameitat, A. J., Kruck, S., Schro ¨ger, E., Alter, K., De
Baene, W., and Friederici, A. D. (2006). “From air oscillations to
music and speech: Functional magnetic resonance imaging evidence
for fine-tuned neural networks in audition,” J. Neurosci. 26, 8647–
Tillmann, B., Schulze, K., and Foxton, J. (2009). “Congenital amusia: A
short-term memory deficit for nonverbal, but not verbal sounds,” Brain
Cogn. 71, 259–264.
Tillmann, B., Burnham, D., Nguyen, S., Grimault, N., Gosselin, N., and Peretz, I.
(2011). “Congenital amusia (or tone-deafness) interferes with pitch processing
Vos, P. G., and Troost, J. M. (1989). “Ascending and descending melodic
intervals: Statistical findings and their perceptual relevance,” Music Per-
cep. 6, 383–396.
Williamson, V. J., McDonald, C., Deutsch, D., Griffiths, T. T., and Stewart,
L. (2010). “Faster decline of pitch memory over time in congenital
amusia,” Adv. Cog. Psych. 26, 15–22.
Wong, P. C. M., Skoe, E., Russo, N. M., Dees, T., and Kraus, N. (2007).
“Musical experience shapes human brainstem encoding of linguistic pitch
patterns,” Nat. Neurosci. 10, 420–422.
4096 J. Acoust. Soc. Am., Vol. 130, No. 6, December 2011Tillmann et al.: Pitch in music and speech in amusia
Author's complimentary copy