ArticlePDF Available

Auditory and musical development

Authors:
Auditory and Musical Development
Page 1 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Auditory and Musical Development
Laurel J. Trainor and Chao He
The Oxford Handbook of Developmental Psychology, Vol. 1: Body and Mind
Edited by Philip David Zelazo
Abstract and Keywords
The development of auditory perception is examined in relation to (1) identity and
location of objects (auditory scene analysis) and (2) musical structure and meaning.
Behavioral and brain research converges to indicate that some capacity to process the
frequency, pitch, intensity, timbre, location, and timing of sounds is present very early in
development, although there is a protracted experience-driven period of plasticity, with
adult levels of maturity typically not reached until well in to childhood. Young infants are
also able to process aspects of musical structure. At the same time, enculturation to the
specific melodic, harmonic, and rhythmic structure of the musical system of a person’s
culture depends on the considerable exposure to that musical system experienced by all
members of the culture, and intensive musical training affects the speed and degree of
that enculturation.
Keywords: auditory perception, music, pitch, rhythm, auditory scene analysis, sound localization, temporal
resolution, melody, harmony, meter
Key Concepts:
1. Auditory information informs us about (1) objects and their locations (auditory
scene analysis), (2) musical structure and meaning and (3) linguistic structure and
meaning.
2. Development of the auditory system is influenced by the particular experiences of
each individual.
3. Auditory development for music and language proceeds according to processes of
perceptual narrowing: Young infants are initially capable of a wide range of
discriminations, which subsequently become refined through experience with a
particular musical system or language. Specifically, perception improves for
discriminations that matter in the particular musical system or language, and
become worse for discriminations that do not matter.
Print Publication Date: Mar 2013 Subject: Psychology, Developmental Psychology
Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199958450.013.0011
Oxford Handbooks Online
Auditory and Musical Development
Page 2 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
4. Although very young infants are already able to process differences in sound
frequency, pitch, timbre, intensity, and location, these abilities continue to improve
well into the childhood years. Furthermore, EEG measures indicate that mature
processing in auditory cortex is not achieved until the late teenage years.
5. Very young infants are sensitive to sensory consonance, octave equivalence, and
relative pitch (or transpositional invariance) as well as to different rhythmic
(metrical) structures. They also process unequal-interval better than equal-interval
musical scales.
6. Through exposure to a particular musical system, infants acquire sensitivity to its
scale structure (key membership), harmonic structure, and rhythmic (metrical)
structure.
(p. 311) 7. Formal musical training in infancy and childhood has profound effects on
brain development that go beyond those effects that are seen with mere exposure to
music.
Introduction
The auditory system is fundamental to human communication through speech and music,
and good auditory processing is critical for many aspects of human development. The
human auditory system extracts three basic types of information from sound: (1) identity
and location of objects, (2) musical structure and meaning, and (3) linguistic structure
and meaning. Although object perception is based on contributions from all of the senses,
the vast majority of research on this topic focuses on vision. In this chapter we examine
the important contribution of the auditory sense to object perception and how it develops.
In contrast to object processing, the communicative functions of speech and music are
based primarily in the auditory modality, although other senses contribute as well. The
development of speech perception is reviewed by Werker and Gervain (in this handbook),
so here we focus on the development of basic auditory processes that enable object
perception and on the development of musical communication.
It is not possible in one chapter to detail all aspects of auditory development. We will give
an overview of what is known about the developmental time course for processing various
basic sound features, sounding objects, and musical structure in the context of general
developmental principles. Development involves neuroplasticity, or structural and
functional changes in the brain that enable the emergence of new processing capabilities
and behaviors (see the chapter by Maurer & Lewis in this handbook). Many factors
contribute to these changes. Neurons proliferate and migrate to their end locations under
genetic guidance (see Markant & Thomas in this handbook). Genes also guide waves of
Auditory and Musical Development
Page 3 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
myelination, synaptic proliferation, and pruning. The presence and amount of different
neurotransmitters vary across age (Murphy, Beston, Boley, & Jones, 2005). Through
these changes, neural processing becomes faster and more efficient, and internal noise is
decreased. While genetic programs may constrain the general processes and times over
which these changes occur, the details are largely determined by experiential factors.
Synaptic connections receiving concurrent input are strengthened, whereas nonuseful
connections are eliminated. Thus, the specific auditory input an infant receives has a
large influence on how the neurons connect and function, which in the end determines
what language the child understands and the musical features to which he or she is
sensitive.
One basic principle of auditory development is that of perceptual narrowing. The
neuronal connections in auditory cortex are initially somewhat random, making the infant
inefficient at sound processing. With experience, the infant develops representations for
important sounds in the environment and develops more efficient neural networks for
processing and distinguishing details of these sounds, and in the process the brain
becomes specialized for these sounds. Thus, with increasing age, learning a new
language or a new musical system can become more difficult. However, substantial
plasticity is maintained at certain levels of processing, such that specific training in
adults can lead to changes seen at the cortical level in terms of altered responses as
assessed by functional magnetic resonance imaging (fMRI), electroencephalography
(EEG), and magnetoencephalography (MEG). Nonetheless, experience appears to have a
greater influence on plasticity at certain developmental stages, termed sensitive periods,
although for most basic sound features, we know rather little about sensitive periods.
The cochlea in the inner ear is structurally and functionally adultlike by birth and the
auditory system is functioning from around the sixth prenatal month (Werner, 2007). Yet,
as outlined below, the auditory system does not reach maturity until well into the teenage
years. Thus, from birth, the major changes in auditory abilities stem from changes in
neural processing. Furthermore, development of the auditory system cannot be
completely understood in isolation. Auditory input converges with other sensory input
(e.g., somatosensory) as early as cochlear nucleus (e.g., Shore, Koehler, Oldakowski,
Hughes, & Syed, 2008), so the maturation of auditory processing is dependent on the
maturation of other sensory systems as well. Furthermore, the auditory system is not
simply a feed-forward system where information is processed linearly from one stage to
the next. Rather, there are at least as many efferent as afferent connections. Indeed,
characteristics of the basilar membrane itself can be changed through feedback from the
brainstem (e.g., see Musiek, Weihing, & Oxholm, 2007). In addition, processes such as
identifying objects by the sounds they make and recognizing musical melodies involve
memory and attention, so the auditory system cannot function without reciprocal
Auditory and Musical Development
Page 4 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
connections between auditory cortex and many other cortical (p. 312) areas.
Consequently, trajectories of improvement in auditory abilities cannot be understood
without reference to these interactions.
Finally, as with all aspects of developmental research, studying auditory development
presents a number of methodological challenges. Challenges for behavioral research
include the fact that preverbal infants cannot be given explicit instructions, that children
of different ages may interpret verbal instructions differently, that the behavioral
repertoire of responses that can be measured changes with age, and that improvements
in other capacities such as attention and memory will influence performance on various
tasks. Physiological measures also have challenges. The no-movement requirements of
fMRI and MEG make them difficult to use with young children, and the morphology (what
components are present) of EEG recordings changes with age, making comparisons
across age difficult. Nonetheless, we have learned a tremendous amount about auditory
development over the past few decades. In the following sections, some important
findings from this research effort are summarized. Where possible, methodological
limitations are discussed, the mechanisms of change are considered, and the extent to
which specific experience affects outcome is evaluated.
Development of Basic Auditory Perception
Isolated sounds are generally described as having four main perceptual features:
loudness, which is related to sound intensity; pitch, which is related to sound frequency;
duration, which is related to sound length; and timbre, which describes sound quality and
is related to several sound features, most notably the speed of sound onset and the
distribution of energy across frequency. Sounds are also perceived as coming from
particular locations in space. In addition to these basic sound features, important
information is contained in patterns of sound, both in terms of sound sequences, such as
melodies or sentences, and simultaneous sounds, as in a musical chord. One of the main
tasks of the auditory system is to identify over time what objects are present and where
they are located. Indeed, in the real world, sounds rarely occur in isolation, so the
method of identifying the features of a sound (loudness, pitch, duration, timbre, location)
necessarily involves a process whereby the auditory system must separate incoming
complex sound signals into parts that each represent a sounding object in the
environment. The process is called auditory scene analysis. Research on auditory
development has tended to focus on the processing of individual features. In the following
sections, we summarize the main findings of this research and then consider the
development of auditory scene analysis.
Auditory and Musical Development
Page 5 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
A number of methods have been used to study auditory abilities in infants and young
children who cannot understand verbal instructions (see Saffran, Werker, & Werner,
2006; Werner, 2007). Most methods fall into one of two broad categories, behavioral and
EEG. In behavioral methods, a motor response of some kind is measured from infants.
For example, in the conditioned head turn method, one auditory stimulus or category of
stimulus is presented continuously from a speaker to the infant’s side, and the infant is
rewarded with dancing toys for turning his or her head to occasional changes in the
stimulus or stimulus category. Infants can also be familiarized with a particular auditory
stimulus and then the precision of their encoding of that stimulus tested in a head turn
preference task. In this task, trials of the familiarized auditory stimulus and the stimulus
to be discriminated are presented. In each case, the stimulus continues to sound while
the infant looks at a light and/or toy and is turned off when the infant looks away. Longer
listening times to one or other of the stimuli indicate discrimination. Infants younger than
about 5 months of age do not have good motor control of head movements, but head
turns can be replaced with eye movement responses. With young infants, nonnutritive
sucking can be used: increases or decreases in sucking rate can be paired with one of two
auditory stimuli. Finally, observer-based procedures are sometimes used, in which a
trained experimenter judges, on the basis of infant movement responses, whether or not
a sound was presented on a particular trial.
In EEG methods, electrical potentials generated by the depolarization of neurons in the
brain are measured across time at the surface of the head as sounds are presented (e.g.,
Luck, 2005). The EEG can be analyzed in the frequency domain by examining, for
example, the relative power in different frequency bands such as delta (0–4 Hz), theta (4–
8 Hz), alpha (8–12 Hz), beta (12–30 Hz), and gamma (>30 Hz) bands. Alternatively, event-
related potentials (ERPs), representing the brain’s response to a presented sound event,
can be derived from EEG recordings. To create an electrical field large enough to
measure, a group of neurons whose axons point in the same direction must fire
synchronously. Even so, on individual trials, there is sufficient “noise” from brain activity
unrelated to the processing of the sound that the ERP is difficult to see. Therefore,
(p. 313) typically many trials are presented and the EEG measured on each trial is
averaged over these trials. Because brain activity unrelated to the processing of the
sound is not time-locked to the onset of the sound, it will tend to average to zero as the
number of trials in the average is increased, and the resulting ERP will largely show the
brain’s processing of the sound. Components of the ERP track the stages of processing of
a sound through the auditory system, with early components representing subcortical
nuclei, later components representing primary and secondary auditory cortex, and still
later components representing attentional and decision-making stages. One very useful
component for infant research is the mismatch negativity (MMN), which is seen in
response to occasional changes in an ongoing sequence of sounds (e.g., Näätänen,
Auditory and Musical Development
Page 6 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Paavilainen, Rinne, & Alho, 2007; Picton, Alain, Otten, Ritter, & Achim, 2000; Trainor &
Zatorre, 2009). For example, if a tone of one pitch is repeated over and over, occasional
replacement of that tone with a tone of a different pitch will generate an MMN response
if the two pitches are discriminable.
Each method used with young children has strengths and weaknesses, so the gold
standard should be to use a variety of methods. When they converge on a common
answer to a question, we can be most confident that the results accurately reflect infants’
abilities.
Thresholds for Hearing and Intensity Discrimination
Perhaps the most basic question in auditory development is that of absolute thresholds:
how intense a sound needs to be in order to be detected. The general answer to this
question is that fully adult levels are probably not achieved until about 10 years of age,
although there are large improvements in absolute thresholds during infancy. However,
the question is complicated by both methodological and interpretational issues, and
thresholds depend on the frequency of the sound. The fetus will move to externally
presented sound by 28 weeks gestation, but the accurate measurement of absolute
thresholds is extremely difficult in utero (see Lecanuet, 1996). After birth, observing
spontaneous responses to sound presentations, Weir (1976, 1979) concluded that in
comparison to adults, neonatal thresholds are up to 70 dB higher than those of adults
between 250 and 2,000 Hz. Studies using an observer-based procedure, in which a
trained experimenter determines on the basis of observing an infant whether or not a
sound was presented on a given trial, have found infant/adult differences at 1 to 2 months
of about 45 dB at 500 Hz and about 35 dB at 4,000 Hz (Trehub, Schneider, Thorpe, &
Judge, 1991; Werner & Gillenwater, 1990). It is not clear whether this represents a large
improvement over the first couple of months in sensitivity or whether it is a result of
different methods of measurement. By 3 months, infant thresholds improve further,
particularly at higher frequencies (by 20 dB at 4,000 Hz and 10 dB at 500 Hz; Olsho,
Koch, Carter, Halpin, & Spetner, 1988). The trend for earlier maturation of absolute
thresholds for high than for low frequencies continues in childhood, with adult levels
achieved by age 5 at 4,000 and 10,000 Hz, but not until age 10 at 1,000 Hz (Trehub,
Schneider, Morrengiello, & Thorpe, 1988).
Of more importance for determining what an object is, where it is located, and whether it
is getting closer to you is the ability to discriminate sounds of different intensity.
Although there are few studies, 7- to 9-month-olds appear to be considerably worse than
adults, with thresholds of 6 dB for detecting intensity differences between 1,000-Hz tones
compared to adult thresholds of 2 dB (Sinnott & Aslin, 1985), and thresholds of 9 dB for
Auditory and Musical Development
Page 7 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
detecting differences in broadband noises compared to adult thresholds of 3 dB (Werner,
2007). By 4 years of age, intensity discrimination appears to be quite good, although
improvements are seen until 10 to 12 years of age for both discrimination of sounds with
different intensities (Maxon & Hochberg, 1982) and masked thresholds or the ability to
detect a change in the intensity of a continuous sound (Schneider, Trehub, Morrongiello,
& Thorpe, 1989).
There are likely several factors that contribute to the developmental trajectories for
absolute and difference thresholds (Saffran et al., 2006; Werner, 2007). While the inner
ear, containing the cochlea where vibrations are translated into neural firings, is
essentially adultlike at birth (see Werner, 2007, for a review), the smaller ear canal of the
infant is better at conducting high frequencies compared to the adult ear. It has also been
documented that there are large improvements (about 20 dB) in the efficiency with which
the middle ear conducts sound into the inner ear between birth and adulthood (Keefe,
Bulen, Arehart, & Burns, 1993; Keefe & Levi, 1996; Okabe, Tanaka, Hamada, Miura, &
Funai, 1988). This improvement in efficiency is largest during infancy and largest for high
frequencies. It likely makes a large contribution to absolute threshold improvements with
age, and to the earlier maturation of absolute thresholds for high (p. 314) than for low
frequencies. Indeed, this has been confirmed with studies measuring auditory brainstem
responses (ABR) using electrophysiological recordings (Sininger & Abdala, 1996;
Sininger, Abdala, & Cone-Wesson, 1997) and studies using otoacoustic emissions (OAE)
that measure inner ear function (Werner & Holmer, 2002).
Middle ear efficiency cannot account for age-related changes in intensity discrimination
because the two sounds of different intensity that are to be compared will both go
through the child’s middle ear and would be affected in the same way. Two types of
immaturity likely contribute to poor intensity discrimination: inefficient neural processing
in the auditory pathway and immature attentional processing. Myelination of the
subcortical auditory pathways is largely complete by 6 to 12 months (Moore, Perazzo, &
Braun, 1995), and the increase in processing speed enabled by this process is reflected in
decreased latency of the ABR and middle latency responses of the auditory evoked
potential (see Moore & Linthicum, 2007). This development likely accounts in part for the
rapid development of intensity discrimination in infancy. Indeed, behavioral thresholds in
infants for intensity are correlated with ABR latencies (Werner, Folsom, & Mancl,
1993a,b).
In general, thresholds obtained through behavioral methods are influenced by the infant’s
or child’s ability to attend to the stimulus. Modeling studies have shown, however, that
such inattention cannot account for most of the difference between children and adults
(Schneider & Trehub, 1992; Viemeister & Schlauch, 1992; Werner, 1992; Wightman &
Auditory and Musical Development
Page 8 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Allen, 1992). However, infants are immature in another respect. Adults have lower
thresholds if they know the frequency of the sound to listen for, whereas infants appear
to be unable to engage in selective listening (Bargones & Werner, 1994). This difference
likely reflects immaturity in auditory cortex and beyond, immaturities that remain until at
least 12 years of age (Moore & Linthicum, 2007; Ponton, Eggermont, Kwong, & Don,
2000; Shahin, Roberts, & Trainor, 2004).
In sum, thresholds for hearing and intensity difference thresholds are substantially
elevated in early infancy. There is rapid improvement over the first year after birth. Adult
levels are reached earlier for sounds of high than of low frequency, with overall adult
levels not obtained until about 10 years of age. This developmental profile is consistent
with maturational timetables for conductive efficiency in the middle ear and for the
development of subcortical and cortical pathways. To date there have been no studies of
the influence of experience on intensity perception, so the question of the role of the
environment remains unknown.
Frequency Resolution and Frequency Discrimination
The vast majority of sounds in the natural world have complex vibration patterns made up
of many frequencies. The basilar membrane in the inner ear acts as a sort of Fourier
analyzer, separating the incoming signal into its frequency components. This is
accomplished through variation in the stiffness of the membrane along its length, such
that low frequencies move the membrane maximally at one end and high frequencies at
the other. Inner hair cells along the length of the basilar membrane move when the
membrane moves. This mechanical motion is transduced into electrical signals in
auditory nerve fibers. Thus the inner ear contains a tonotopic frequency map. This
frequency organization is referred to as the “place code” and is maintained in the
auditory nerve, through subcortical nuclei, and into primary auditory cortex. There is also
a “temporal code” for frequency. Because auditory nerve fibers fire when the basilar
membrane is maximally displaced, the rate of firing over a population of adjacent nerve
fibers is inversely related to the frequency of the incoming sound signal.
“Frequency resolution” refers to the ability of the place code to discriminate between
different frequencies. This is generally measured in masking studies. The ability of a
person to detect a target tone of a particular frequency (or a narrow band of noise
centered at a particular frequency) is tested in the presence of a masking tone (or noise).
The general finding in adults is that detection of the target tone is affected by the
presence of the masker only when it falls within a critical band (about a quarter of an
octave for most of the frequency range) of the target tone.
Auditory and Musical Development
Page 9 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Behavioral studies indicate that frequency resolution (i.e., the size of the critical
bandwidth) is mature for low frequencies at birth and for high frequencies by 6 months of
age. Studies of cochlear function suggest that the cochlea is mature at birth, and ABR
studies suggest that the early limitations in high-frequency resolution before 6 months
are due to immature processing of high frequencies in the brainstem (Abdala & Folsom,
1995a,b; Folsom & Wynne, 1987), as discussed earlier in the section “Thresholds for
Hearing and Intensity Discrimination.”
(p. 315) Although frequency resolution appears mature at 6 months, the ability to
discriminate two frequencies continues to improve until about 10 years of age (Jensen &
Neff, 1993; Maxon & Hochberg, 1982; Thompson, Cranford, & Hoyer, 1999). This is likely
because the place code is too coarse to account for adults’ ability to detect fine
differences in pitch, which rely on the temporal mechanism. Discrimination for high
frequencies matures earlier than for low frequencies, perhaps reflecting the fact that low-
frequency discrimination depends to a greater extent on the temporal mechanism. The
temporal mechanism might be expected to mature later as it depends on precise
temporal firing patterns. Despite the long developmental trajectory, it should be noted
that thresholds in infancy are still sufficiently good to support musical perception. For
example, at 1,000 Hz, 6-month-olds can detect a 1.5% to 3.0% change in frequency under
conditions where adults detect a 1.0% change.
The protracted development of frequency discrimination probably reflects the protracted
development of auditory cortex. Studies of human auditory cortex from autopsy cases
indicate that in early infancy, subcortical input appears to go only to layer I and does not
contain frequency-specific information (Moore & Linthicum, 2007). Neurofilament is not
yet expressed in the other layers, indicating that mature, fast connections between
neurons are not yet present. Between 6 months and 5 years of age, neural connections
proliferate in the deeper cortical layers (IV, V, VI, and lower III) and myelination is
gradually completed. This development supports frequency-specific input to layer IV and
processing of that information within cortical columns in auditory cortex. Only after age 5
does this anatomical maturation begin in the upper layers (II and upper III) and adult
levels of maturation are not reached until age 12. The maturation of auditory evoked
potentials follows this anatomical maturation, with the emergence of the N1 response
around 5 years of age and an increase in its amplitude until about age 12 (Ponton et al.,
2000; Shahin et al., 2004). The upper layers contain the main connections to other
cortical areas, so it is likely not until they begin to mature that top-down attentional
processes can be fully brought to bear on auditory discrimination tasks.
Animal studies indicate that auditory experience is critical for the development of
tonotopic organization in the auditory neural pathways. Cortical plasticity is evident even
in adult animals. For example, lesioning an area of the cochlea of an adult guinea pig
Auditory and Musical Development
Page 10 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
leads to a reorganization of tonotopic maps in auditory cortex, such that areas that
formerly represented the lesioned frequencies came to respond to adjacent frequencies
instead (Robertson & Irvine, 1989). Indeed, simple training of frequency discrimination at
a particular frequency in owl monkeys leads to an expansion of the representation of that
frequency in cortical tonotopic maps (Recanzone, Schreiner, & Merzenich, 1993).
Recently, EEG studies in humans have also shown larger responses to frequencies trained
in the laboratory than untrained frequencies, suggesting that the discrimination training
led to more neurons dedicated to representing the trained frequencies (e.g., Bosnyak,
Eaton, & Roberts, 2004). Of most interest from a developmental perspective are studies
indicating that animals exposed only to pulsed white noise (which contains all frequencies
in the absence of any pattern) during the critical period for cortical tonotopic
organization develop very abnormal cortical tonotopic maps where neurons respond
broadly to many frequencies (Zhang, Bao, & Merzenich, 2002).
It is of course not ethical to conduct controlled deprivation studies in humans. However,
a recent study indicates that extensive active participation in music classes between 6
and 12 months of age results in more mature auditory cortical responses to sound
compared to an equal amount of passive exposure to music (hearing music in the
background) at infant classes (Trainor, Marie, Gerry, Whiskin, & Unrau, 2012).
In sum, frequency resolution reflecting the spatial mechanisms reaches adult levels by 6
months, whereas more fine-grained frequency discrimination takes many years to reach
adult levels. The early maturation of the place mechanism reflects the early maturation of
the cochlea. The protracted development of frequency discrimination reflects the
protracted development of auditory cortex. Animal studies indicate that experience with
tonal sounds appears to be crucial for the development of normal tonotopic maps and
normal frequency discrimination. Recent research suggests that extensive exposure to
music in humans may lead to enhanced frequency representations.
Pitch and Timbre
Sounds containing only one frequency component are very rare in the natural
environment. Sounds perceived to have pitch typically contain energy at a fundamental
frequency (which corresponds to the perceived pitch) and at integer multiples
(harmonics) of that frequency. For example, a (p. 316) sound perceived to have a pitch of
440 Hz (concert A) would also contain energy at 880, 1320, 1,760, 2,200 Hz., etc.
Although the basilar membrane separates at least the lower (resolvable) harmonics into
different frequency channels, adults do not perceive a separate sound for each harmonic.
Rather, the harmonics are fused into a single percept at a later stage of processing.
Primary auditory cortex contains frequency (tonotopic) maps but does not contain pitch
Auditory and Musical Development
Page 11 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
representations, even at a neural population level (Fishman, Reser, Arezzo, &
Steinschneider, 1998). Pitch appears to be first represented in a region adjacent to
primary auditory cortex in marmoset monkeys (Bendor & Wang, 2005). Studies using
fMRI (Patterson, Uppenkamp, Johnsrude, & Griffiths, 2002; Penagos, Melcher, &
Oxenham, 2004; Schneider et al., 2005) and analyses of lesion cases (Schönwiesner &
Zatorre, 2008) confirm generalization of this finding to humans.
That the brain integrates the harmonics, and does not simply use the fundamental
frequency when determining the pitch, is clear from a phenomenon known as the pitch of
the missing fundamental. If the fundamental frequency of a tone is removed, the pitch is
not affected (although the timbre does change). Indeed, the repetition period of the
complex sound wave does not change. This phenomenon enables study of the
development of pitch perception. In a series of studies, Clarkson and colleagues
(Clarkson & Clifton, 1985, 1995; Clarkson & Rogers, 1995; Montgomery & Clarkson,
1997) showed that 7-month-old infants hear the pitch of the missing fundamental.
However, given the immaturity of auditory cortex, especially prior to 6 months of age, as
discussed above, and given the fact that auditory cortex appears to be necessary for pitch
processing, it might be expected that very young infants cannot integrate harmonics into
a single sound with pitch. He and Trainor (2009) tested this using EEG responses. They
presented sequences of trials where each trial consisted of a rising pair of tones, both
with fundamentals and several harmonics present. Each harmonic rose in pitch from the
first to second tone, but the starting pitch and amount of pitch rise varied from trial to
trial. Occasionally, the harmonics in the second tone lined up so as to produce a missing
fundamental that was lower than the pitch of the first tone. If infants could hear the pitch
of the missing fundamental, they would show a brain response indicating violation of
expectation for a rising pitch. However, if they could not integrate the harmonics, each
harmonic still rose from the first to the second tone, so there would be no violation of
expectation. They found that adults, 7-month-olds, and 4-month-olds heard the pitch of
the missing fundamental, but no evidence that 3-month-olds were able to do so. In sum, it
appears that the ability to integrate harmonics into a percept of pitch emerges between 3
and 4 months of age as cortex matures. It remains unknown as to whether specific
experience affects the emergence of this ability.
The perception of pitch is intimately tied to the perception of timbre or sound quality.
Timbre is defined negatively as the perceptual difference between sounds with the same
pitch, duration, and loudness that nonetheless sound different. Examples include musical
sounds, such as a violin versus a flute; one human voice versus another; and one vowel
sound versus another. In each case, even when equated for pitch, duration, and loudness,
a difference in sound quality remains. The perception of timbre is multidimensional, but
the main physical correlates are the frequency spectrum (the relative amounts of energy
Auditory and Musical Development
Page 12 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
at each frequency present in the sound) and sound onset characteristics (e.g., a violin has
a slow onset whereas a piano has a fast onset). Because of its multidimensional nature,
timbre is difficult to study, and there are few studies of its development. It appears that
infants can discriminate between sounds with different spectral shapes (i.e., different
relative intensities of different frequency regions) (Clarkson, Clifton, & Perris, 1988;
Trehub, Endman, & Thorpe, 1990), but their resolution compared to adults has not been
tested. There is some evidence in children that discrimination of spectral shape
differences does not reach adult levels until 9 years of age (Allen & Wightman, 1992).
Few studies have examined the role of experience in timbre perception. However, Tsang
and Trainor (2002) showed that infants are better at spectral shape discrimination for
sounds with spectral shapes that are typical of speech and music compared to those that
are not. Voices and musical instruments have negative spectral slopes (i.e., intensity fall-
off with increasing frequency) between −4 and −12 dB/octave. Tsang and Trainor (2002)
found that infants were better able to discriminate tones with spectral slopes in this
region compared to sounds with positive or highly negative spectral slopes. A question for
future research is whether this sensitivity for spectral slopes that are relevant in the
human environment is innate or the result of experience with human voices. However,
one recent study indicates that a small amount of experience with particular timbres can
modify cortical EEG responses (Trainor, Lee, (p. 317) & Bosnyak, 2011). Infants who
listened for 20 a minutes day for a week to children’s songs, played in either guitar
timbre or marimba timbre, showed larger responses to tones and pitch changes in the
trained timbre.
Temporal Resolution
The adult auditory system is capable of resolving timing differences of a few milliseconds
and also of integrating information in time windows of 200 to 300 ms. The former ability
is very useful, for example, in speech perception, where differences of a few milliseconds
can change perception from one speech sound category (phoneme) to another. Almost all
developmental work in this area has been directed at temporal resolution. Although there
is some variation depending on methodology, the consensus is that when other factors
are controlled, temporal resolution is relatively mature early on. At the same time, many
factors affect performance on temporal resolution tasks, and adult levels on many tasks
are not reached until well into childhood.
One of the most common ways to measure temporal resolution is gap detection, where
the smallest silent interval between two sound markers that can be detected is
determined. Performance on gap detection tasks is affected by a number of factors (see
Auditory and Musical Development
Page 13 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Phillips, 1999). Performance is worse for longer sound markers, probably because the
first marker creates forward masking of the silence and the second marker creates
backward masking. Performance is also worse for cross-channel gap detection, a
situation in which the two markers are in different frequency regions and therefore
processed in different spatial frequency channels (see the section “Frequency Resolution
and Frequency Discrimination” above). Performance can also be adversely affected by
band-limited noise markers because such stimuli naturally contain amplitude fluctuations
that can be confused with the gap. Using behavioral methods and short 500-Hz tone pips
(with Gaussian on and off ramps), Trehub, Schneider, and Henderson (1995) found that 6-
month-old infants’ gap thresholds were about 12 ms, in comparison to adult thresholds of
6 ms. Using objective EEG measures and stimuli similar to Trehub and colleagues
(1995), Trainor and colleagues (2003) measured event-related EEG responses to
occasional presentations of gap stimuli in sequences of 2,000-Hz no-gap stimuli matched
in duration and intensity. They found that both infants and adults exhibited reliable
responses to gaps as small as 4 ms. This result indicates that temporal resolution is quite
mature by 6 months of age.
In contrast, using continuous noise markers, Werner, Marean, Halpin, Spetner, and
Gillenwater (1992) found that infant gap detection thresholds (50 ms) were much worse
than those of adults (10 ms). This suggests that infants are particularly disadvantaged by
the noise markers, whether by increased masking or by amplitude fluctuations in the
markers. Indeed, several studies indicate that even much older children are very
susceptible to backward masking (Buss, Hall, Grose, & Dev, 1999; Hartley, Wright,
Hogan & Moore, 2000; Rosen, van der Lely, Adlard, & Manganari, 2000). For example,
Hartley and colleagues (2000) found that compared to adults a 1,000-Hz tone had to be
34 dB more intense at 6 years and 20 dB more intense at 10 years in order to be detected
in the presence of backward masking.
Infants are also particularly affected by cross-channel gap detection. In contrast to
thresholds close to adult levels by 6 months when the sound markers are short and have
the same frequency content (Trainor, McFadden, et al., 2003; Trehub et al., 1995), when
the sound markers must be processed in different frequency channels, infant thresholds
are 30 to 40 ms under conditions where adult thresholds are 10 to 20 ms (Smith, Trainor,
& Shore, 2006). This suggests that infants have particular problems comparing timing
across different channels. However, this ability is probably very important for speech
perception, where small temporal silences often occur between voiced segments of
different frequencies (Phillips, 1999).
Another approach to measuring temporal resolution that, like gap detection with short
tone pips, gets around confounds of masking is to measure the temporal modulation
Auditory and Musical Development
Page 14 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
transfer function (Viemeister, 1979). Sounds are presented with and without Gaussian
amplitude modulation (i.e., periodic fluctuations in loudness) and listeners indicate the
presence or absence of the amplitude modulation. The rate of modulation is varied,
typically from about 4 Hz to beyond 60 Hz. The depth (size of the modulation in dB) that
can be detected is determined at each modulation rate. Typically for adults, the threshold
(size of modulation that can just be detected) is similar between 4 Hz and 50 to 60 Hz,
after which much larger modulation is needed. This indicates that adults can perceive
modulations of up to about 50 to 60 Hz, which corresponds to a temporal rate of about 17
to 20 ms. Although 4- to 7-year-old children need larger modulations in general
(indicating poorer intensity discrimination; 9- to 10-year-olds are adultlike), (p. 318) they
also show consistent thresholds until about 50 to 60 Hz. Thus, temporal resolution itself
appears to be adultlike in children as young as 4 years. One study in infants using
temporal modulation transfer functions suggests that by this measure, infants are also
quite mature in temporal resolution (Levi & Werner, 1996).
In sum, temporal resolution appears to mature quite early. However, situations involving
intensity comparison, the presence of masking, and the need to compare timing across
frequency channels can all lead to poor performance on temporal resolution tasks well
into childhood. The effects of experience during childhood on these factors remains
largely unknown, but one study in adults shows that gap detection thresholds improve
substantially with training (Smith, Trainor, Gray, Plantinga, & Shore, 2008).
Sound Localization
The ability to localize sounds in space is very useful, in that knowing the location of a
predator or speeding car aids survival. Locating the source of sounds is also helpful for
detecting and identifying objects when more than one is sounding at the same time. The
main cues for localizing sounds in the horizontal plane involve comparing the intensity
level (interaural level difference [ILD]) or timing difference (interaural timing difference
[ITD]) between the ears. Sounds to the right of midline will be louder and arrive earlier at
the right than left ear, and vice versa for sounds to the left of midline. Cues to location in
the vertical plane primarily involve changes in the frequency spectrum as sounds from
different elevations hit the pinna (outer ear) at different angles, causing differential
filtering of different frequencies. Sound localization is not as good in the vertical plane as
in the horizontal plane, and it relies to a greater extent on familiarity with the sounds to
be located because this information is needed to determine the extent of frequency
distortions caused by the pinna.
Most studies of sound localization in infants have measured infants’ ability to make a
head turn to the location of a sound. Muir and Field (1979) first showed that newborns
Auditory and Musical Development
Page 15 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
will turn their head to the right or left to localize broadband sounds. Interestingly, head
turning to sound location follows a J-shaped function. Newborn head turning is very slow
and imprecise (Muir, Clifton, & Clarkson, 1989). Around 12 weeks, the response
disappears entirely, and when it returns at around 16 weeks, it is faster and more
accurate and is accompanied by visual search for the object (Muir et al., 1989). Clifton
(1992) proposed that the early head turn response is reflexive and driven by subcortical
structures; that as cortex matures, it inhibits subcortical processing but is not yet able to
perform sound localization; and that by 4 months, sound localization abilities return as
cortex takes over this function. However, despite the lack of head turn responses at 3
months, event-related EEG responses to a change in sound location can be seen at this
age (Sonnadara & Trainor, 2005). This suggests that infants are still able to localize
sounds during the time in which they do not make head turn responses, but that cortical
sensorimotor integration between location and head turns has not yet been achieved.
Sound localization has a fairly protracted development. In the horizontal plane, the
minimum audible angle (the smallest difference in sound location that can be detected,
measured in degrees) is about 27 degrees at 1 month and reaches 5 degrees at 18
months and adult levels of 1 to 2 degrees at 5 years (Ashmead, Clifton, & Perris, 1987;
Clifton, Morrongiello, Kulig, & Dowd, 1981; Morrongiello, 1988; Morrongiello, Fenwick,
& Chance, 1990; Morrongiello, Fenwick, Hillier, & Chance, 1994; Morrongiello & Rocca,
1987a, 1990). In the vertical plane, high-frequency sounds are easier to localize than low-
frequency sounds as they are more affected by the pinna. Few studies have examined the
development of localization in the vertical plane; however, for an 8- to 12-kHz noise band,
the minimum audible angle is about 16 degrees at 6 months and improves to 4 degrees at
18 months, which is comparable to that of adults (Morrongiello & Rocca, 1987b,c).
Sounds reflect off surfaces such as walls and furniture, resulting in multiple copies
actually reaching the ear. Infants learn to ignore these reflections in a process known as
the precedence effect (Clifton, Morrongiello, & Dowd, 1984). However, even though
sound localization in nonreverberant spaces appears mature by age 5 years, children
perform more poorly in reverberant spaces containing more reflections (Litovsky, 1997).
It remains unknown as to when sound localization reaches adult levels in all
environments, but in general children perform more poorly on many tasks in the presence
of background noise (Werner & Marean, 1996).
In part, development of accurate sound localization is protracted because, with
increasing age, the head becomes bigger, the ears become further apart, and interaural
cues become larger. Not only do representations for ILD and ITD need to be (p. 319)
recalibrated as the head grows, but they become more reliable as well (Clifton, Gwiazda,
Bauer, Clarkson, & Held, 1988). However, this increased reliability is not enough to
account entirely for the poor performance of infants (Ashmead, Davis, Whalen, & Odom,
Auditory and Musical Development
Page 16 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
1991). Furthermore, infants are much better at discriminating ITDs between 16 and 28
weeks of age than would be predicted by their localization performance (Ashmead et al.,
1991). ITD and ILD cues are known to be processed in subcortical nuclei, specifically the
medial superior olive and the lateral superior olive, respectively (King, Parsons, & Moore,
2000). However, in adults at least, maps of space are found in auditory cortex, and these
are necessary for localization of sounds in space (e.g., Al’tman, 1983; Clarke et al., 2002;
Cornelisse & Kelly, 1987; Efron, Crandall, Koss, Divenyi & Yund, 1983; Zatorre, Bouffard,
Ahad, & Belin, 2002). Thus, it appears that much of the sound localization limitations
early in development are probably due to immaturities in cortical maps of space rather
than to processing of localization cues.
Because the cues to sound location change as the head grows, auditory maps of space
must remain plastic for an extended period. Interestingly, Sonnadara and Trainor (2005)
found that event-related brain responses to changes in location remain immature past 8
months of age although responses to changes in pitch and timing take on an adult
morphology between 4 and 6 months of age. There is evidence from animal studies for a
sensitive period for the development of auditory maps of space (e.g., Binns, Withington, &
Keating, 1995; Gray, 1992; Knudsen, 1988; Moore, 1983). The same appears to be true
for humans, who, if deprived of binaural hearing during development, do not develop
spatial hearing as adults even if binaural hearing is restored (Wilmington, Gray, &
Jahrsdorfer, 1994). Furthermore, auditory maps of space must converge with visual maps
of space as our perception of objects involves the integration of information from
different senses into a unified perception of an object and its location. Typically, visual
information to location is dominant, but auditory cues to location can override visual cues
if the visual cues are sufficiently degraded (Alias & Burr, 2004; Battaglia, Jacobs, & Aslin,
2003). To date, few studies have addressed the role of experience in human development
of sound localization abilities, and the extent and timing of a sensitive period remains
largely unknown.
Auditory Scene Analysis
Natural auditory environments are typically complex, containing multiple objects that
emit sounds that change over time and overlap with each other. For example, there may
be several people talking, cars driving by, music playing in the background, a baby
crying, and a washing machine churning. The sound waves produced by these objects
simply sum as they travel through the air and bounce off various objects, such that what
impinges on the ear is a complex waveform in which the information about separate
objects is jumbled together. Unlike in the visual system, where the two-dimensional
shapes of objects and their relative distances from each other are mapped in some
Auditory and Musical Development
Page 17 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
fashion onto the receptors in the retina, the auditory periphery encodes spectral
(frequency) information along the length of the basilar member and temporal information
through firing patterns in the auditory nerve. Objects and their locations are not given
directly in this mapping as each object likely contains many frequency components that
may overlap and change over time. Extracting information about what objects are present
and where they are requires considerable complex processing, which is likely the reason
why the subcortical auditory pathway is so much more extensive than the visual
subcortical pathway.
The process of determining the number, identity, and location of objects is referred to as
“auditory scene analysis,” and it depends on the basic abilities described in the previous
sections of frequency and pitch processing, intensity discrimination, temporal resolution,
and sound localization (Bregman, 1990). There are two basic complementary processes in
auditory scene analysis, one being integration of components that belong to the same
object and the other being the segregation of components that belong to different
objects. Both of these processes apply to both sequences of sounds and simultaneous
sounds. For example, the successive notes of a melody or the successive phonemes of a
person talking need to be integrated and perceived as coming from one object, and they
need to be segregated (or streamed) from other sounds such as another melody or the
phonemes coming from a different person. Similarly, the simultaneous harmonics of a
sound with pitch (e.g., a musical tone or a vowel) need to be integrated into a percept of
a single object, and they need to be separated from other harmonics that may belong to a
different auditory object that is sounding at the same time.
Almost all of the small amount of developmental work on auditory scene analysis has
focused on (p. 320) streaming, or sequential integration and segregation. Whether the
elements of a sequence of sounds are heard as being in one stream (emanating from one
object) or two streams (emanating from two objects) depends on a number of factors, the
most important being frequency or pitch differences between elements, rate of
presentation, and timbre differences between elements. Specifically, if some elements are
high in pitch while others are low, the high elements will tend to be heard as one stream
and the low elements as another. If the sequence is played more rapidly, it is more likely
that the high and low tones will segregate. This is consistent with objects in the real
world, which typically do not jump about rapidly in pitch. Similarly, if some elements
have one timbre and other elements have another timbre, the elements of similar timbre
will tend to integrate into streams and segregate from the elements with different timbre,
and this becomes more likely the faster the presentation rate. Again this is consistent
with objects in the natural world, which tend not to jump back and forth rapidly between
timbres.
Auditory and Musical Development
Page 18 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Developmentally, one of the most interesting questions is whether the heuristic rules,
described above, as to when elements will integrate and when they will segregate are
learned through experience with sounds in the world or whether they are innate.
Bregman (1990) suggested that, although there are rules that do depend on learning,
those described above do not depend a great deal on experience; rather, they are bottom-
up and only partially amenable to conscious control. In this case, they would be expected
to operate in infancy, and indeed several studies show that infants can segregate
sequences of sounds into two streams (Demany, 1982; Fassbender, 1993; McAdams &
Bertoncini, 1997). These studies make use of the fact that the auditory system keeps good
track of the temporal order of elements within a stream but rather poor track of the
temporal order of elements in different streams. Thus, if one hears a particular sequence
of four elements that repeat in a given order as coming from one source, one will easily
detect a change in the order of its elements. However, if every other element is low in
pitch and the remaining alternating elements are high in pitch, then two streams will be
perceived, one consisting of the low-pitched elements and the other consisting of the
high-pitched elements. The temporal order of the low-pitched tones will be encoded
accurately, as will the temporal order of the high-pitched tones. However, the temporal
order of the high and low tones relative to each other will not be encoded accurately.
Winkler and colleagues (2003) made use of this fact to show that 2- to 5-day-old infants
can do stream segregation.
As far as integrating simultaneous components into a single percept, one study indicates
that infants can do this at 7 months of age (Folland, Butler, Smith & Trainor, 2011). If a
set of harmonics are all integer multiples of a common fundamental frequency, adults will
typically integrate the components into a single complex sound with pitch equal to that of
the fundamental frequency. If one harmonic is mistuned, it will not be integrated into the
complex, and two simultaneous sounds will be heard, one higher pitched tone at the
frequency of the mistuned harmonic, and one lower pitched tone at the frequency of the
fundamental. Folland et al. (2011) showed that 6-month-old infants are quite good at
detecting mistuned harmonics. Furthermore, as discussed above, from 4 months of age,
infants perceive the pitch of the missing fundamental, which also implies that they can
integrate the harmonics into a single percept.
While the studies described above demonstrate that auditory scene analysis is present in
very young infants, they do not indicate whether such abilities improve with age or are
dependent on experience. Sussman and colleagues (Sussman, Čeponienė, Shestakova,
Näätänen, & Winkler, 2001; Sussman, Wong, Horváth, Winkler, & Wang, 2007)
conducted studies in children using similar stimuli as Winkler and colleagues (2003).
They found that although children between 5 and 11 years all demonstrated stream
segregation, the younger children were less efficient than the older children and adults.
Auditory and Musical Development
Page 19 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Although no training studies on auditory scene analysis have been conducted in children,
comparing adult musicians and nonmusicians provides a natural experiment of the effects
of musical training. Fujioka, Trainor, Ross, Kakigi, and Pantev (2005) found that when
two simultaneous melodies (polyphonic music) were presented to adults, MEG brain
responses indicated that musicians were able to form more robust representations of the
two melodies compared to nonmusicians. This suggests that experience likely plays a
significant role in the efficiency of auditory scene analysis processes.
Summary of Basic Auditory Development and Neural Correlates
In general, basic auditory-processing abilities improve greatly during infancy but do not
fully mature until around 10 years of age. The early improvements are likely associated
with maturation (p. 321) of auditory cortical areas. In particular, myelination and the
expression of neurofilament in the deeper cortical layers during the first couple of years
after birth enables input of specific auditory information from subcortex to layer IV, and
processing of this information internal to auditory areas. However, the upper layers that
contain the majority of connections to cortical areas beyond auditory cortex do not begin
to show mature neural connections until after age 5 years. The maturation of these layers
is associated with improvements in attention, the ability to filter out certain sounds and
listen selectively. These abilities are likely achieved through connections from higher
cortical areas back to auditory areas. Such connections likely enable selective priming of
auditory cortical neurons in order to direct processing to particular sound features of
interest. Event-related potential studies indicate that auditory evoked potentials continue
to mature until about 18 years of age (Ponton et al., 2000; Trainor, Shahin, & Roberts,
2003). For example, the N1 and P2 components of the evoked potential, occurring about
100 ms and 170 ms after onset of an isolated sound, respectively, are obligatory
responses in adults. They likely reflect recurrent activations in auditory areas that are
influenced by connections from other cortical areas. However, in children, they are not
seen robustly until after about 4 years of age. Thereafter, they increase in amplitude with
age, reaching a maximum around 10 to 12 years of age, the point at which neurofilament
development matures. After age 12, these responses decrease in amplitude, probably
reflecting fewer, more efficient connections for processing sound, and they reach stable
adult levels around 18 years of age. Also of interest is the maturation of oscillatory
activity seen in the EEG in response to sound. Shahin, Roberts, Chau, Trainor, and Miller
(2008) have shown that induced gamma-band activity, which is associated with top-down
processing, attention, and memory, also matures rather late. Finally, although direct tests
of the effects of auditory experience on these ERPs have not been done, several studies
show that N1 and P2 (Shahin et al., 2004), N2 (a component related to attentional
processing; Fujioka, Ross, Kakigi, Pantev, & Trainor, 2006), and induced gamma-band
Auditory and Musical Development
Page 20 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
activity (Shahin et al., 2008) are all larger in preschool children engaged in music lessons
compared to children engaged in nonmusical activities. These studies suggest that the
development of auditory cortex is greatly influenced by specific auditory experience
during childhood.
Musical Development
Music is produced by instruments that create sound vibrations, including the striking of
percussion instruments, the plucking or bowing of strings, the blowing of air columns in
wind instruments, and the vibrating of vocal cords in singing. The ability to perceive the
individual sounds from which music is composed depends on the ability to process the
basic sound features of intensity, frequency, pitch, timbre, duration, and location outlined
in the previous section. However, meaning in music arises largely from how the individual
sounds are put together, both sequentially and simultaneously. Musical structure has two
basic aspects, a temporal (rhythmic) aspect and a pitch (melodic and harmonic) aspect.
The perception of music relies intimately on the general principles of auditory scene
analysis outlined in the section “Auditory Scene Analysis” above. Some sound events are
perceived as being grouped together, while different groups of sounds are perceived as
being segregated from each other. For example, to perceive a melody, successive tones
from one sound source (voice or stream) must be grouped into a coherent whole.
Similarly, to perceive the different colors or timbral qualities of different chords, the
relationships between simultaneous tones must be perceived in an integrated manner. On
the other hand, to hear the different parts (voices or streams) of polyphonic music, each
part or melody must be segregated from each other melody.
Musical behavior is found in all human societies, past and present, and, like language,
music is a defining characteristic of the human species. People spend a tremendous
amount of time and resources engaging in musical activity. For example, the male
Mekranoti Indians of the Amazon sing for 1 to 2 hours every morning before dawn (see
Huron, 2003). In modern Western society, much time and money is also devoted to music,
as can be seen by the fact that the United States makes more money exporting music
than pharmaceuticals (Huron, 2003). Music engages sensory, perceptual, and cognitive
systems, but it also has direct effects on the emotions (see Huron, 2006; Sloboda, 1991;
Trainor & Schmidt, 2003). Music serves important social functions and is found at
birthday parties, weddings, religious ceremonies, and political rallies, and in rallying
armies for warfare. In adults, singing and playing music together appears to have the
effect of engendering a common emotional feeling across people and increases people’s
willingness to cooperate (Wiltermuth & Heath, 2009). Singing is also an everyday activity
Auditory and Musical Development
Page 21 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
among young children, who (p. 322) incorporate music into their games. It is likely not by
chance that music is an integral part of daycare and preschool programs everywhere.
Perhaps most interesting is that caregivers around the world communicate with their
preverbal infants through singing (Trehub, 2009). Across cultures, infant-directed singing
is distinguishable from other types of singing (Trehub & Trainor, 1998; Trehub, Unyk, &
Trainor, 1993a,b). In keeping with infants’ processing limitations, infant-directed singing
uses simple structures and a lot of repetition and is often somewhat conversational in
that caregivers will modify what they sing based on the infants’ reactions (Smith &
Trainor, 2008; Unyk, Trehub, Trainor, & Schellenberg, 1992). The communicative intent
of infant-directed singing is evident in the fact that caregivers sing in different styles
when attempting to achieve different parenting goals—for example, singing in a quiet,
slow, lower-pitched voice with an airy timbre when putting infants to sleep, and in a
faster, higher-pitched, and more enunciated voice when playing with infants (Trainor,
Clark, Huntley, & Adams, 1997).
Infants are very responsive to music (Trehub, 2009). For example, they prefer to listen to
renditions of songs that are sung in an infant-directed style compared to the same songs
sung in an adult-directed style (Trainor, 1996), particularly preferring the loving tone and
the higher pitch (Trainor & Zacharias, 1998) of the infant-directed versions. Infants also
react differently to different types of infant-directed singing, focusing inward and looking
downward during the presentation of lullabies and actively looking outward at people in
the room during the presentation of play songs (Rock, Trainor, & Addison, 1999). Speech
directed to preverbal infants contains musical features such as exaggerated pitch
contours and rhythmic patterning, leading to the suggestion that infant-directed music
and speech serve similar functions related to emotional regulation and social bonding
(Dissanayake, 2000). Music and language development also appear to be related during
childhood. For example, Anvari, Trainor, Woodside, and Levy (2002) showed that musical
abilities and early reading abilities are correlated in preschool children, even after other
factors such as memory, vocabulary, and phonological awareness are factored out.
Furthermore, participation of school-aged children in musical activities appears to
improve early reading ability (Magne, Schon, & Besson, 2006; Moreno & Besson, 2006)
and general intelligence (Schellenberg, 2004).
In the following sections, we examine the development of sensitivity to musical pitch and
rhythmic structure. In both cases, there are universal or near-universal features, but, as
is the case with language, some aspects of musical structure vary considerably between
musical systems. Different musical systems use different scales (set of notes from which
musical compositions are formed), different melodic conventions, and different
predominant rhythmic structures, and they may or may not use complex harmonic
structure. Thus, as with language, children must acquire the specific musical system to
Auditory and Musical Development
Page 22 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
which they are exposed, and adults, even those without specific musical training, process
music through mental structures that were sculpted by the music to which they were
exposed as children. In the following sections we consider early musical abilities and how
specific musical pitch and rhythmic structures are acquired through everyday exposure
to music, a process known as “enculturation.” We then examine the research evidence for
effects of enriched early musical training on musical acquisition and development in
general.
Development of Musical Pitch Organization
Some aspects of musical pitch structure are near universal, whereas others vary greatly
from musical system to musical system. It might be expected that near-universal aspects
reflect general brain mechanisms for processing auditory information, including how
sound is represented in cortical tissue, basic memory limitations, and the integration and
segmentation processes involved in auditory scene analysis. Sensitivity to near-universal
aspects of musical pitch structure might be expected to appear early in development. On
the other hand, the fact that musical system-specific aspects have evolved in only some
musical systems suggests that these aspects may reflect processing that is less
dependent on general auditory mechanisms and may require more experience with a
particular musical system to be acquired by children. In the following, we first consider
early-developing musical capabilities and their relation to near-universal musical
features, then later-developing capabilities and their relation to enculturation, and,
finally, effects of enriched musical training.
Early Abilities for Perceiving Musical Pitch Organization
Consonance and dissonance. To adults, when the ratio between the fundamental
frequencies of two tones can be expressed as a small integer ratio (p. 323) (e.g., intervals
of an octave, 1:2 and perfect fifth, 2:3) they are perceived to sound consonant (smooth,
pleasant, without roughness) to adults, but when the ratios are more complex (e.g., major
seventh, 8:15, tritone, 32:45) they sound dissonant (rough, unpleasant) (e.g., see Plomp &
Levelt, 1965; Terhardt, 1984; Tramo, Cariani, Delgutte, & Braida, 2001). This perception
is thought to arise from both the spatial and the temporal mechanisms. The more
consonant the perception of two tones, the more the frequencies of the harmonics tend to
be either identical or more than a critical band apart so that their vibration pattern
representations on the basilar member are separated and do not interact. On the other
hand, the more dissonant the perception of two tones, the greater the chance that there
are harmonics whose frequencies are nonidentical but within a critical band. In this case,
Auditory and Musical Development
Page 23 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
there is interference in the spatial neural representation of these frequency components.
Studies have shown that the temporal mechanism is also involved. Small versus large
integer frequency ratios set up firing patterns in the auditory nerve fibers that are
distinct (Tramo et al., 2001).
The consonance/dissonance continuum is commonly used as an organizing structural
feature for musical pitch. For example, in Western musical structure, a dissonant interval
creates tension, which is typically subsequently resolved by a following consonant
interval, giving rise to an emotional experience of ebb and flow of tension. Given its
origin at relatively peripheral levels of the auditory system, one would expect sensitivity
to consonance and dissonance to be present early in life. Indeed, a number of studies in
6-month-olds show that infants of this age can categorize consonant and dissonant
intervals (Trainor, 1997) and that they find two consonant intervals to sound more similar
than a consonant and a dissonant interval, even when the pitch distance between the two
consonant intervals is larger than that between the consonant and dissonant interval
(Schellenberg & Trainor, 1996). Furthermore, infants between 2 and 6 months of age
prefer to listen to consonance compared to dissonance, whether with isolated intervals or
in the context of a simple piece of music (Trainor & Heinmiller, 1998; Trainor, Tsang, &
Cheung, 2002; Zentner & Kagan, 1998). Even hearing newborns of deaf parents prefer
consonance to dissonance (Masataka, 2006). Given the fundamental importance of
consonance and dissonance in musical pitch structure and its early development, we
propose that the perceptual continuum between consonance and dissonance might
provide the starting point for the development of musical pitch systems.
Octave equivalence. Another basic feature of sound representation that is related to
consonance is octave equivalence. Tones an octave apart (1:2 ratio of fundamental
frequencies) are perceived as very similar. Indeed, because of their small-integer ratio
relation, all of the harmonics in the higher tone are contained within the lower tone. In
terms of musical structure, a near-universal feature of musical systems is that tones an
octave apart are considered structurally identical (Burns, 1999). In Western music, tones
an octave apart are given the same note names (e.g., there are seven As on the piano
keyboard). When males and females sing together and their voice ranges do not overlap,
they commonly sing octaves apart. Infants also appear to be sensitive to octave relations
(Demany & Armand, 1984).
Unequal interval scales. Music is not typically composed using continuous pitch. Rather,
the octave is divided into a small set of pitch intervals, and the notes formed by these
intervals are used to make melodies and harmonies. This near-universal convention is
probably related to memory constraints of the human auditory system, which make
continuous pitch compositions difficult to encode and remember. While different musical
systems use different scales (e.g., pentatonic, ragas of Indian classical music, Western
Auditory and Musical Development
Page 24 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
major and minor), the vast majority of scales have the characteristic that they contain
two (or more) sizes of intervals. For example, the Western major scale contains intervals
of a semitone (1/12th octave) and a tone (1/6th octave). Use of more than one interval
size gives rise to the possibility of making unique sets of relations between each note of
the scale and every other note of that scale (Balzano, 1980). This structure allows for
different notes to take on different functions. For example, in the Western major scale,
the tonic is the most stable pitch, and compositions that end on this note sound most
complete. The fifth tone of the scale, the dominant, and the seventh note, the leading
tone, require resolution to the tonic. Although infants need to learn the intervals in the
scales used in their culture (see below), even before they have done this they show
processing advantages for scales with two interval sizes compared to scales with one
interval size (Trehub, Schellenberg, & Kamenetsky, 1999).
Transpositional invariance. One universal aspect of musical pitch structure is that of
transpositional invariance. A melody maintains its identity (p. 324) regardless of the
absolute pitch of the starting note as long as the pitch distances (intervals) between notes
are correct. For example, Happy Birthday is recognizable whether transposed to a higher
or lower pitch range. Indeed, most adults do not readily remember the absolute pitches of
a melody, but favor a representation in the nervous system where the distances between
notes are encoded. The ability to compare the pitch distance between one set of two
tones and another set of two tones is called relative pitch. The fragility of absolute pitch
representations in long-term memory in most adults is evident in studies showing that the
ability to judge whether two tones have the same or different pitch deteriorates rapidly as
more distracter tones with random pitch are placed between them, although a very small
percentage of people do readily remember absolute pitch and are not affected by such
distracter tones (Ross et al., 2004).
Infants may also process absolute pitch under some circumstances (Saffran &
Griepentrog, 2001; Volkova, Trehub, & Schellenberg, 2006), but, like adults, they favor
relative pitch representations (Plantinga & Trainor, 2005, 2008; Trehub, Bull, & Thorpe,
1984). Also similar to adults, infants’ memory for absolute pitch fades rapidly. Infants’
ability to determine whether two pitches are the same or different deteriorates as the
number of distracter tones placed between them increases (Plantinga & Trainor, 2008).
On the other hand, infants readily process relative pitch. Several studies show that
infants can detect a change in one note of a melody, even when the comparison melody is
transposed with respect to the original (e.g., Trainor & Trehub 1992; Trehub et al., 1984).
Infants’ long-term memory representations also favor relative pitch. Plantinga and
Trainor (2005) exposed infants to one of two melodies every day for a week. After this
exposure, infants showed a novelty preference, preferring to listen to whichever of the
two melodies they had not heard previously. Of most interest here, this preference for the
Auditory and Musical Development
Page 25 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
novel melody remained as strong when the melodies at test were transposed up or down
by either a perfect fifth (7/12ths of an octave) or tritone (1/2 octave) compared to their
presentation during familiarization the prior week. However, infants showed no
preference for the melody of exposure presented at the pitch level heard during the week
of exposure compared to that same melody presented at a different pitch level. These
results are interesting because relative pitch representations are more complex than
absolute pitch representations in that the former require the comparison of pitch
distances whereas the latter require simple encoding of isolated pitches. These results
are particularly interesting when it is considered that absolute frequency representations
are present already on the basilar membrane in the inner ear and are maintained through
subcortical tonotopic maps and into primary auditory cortex. Yet melodic representations
largely discard this absolute pitch information in favor of relative pitch representations.
This attests to the usefulness of the relative pitch representations. Different people speak
and sing at different pitch levels, and therefore relative pitch representations are
essential for recognizing musical input across such variation.
Enculturation to Specific Musical Pitch Systems
Although infants show some precocious musical processing abilities as indicated in the
previous section, these abilities appear to concern near-universal aspects of musical pitch
structure. Just as it takes time for children to acquire a particular language, it takes time
for them to acquire a musical system. Yet, for both music and language, implicit
knowledge of the structure is acquired without any formal instruction. Indeed, some have
argued that, when given appropriate implicit behavioral tests that do not require explicit
music knowledge, nonmusicians show considerable knowledge of the musical system of
their culture (e.g., Bigand & Poulin-Charronnat, 2006; Tillmann, Bigand, Escoffier, &
Lalitte, 2006). For example, Trainor, McDonald, and Alain (2002) found that Western
nonmusicians show preattentive automatic brain responses to changes to a note that
violate Western musical structure (out-of-key notes) in an unfamiliar melody, indicating
that they have internalized the structure of Western music and process music through
representations that instantiate these expectations.
There are two basic aspects of musical pitch structure that vary considerably from
musical system to musical system: scale (or key) structure and harmonic structure.
Lynch, Eilers, Oller, and Urbano (1990) showed that Western adults are much better at
detecting changes to Western scales than to unfamiliar Balinese scales, whereas Western
infants are equally good at detecting changes to both. Trainor and Trehub (1992) showed
that Western adults, whether formally musically trained or not, process melodies in terms
of Western major scale structure. Specifically, they are much better at detecting changes
to an unfamiliar Western melody that go outside the notes of the key of the melody
(p. 325) compared to changes that remain within the key of the melody. Thus, they have
Auditory and Musical Development
Page 26 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
implicit knowledge of Western key structure. Infants, on the other hand, detect both
types of changes equally well, and even perform better than adults under some conditions
on within-key changes, indicating that they have not yet learned what notes belong in the
Western major scale. EEG studies indicate that event-related potential measures of
cortical representations for melodies take on an adultlike morphology later than
representations for individual pitches (He, Hotson, & Trainor, 2007, 2009; Tew, Fujioka,
He, & Trainor, 2009), indicating that processing melodic information develops more
slowly than processing the pitch of individual tones. The exact age at which children
acquire scale knowledge is not known; however, it is certainly present by age 4 or 5 years
(Corrigall & Trainor, 2009; Trainor, 2005; Trainor & Corrigall, 2010; Trainor & Unrau,
2012; Trehub, Cohen, Thorpe, & Morrongiello, 1986).
Although the vast majority of musical systems use some kind of scale structure for
melodic composition, elaborate harmonic structure is relatively rare across musical
systems. Interestingly, harmonic structure is acquired rather late in development and, at
least in the absence of musical training, does not reach adult levels until around 12 years
of age (Costa-Giomi, 2003). However, younger children do show some sensitivity to
harmony. Schellenberg, Bigand, Poulin-Charronnat, Garnier, and Stevens (2005) showed
that when harmonic progressions ended on the expected tonic chord, 6-year-olds were
faster to make judgments about that chord (e.g., which of two vowels was sung on the
chord) compared to when the harmonic progression ended on an unexpected
subdominant chord (based on the fourth note of the scale), even though the tonic and
subdominant chords are structurally identical in isolation (i.e., composed of the same
intervals). They take on different roles only in the context of a key. Koelsch and
colleagues (2003) used EEG to demonstrate that children as young as 5 years show a
brain response to harmonically very unexpected chords. And Corrigall and Trainor (2009)
demonstrated that 4-year-old children rate sequences that end on the tonic chord as
sounding “good” significantly more often than sequences that end on the subdominant
chord.
Even when chords do not accompany a melody in Western music, the notes of the melody
alone imply a harmonic accompaniment. Trainor and Trehub (1994) investigated the
development of sensitivity to implied harmony. They found that adults and 7-year-olds
readily detected changes to a melody that remained within the key of the melody (and so
did not violate scale-based expectations) but that violated harmonic expectations at that
point in the melody. This result indicates that these age groups are processing melodies
according to a sophisticated implied harmonic representation. On the other hand, 5-year-
olds did not detect changes that violated the implied harmonic structure better than
changes that did not, indicating that at this age they do not have a well-developed sense
of implied harmony.
Auditory and Musical Development
Page 27 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
In sum, although infants show early development of sensitivity to near-universal musical
features, it takes many years to fully acquire system-specific musical pitch processing.
Yet, as in language, such processing is learned through everyday exposure to music in
the absence of formal training. Interestingly, the earlier acquisition of scale (key)
knowledge and the later acquisition of harmonic knowledge parallels how commonly
these music features are seen across the world’s musical traditions. Scale structure is
very common, whereas elaborate harmonic structure is relatively rare. One caveat to
these conclusions, however, is that almost all of the research evidence to date comes
from the study of the acquisition of Western musical structure, and we can only speculate
that acquisition of other musical systems follows similar patterns.
Effects of Formal Musical Training on Musical Pitch Development
In contrast to language, where syntax (grammar), vocabulary, semantics (meaning), and
the ability to read written words are trained in all children in school, musical training in
Western societies varies considerably from individual to individual, from no formal
training to decades of intensive practice for hours a day. Thus, there is the possibility to
examine the effects of extensive musical training. Most studies in this area have
compared adult musicians and nonmusicians. This provides a starting point for this line of
inquiry, although it can be difficult to disentangle genetic from experiential factors in
these studies (see discussion below).
A number of MRI studies indicate structural brain differences between musicians and
nonmusicians that extend across a wide network of areas (e.g., Koelsch & Siebel, 2005).
In particular, musical training is associated with enlarged areas in auditory cortex,
particularly on the right side (Bermudez, Lerch, Evans, & Zatorre, 2009; Schneider et al.,
2002), Broca’s area (Sluming et al., 2002), cerebellum (p. 326) (Hutchinson, Lee, Gaab, &
Schlaug, 2003), and motor areas (Bangert & Schlaug, 2006; Gaser & Schlaug, 2003).
Musical performance places great demands on fast encoding, memory, retrieval,
multisensory integration, and executive functions such as attention and inhibition, and
this network of brain differences likely reflects the training of these functions.
While MRI studies give information about brain structures, when sounds are presented
the stages of musical information processing can be tracked in detail with EEG and MEG
(see Näätänen et al., 2007, and Trainor & Zatorre, 2009, for reviews). Functional
differences between musicians and nonmusicians have been found at virtually every stage
of sound processing. Auditory brainstem responses occurring within 12 ms of sound onset
are already enhanced in musicians (Musacchia, Sams, Skoe, & Kraus, 2007). Likewise,
middle latency responses originating in primary auditory cortex are also enhanced
(Schneider et al., 2002). Several responses from secondary auditory cortex are earlier
and larger in musicians, including the N1b occurring around 100 ms after stimulus onset,
Auditory and Musical Development
Page 28 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
the N1c around 170 ms, and the P2 around 200 ms (Kuriki, Kanda, & Hirata, 2006;
Pantev et al., 1998; Shahin, Roberts, Pantev, Trainor, & Ross, 2005). Finally, P3a
responses indicating attentional capture of sounds in an unattended stream (e.g., Fujioka,
Trainor, Ross, Kakigi, & Pantev, 2004, 2005; Trainor, McDonald, & Alain, 2002) and P3b
responses reflecting memory and conscious attending to the sound (e.g., Tervaniemi, Just,
Koelsch, Widmann & Schröger, 2005; Trainor, Desjardins, & Rockel, 1999) are also larger
in musicians. Another ERP response that has been studied with respect to musical
training is MMN. Musicians show larger and/or earlier MMN responses to occasional
note changes in a single melody presented in transposition (Fujioka et al., 2004), to
occasional changes in each of two simultaneously presented melodies in a polyphonic
musical texture (Fujioka et al., 2005), and to unexpected harmonies in chord progressions
(Koelsch, Schmidt, & Kansok, 2002). A study by Shahin and colleagues (2008) also
indicates larger induced gamma-band (40- to 100-Hz oscillation) responses in musicians
compared to nonmusicians. The evoked gamma-band response is phase-locked to the
sound stimulus and occurs primarily between about 50 and 100 ms after sound onset
(e.g., Pantev et al., 1991). However, the induced gamma-band response is more long
lasting and is not phase-locked to the incoming sound (e.g., Kaiser & Lutzenberger,
2003). It is therefore thought to reflect the entrainment of intrinsic oscillatory networks
in the brain to the incoming sound. Induced gamma-band activity likely reflects top-down
executive processes (e.g., Fujioka, Trainor, Large, & Ross, 2009; Gurtubay, Alegre,
Valencia, & Artieda, 2006; Snyder & Large, 2005), and its enhancement in musicians
suggests that musical training might influence general attentional and executive
functions (Trainor, Shahin, & Roberts, 2009).
Are these musician/nonmusician differences related to musical experience or do they
reflect innate factors such that musicians simply have a genetic endowment that favors
good sound processing? The only way to definitively show effects of experience is to
examine children and randomly assign them to musical training or not, an expensive and
time-consuming enterprise. However, a number of factors point to a large role of
experience in the adult comparison data. First, the amount of musical training is often
correlated with the extent of the brain enhancement, including for structural differences
(e.g., Schneider et al., 2002) and ERP differences (e.g., Pantev et al., 1998; Trainor,
Desjardins, et al., 1999). Second, enhancements are greatest for sounds in the timbre of
the instrument of practice in comparison to timbres of other instruments (e.g., Pantev,
Roberts, Schulz, Engelien, & Ross, 2001). Third, components that are enhanced in
musicians can also be affected even in adulthood through laboratory training (e.g.,
Bosnyak et al., 2004; Lappe, Herholz, Trainor, & Pantev, 2008).
Effects of musical experience can be measured most directly by studying children, but
because such studies are difficult to carry out, few have been completed. Most of these
Auditory and Musical Development
Page 29 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
studies compare children taking music lessons to those not taking music lessons at one
point in time without random assignment to groups, so the conclusions must be treated
with caution. However, these studies consistently find enhanced processing in musician
children. As discussed above, ERP responses to sound (including N1, P2, N2, and induced
gamma band) do not reach adult levels of maturity until well into the teenage years
(Ponton et al., 2000; Trainor, Shahin, & Roberts, 2003; Shahin et al., 2004), but musician
children appear to be advanced along this trajectory (Fujioka et al., 2006; Jentschke,
Koelsch, & Friederici, 2005; Shahin et al., 2004, 2008).
A few studies have used longitudinal designs to compare children at two time points.
Differential gains by musician and nonmusician children over the time period lend
support that musical (p. 327) experience is involved. Corrigall and Trainor (2009)
examined how musical processing develops in two groups of 4- to 5-year-old children, one
engaging in musical training and the other not. They found no differences between
groups at the first measurement. However, by the second measurement 8 months later,
the group engaging in musical training showed superior ability to detect harmonically
unexpected chords. Fujioka and colleagues (2006) measured MEG responses to sound in
4- to 5-year-old children, first when they were about to start music lessons and then every
3 months for a year during musical training. The responses of these children were
compared to those of a control group engaging in other activities such as athletic
training. The largest differences between groups in how the ERP responses changed over
the year were in the N2 component, which likely reflects greater attentional and memory
gains in the musician group compared to the nonmusician group. Similarly, Shahin and
colleagues (2008) examined the development of induced gamma-band responses in 4- to
5-year-old children, measuring musician and nonmusician children twice separated by a
year. They found that none of the children showed induced gamma-band responses at the
first measurement and that only those engaged in musical training did so at the second
measurement. These results also strongly suggest that musical training enhances
executive functions such as memory and attention. Finally, in one of the few studies
randomly assigning children to either music or drama lessons, Schellenberg (2004) found
that after a year of experience, those in music lessons showed greater improvement in IQ
scores than those in drama lessons.
In sum, although even nonmusicians develop brains specialized for processing the
structure of the music in their environment, musical training greatly enhances structural
and functional aspects of the brain for musical processing. Furthermore, musical training
in the preschool period results in superior musical abilities and, perhaps most interesting,
in superior executive functioning as well, which may well lead to benefits for other
cognitive domains (see Moreno, et al., 2011; Schellenberg, 2011; Trainor & Corrigall,
2010; Trainor & Unrau, 2012 for reviews).
Auditory and Musical Development
Page 30 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Development of Musical Rhythmic Organization
Musical pitch structure must be realized over time, so the temporal structure of music is
in a sense the most basic aspect of musical structure. Indeed, a considerable amount of
music has only rhythmic structure, with little or no pitch variation. Brain representations
for rhythm appear to involve two aspects that are closely linked to general auditory scene
analysis processes. The first is metrical structure, whereby listeners abstract a steady
beat (and its subdivisions and superdivisions) from a presented series of sound events.
The metrical structure is not given in the stimulus directly. Indeed, perceived beats of a
steady metrical structure may occur at places where there is no actual sound event. The
perception of a steady beat depends on regularities in the sequence of sound event onset-
to-onsets as well as on the durations, relative intensities, and pitches of the sound events.
The metrical structure is hierarchical, usually with an obvious tactus (or tempo at which
one would tap along) on which most people agree (Drake, Penel, & Bigand, 2000; Repp,
2005; Snyder & Krumhansl, 2001). In Western music, beats are typically evenly spaced
and successive levels of the hierarchy divide each beat of the previous level into two or
three beats. The second aspect of rhythmic structure is grouping, whereby sequences of
sound events are divided or grouped into phrases and subphrases and segregated from
surrounding phrases and subphrases.
Most research has been conducted on the perception of metrical structure, so that will be
the focus of the following sections. Metrical perception in music has been linked closely
with motor rhythms (Grahn & Brett, 2007; Grahn & Rowe, 2009; Phillips-Silver & Trainor,
2005, 2007, 2008; Repp, 2005; Trainor, Gao, Lei, Lehtovarara, & Harris, 2009). Indeed,
rhythmic music makes people want to move and dance. A close connection between
auditory and movement rhythms also underlies people’s ability to synchronize when
singing together or playing musical instruments together, a coupling that is not required
by speech. Few species appear to have this ability to synchronize movement to an
external auditory beat, but those that do also appear to be capable of vocal imitation
(Patel, Iversen, Bregman, & Schulz, 2009; Schachner, Brady, Pepperberg, & Hauser,
2009). Some of these species are evolutionarily distantly related, such as humans and
cockatoos, suggesting that this ability evolved independently in different species.
Although metrical structure is a near-universal organizing principle in music, the details
of metrical organization differ substantially across musical systems. Thus, as with musical
pitch structure, learning is necessary for rhythmic enculturation. In the following, we
first consider early-developing rhythmic capabilities, (p. 328) then enculturation to the
predominant rhythms of one’s culture, and finally effects of enriched musical training.
Auditory and Musical Development
Page 31 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Early Abilities for Perceiving Rhythmic Organization
As with musical pitch perception, young infants show considerable sensitivity to rhythmic
structure. Winkler, Háden, Ladinig, Sziller, and Honing (2009) used EEG to demonstrate
that newborn infants show surprise when a downbeat is omitted in a rhythmic context
that sets up an expectation for that downbeat. By 2 months of age infants are best at
tempo discrimination around 600 ms onset-to-onset (Baruch and Drake, 1997), which is in
the optimal range for adults across cultures. Infants of this age can also discriminate
simple rhythmic patterns (Demany, McKenzie, & Vurpillot, 1977). By 6 months, Western
infants use duration to extract grouping structure, perceiving relatively longer sound
events as the ends of groups (Trainor & Adams, 2000). At this age, infants can also
distinguish metrical structures where successive levels of the metrical hierarchy involve
groups of three beats (as in a waltz) from those that involve groups of two beats (as in a
march) (Hannon & Johnson, 2005; Hannon & Trehub, 2005a; Morrongiello, 1984; Phillips-
Silver & Trainor, 2005). At least as young as 7 to 9 months, infants can generalize across
pitch and tempo and recognize a rhythm across variation in these aspects (Trehub &
Thorpe, 1989).
Infants are motorically immature. However, Phillips-Silver and Trainor (2005) showed
that for infants as young as 7 months of age, the way that they are moved affects how
they perceive musical rhythms. The researchers created a repeating six-beat rhythm
pattern that was ambiguous in that it had no physical accents, but could readily be
perceived either with accents on every second beat (as in a march) or with accents on
every third beat (as in a waltz). They played this ambiguous rhythmic pattern for infants,
while bouncing half of the infants on every second beat and the other half on every third
beat. After this training, they found that infants bounced on every second beat preferred
to listen to a version of the rhythmic pattern with physical accents added on every second
beat over a version with accents on every third beat, whereas infants bounced on every
third beat showed the opposite pattern of preferences. Because the infants did not move
themselves, this suggests that motor planning may not be necessary for the interaction
between movement and auditory rhythm perception. Indeed, subsequent studies with
adults indicate that vestibular input, which is necessary for balance and movement in a
gravitational field, is crucial for this interaction (Phillips-Silver & Trainor, 2008; Trainor,
Gao, et al., 2009).
There is rather little research on the development of rhythmic abilities in children, and
much of it has been conducted in the laboratory of Carolyn Drake. With respect to the
reproduction of rhythm patterns, at age 7, but not at age 5, children are as accurate as
musically untrained adults in reproducing short rhythms (Drake, 1993). Younger children,
like adults, find duple meters easier than triple meters, rhythms with fewer different note
durations easier, and rhythms with intensity accents easier. With respect to metrical
Auditory and Musical Development
Page 32 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
perception, young children have difficulty moving in time with an external beat, whether
by tapping or using whole-body movement (Eerola, Luck, & Toiviainen, 2006). However,
the ability to tap to a beat improves dramatically between ages 4 and 11 years (Drake,
Jones, & Baruch, 2000). Finally, children’s preferred tapping tempo decreases with age
and with musical training, suggesting that they are able to deal with longer spans of time
as they get older (Drake, Jones, & Baruch, 2000; McAuley, Jones, Holub, Johnston, &
Miller, 2006).
In sum, young infants show precocious abilities to discriminate rhythmic patterns, and
their auditory perception of rhythm is influenced by movement. However, the motor skills
needed to produce musical rhythms take considerable time to develop.
Enculturation to the Rhythmic Structure of Specific Musical Systems
Western music predominantly contains simple metrical structures (Fraisse, 1982), and
those who have grown up listening to Western music find simple metrical structures such
as those with duration ratios of 1:1 or 1:2 (as in a march) easier to process than those
with more complex rhythms, such as ratios of 2:3 (i.e., a group of two beats followed by a
group of three beats) (Hannon & Trehub, 2005a; Repp, London, & Keller, 2005; Snyder,
Hannon, Large, & Christiansen, 2006). A privileged status for simple rhythmic structures
might reflect the simple ratios involved in human movements, such as heartbeats and
walking. However, many musical systems, such as those in Bulgaria and Macedonia, use
more complex rhythms in their folk music, and adults in these cultures have no difficulty
in perceiving these complex rhythms (Hannon & Trehub, 2005a).
(p. 329) As with the development of system-specific scale structure discussed above,
Hannon and Trehub (2005a,b) have shown that at 6 months infants are able to perceive
both simple and complex rhythms, but that they lose the ability to process complex
rhythms by 12 months if these rhythms are not present in their musical system of
exposure. They presented Western adults and infants of 6 and 12 months, as well as
Bulgarian and Macedonian adults, with musical excerpts that had either simple or
complex metrical structures. They found that Western 6-month-olds and Bulgarian and
Macedonian adults could detect timing changes in rhythms with both types of structures,
but that Western adults and Western 12-month-olds could do so only for rhythms with
simple metrical structures. In sum, the brain appears to become specialized by 12 months
of age for the metrical structures that are predominant in the musical system of one’s
culture.
Effects of Formal Musical Training on Rhythmic Development
Very few studies have directly examined effects of musical training on rhythmic
development. However, a large range of rhythmic abilities exist in the general population,
and these individual differences extend to motor manifestations of rhythm such as
Auditory and Musical Development
Page 33 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
dancing. fMRI studies show that rhythmic stimuli activate a network of auditory and
motor regions in the brain that are similar in musicians and nonmusicians (Limb,
Kemeny, Ortigoza, Rouhani, & Braun, 2006). At the same time, several studies show
differences between adult musicians and nonmusicians, and it is reasonable to speculate
that these differences are caused at least in part by their different musical experiences in
childhood. For example, when five beats in a row are occasionally omitted in an
isochronous beat sequence, musicians are more accurate than nonmusicians at tapping at
the point where they thought the fifth tone should occur (Jongsma, Desain, Honing,
2004). Musicians’ ERPs are also temporally less variable than those of nonmusicians.
Both fMRI studies (Limb et al., 2006) and EEG studies measuring MMN (Vuust et al.,
2005) indicate that musicians show greater left activation when engaging in rhythmic
processing compared to nonmusicians. The right hemisphere may be necessary for
sequencing, but the left hemisphere is likely better for precise timing (Zatorre, 2001), so
these studies suggest that musical training may have its greatest effect on rhythm
processing in the networks of the left hemisphere for precise timing.
It remains unknown as to exactly how rhythms are encoded in cortex. However,
theoretical models have shown that rhythmic entrainment can be accomplished by a bank
of oscillators, each flexible but maximally driven by a particular best frequency (e.g.,
Large & Jones, 1999). Recent evidence suggests that this type of model may be
biologically plausible. The past decade has seen an increased interest in oscillatory
activity at various frequencies that is present in EEG and MEG recordings. Some of this
activity is directly driven by the stimulus in that it is precisely phase-locked to the onset
of the presented sound, such as the evoked gamma-band response (Pantev et al., 1991).
However, some of this activity increases in response to the presented stimulus but is not
precisely phase-locked to the stimulus, such as the induced gamma-band response
(between 40 and 100 Hz). The induced gamma-band response can occur when no sound
is actually present, but when there is an expectation for a sound at a certain point in a
rhythmic pattern (Snyder & Large, 2005). With respect to effects of musical training,
gamma-band responses are larger in musicians than in nonmusicians (Bhattacharya,
Petsche, & Pereda, 2001) and develop earlier in children taking music lessons than in
children not taking formal music lessons (Shahin et al., 2008). Relations between activity
in various frequency bands is likely important as well. For example, Fujioka and
colleagues (2009) found that activity in the beta band (15 to 30 Hz) followed each sound
event in a regular sequence of presented sound beats but did not respond to omitted
beats, whereas activity in the gamma band increased after each presented beat as well as
after omitted beats. Furthermore, oscillatory activity is likely related to rhythmic coupling
between auditory and motor networks. Fujioka, Trainor, Large, and Ross (2012) used
source analysis techniques on MEG data to demonstrate that presentation of an auditory
beat in the absence of any motor movement or instruction to imagine movement leads to
Auditory and Musical Development
Page 34 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
similar modulation of oscillatory beta band activity across auditory and motor regions
that follows the tempo of the auditory beats.
There are very few scientific studies examining the effects of musical training on
rhythmic development. However, Drake and her colleagues have consistently found that
compared to children not taking music lessons, children training musically are better at
reproducing rhythmic patterns and are more flexible in their ability to tap the beat at
different levels of the metrical hierarchy (Drake, 1993; Drake, Jones, & Baruch, 2000). To
date there is one study of the (p. 330) effects of music experience in infancy on rhythm
development. Gerry, Faux, and Trainor (2010) used the methods of Phillips-Silver and
Trainor (2005) discussed above (see the section “Early Abilities for Perceiving Rhythmic
Organization”) to compare infants enrolled in parent-and-infant Kindermusik classes with
infants not engaged in formal musical classes. Specifically, this method was developed to
measure the influence of movement on whether an ambiguous rhythmic pattern is
perceived as a march or as a waltz. Infants in Kindermusik classes get a lot of experience
being walked and swayed to musical rhythms, and this study examined whether such
enriched auditory-movement experience would influence metrical development. Gerry
and colleagues found that infants in Kindermusik tended to listen longer to the rhythm
patterns in general in the preference test phase, suggesting a greater interest in musical
rhythms. The music used in the Kindermusik classes is predominantly in duple rather
than triple meter, following conventions of Western music. Of most interest with respect
to enculturation was the finding that the effects of movement on auditory disambiguation
of the metrical structure were much stronger when infants were moved on every second
bet of the ambiguous pattern compared to when they were moved on every third beat for
the Kindermusik group but not for the group not engaged in formal musical training.
Thus, Kindermusik training is associated with an earlier processing bias for the dominant
duple rhythm patterns of Western music.
In sum, although there are few studies directly addressing this issue, musical training
likely has a large effect on the level of rhythmic accomplishment achieved. However, little
research to date addresses the ages at which formal training has the largest effects on
rhythmic perception and production.
Conclusions
Some common themes apply to the discrimination of sound features of isolated sounds,
the perception of auditory objects, and the perception of music with complex
spectrotemporal structure. First, for each of these levels of auditory processing, young
Auditory and Musical Development
Page 35 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
infants show some sophisticated processing abilities. Second, mature adult levels of
processing, however, are not usually achieved until well into childhood or even, in many
cases, into the teenage years. Third, through exposure to specific sounds with specific
structures, children’s processing becomes refined and specialized for the structure of
their auditory input. For example, exposure to sounds with pitch is necessary for the
development of tonotopic representations and the ability to discriminate different
frequencies of sound. Similarly, enculturation—development of specialized brain
representations—to the specific melodic, harmonic, and rhythmic structure of the musical
system of a person’s culture depends on considerable exposure to that musical system.
Fourth, specific intense experience has a profound effect on perception. For example,
early exposure to complex rhythms appears to be necessary for fluent processing of those
rhythms, and brain representations for musical pitch structure are considerably different
in both child and adult musicians compared to nonmusicians.
It can be readily argued that auditory scene analysis and the representation of sounding
objects developed under evolutionary pressure, as there is survival value in being able to
identify which conspecifics and predictors are present in the environment and where they
are located. This pressure in turn could easily translate into pressure for better encoding
of basic sounds features such as frequency, pitch, timbre, timing, and sound location as
they are all necessary for optimal auditory scene analysis. The perception of musical
structure also depends on basic sound processing and on auditory scene analysis, but
music presents more of a puzzle in terms of survival value. However, music serves
important communication functions in preverbal infants and continues to be important in
childhood and adulthood for emotional regulation and social bonding. Thus, it is possible
that the social/emotional functions of music confer survival value, and that music, in turn,
has led to evolutionary pressure for better basic pitch processing and auditory scene
analysis.
Questions for Future Research
General issues that remain for future research include the following:
1. Why does basic auditory processing take many years to reach adult levels?
2. How does auditory plasticity differ at different ages?
3. What are the sensitive periods in humans for processing different sound features?
4. How are physiological and behavioral auditory development related?
5. Are there sensitive periods for the acquisition of musical skills?
6. Does the orderly acquisition of musical skills in Western children apply to the
acquisition of other musical systems?
Auditory and Musical Development
Page 36 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
(p. 331) Acknowledgments
The writing of this chapter was supported by grants to LJT from the Natural Science and
Engineering Research Council of Canada, the Canadian Institutes of Health Research, the
Canada Foundation for Innovation, and the Grammy Foundation. We thank Andrea Unrau
for comments on an earlier draft.
References
Abdala, C., & Folsom, R. C. (1995a). The development of frequency resolution in humans
as revealed by the auditory brain-stem response recorded with notched-noise masking.
Journal of the Acoustical Society of America, 98, 921–930.
Abdala, C., & Folsom, R. C. (1995b). Frequency contribution to the click-evoked auditory
brain stem response in human adults and infants. Journal of the Acoustical Society of
America, 97, 2394–2404.
Alias, D., & Burr, D. (2004). The ventriloquist effect results from near-optimal bimodal
integration. Current Biology, 14, 257–262.
Allen, P., & Wightman, F. (1992). Spectral pattern discrimination by children. Journal of
Speech and Hearing Research, 35, 222–233.
Al’tman, I. (1983). Role of higher divisions of the auditory system in localizing a moving
sound source. Zhurnal vyssheĭ nervnoĭ deiatelnosti imeni I P Pavlova, 33, 88–94.
Anvari, S. H., Trainor, L. J., Woodside, J., & Levy, B. A. (2002). Relations among musical
skills, phonological processing and early reading ability in preschool children. Journal of
Experimental Child Psychology, 83, 111–130.
Ashmead, D. H., Clifton, R. K., & Perris, E. E. (1987). Precision of auditory localization in
human infants. Developmental Psychology, 23, 641–647.
Ashmead, D. H., Davis, D., Whalen, T., & Odom, R. (1991). Sound localization and
sensitivity to interaural time differences in human infants. Child Development, 62, 1211–
1226.
Balzano, G. J. (1980). The group theoretic description of 12-fold and microtonal pitch
systems. Computer Music Journal, 4, 66–84.
Auditory and Musical Development
Page 37 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Bangert, M., & Schlaug, G. (2006). Specialization of the specialized in features of
external human brain morphology. European Journal of Neuroscience, 24, 1832–1834.
Bargones, J. Y., & Werner, L. A. (1994). Adults listen selectively; Infants do not.
Psychological Science, 5, 170–174.
Baruch, C., & Drake, C. (1997). Tempo discrimination in infants. Infant Behavior &
Development, 20, 573–577.
Battaglia, P. W., Jacobs, R. A., & Aslin, R. N. (2003). Bayesian integration of visual and
auditory signals for spatial localization. Journal of the Optical Society of America A, 20,
1391–1397.
Bendor, D., & Wang, X. (2005). The neuronal representation of pitch in primate auditory
cortex. Nature, 436, 1161–1165.
Bermudez, P., Lerch, J. P., Evans, A. C., & Zatorre, R. J. (2009). Neuroanatomical
correlates of musicianship as revealed by cortical thickness and voxel-based
morphometry. Cerebral Cortex, 19, 1583–1596.
Bhattacharya, J., Petsche, H., & Pereda, E. (2001). Long-range synchrony in the gamma
band: role in music perception. Journal of Neuroscience, 21, 6329–6337.
Bigand, E., & Poulin-Charronnat, B. (2006). Are we “experienced listeners”? A review of
the musical capacities that do not depend on formal musical training. Cognition, 100,
100–130.
Binns, K. E., Withington, D. J., & Keating, M. J. (1995). The developmental emergence of
the representation of auditory azimuth in the external nucleus of the inferior colliculus of
the guinea-pig: The effects of visual and auditory deprivation. Brain Research:
Developmental Brain Research, 85, 14–24.
Bosnyak, D. J., Eaton, R. A., & Roberts, L. E. (2004). Distributed auditory cortical
representations are modified when non-musicians are trained at pitch discrimination with
40 Hz amplitude modulated tones. Cerebral Cortex, 14, 1088–1099.
Bregman, A. S. (1990). Auditory scene analysis: The perceptual organization of sound.
Cambridge, MA: MIT Press.
Burns, E. M. (1999) Intervals, scales and tuning. In D. Deutsch (Ed.), The psychology of
music (2nd ed., pp. 215–264). San Diego, CA: Academic Press.
Auditory and Musical Development
Page 38 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Buss, E., Hall, J. W., Grose, J. H., & Dev, M. B. (1999). Development of adult-like
performance in backward, simultaneous, and forward masking. Journal of Speech
Language and Hearing Research, 42, 844–849.
Clarke, S., Bellman, T. A., Maeder, P., Adriani, M., Vernet, O., Regli, L., et al. (2002).
What and where in human audition: selective deficits following focal hemispheric lesions.
Experimental Brain Research, 147, 8–15.
Clarkson, M. G., & Clifton, R. K. (1985). Infant pitch perception: Evidence for responding
to pitch categories and the missing fundamental. Journal of the Acoustical Society of
America, 77, 1521–1528.
Clarkson, M. G., & Clifton, R. K. (1995). Infants’ pitch perception: Inharmonic tonal
complexes. Journal of the Acoustical Society of America, 98, 1372–1379.
Clarkson, M. G., Clifton, R. K., & Perris, E. E. (1988). Infant timbre perception:
Discrimination of spectral envelopes. Perception & Psychophysics, 43, 15–20.
Clarkson, M. G., & Rogers, E. C. (1995). Infants require low-frequency energy to hear the
pitch of the missing fundamental. Journal of the Acoustical Society of America, 98, 148–
154.
Clifton, R. K. (1992). The development of spatial hearing in human infants. In L. A.
Werner & E. W. Rubel (Eds.), Developmental psychoacoustics (pp. 135–157). Washington
DC: American Psychological Association.
Clifton, R. K., Gwiazda, J., Bauer, J., Clarkson, M., & Held, R. (1988). Growth in head size
during infancy: Implications for sound localization. Developmental Psychology, 24, 477–
483.
Clifton, R. K., Morrongiello, B. A., & Dowd, J. M. (1984). A developmental look at an
auditory illusion: The precedence effect. Developmental Psychobiology, 17, 519–536.
Clifton, R. K., Morrongiello, B. A., Kulig, J. W., & Dowd, J. M. (1981). Developmental
Changes in Auditory Localization in Infancy. In R. Aslin, J. Alberts, & M. R. Peterson
(Eds.), Development of perception (pp. 141–160). Academic Press.
Cornelisse, L. E., & Kelly, J. B. (1987). The effect of cerebrovascular accident on the
ability to localize sounds under conditions of the precedence effect. Neuropsychologia,
25, 449–452.
Corrigall, K. A., & Trainor, L. J. (2009). Effects of musical training on key and harmony
perception. Annals of the New York Academy of Sciences, 1169, 164–168.
Auditory and Musical Development
Page 39 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Costa-Giomi, E. (2003). Young children’s harmonic perception. Annals of the New York
Academy of Sciences, 999, 477–484.
Demany, L. (1982). Auditory stream segregation in infancy. Infant Behavior and
Development, 5, 261–276.
Demany, L., & Armand, F. (1984). The perceptual reality of tone chroma in early infancy.
Journal of the Acoustical Society of America, 76, 57–66.
Demany, L., McKenzie, B., & Vurpillot, E. (1977). Rhythm perception in early infancy.
Nature, 266, 718–719.
Dissanayake, E. (2000). Antecedents of the temporal arts in early mother-infant
interaction. In N. L. Wallin, B. Merker, & S. Brown (Eds.), Florentine workshops in
biomusicology (pp. 389–410). Cambridge, MA: MIT Press.
Drake, C. (1993). Reproduction of musical rhythms by children, adult musicians, and
adult nonmusicians. Perception & Psychophysics, 53, 25–33.
Drake, C., Jones, M. R., & Baruch, C. (2000). The development of rhythmic attending in
auditory sequences: Attunement, referent period, focal attending. Cognition, 77, 251–288.
Drake, C., Penel, A., & Bigand, E. (2000). Tapping in time with mechanically and
expressively performed music. Music Perception, 18, 1–23.
Eerola, T., Luck, G., & Toiviainen, P. (2006, August). An investigation of pre-schoolers’
corporeal synchronization with music. Presented at the 9th International Conference on
Music Perception and Cognition, Bologna, Italy.
Efron, R., Crandall, P. H., Koss, B., Divenyi, P. L., & Yund, E. W. (1983). Central auditory
processing. III. The “cocktail party” effect and anterior temporal lobectomy. Brain and
Language, 19, 254–263.
Fassbender, C. (1993). Auditory grouping and segregation processes in infancy.
Norderstedt: Kaste Verlag.
Fishman, Y. I., Reser, D. H., Arezzo, J. C., & Steinschneider, M. (1998). Pitch vs. spectral
encoding of harmonic complex tones in primary auditory cortex of the awake monkey.
Brain Research, 786, 18–30.
Folland, N. A., Butler, B. E., Smith, N. A., & Trainor, L. J. (2012). Processing simultaneous
auditory objects: Infants’ ability to detect mistunings in harmonic complexes. Journal of
the Acoustical Society of America, 131, 993–997.
Auditory and Musical Development
Page 40 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Folsom, R. D., & Wynne, M. K. (1987). Auditory brain stem responses from human adults
and infants: Wave V tuning curves. Journal of the Acoustical Society of America, 81, 412–
417.
Fraisse, P. (1982). Rhythm and tempo. In D. Deutsch (Ed.), The psychology of music (pp.
149–180). New York: Academic Press.
Fujioka, T., Ross, B., Kakigi, R., Pantev, C., & Trainor, L. J. (2006). One year of musical
training affects development of auditory cortical-evoked fields in young children. Brain: A
Journal of Neurology, 129, 2593–2593.
Fujioka, T., Trainor, L. J., Large, E. W., & Ross, B. (2009). Beta and gamma rhythms in
human auditory cortex during musical beat processing. Annals of the New York Academy
of Sciences, 1169, 89–92.
Fujioka, T., Trainor, L. J., Large, E. W., & Ross, B. (2012). Internalized timing of
isochronous sounds is represented in neuromagnetic beta oscillations. The Journal of
Neuroscience, 32, 1791–1802.
Fujioka, T., Trainor, L. J., Ross, B., Kakigi, R., & Pantev, C. (2004). Musical training
enhances automatic encoding of melodic contour and interval structure. Journal of
Cognitive Neuroscience, 16, 1010–1021.
Fujioka, T., Trainor, L. J., Ross, B., Kakigi, R., & Pantev, C. (2005). Automatic encoding of
polyphonic melodies in musicians and non-musicians. Journal of Cognitive Neuroscience,
17, 1578–1592.
Gaser, C., & Schlaug, G. (2003). Brain structures differ between musicians and non-
musicians. Journal of Neuroscience, 23, 9240–9245.
Gerry, D. W., Faux, A. L., & Trainor, L. J. (2010). Effects of Kindermusik training on
infants’ rhythmic enculturation. Developmental Science, 13, 545–551.
Grahn, J. A., & Brett, M. (2007). Rhythm and beat perception in motor areas of the brain.
Journal of Cognitive Neuroscience, 19, 893–906.
Grahn, J. A., & Rowe, J. B. (2009). Feeling the beat: Premotor and striatal interactions in
musicians and non-musicians during beat perception. Journal of Neuroscience, 29, 7540–
7548.
Gray, L. (1992). Interactions between sensory and nonsensory factors in the responses of
newborn birds to sound. In L. A. Werner & E. W. Rubel (Eds.), Developmental
psychoacoustics (pp. 89–112). Washington, DC: American Psychological Association.
Auditory and Musical Development
Page 41 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Gurtubay, I. G., Alegre, M., Valencia, M., & Artieda, J. (2006). Cortical gamma activity
during auditory tone omission provides evidence for the involvement of oscillatory
activity in top-down processing. Brain Research Experimental Brain Research, 175, 463–
470.
Hannon, E. E., & Johnson, S. P. (2005). Infants use meter to categorize rhythms and
melodies: Implications for musical structure learning. Cognitive Psychology, 50, 354–377.
Hannon, E. E., & Trehub, S. E. (2005a). Metrical categories in infancy and adulthood.
Psychological Science, 16, 48–55.
Hannon, E. E., & Trehub, S. E. (2005b). Tuning in to musical rhythms: infants learn more
readily than adults. Proceedings of the National Academy of Science USA, 102, 12639–
12643.
Hartley, D. E. H., Wright, B. A., Hogan, S. C., & Moore, D. R. (2000). Age-related
improvements in auditory backward and simultaneous masking in 6- to 10-year-old
children. Journal of Speech Language and Hearing Research, 43, 1402–1415.
He, C., Hotson, L., & Trainor, L. J. (2007). Mismatch responses to pitch changes in early
infancy. Journal of Cognitive Neuroscience, 19, 878–892.
He, C., Hotson, L., & Trainor, L. J. (2009). Maturation of cortical mismatch responses to
occasional pitch change in early infancy: Effects of presentation rate and magnitude of
change. Neuropsychologia, 47, 218–229.
He, C., & Trainor, L. J. (2009). Finding the pitch of the missing fundamental in infants.
Journal of Neuroscience, 29, 7718–7722.
Huron, D. (2003). Is music an evolutionary adaptation? In I. Peretz & R. Zatorre (Eds.),
The cognitive neuroscience of music (pp. 57–75). Oxford: Oxford University Press.
Huron, D. (2006). Sweet anticipation: Music and the psychology of expectation.
Cambridge, MA: MIT Press.
Hutchinson, S., Lee, L. H., Gaab, N., & Schlaug, G. (2003). Cerebellar volume of
musicians. Cerebral Cortex, 13, 943–949.
Jensen, J. K., & Neff, D. L. (1993). Development of basic auditory discrimination in
preschool children. Psychological Science, 4, 104–107.
Auditory and Musical Development
Page 42 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Jentschke, S., Koelsch, S., & Friederici, A. D. (2005). Investigating the relationship of
music and language in children: Influences of musical training and language impairment.
Annals of the New York Academy of Sciences, 1060, 231–242.
Jongsma, M. L. A., Desain, P., & Honing, H. (2004). Rhythmic context influences the
auditory evoked potentials of musicians and nonmusicians. Biological Psychology, 66,
129–152.
Kaiser, J., & Lutzenberger, W. (2003). Induced gamma-band activity and human brain
function. Neuroscientist, 9, 475–484.
Keefe, D. H., Bulen, J. C., Arehart, K. H., & Burns, E. M. (1993). Ear-canal impedance and
reflection coefficient in human infants and adults. Journal of the Acoustical Society of
America, 94, 2617–2638.
Keefe, D. H., & Levi, E. C. (1996). Maturation of the middle and external ears: Acoustic
power-based responses and reflectance tympanometry. Ear and Hearing, 17, 1–13.
King, A. J., Parsons, C. H., & Moore, D. R. (2000). Plasticity in the neural coding of
auditory space in the mammalian brain. Proceedings of the National Academy of Sciences
USA, 97, 11821–11828.
Knudsen, E. I. (1988). Experience shapes sound localization and auditory unit properties
during development in the barn owl. In G. M. Edelman, W. E. Gall, & W. M. Cowan (Eds.),
Auditory function: Neurobiological bases of hearing (pp. 137–149). New York: John Wiley
& Sons.
Koelsch, S., Grossmann, T., Gunter, T. C., Hahne, A., Schroger, E., & Friederici, A. D.
(2003). Children processing music: Electric brain responses reveal musical competence
and gender differences. Journal of Cognitive Neuroscience, 15, 683–693.
Koelsch, S., Schmidt, B., & Kansok, J. (2002). Effects of musical expertise on the early
right anterior negativity: An event-related brain potential study. Psychophysiology, 39,
657–663.
Koelsch, S., & Siebel, W. A. (2005). Towards a neural basis of music perception. Trends in
Cognitive Sciences, 9, S78–S84.
Kuriki, S., Kanda, S., & Hirata, Y. (2006). Effects of musical experience on different
components of MEG responses elicited by sequential piano-tones and chords. Journal of
Neuroscience, 26, 4046–4053.
Auditory and Musical Development
Page 43 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Lappe, C., Herholz, S. C., Trainor, L. J., & Pantev, C. (2008). Cortical plasticity induced by
short-term unimodal and multimodal musical training. Journal of Neuroscience, 28, 9632–
9639.
Large, E. W., & Jones, M. R. (1999). The dynamics of attending: How people track time-
varying events. Psychological Review, 106, 119–159.
Lecanuet, J. P. (1996). Fetal sensory competencies. European Journal of Obstetrics and
Gynecology, 68, 1–23.
Levi, E. C., & Werner, L. A. (1996). Amplitude modulation detection in infancy: Update on
3-month-olds. Abstracts of the Association for Research in Otolaryngology, 19, 142.
Limb, C. J., Kemeny, S., Ortigoza, E. B., Rouhani, S., & Braun, A. R. (2006). Left
hemispheric lateralization of brain activity during passive rhythm perception in
musicians. Anatomical Record Part A: Discoveries in Molecular, Cellular, and
Evolutionary Biology, 288A, 382–389.
Litovsky, R. Y. (1997). Developmental changes in the precedence effect: Estimates of
minimal audible angle. Journal of the Acoustical Society of America, 102, 1739–1745.
Luck, S. J. (2005). An introduction to the event-related potential technique. Cambridge,
MA: MIT Press.
Lynch, M. P., Eilers, R. E., Oller, D. K., & Urbano, R. C. (1990). Innateness, experience,
and music perception. Psychological Science, 1, 272–276.
Magne, C., Schon, D., & Besson, M. (2006). Musician children detect pitch violations in
both music and language better than nonmusician children: Behavioral and
electrophysiological approaches. Journal of Cognitive Neuroscience, 18, 199–211.
Masataka, N. (2006). Preference for consonance over dissonance by hearing newborns of
deaf parents and of hearing parents. Developmental Science, 9, 46–50.
Maxon, A. B., & Hochberg, I. (1982). Development of psychoacoustic behavior: Sensitivity
and discrimination. Ear and Hearing, 3, 301–308.
McAdams, S., & Bertoncini, J. (1997). Organization and discrimination of repeating sound
sequences by newborn infants. Journal of the Acoustical Society of America, 102, 2945–
2953.
Auditory and Musical Development
Page 44 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
McAuley, J. D., Jones, M. R., Holub, S., Johnston, H. M., & Miller, N. S. (2006). The time
of our lives: Life span development of timing and event tracking. Journal of Experimental
Psychology General, 135, 348–367.
Montgomery, C. R., & Clarkson, M. G. (1997). Infants’ pitch perception: Masking by low-
and high-frequency noises. Journal of the Acoustical Society of America, 102, 3665–3672.
Moore, D. R. (1983). Development of inferior colliculus and binaural audition. In R.
Romand (Ed.), Development of auditory and vestibular systems (pp. 121–166). San Diego,
CA: Academic Press.
Moore, J. K., & Linthicum, R. H. (2007). The human auditory system: A timeline of
development. International Journal of Audiology, 46, 460–478.
Moore, J. K., Perazzo, L. M., & Braun, A. (1995). Time course of axonal myelination in
human brainstem auditory pathway. Hearing Research, 87, 21–31.
Moreno, S., & Besson, M. (2006). Musical training and language-related brain electrical
activity in children. Psychophysiology, 43, 287–287.
Moreno, S., Bialystok, E., Barac, R., Schellenberg, E. G., Cepeda, N. J., & Chau, T. (2011).
Short-term music training enhances verbal intelligence and executive function.
Psychological Science, 22, 1425–1433.
Morrongiello, B. A. (1984). Auditory temporal pattern perception in 6- and 12-month-old
infants. Developmental Psychology, 20, 441–448.
Morrongiello, B. A. (1988). Infants’ localization of sounds in the horizontal plane:
Estimates of minimal audible angle. Developmental Psychology, 24, 8–13.
Morrongiello, B. A., Fenwick, K., & Chance, G. (1990). Sound localization acuity in very
young infants: An observer-based testing procedure. Developmental Psychology, 26, 75–
84.
Morrongiello, B. A., Fenwick, K. D., Hillier, L., & Chance, G. (1994). Sound localization in
newborn human infants. Developmental Psychobiology, 27, 519–538.
Morrongiello, B. A., & Rocca, P. T. (1987a). Infants’ localization of sounds in the
horizontal plane: Effects of auditory and visual cues. Child Development, 58, 918–927.
Morrongiello, B. A., & Rocca, P. T. (1987b). Infants’ localization of sounds in the median
sagittal plane: Effects of signal frequency. Journal of the Acoustical Society of America,
82, 900–905.
Auditory and Musical Development
Page 45 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Morrongiello, B. A., & Rocca, P. T. (1987c). Infants’ localization of sounds in the median
vertical plane: Estimates of minimal audible angle. Journal of Experimental Child
Psychology, 43, 181–193.
Morrongiello, B. A., & Rocca, P. T. (1990). Infants’ localization of sounds within
hemifields: Estimates of minimum audible angle. Child Development, 61, 1258–1270.
Muir, D. W., Clifton, R. K., & Clarkson, M. G. (1989). The development of a human
auditory localization response: a U-shaped function. Canadian Journal of Psychology, 43,
199–216.
Muir, D., & Field, J. (1979). Newborn infants orient to sounds. Child Development, 50,
431–436.
Murphy, K. M., Beston, B. R., Boley, P. M., & Jones, D. G. (2005). Development of human
visual cortex: A balance between excitatory and inhibitory plasticity mechanisms.
Developmental Psychobiology, 46, 209–221.
Musacchia, G., Sams, M., Skoe, E., & Kraus, N. (2007). Musicians have enhanced
subcortical auditory and audiovisual processing of speech and music. Proceedings of the
National Academy of Sciences, 104, 15894–15898.
Musiek, F. E., Weihing, J. A. & Oxholm, V. B. (2007). Anatomy and physiology of the
preipheral auditory system. In R. J. Roeser, M. Valente, & H. Hosford-Dunn (Eds.),
Audiology diagnosis (Vol. 2, 2nd ed., pp. 50–56). New York: Thieme Medical Publishers.
Näätänen, R., Paavilainen, P., Rinne, T., & Alho, K. (2007). The mismatch negativity
(MMN) in basic research of central auditory processing: A review. Clinical
Neurophysiology, 118, 2544–2590.
Okabe, K. S., Tanaka, S., Hamada, H., Miura, T., & Funai, H. (1988). Acoustic impedance
measured on normal ears of children. Journal of the Acoustical Society of Japan, 9, 297–
294.
Olsho, L. W., Koch, E. G., Carter, E. A., Halpin, C. F., & Spetner, N. B. (1988). Pure-tone
sensitivity of human infants. Journal of the Acoustical Society of America, 84, 1316–1324.
Pantev, C., Makeig, S., Hoke, M., Galambos, R., Hampson, S., & Gallen, C. (1991). Human
auditory evoked gamma band magnetic fields. Proceedings of the National Academy of
Sciences, 88, 8996–9000.
Pantev, C., Oostenveld, R., Engelien, A., Ross, B., Roberts, L. E., & Hoke, M. (1998).
Increased auditory cortical representation in musicians. Nature, 392, 811–814.
Auditory and Musical Development
Page 46 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Pantev, C., Roberts, L. E., Schulz, M., Engelien, A., & Ross, B. (2001). Timbre-specific
enhancement of auditory cortical representations in musicians. NeuroReport, 12, 169–
174.
Patel, A. D., Iversen, J. R., Bregman, M. R., & Schulz, I. (2009). Experimental evidence for
synchronization to a musical beat in a nonhuman animal. Current Biology, 19, 827–830.
Patterson, R. D., Uppenkamp, S., Johnsrude, I. S., & Griffiths, T. D. (2002). The
processing of temporal pitch and melody information in auditory cortex. Neuron, 36, 767–
776.
Penagos, H., Melcher, J. R., & Oxenham, A. J. (2004). A neural representation of pitch
salience in nonprimary human auditory cortex revealed with functional magnetic
resonance imaging. Journal of Neuroscience, 24, 6810–6815.
Phillips, D. P. (1999). Auditory gap detection, perceptual channels, and temporal
resolution in speech perception. Journal of the American Academy of Audiology, 10, 343–
354.
Phillips-Silver, J., & Trainor, L. J. (2005). Feeling the beat: Movement influences infant
rhythm perception. Science, 308, 1430–1430.
Phillips-Silver, J., & Trainor, L. J. (2007). Hearing what the body feels: Auditory encoding
of rhythmic movement. Cognition, 105, 533–546.
Phillips-Silver, J., & Trainor, L. J. (2008). Vestibular influence on auditory metrical
interpretation. Brain and Cognition, 67, 94–102.
Picton, T. W., Alain, C., Otten, L., Ritter, W., & Achim, A. (2000). Mismatch negativity:
Different water in the same river. Audiology & Neurotology Special Issue: Mismatch
Negativity, 5, 111–139.
Plantinga, J., & Trainor, L. J. (2005). Memory for melody: Infants use a relative pitch
code. Cognition, 98, 1–11.
Plantinga, J., & Trainor, L. J. (2008). Infants’ memory for isolated tones and the effects of
interference. Music Perception, 26, 121–127.
Plomp, R., & Levelt, W. J. M. (1965). Tonal consonance and critical bandwidth. Journal of
the Acoustical Society of America, 38, 548–560.
Auditory and Musical Development
Page 47 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Ponton, C. W., Eggermont, J. J., Kwong, B., & Don, M. (2000). Maturation of human
central auditory system activity: Evidence from multi-channel evoked potentials. Clinical
Neurophysiology, 111, 220–236.
Recanzone, G. H., Schreiner, C. E., & Merzenich, M. M. (1993). Plasticity in the
frequency representation of primary auditory cortex following discrimination training in
adult owl monkeys. Journal of Neuroscience, 13, 87–103.
Repp, B. H. (2005). Sensorimotor synchronization: A review of the tapping literature.
Psychonomic Bulletin & Review, 12, 969–992.
Repp, B. H., London, J., & Keller, P. E. (2005). Production and synchronization of uneven
rhythms at fast tempi. Music Perception, 23, 61–78.
Robertson, D., & Irvine, D. R. F. (1989). Plasticity of frequency organization in auditory
cortex of guinea pigs with partial unilateral deafness. Journal of Comparative Neurology,
282, 456–471.
Rock, A. M. L., Trainor, L. J., & Addison, T. (1999). Distinctive messages in infant-directed
lullabies and play songs. Developmental Psychology, 35, 527–534.
Rosen, S., van der Lely, H., Adlard, A., & Manganari, E. (2000). Backward masking in
children with and without language disorders. British Journal of Audiology, 34, 124.
Ross, D. A., Olden, I. R., Marks, L. E., & Gore, J. C. (2004). A nonmusical paradigm for
identifying absolute pitch possessors. Journal of the Acoustical Society of America, 116,
1792–1799.
Saffran, J. R., & Griepentrog, G. J. (2001). Absolute pitch in infant auditory learning:
Evidence for developmental reorganization. Developmental Psychology, 37, 74–85.
Saffran, J. R., Werker, J. F., & Werner, L. A. (2006). The infant’s auditory world: Hearing,
speech, and the beginnings of language. In R. Siegler & D. Kuhn (Eds.), Handbook of
child psychology: Vol. 2, Cognition, perception and language (pp. 58–108). New York:
Wiley.
Schachner, A., Brady, T. F., Pepperberg, I. M., & Hauser, M. D. (2009). Spontaneous
motor entrainment to music in multiple vocal mimicking species. Current Biology, 19, 1–6.
Schellenberg, E. G. (2004). Music lessons enhance IQ. Psychological Science, 15, 511–
514.
Auditory and Musical Development
Page 48 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Schellenberg, E. G. (2011). Music lessons, emotional intelligence, and IQ. Music
Perception, 29, 185–194.
Schellenberg, E. G., Bigand, E., Poulin-Charronnat, B., Garnier, C., & Stevens, C. (2005).
Children’s implicit knowledge of harmony in Western music. Developmental Science, 8,
551–566.
Schellenberg, E. G., & Trainor, L. J. (1996). Sensory consonance and the perceptual
similarity of complex-tone harmonic intervals: Tests of adult and infant listeners. Journal
of the Acoustical Society of America, 100, 3321–3328.
Schneider, P., Scherg, M., Dosch, H. G., Specht, H. J., Gutschalk, A., & Rupp, A. (2002).
Morphology of Heschl’s gyrus reflects enhanced activation in the auditory cortex of
musicians. Nature Neuroscience, 5, 688–694.
Schneider, P., Sluming, V., Roberts, N., Scherg, M., Goebel, R., Specht, H. J., et al.
(2005). Structural and functional asymmetry of lateral Heschl’s gyrus reflects pitch
perception preference. Nature Neuroscience, 8, 1241–1247.
Schneider, B. A., & Trehub, S. E. (1992). Sources of developmental change in auditory
sensitivity. In L. A. Werner & E. W. Rubel (Eds.), Developmental psychoacoustics (pp. 3–
46). Washington, DC: American Psychological Association.
Schneider, B. A., Trehub, S. E., Morrongiello, B. A., & Thorpe, L. A. (1989).
Developmental changes in masked thresholds. Journal of the Acoustical Society of
America, 86, 1733–1742.
Schönwiesner, M., & Zatorre, R. J. (2008). Depth electrode recordings show double
dissociation between pitch processing in lateral Heschl’s gyrus and sound onset
processing in medial Heschl’s gyrus. Experimental Brain Research, 187, 97–105.
Shahin, A., Roberts, L. E., Pantev, C., Trainor, L. J., & Ross, B. (2005). Modulation of P2
auditory evoked responses by the spectral complexity of musical sounds. NeuroReport,
16, 1781–1785.
Shahin, A., Roberts, L. E., & Trainor, L. J. (2004). Enhancement of auditory cortical
development by musical experience in children. NeuroReport, 15, 1917–1921.
Shahin, A. J., Roberts, L. E., Chau, W., Trainor, L. J., & Miller, L. (2008). Musical training
leads to the development of timbre-specific gamma band activity. NeuroImage, 41, 113–
122.
Auditory and Musical Development
Page 49 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Shore, S. E., Koehler, S., Oldakowski, M., Hughes, L. F., & Syed, S. (2008). Dorsal
cochlear nucleus responses to somatosensory stimulation are enhanced after noise-
induced hearing loss. European Journal of Neuroscience, 27, 155–168.
Sininger, Y. S., & Abdala, C. (1996). Auditory brainstem response thresholds of newborns
based on ear canal levels. Ear and Hearing, 17, 395–401.
Sininger, Y. S., Abdala, C., & Cone-Wesson, B. (1997). Auditory threshold sensitivity of
the human neonate as measured by the auditory brainstem response. Hearing Research,
104, 1–22.
Sinnott, J. M., & Aslin, R. N. (1985). Frequency and intensity discrimination in human
infants and adults. Journal of the Acoustical Society of America, 78, 1986–1992.
Sloboda, J. A. (1991). Music structure and emotional response: Some empirical findings.
Psychology of Music, 19, 110–120.
Sluming, V., Barrick, T., Howard, M., Cezayirli, E., Mayes, A., & Roberts, N. (2002).
Voxel-based morphometry reveals increased gray matter density in Broca’s area in male
symphony orchestra musicians. NeuroImage, 17, 1613–1622.
Smith, N. A., & Trainor, L. J. (2008). Infant-directed speech is modulated by infant
feedback. Infancy, 13, 410–420.
Smith, N. A., Trainor, L. J., Gray, K. Plantinga, J., & Shore, D. (2008). Stimulus, task and
learning effects on measures of temporal resolution: Implications for predictors of
language outcome. Journal of Speech Language and Hearing Research, 51, 1630–1642.
Smith, N. A., Trainor, L. J., & Shore, D. I. (2006). The development of temporal resolution:
Between-channel gap detection in infants and adults. Journal of Speech Language and
Hearing Research, 49, 1104–1123.
Snyder, J. S., Hannon, E. E., Large, E. W., & Christiansen, M. H. (2006). Synchronization
and continuation tapping to complex meters. Music Perception, 24, 135–145.
Snyder, J., & Krumhansl, C. L. (2001). Tapping to ragtime: Cues to pulse finding. Music
Perception, 18, 455–489.
Snyder, J. S., & Large, E. W. (2005). Gamma-band activity reflects the metric structure of
rhythmic tone sequences. Cognitive Brain Research, 24, 117–126.
Auditory and Musical Development
Page 50 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Sonnadara, R., & Trainor, L. J. (2005). Event-related potentials elicited by occasional
changes in sound location across the first 8 months of life. Presented at the Society for
Psychophysiological Research 44th Annual Meeting, Lisbon.
Sussman E., Čeponienė, R., Shestakova, A., Näätänen, R., & Winkler, I. (2001). Auditory
stream segregation processes operate similarly in school-aged children as adults. Hearing
Research, 153, 108–114.
Sussman, E., Wong, R., Horváth, J., Winkler, I., & Wang, W. (2007). The development of
the perceptual organization of sound by frequency separation in 5–11 year-old children.
Hearing Research, 225, 117–127.
Terhardt, E. (1984). The concept of musical consonance: A link between music and
psychoacoustics. Music Perception, 1, 276–295.
Tervaniemi, M., Just, V., Koelsch, S., Widmann, A., & Schröger, E. (2005). Pitch-
discrimination accuracy in musicians vs. non-musicians—an event-related potential and
behavioral study. Experimental Brain Research, 161, 1–10.
Tew, S., Fujioka, T., He, C., & Trainor, L. (2009). Neural representation of transposed
melody in infants at 6 months of age. Annals of the New York Academy of Sciences, 1169,
287–290.
Thompson, N. C., Cranford, J. L., & Hoyer, E. (1999). Brief-tone frequency discrimination
by children. Journal of Speech Language and Hearing Research, 42, 1061–1068.
Tillmann, B., Bigand, E., Escoffier, N., & Lalitte, P. (2006). The influence of musical
relatedness on timbre discrimination. European Journal of Cognitive Psychology, 18, 343–
358.
Trainor, L. J. (1996). Infant preferences for infant-directed versus noninfant-directed
playsongs and lullabies. Infant Behavior and Development, 19, 83–92.
Trainor, L. J. (1997). The effect of frequency ratio on infants’ and adults’ discrimination of
simultaneous intervals. Journal of Experimental Psychology: Human Perception and
Performance, 23, 1427–1438.
Trainor, L. J. (2005). Are there critical periods for music development? Developmental
Psychobiology, 46, 262–278.
Trainor, L. J., & Adams, B. (2000). Infants’ and adults’ use of duration and intensity cues
in the segmentation of tone patterns. Perception & Psychophysics, 62, 333–340.
Auditory and Musical Development
Page 51 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Trainor, L. J., Clark, E. D., Huntley, A., & Adams, B. (1997). The acoustic basis of
preferences for infant-directed singing. Infant Behavior and Development, 20, 383–396.
Trainor, L. J., & Corrigall, K. A. (2010). Music acquisition and effects of musical
experience. In M. Riess-Jones & R. R. Fay (Eds.), Springer Handbook of Auditory
Research: Music Perception (pp. 89–128). Heidelberg: Springer.
Trainor, L. J., Desjardins, R. N., & Rockel, C. (1999). A comparison of contour and interval
processing in musicians and nonmusicians using event-related potentials. Australian
Journal of Psychology Special Issue: Music as a Brain and Behavioural System, 51, 147–
153.
Trainor, L. J., Gao, X., Lei, J., Lehtovarara, K., & Harris, L. R. (2009). The primal role of
the vestibular system in determining musical rhythm. Cortex, 45, 35–43.
Trainor, L. J., & Heinmiller, B. M. (1998). The development of evaluative responses to
music: Infants prefer to listen to consonance over dissonance. Infant Behavior &
Development, 21, 77–88.
Trainor, L. J., Lee, K., & Bosnyak, D. J. (2011). Cortical plasticity in 4-month-old infants:
Specific effects of experience with musical timbres. Brain Topography, 24, 192–203.
Trainor, L. J., Marie, C., Gerry, D., Whiskin, E., & Unrau, A. (2012). Becoming musically
enculturated: Effects of music classes for infants on brain and behavior. Annals of the
New York Academy of Sciences, 1252, 129–138.
Trainor, L. J., McDonald, K. L., & Alain, C. (2002). Automatic and controlled processing of
melodic contour and interval information measured by electrical brain activity. Journal of
Cognitive Neuroscience, 14, 430–442.
Trainor, L. J., McFadden, M., Hodgson, L., Darragh, L., Barlow, J., Matsos, L., et al.
(2003). Changes in auditory cortex and the development of mismatch negativity between
2 and 6 months of age. International Journal of Psychophysiology, 51, 5–15.
Trainor, L. J., & Schmidt, L. A. (2003). Processing emotions induced by music. In I. Peretz
& R. Zatorre (Eds.), The cognitive neuroscience of music (pp. 311–324). New York: Oxford
University Press.
Trainor, L. J., Shahin, A., & Roberts, L. E. (2003). Effects of musical training on the
auditory cortex in children. Annals of the New York Academy of Sciences, 999, 506–513.
Auditory and Musical Development
Page 52 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Trainor, L. J., Shahin, A., & Roberts, L. E. (2009). Understanding the benefits of musical
training: Effects on oscillatory brain activity. Annals of the New York Academy of
Sciences, 1169, 133–142.
Trainor, L. J., & Trehub, S. E. (1992). A comparison of infants’ and adults’ sensitivity to
Western musical structure. Journal of Experimental Psychology: Human Perception and
Performance, 18, 394–402.
Trainor, L. J., & Trehub, S. E. (1994). Key membership and implied harmony in Western
tonal music: Developmental perspectives. Perception & Psychophysics, 56, 125–132.
Trainor, L. J., Tsang, C. D., & Cheung, V. H. W. (2002). Preference for consonance in 2-
and 4-month-old infants. Music Perception, 20, 187–194.
Trainor, L. J., & Unrau, A. J. (2012). Development of pitch and music perception. In L.
Werner, R. R. Fay & A. N. Popper (Eds.), Springer Handbook of Auditory Research:
Human Auditory Development (pp. 223–254). New York: Springer.
Trainor, L. J., & Zacharias, C. A. (1998). Infants prefer higher-pitched singing. Infant
Behavior and Development, 21, 799–805.
Trainor, L. J., & Zatorre, R. (2009). The neurobiological basis of musical expectations:
From probabilities to emotional meaning. In S. Hallam, I. Cross, & M. Thaut (Eds.),
Oxford handbook of music psychology (pp. 171–183). Oxford: Oxford University Press.
Tramo, M. J., Cariani, P. A., Delgutte, B., & Braida, L. D. (2001). Neurobiological
foundations for the theory of harmony in western tonal music. In R. J. Zatorre & I. Peretz
(Eds.), The biological foundations of music (pp. 92–116). New York: New York Academy of
Sciences.
Trehub, S. E. (2009). Music lessons from infants. In S. Hallen, I. Cross, & M. Thaut (Eds.),
Oxford handbook of music psychology (pp. 229–234). Oxford: Oxford University Press.
Trehub, S. E., Bull, D., & Thorpe, L. A. (1984). Infants’ perception of melodies: The role of
melodic contour. Child Development, 55, 821–830.
Trehub, S. E., Cohen, A. J., Thorpe, L. A., & Morrongiello, B. A. (1986). Development of
the perception of musical relations: Semitone and diatonic structure. Journal of
Experimental Psychology: Human Perception and Performance, 12, 295–301.
Trehub, S. E., Endman, M. W., & Thorpe, L. A. (1990). Infants’ perception of timbre:
Classification of complex tones by spectral structure. Journal of Experimental Child
Psychology, 49, 300–313.
Auditory and Musical Development
Page 53 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Trehub, S. E., Schneider, B. A., & Henderson, J. (1995). Gap detection in infants,
children, and adults. Journal of the Acoustical Society of America, 98, 2532–2541.
Trehub, S. E., Schellenberg, E. G., & Kamenetsky, S. B. (1999). Infants’ and adults’
perception of scale structure. Journal of Experimental Psychology: Human Perception and
Performance, 25, 965–975.
Trehub, S. E., Schneider, B. A., Morrongiello, B. A., & Thorpe, L. A. (1988). Auditory
sensitivity in school-age children. Journal of Experimental Child Psychology, 46, 273–285.
Trehub, S. E., Schneider, B. A., Thorpe, L. A., & Judge, P. (1991). Observational measures
of auditory sensitivity in early infancy. Developmental Psychology, 27, 40–49.
Trehub, S. E., & Thorpe, L. A. (1989). Infants’ perception of rhythm: Categorization of
auditory sequences by temporal structure. Canadian Journal of Psychology Special Issue:
Infant Perceptual Development, 43, 217–229.
Trehub, S. E., & Trainor, L. J. (1998). Singing to infants: Lullabies and playsongs.
Advances in Infancy Research, 12, 43–77.
Trehub, S. E., Unyk, A. M., & Trainor, L. J. (1993a). Adults identify infant-directed music
across cultures. Infant Behavior & Development, 16, 193–211.
Trehub, S. E., Unyk, A. M., & Trainor, L. J. (1993b). Maternal singing in cross-cultural
perspective. Infant Behavior & Development, 16, 285–295.
Tsang, C. D., & Trainor, L. J. (2002). Spectral slope discrimination in infancy: Sensitivity
to socially important timbres. Infant Behavior and Development, 25, 183–194.
Unyk, A. M., Trehub, S. E., Trainor, L. J., & Schellenberg, E. G. (1992). Lullabies and
simplicity: A cross-cultural perspective. Psychology of Music, 20, 15–28.
Viemeister, N. F. (1979). Temporal modulation transfer functions based upon modulation
thresholds. Journal of the Acoustical Society of America, 66, 1364–1380.
Viemeister, N. F., & Schlauch, R. S. (1992). Issues in infant psychoacoustics. In L. A.
Werner & E. W. Rubel (Eds.), Developmental psychoacoustics (pp. 191–210). Washington,
DC: American Psychological Association.
Volkova, A., Trehub, S. E., & Schellenberg, E. G. (2006). Infants’ memory for musical
performances. Developmental Science, 9, 583–589.
Auditory and Musical Development
Page 54 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Vuust, P., Pallesen, K. J., Bailey, C., Van Zuijen, T. L., Gjedde, A., Roepstorff, A., et al.
(2005). To musicians the message is in the meter: Pre-attentive neural responses to
incongruent rhythm are left-lateralized in musicians. NeuroImage, 24, 560–564.
Weir, C. (1976). Auditory frequency sensitivity in the neonate: A signal detection analysis.
Journal of Experimental Child Psychology, 21, 219–225.
Weir, C. (1979). Auditory frequency sensitivity of human newborns: Some data with
improved acoustic and behavioral controls. Perception & Psychophysics, 26, 287–294.
Werner, L. A. (1992). Interpreting developmental psychoacoustics. In L. A. Werner & E.
W. Rubel (Eds.), Developmental psychoacoustics (pp. 47–88). Washington, DC: American
Psychological Association.
Werner, L. A. (2007). Human auditory development. In R. Hoy,P. Dallos, & D. Oertel
(Eds.), The senses: A comprehensive reference, Vol. 3: Audition (pp. 871–894). New York:
Academic Press.
Werner, L. A., Folsom, R. C., & Mancl, L. R. (1993a). The relationship between auditory
brainstem response and behavioral thresholds in normal hearing infants and adults.
Hearing Research, 68, 131–141.
Werner, L. A., Folsom, R. C., & Mancl, L. R. (1993b). The relationship between auditory
brainstem response latencies and behavioral thresholds in normal hearing infants and
adults. Hearing Research, 77, 88–98.
Werner, L. A., & Gillenwater, J. M. (1990). Pure-tone sensitivity of 2- to 5-week-old
infants. Infant Behavior and Development, 13, 355–375.
Werner, L. A., & Holmer, N. M. (2002). Infant hearing thresholds measured in the ear
canal. Paper presented at the meeting of the American Auditory Society, Scottsdale, AZ.
Werner, L. A., & Marean, G. C. (1996). Human auditory development. Boulder, CO:
Westview.
Werner, L. A., Marean, G. C., Halpin, C. F., Spetner, N. B., & Gillenwater, J. M. (1992).
Infant auditory temporal acuity: Gap detection. Child Development, 63, 260–272.
Wightman, F., & Allen, P. (1992). Individual differences in auditory capability among
preschool children. In L. A. Werner & E. W. Rubel (Eds.), Developmental psychoacoustics
(pp. 113–133). Washington, DC: American Psychological Association.
Auditory and Musical Development
Page 55 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
Wilmington, D., Gray, L., & Jahrsdorfer, R. (1994). Binaural processing after corrected
congenital unilateral conductive hearing loss. Hearing Research, 74, 99–114.
Wiltermuth, S. S., & Heath, C. (2009). Synchrony and cooperation. Psychological Science,
20, 1–5.
Winkler, I., Háden, G. P., Ladinig, O., Sziller, I., & Honing, H. (2009). Newborn infants
detect the beat in music. Proceedings of the National Academy of Sciences, 106, 2468–
2471.
Winkler, I., Kushnerenko, E., Čeponienė, R., Fellman, V., Huotilainen, M., et al. (2003).
Newborn infants can organize the auditory world. Proceedings of the National Academy
of Sciences USA, 100, 11812–11815.
Zhang, L. I., Bao, S., & Merzenich, M. M. (2002). Disruption of primary auditory cortex by
synchronous auditory inputs during a critical period. Proceedings of the National
Academy of Sciences, 99, 2309–2314.
Zatorre, R. J. (2001). Neural specializations for tonal processing. Annals of The New York
Academy of Sciences, 930, 193–210.
Zatorre, R. J., Bouffard, M., Ahad, P., & Belin, P. (2002). Where is ‘where’ in the human
auditory cortex? Nature Neuroscience, 5, 905–909.
Zentner, M. R., & Kagan, J. (1998). Infants’ perception of consonance and dissonance in
music. Infant Behavior & Development, 21, 483–492.
Laurel J. Trainor
Laurel J. Trainor, Department of Psychology, Neuroscience, and Behaviour,
Hamilton, McMaster University, Ontario, Canada; Rotman Research Institute,
Baycrest Hospital, Toronto, Canada
Chao He
Chao He, Rotman Research Institute, McMaster University, Canada
Auditory and Musical Development
Page 56 of 56
PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2015. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy).
Subscriber: McMaster University; date: 29 July 2016
... In this regard, one interesting field of study is the impact of acute stresses on sensory systems such as audition. The auditory system is one of main sensory modalities for communication in primates (Dagnino-Subiabre, 2013) and is critical to human communication both through speech and nonspeech stimuli (Trainor and He, 2013). The relationship between hearing disorders and chronic stress has been widely addressed previously (Horner, 2003). ...
Article
Stress is an integral part of modern life. Although there is a large body of literature regarding the harmful effects of chronic stress on different aspects of human life, acute stress is the most common form of stress, resulting from the demands and pressures of the recent past and the anticipated demands and pressures of the near future (APA, 2016). In spite of its pervasive nature, less attention has been paid to the impact of acute stress on sensory processing than to the consequences of chronic stress, particularly concerning auditory processing. In this systematic review, the impact of experimental acute stress on the auditory processing of healthy adults was investigated. The results revealed the adverse effects of acute physical and psychological stresses on auditory processing. According to the open field of research on stress and the auditory system and the high possibility of experiencing different types of acute stresses in various life environments, including testing places, it seems that more investigations are needed to identify and manage different types of acute stresses in both clinical and research situations.
... frequencies relies additionally on temporal mechanisms and remains immature until around 10 or 11 years of age (Maxon and Hochberg, 1982; Werner, 2007 ). Nonetheless, pitch discrimination at low frequencies is mature enough in early infancy to support the fine discrimination needed for music perception (Trainor and Corrigall, 2010; Trainor and He, 2011). At 2 months of age, infants are able to discriminate vowel sounds that differ in frequency by 3% (Swoboda et al., 1976) and show cortical electroencephalograph (EEG) responses to a 6% change in the pitch of piano tones (He et al., 2007). ...
Article
Full-text available
The ability to separate simultaneous auditory objects is crucial to infant auditory development. Music in particular relies on the ability to separate musical notes, chords, and melodic lines. Little research addresses how infants process simultaneous sounds. The present study used a conditioned head-turn procedure to examine whether 6-month-old infants are able to discriminate a complex tone (240 Hz, 500 ms, six harmonics in random phase with a 6 dB roll-off per octave) from a version with the third harmonic mistuned. Adults perceive such stimuli as containing two auditory objects, one with the pitch of the mistuned harmonic and the other with pitch corresponding to the fundamental of the complex tone. Adult thresholds were between 1% and 2% mistuning. Infants performed above chance levels for 8%, 6%, and 4% mistunings, with no significant difference between conditions. However, performance was not significantly different from chance for 2% mistuning and significantly worse for 2% compared to all larger mistunings. These results indicate that 6-month-old infants are sensitive to violations of harmonic structure and suggest that they are able to separate two simultaneously sounding objects.
Book
The book takes up the UNESCO (1972) assertion that there is a fundamental aim for the education of the peoples of the world: It is to educate complete human beings in all their dimensions, such that they participate in the human story. The book examines the aims and purposes of education as conceived by scholars, international and national authorities, international pedagogies, early childhood educators, and representative schools. It reviews the concepts of human nature most commonly assumed in educational theory and practice: Mind, body, self, senses, spirit, intelligence(s), and creativity. It explores the most common concepts of human learning from the perspective of principles of learning. An intermission sums the argument of the first four chapters, and prepares for the holistic paradigm to be developed. The argument of the book is then presented in terms of a holistic paradigm: Whole child, holistic learning, holistic education and holistic principle. Each concept is developed and linked. The holistic language of the paradigm presents an ideological, theoretical and practical principle from which the intents and practices of educational systems and local centres may be conceived. It offers a holistic principle as the ontological principle of the human universe. A concept of human education needs to take this into account in framing its purposes and praxis. The book reviews the concepts and language of the international education landscape, such that a fundamental holistic purpose for the education of the world’s peoples may be proposed and agreed. This holistic purpose applies not only to education systems and local agencies for children’s learning. It is fundamental to the way the human world works. In a contemporary age of diversity and pluralism, of the seeming insurmountable difficulty of achieving international and national cohesion, the book draws together 800 voices in a chorus, to offer a vision and method for the future of education and the world.
Article
Full-text available
Previous research suggests that when two streams of pitched tones are presented simultaneously, adults process each stream in a separate memory trace, as reflected by Mismatch Negativity (MMN), a component of the Event-Related Potential (ERP). Furthermore, a superior encoding of the higher tone or voice in polyphonic sounds has been found for 7-month-old infants and both musician and non-musician adults in terms of a larger amplitude MMN in response to pitch deviant stimuli in the higher than the lower voice. These results, in conjunction with modeling work, suggest that the high voice superiority effect might originate in characteristics of the peripheral auditory system. If this is the case, the high voice superiority effect should be present in infants younger than 7 months. In the present study we tested 3-month-old infants as there is no evidence at this age of perceptual narrowing or specialization of musical processing according to the pitch or rhythmic structure of music experienced in the infant's environment. We presented two simultaneous streams of tones (high and low) with 50% of trials modified by 1 semitone (up or down), either on the higher or the lower tone, leaving 50% standard trials. Results indicate that like the 7-month-olds, 3-month-old infants process each tone in a separate memory trace and show greater saliency for the higher tone. Although MMN was smaller and later in both voices for the group of sixteen 3-month-olds compared to the group of sixteen 7-month-olds, the size of the difference in MMN for the high compared to low voice was similar across ages. These results support the hypothesis of an innate peripheral origin of the high voice superiority effect.
Article
Full-text available
Animal models suggest that the brain is particularly neuroplastic early in development, but previous studies have not systematically controlled the auditory environment in human infants and observed the effects on auditory cortical representations. We exposed 4-month-old infants to melodies in either guitar or marimba timbre (infants were randomly assigned to exposure group) for a total of ~160 min over the course of a week, after which we measured electroencephalogram (EEG) responses to guitar and marimba tones at pitches not previously heard during the exposure phase. A frontally negative response with a topography consistent with generation in auditory areas, peaking around 450 ms, was significantly larger for guitar than marimba tones in the guitar-exposed group but significantly larger for marimba than guitar tones in the marimba-exposed group. This indicates that experience with tones in a particular timbre affects representations for that timbre, and that this effect generalizes to tones not previously experienced during exposure. Furthermore, mismatch responses to occasional small 3% changes in pitch were larger for tones in guitar than marimba timbre only for infants exposed to guitar tones. Together these results indicate that a relatively small amount of passive exposure to a particular timbre in infancy enhances representations of that timbre and leads to more precise pitch processing for that timbre.
Article
Full-text available
The presence of the phenomenological body is central to music in all of its varieties and contradictions. With the explosion of scholarly works on the body in virtually every field in the humanities, the social as well as the biomedical sciences, the question of how such a complex understanding of the body is related to music, with its own complexity, has been investigated within specific disciplinary perspectives. The Oxford Handbook of Music and the Body brings together these particular aspects of such relationships in a broad context and provides a platform for the discussion of the multidimensional interfaces of music and the body. It is organized into six sections, each discussing the topics that define the field: the moving and performing body; the musical brain and psyche; embodied mind, embodied rhythm; the disabled and sexual body; music as medicine; and the multimodal body. Connecting a wide array of diverse perspectives and presenting a survey of research and practice highlighting different facets, the Handbook provides an introduction into the rich world of music and the body.
Article
Full-text available
The idea that extensive musical training can influence processing in cognitive domains other than music has received considerable attention from the educational system and the media. Here we analyzed behavioral data and recorded event-related brain potentials (ERPs) from 8-year-old children to test the hypothesis that musical training facilitates pitch processing not only in music but also in language. We used a parametric manipulation of pitch so that the final notes or words of musical phrases or sentences were congruous, weakly incongruous, or strongly incongruous. Musician children outperformed nonmusician children in the detection of the weak incongruity in both music and language. Moreover, the greatest differences in the ERPs of musician and nonmusician children were also found for the weak incongruity: whereas for musician children, early negative components developed in music and late positive components in language, no such components were found for nonmusician children. Finally, comparison of these results with previous ones from adults suggests that some aspects of pitch processing are in effect earlier in music than in language. Thus, the present results reveal positive transfer effects between cognitive domains and shed light on the time course and neural basis of the development of prosodic and melodic processing.
Article
Full-text available
What can we learn about music and musicality from infants? Sceptics may question the possibility of deriving fruitful answers to such questions from immature beings whose hearing is deficient (relative to adults) and whose exposure to 'good' music, even conventional music, is limited. This article considers the possibility of nature making some contribution to our musical beginnings and to our subsequent development. The story that emerges from infancy involves a rich musical environment, with mothers delivering performances which match the inclinations of their infants. Moreover, infants have predispositions or inborn preferences for musical features that are common across the world's cultures. Because musical systems across the world differ in notable respects, it makes sense that infants are open to the available alternatives. With increasing exposure to music, they gain expertise as listeners, but that expertise comes at the cost of diminished sensitivity to features which are irrelevant or infrequent in their own musical culture.
Article
Full-text available
The ubiquity of songs is at odds with the prevailing view that music has no survival value (e.g., Granit, 1977; Winner, 1982). In particular, the widespread use of songs in child care (Trehub & Schellenberg, 1995) raises questions about their form and function, historically and cross-culturally, and their special link to caregiving. In the present review of singing to infants, we pursue two rather divergent approaches: one descriptive, the other empirical. The descriptive and historical material on songs, which is drawn primarily from anthropological and ethnomusicological sources, provides a context for the limited body of empirical research on songs for infants. Indeed, the descriptive evidence seems to suggest that the practice of singing to infants and many details of song form are rooted in ancient traditions that have survived industrialization and urbanization.
Book
One of the important advances is the use of new tools such as molecular biology and immuno-histochemistry that can find the genetic origin and the precise location of constitutive proteins that may serve as developmental markers. These proteins can be involved in particular aspects of the onset of the stato-acoustic function as in development of higher brain centers. This book gives valuabe information on the brainstem developement and higher nuclei related to binaural hearing and plasticity of the auditory system. It addresses some essential conceptual questions of particular interest to developmental Neuroscience and also, more generally, to Neuroscience.
Article
This study investigated the development of auditory frequency and temporal resolution using simultaneous and backward masking of a tone by a noise. The participants were 6- to 10-year-old children and adults. On the measure of frequency resolution (the difference in the detection threshold for a tone presented either in a bandpass noise or in a spectrally notched noise), 6-year-old children performed as well as adults. However, for the backward masking task, 6-year-olds had, on average, 34 dB higher thresholds than adults. A negative exponential decay function Fitted to the backward masking data for subjects of all ages indicated that adult-like temporal resolution may not be reached until about 11 years of age. These results show that, measured by masking, frequency resolution has reached adult-like performance by 6 years of age, whereas temporal resolution develops beyond 10 years of age. Six-year-old children were also assessed with tests of cognitive ability. Improvements in both frequency and temporal resolution were found with increasing IQ score.
Article
The chapter discusses the possible origins and bases of scales including those aspects of scales that are universal across musical cultures. It also addresses the perception of the basic unit of melodies and scales, the musical interval. Natural intervals are define as intervals that show maximum sensory consonance and harmony, have influenced the evolution of the scales of many musical cultures, but the standards of intonation for a given culture are the learned interval categories of the scales of that culture. Based on the results of musical interval adjustment and identification experiments, and on measurements of intonation in performance, the intonation standard for Western music appears to be a version of the equitempered scale that is slightly compressed for small intervals, and stretched for wide intervals, including the octave. The perception of musical intervals shares a number of commonalities with the perception of phonemes in speech, most notably categorical-like perception, and an equivalence of spacing, in sensation units, of categories along the respective continua. However, the perception of melodic musical intervals appears to be the only example of ideal categorical perception in which discrimination is totally dependent on identification. Therefore this chapter concludes that, rather than speech being “special” as ofttimes proclaimed by experimental psychologists it seems that music is truly special.