ArticlePDF Available

Abstract and Figures

To examine how young children recognize the association between two different types of meaningful sounds and their visual referents, we compared 15-, 20-, and 25-month-old infants' looking time responses to familiar naturalistic environmental sounds, (e.g., the sound of a dog barking) and their empirically matched verbal descriptions (e.g., "Dog barking") in an intermodal preferential looking paradigm. Across all three age groups, performance was indistinguishable over the two domains. Infants with the largest vocabularies were more accurate in processing the verbal phrases than the environmental sounds. However, after taking into account each child's verbal comprehension/production and the onomatopoetic test items, all cross-domain differences disappeared. Correlational analyses revealed that the processing of environmental sounds was tied to chronological age, while the processing of speech was linked to verbal proficiency. Overall, while infants' ability to recognize the two types of sounds did not differ behaviorally, the underlying processes may differ depending on the type of auditory input.
Content may be subject to copyright.
This article was downloaded by:
[Cummings, Alycia]
30 June 2009
Access details:
Access Details: [subscription number 912835986]
Psychology Press
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
Language Learning and Development
Publication details, including instructions for authors and subscription information:
Infants' Recognition of Meaningful Verbal and Nonverbal Sounds
Alycia Cummings a; Ayse Pinar Saygin b; Elizabeth Bates c; Frederic Dick d
a San Diego State University and University of California—San Diego, University of North Dakota, b Institute of
Cognitive Neuroscience, University College London, University of California—San Diego, c University of
California-San Diego, d University of California-San Diego, Birkbeck College, University of London,
Online Publication Date: 01 July 2009
To cite this Article Cummings, Alycia, Saygin, Ayse Pinar, Bates, Elizabeth and Dick, Frederic(2009)'Infants' Recognition of Meaningful
Verbal and Nonverbal Sounds',Language Learning and Development,5:3,172 — 190
To link to this Article: DOI: 10.1080/15475440902754086
Full terms and conditions of use:
This article may be used for research, teaching and private study purposes. Any substantial or
systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or
distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses
should be independently verified with primary sources. The publisher shall not be liable for any loss,
actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly
or indirectly in connection with or arising out of the use of this material.
Language Learning and Development, 5: 172–190, 2009
Copyright © Taylor & Francis Group, LLC
ISSN: 1547-5441 print / 1547-3341 online
DOI: 10.1080/15475440902754086
HLLD1547-54411547-3341Language Learning and Development, Vol. 5, No. 3, May 2009: pp. 1–37Language Learning and Development Infants’ Recognition of Meaningful Verbal
and Nonverbal Sounds
San Diego State University and University of California—San Diego
University of North Dakota
Ayse Pinar Saygin
Institute of Cognitive Neuroscience, University College London
University of California—San Diego
Elizabeth Bates
University of California–San Diego
Frederic Dick
University of California–San Diego
Birkbeck College, University of London
To examine how young children recognize the association between two different types of meaningful
sounds and their visual referents, we compared 15-, 20-, and 25-month-old infants’ looking time
responses to familiar naturalistic environmental sounds (e.g., the sound of a dog barking) and their
empirically matched verbal descriptions (e.g., “Dog barking”) in an intermodal preferential looking
paradigm. Across all three age groups, performance was indistinguishable over the two domains.
Infants with the largest vocabularies were more accurate in processing the verbal phrases than the
environmental sounds. However, after taking into account each child’s verbal comprehension/
production and the onomatopoetic test items, all cross-domain differences disappeared. Correla-
tional analyses revealed that the processing of environmental sounds was tied to chronological age,
while the processing of speech was linked to verbal proficiency. Overall, while infants’ ability to rec-
ognize the two types of sounds did not differ behaviorally, the underlying processes may differ
depending on the type of auditory input.
What is the relationship between the development of speech and language comprehension and
the comprehension of other complex auditory signals? Previous developmental studies have
addressed this intriguing question in a variety of manners and modalities (e.g., Hollich et al.,
Correspondence should be addressed to Alycia Cummings, Center for Research in Language, 9500 Gilman Drive,
UCSD Mail Code 0526, La Jolla, CA 92093-0526. E-mail:
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
2000; Gogate & Bahrick, 1998; Gogate, Bolzani, & Betancourt, 2006; Gogate, Walker-Andrews, &
Bahrick, 2000), though few have directly compared how infants recognize sound–object associa-
tions for nonverbal sounds versus language. Rather, most have contrasted infants’ preference for
speech and nonspeech sounds. For instance, even at a young age (2.5 to 6.5 months) infants
show a preference for natural speech sounds over synthetic sine-wave speech (Vouloumanos &
Werker, 2004); 9-month-old infants also appear to show a preference for the human voice, as
opposed to similar sounds conveyed by musical instruments (Glenn, Cunningham, & Joyce,
1981). However, these studies do not directly address the infant’s use and understanding of
speech as a means of conveying semantic information. This is important to understand since the
ability to recognize the association between a sound and/or word and an object is a likely precursor
to word comprehension, as knowing that a word can “stand for” an object is the first step toward
developing language (Golinkoff, Mervis, & Hirsh-Pasek, 1994).
Indeed, studies examining the development of intermodal semantic associations suggest that
for much of the second year of life infants will accept gestures, spoken words, and nonverbal
sounds as possible object associates when these are presented in an appropriate context. Early in
development, perceptual saliency can assist young infants’ (7- to 12-month-olds) associative
abilities (Bahrick, 1994; Gogate & Bahrick, 1998; Gogate et al., 2006; Hollich et al., 2000).
However, sometime after their first birthday, infants start to attend to and use social and referential
cues for symbolic associations. For instance, Hollich and colleagues (2000) and Roberts (1995)
found that 15-month-old infants developed a categorization bias for novel objects when previous
object presentations were predictably paired with either instrumental music or speech sounds.
Similarly, results of Namy (2001), Namy and Waxman (1998, 2002), and Namy, Acredolo, and
Goodwyn (2000) have shown that pictograms, gestures, and nonverbal sounds were equally
likely to be learned as labels for object categories at 17 or 18 months. In particular, Campbell
and Namy (2003) showed that both at 13 and 18 months, toddlers were equally likely to recog-
nize associations between novel objects and either novel words or novel, relatively complex
artificial sounds (e.g., a series of tone pips) when either type of sound was linked to social and
referential cues during learning. However, neither words nor nonverbal sounds were associated
with the objects when such social or referential cues were not present during learning (see also
Balaban & Waxman, 1997, and Xu, 2002, for instances of infants not categorizing objects in the
absence of referential cues.
As children get older, they become less likely to accept a nonlinguistic label for an object
than they are a word. For instance, Woodward and Hoyne (1999) found that 13-month-old
infants would learn a novel association between a nonverbal label (an artificially produced
sound) and an object, but 20-month-old infants did not; Namy and Waxman (1998) found
similar trends in gestural comprehension between 18 and 26 months. Likewise, in gesture
production, Iverson, Capirci, and Caselli (1994) found a greater acceptance of verbal rather than
gestural labels between 16 and 20 months (see also Volterra, Iverson, & Emmorey, 1995).
Undoubtedly, multiple factors drive the emergence of this greater acceptance of spoken
words as labels (for discussion and review, see Volterra, Caselli, Capirci, & Pizzuto, 2005).
However, it is unclear whether this greater acceptance of spoken words holds when words
are compared with natural, complex, and frequently occurring sounds that convey meaningful
information, such as environmental sounds. Both word–object and environmental sound–object
associations occur frequently in everyday situations (e.g., Ballas & Howard, 1987), and this
frequency of occurrence may be important for learning meaningful sound–object relationships.
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
However, the relationship between objects and their associated verbal and environmental sound
labels are somewhat different. Environmental sounds can be defined as sounds generated by real
events—for example, a dog barking or a drill boring through wood—that gain sense or meaning
by their association with those events (Ballas & Howard, 1987). Thus, individual environmental
sounds are typically causally bound to the sound source or referent, unlike the arbitrary linkage
between a word’s pronunciation and its referent. Environmental sounds might be considered
icons or indexes, rather than true symbols.
There is wide variation in how environmental sounds are generated. They can be produced by
live beings (e.g., dogs, cats) or by animate beings acting upon inanimate objects (e.g., playing an
instrument, using a tool). Inanimate objects are also able to produce a sound on their own
without the intervention of an animate being (e.g., water running in a river, alarm clock ringing).
With the advent of electronic toys, books, and other media, the linking of environmental sounds
to their source has become somewhat more abstract; for instance, it is much more likely for a
child in San Diego to hear the mooing of a cow from a picture book or video than on a farm.
Studies with adult subjects have shown that, like words, processing of individual environ-
mental sounds is modulated by contextual cues (Ballas & Howard, 1987), item familiarity, and
frequency (Ballas, 1993; Cycowicz & Friedman, 1998). Environmental sounds can prime
semantically related words and vice versa (Van Petten & Rheinfelder, 1995), and may also
prime other semantically related sounds (Stuart & Jones, 1995; but cf. Chiu & Schacter, 1995,
and Friedman, Cycowicz, & Dziobek, 2003, who showed priming from environmental sounds to
language stimuli, but no priming in the reverse direction). Spoken words and environmental
sounds share many spectral and temporal characteristics, and the recognition of both classes of
sounds breaks down in similar ways under acoustical degradation (Gygi, 2001). Finally, Saygin,
Dick, Wilson, Dronkers, and Bates (2003) have shown that in adult patients with aphasia,
the severity of patients’ speech comprehension deficits strongly predicts the severity of their
environmental sounds comprehension deficits.
Environmental sounds also differ from speech in several fundamental ways. The “lexicon” of
environmental sounds is small and semantically stereotyped; these sounds are also not easily
recombined into novel sound phrases (Ballas, 1993). There is wide individual variation in
exposure to different sounds (Gygi, 2001), and correspondingly, healthy adults show much
variability in their ability to recognize and identify these sounds (Saygin, Dick, & Bates, 2005).
Finally, the human vocal tract does not produce most environmental sounds. In fact, the neural
mechanisms of nonlinguistic environmental sounds that can and cannot be produced by the
human body may differ significantly (Aziz-Zadeh, Iacoboni, Zaidel, Wilson, & Mazziotta,
2004; Lewis, Brefczynski, Phinney, Janik, & DeYoe, 2005; Pizzamiglio et al., 2005).
Recognizing the association between environmental sounds and their paired objects or events
appears to utilize many of the same cognitive and/or neural mechanisms as those associated with
a verbal label, especially when task and stimuolus demands are closely matched (reviewed in
Ballas, 1993; Saygin et al., 2003, 2005, Cummings, Ceponiene, Dick, Saygin, & Townsend,
2008; Cummings, Ceponiene, Koyama, Saygin, Townsend, & Dick, 2006). Indeed, spoken
language and environmental sounds comprehension appear to show similar trajectories in
typically developing school-age children (Dick, Saygin, Paulsen, Trauner, & Bates, 2004;
Cummings et al., 2008), as well as in children with language impairment and perinatal focal
lesions (Cummings, Ceponiene, Williams, Townsend, & Wulfeck, 2006; Borovsky, Saygin,
Cummings, & Dick, in preparation).
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
The acoustical and semantic similarities between environmental sounds and speech sounds
discussed above and the findings of previous studies have led us to explore the relationship
between the early development of speech processing and that of environmental sounds. Previous
studies have shown an emerging preference for speech compared to other complex sounds, but
these sounds were not naturally meaningful or referential. Although it has been shown that
adults and school age children may use similar resources for processing meaningful environ-
mental and verbal sounds, it is still possible that distinct mechanisms may direct the recognition
of the sound–object associations between these sounds early in development.
There is some evidence from early development that recognizing sound–object associations
may rely on domain-general perceptual mechanisms. For example, Bahrick (1983, 1994) dem-
onstrated how the perception of amodal relations in auditory-visual events could serve as a
building block for the detection of arbitrary auditory-visual mappings. Infants at 3 months of age
were able to identify color/shape changes in objects and pitch changes in sound, but it was not
until 7 months of age that the infants were able to identify the arbitrary relationship between
objects and sounds. Thus, it is possible that by first attending to invariant amodal relations,
infants learn about arbitrary intermodal relationships (Bahrick, 1992). Alternatively, there has
also been recent evidence that infants are better able to learn nonlinguistic information by first
learning about speech (Marcus, Fernandes, & Johnson, 2007, but see Dawson & Gerken, 2006;
Saffran, Pollak, Seibel, & Shkolnik, 2007).
Thus, it is important to examine not only the links between the receptive processing of verbal
and environmental sounds and overall (chronological) development, but also the links between
receptive abilities and the development of the child’s production of verbal sounds. Although no
previous study has examined the relationship of environmental sounds and productive vocabulary
size, previous looking-while-listening studies have reported positive relationships between
looking accuracy and productive vocabulary (e.g., Conboy & Thal, 2006; Zangl, Klarman, Thal,
Fernald, & Bates, 2005). Following a large sample of children longitudinally, Fernald, Perfors,
and Marchman (2006) observed that accuracy in spoken word recognition at 25 months was cor-
related with measures of lexical and grammatical development gathered from 12 to 25 months.
Moreover, Marchman and Fernald (2008) observed that infants’ vocabularies at 25 months are
strongly related to expressive language, IQ, and working memory skills tested at 8 years of age.
Thus, a link between vocabulary size and environmental sound–object associations might
suggest that an understanding of sound–object relationships, verbal or nonverbal, is linked to
more general language and cognitive development. Alternatively, we may see that productive
vocabulary growth is linked only to increases in language comprehension, suggesting that verbal
and nonverbal sound processing might be differentiated to some degree over early development.
Here, by testing infants before, during, and after the vocabulary spurt, we examined whether
infants are able to recognize associations to an object for verbal descriptions and environmental
sounds similarly, and whether this recognition would shift in response to dramatic gains in
language abilities due to the vocabulary spurt. In particular, we designed our study to address the
following questions, motivated by the literature summarized above: (a) Are infants and toddlers
better able to recognize the association between pictures of objects and speech labels, as compared
to pictures of objects and nonspeech sounds, even when the sounds are naturalistic and meaningful?;
(b) If there is no initial difference in the matching of meaningful speech sounds and environmental
sounds to associated pictures, will one emerge around 18 to 20 months of age (or later), as sug-
gested by the literature discussed above?; and (c) Will a child’s recognition of environmental
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
sounds and spoken words be predicted by chronological age, productive vocabulary size, or some
combination of the two?
To test infants’ ability to recognize sound–object associations, we used an intermodal preferen-
tial looking paradigm. We played either an environmental sound (e.g., the sound of a dog barking)
or the spoken language gloss of the sound (the phrase Dog Barking) and measured infants’ looking
to matching and mismatching pictures (in this case, a matching picture would be a dog, a mis-
matching picture would be a piano). Based on the results of previous studies of early linguistic
and nonlinguistic acquisition and development briefly presented above, we might expect younger
(15- and 20-month-old) infants to process environmental sounds and spoken language similarly,
whereas the oldest and/or most linguistically advanced children (e.g., 25-month-olds or infants
with very high productive vocabularies) would process the two types of sounds differently, more
readily recognizing language sounds than environmental sounds. Alternatively, given the results of
environmental sounds/language comparisons in older children and adults, the trajectories of auditory
linguistic and nonlinguistic processing might go hand in hand across this age span.
Participants were 60 typically developing, English-learning infants of three ages: twenty 15-
month-old children (9 female), twenty 20-month-old children (11 female), and twenty 25-month-
old children (12 female). Parents of the infants were asked to fill out the MacArthur-Bates Com-
municative Development Inventory (CDI; Fenson et al., 1994). The Words and Gestures version
was used for the 15-month-olds, while the Words and Sentences form was given to the parents of
the 20- and 25-month-olds. The Words and Gestures form tracks both infants’ comprehension and
production of words, but that form is valid only up to the age of 17 months. The Words and Sen-
tences form is valid from 16 to 30 months, but it tracks only productive vocabulary (due to the dif-
ficulty parents would have in keeping track of the hundreds and thousands of words older infants
know). Although it would have been ideal to know all our infant subjects’ receptive vocabularies,
our vocabulary assessment instruments limited us to using only productive vocabulary measures.
Verbal proficiency groups were defined by dividing the entire infant sample into thirds based on
their productive vocabulary: Low (< 50 words), Mid (51 to 261 words), and High ( 262 words).1
Experimental Design and Materials
There was a single within-subjects factor, Domain (Verbal or Environmental Sounds); this was
crossed with the between-subjects factors Age Group (15-, 20-, 25-month-old) or CDI Verbal
1The parents of four of the infants (two 15-month-olds and two 20-month-olds) did not return the CDIs. One
20-month-old’s mother did keep track of his vocabulary and stated that he was producing more than 400 words at the
time of testing, so he was informally placed in the High vocabulary group. The parents of the two 15-month-olds
reported that their infants spoke only a few words, so those two children were informally grouped into the Low vocabu-
lary group. The last 20-month-old child was reported to speak more than 50 words but was not overly verbal, thus she
was informally placed into the Mid vocabulary group. These four children were not included in the analyses that examined
the production of specific words (see Results for more detail).
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
Proficiency (Low, Medium, High). The experiment contained three types of stimuli: full-color pho-
tographs, nonverbal environmental sounds, and speech sounds. All of the sound tokens, nonverbal
and verbal, were presented after the attention-grabbing carrier phrase, “Look”. The duration of a
single trial—composed of carrier phrase, sound, and intervening silences—was 4000 msec ()see
Supplemental Web Materials,, for individual sound
durations and sample stimuli). All auditory stimuli were digitized at 44.1 kHz with a 16-bit sam-
pling rate. The average intensity of all auditory stimuli was normalized to 65 dB SPL using the Praat
4.4.30 computer program (Boersma, 2001). All sound–picture pairs were presented for the same
amount of time.
Nonverbal environmental sounds. The sounds came from many different nonlinguistic
sources: animal cries (n = 8, e.g., cow mooing), human nonverbal vocalizations (n = 4, e.g., sneez-
ing), vehicle noises (n = 2, e.g., airplane flying), alarms/alerts (n = 3, e.g., phone ringing), water
sounds (n = 1, toilet flushing), and music (n = 2, e.g., piano playing) (see Appendix for a complete
list of stimuli). All environmental sound trials were edited to 4000 ms in length, including the carrier
phrase, “Look!”. The average length of the environmental sound labels was 3149 ms (Range 1397
to 4000 ms), including 964 ms for the “Look!” and its subsequent silence prior to the onset of the
environmental sound. In order to extend the files to the required 4000 ms, 11 of the environmental
sound labels had “silence” appended to the end of the sound files. The addition of the silence
standardized the amount of time the infants were exposed to each picture set (2000 ms prior to the
onset of the sound, followed by 4000 ms of auditory input), even if the amount of audible auditory
information varied across trials.
Sounds were selected to be familiar and easily recognizable based on extensive norming studies
(detailed in Saygin et al., 2005) and a novel “sound exposure” questionnaire that was completed by
the children’s parents. This questionnaire asked parents to rate how often their child heard the envi-
ronmental sounds presented in the study (e.g., every day, three to four times/week, once a week,
twice a month, once a month, every 3 to 4 months, twice a year, yearly, or never). Because there was
little consistency in parents’ responses (perhaps due to problems with the survey itself, as well as
with differences in kind and degree of exposure to sounds), we did not have enough confidence in
these data to use them as covariates.
Verbal labels. Speech labels were derived from a previous norming study, Saygin et al.
(2005), and were of the form ‘noun verb-ing’ (see Appendix). Labels were read by an adult male
North American English speaker, digitally recorded, and edited to 4000 ms in length, including the
carrier phrase, “Look!”. The average duration of the verbal labels was 2431 ms (range 2230 to 2913
ms), including 964 ms for the “Look!” and its subsequent silence prior to the onset of the verbal
label. In order to extend the files to the required 4000 ms, all of the verbal labels had “silence”
appended to the end of the sound files. Fifteen (of 20) of these labels were matched to items on the
MacArthur-Bates CDI, allowing us to calculate whether each label was in a child’s receptive or
productive vocabulary.
Visual stimuli. Pictures were full-color, digitized photos (280 × 320 pixels) of common
objects corresponding to the auditory stimuli. Pictures were paired together so that visually their
images did not overlap, thus lessening the potential confusion infants would have with two
perceptually similar items (e.g., two animals with four feet: dog versus cat). These pairings
remained the same throughout the experiment and were the same for all infants. Trials were
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
quasi-randomized, where all 20 targets were presented once (either with a word or an environmental
sound) before being repeated (presented with the other auditory stimulus). In addition, each picture
was presented twice on each side; as the “target” picture, it was the target once on the left side and
once on the right side to control for looking-side bias. Over the course of the experiment, each
picture appeared four times in a counterbalanced design: picture type (target/distracter) × domain
This experiment was run on a PC using the Babylab Preferential Looking software, custom
developed by the Oxford Babylab (Oxford, England). The participants sat on their mothers’ laps
in a darkened room in front of a computer booth display (Figure 1). The two computer monitors,
90 cm apart, and the loudspeaker were placed in the front panel of the booth for stimuli presen-
tation. The infants sat 72 cm away from the monitors, which were placed at infant eye level. A
small video camera placed between the monitors recorded subjects’ eye movements. The
camera was connected to a VCR in the adjacent observation area, where the computer running
the study was located.
The experiment consisted of 47 experimenter-advanced trials (Figure 2). In each trial, infants
were presented with two pictures on the two computer screens. After 2000 ms, the sound stimu-
lus (either verbal or nonverbal) was presented through the loudspeaker. The infants’ looking
FIGURE 1 Infants visually matched each verbal or environmental
sound to 1 of 2 pictures presented on computer screens in front of them.
A small video camera placed between the monitors recorded subjects’
eye movements during the experiment.
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
task was to visually match each verbal or environmental sound to one of two pictures presented
on computer screens in front of them. For example the target picture “dog” appeared twice: once
with a verbal sound (‘dog barking’) and once with a nonlinguistic sound (the sound of a dog
barking). Overall, 20 pictures were targets, for a total of 40 test trials.2 An additional seven
attention-getting trials (Hey, you’re doing great!, paired with “fun” pictures, such as Cookie
Monster and Thomas the Train) were interspersed throughout the experiment to keep infants
on task.
During the session a digital time code accurate to 33 ms was superimposed on the video-
tape. Videos were digitized by Apple iMovie 2.1.1, converted to Quicktime 6.2 movie files,
and coded using custom-designed software (from the Stanford University Center for Infant
Studies) on a Macintosh G3 by a highly trained observer, blind to the side of the target.
Each session was analyzed frame by frame, assessing whether the child was looking at the
target picture, distracter picture, or neither picture. Accuracy was measured by calculating
the amount of time in each time window an infant looked at the target picture divided by the
total amount of time the child fixated on both the target and distracter pictures. Time win-
dows were defined as Early (533 to 1500 msec) and Late (1533 to 2500 msec), with time
measured from the onset of each phrase or environmental sound (e.g., Fernald et al., 2006;
Houston-Price, Plunkett, & Harris, 2005; see Figure 3). In other words, the analysis win-
dows did not include the carrier phrase, “Look!”. Rather, they began at the onset of the
verbal label (e.g., “dog barking”) or the environmental sound (e.g., the sound of a dog barking).
2The present study used many more different stimuli pairs than most preferential looking studies. However, the large
number of items did not necessarily decrease the infants’ ability to semantically map the auditory stimuli onto the visual
stimuli. For instance, Killing and Bishop (2008) used a bimodal preferential looking paradigm, similar to that used in the
present study. They presented their infants (20 to 24 months) with 16 different pairs of items, which were also semanti-
cally and visually distinct (e.g., comb versus star, tongue versus boot, clock versus bee), and demonstrated interpretable
results, as in the present study.
FIGURE 2 Two pictures first appeared on two computer screens for
2000 ms. The pictures remained visible while the auditory stimulus
(verbal or environmental sound label) was presented for 4000 ms. At the
conclusion of the auditory stimulus, both pictures disappeared together
from the computer screens. The experimenter manually started the next
trial as long as the child was attending to the computer screens.
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
Data from the first 500 ms of each environmental sound and verbal label are not reported as this
time window did not show significant differences between modalities or age groups. (It is standard
in infant preferential looking paradigms to disregard at least the initial 367 ms of an analysis
window due to the amount of time infants need to initiate eye movements; e.g., Fernald, Pinto,
Swingley, Weinberg, McRoberts, 1998; Fernald et al., 2006).
Two different sets of mixed-effects analyses of variance (ANOVA) were performed to
examine looking time accuracy: (a) a 2-within-subjects, Domain (Verbal/Nonverbal) and
Window (Early/Late), by 1-between-subjects, Age (15/20/25), ANOVA; and (b) a 2-within-
subjects, Domain (Verbal/Nonverbal) and Window (Early/Late), by 1-between-subjects,
(CDI Verbal Proficiency (Low/Mid/High), ANOVA. Due to the small number of varied
stimulus items, ANOVAs were carried out with both subjects (F1) and items (F2) as random
factors. When appropriate, Bonferroni corrections were applied.
The predictability of infants’ performance on the task was also tested with multiple
regression using age and productive vocabulary as predictors of looking accuracy. All
estimates of unique variance were derived from adjusted r2.
Looking Time Results
Age group analysis. Accuracy significantly improved with age, F1(2,57) = 12.84, p < .0001;
F2(2,38) = 19.50, p < .0001 (Table 1). Preplanned group comparisons using ANOVA revealed that
FIGURE 3 The Early Time Window ran from 533 ms after utterance/
sound onset to 1500 ms after onset. The Late Time Window ran from
1533 to 2500 ms after sentence onset. Note that the analyses did not
include the carrier phrase, “Look!”. Here, the 15-, 20-, and 25-month-old
infants’ responses to the Verbal trials are represented by dashed lines and
the Environmental Sound trials by solid lines.
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
both the 20- and 25-month-old infants performed significantly better than the 15-month-olds, 20
months: F1(1,38) = 9.71, p < .004; 25 months: F1(1,38) = 25.62, p < .0001, while not being statisti-
cally different from each other. Infants’ accuracy performance also improved significantly from the
Early to Late Window, F1(1,57) = 94.88, p < .0001; F2(1,19) = 42.25, p < .0001. No main effect of
Domain (i.e., verbal or environmental sound labels) was observed. No other accuracy effects were
Verbal proficiency group analysis. Infants’ accuracy improved as their verbal proficiency
grew, F1(2,57) = 17.53, p < .0001; F2(2,38) = 18.6, p < .0001 (Table 1). Preplanned group compar-
isons using ANOVA revealed that infants in both the Mid and High verbal proficiency groups
were more accurate than those in the Low group, Mid: F1(1,38) = 27.45, p < .0001; High: F1(1,38) =
27.20, p < .0001, while not differing from each other. Infants’ accuracy also improved from the
Early to Late time window, F1(1,57) = 93.76, p < .0001; F2(1,19) = 42.44, p < .0001. As with the
Age group analyses, no main effect of Domain was observed.
Unlike age, there were differential effects of verbal proficiency on Verbal and Environ-
mental Sounds looking accuracy, as shown by the Domain × Verbal Proficiency interaction,
F1(2,57) = 6.59, p < .003; F2(2,38) = 3.38, p < .06 (Figure 4). Preplanned group comparisons
using ANOVA (Table 1) demonstrated that for Environmental Sound trials, the Low profi-
ciency group’s performance was significantly worse than the Mid group, F1(1,38) = 15.69, p <
.0002, and showed a strong trend for being less accurate than infants in the High group; the
Mid and High groups did not differ from each other. Similarly in the Verbal trials, infants in
the Low group were less accurate than those in both the Mid, F1(1,38) = 10.84, p < .002, and
High, F1(1,38) = 33.56, p < .0001, groups, who did not differ from each other. Interestingly,
when the individual proficiency groups were examined, only the High proficiency group
showed significant processing differences for speech and environmental sounds, F1(1,19) =
21.55, p < .0002, being more accurate in response to the Verbal labels. No other effects or
interactions were significant.
Looking Accuracy Results
Domain Early Window Late Window
Sounds Verbal Sounds Verbal Sounds Verbal
15 months 53.69 54.87 52.51 50.18 50.16 59.56 54.86
20 months 60.71161.25 60.17 57.73 54.24 64.77 64.75
25 months 65.10264.38 65.84 58.35 61.38 70.39 70.3
Low Proficiency 52.60 54.72 50.49 49.73 47.97 59.72 53.01
Mid Proficiency 62.26364.22460.29558.7 55.31 69.75 65.28
High Proficiency 64.64361.54 67.746,7 57.83 62.50 65.26 72.99
Note. Accuracy was calculated by dividing the proportion of time infants looked at the target picture by the total
amount of time infants looked at both the target and distracter pictures. All accuracy measures represent the percent of
time infants attended to the target during the designated analysis window. Data are significant as compared to: 115
months, p < .003; 215 months, p < .0001; 3Low Proficiency, p < .0001; 4Low Proficiency Sounds, p < .0003; 5Low Pro-
ficiency Verbal, p < .003; 6Low Proficiency Verbal, p < .0001; 7High Proficiency Sounds, p < .0003. F values are cited
in the text.
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
A lack of item repetition across trials may have potentially influenced infants’ performance
in the task.3 Thus, a comparison to chance performance (i.e., 50% responding rate) is a prudent
check to establish a difference between systematic and random responding (i.e., whether or not
infants were recognizing the sound–object associations in the task), especially when comparatively
young and nonverbal infants are involved. As expected, for the 25-month-old group, t tests
showed that group accuracy was significantly greater than chance in each time window and
domain, Verbal Early: t(1,19) = 4.35, p < .0004; Verbal Late: t(1,19) = 8.10, p < .0001; Sounds
Early: t(1,19) = 4.89, p < .0002; Sounds Late: t(1,19) = 9.64, p < .0001. The 20-month-old
infants responded significantly greater than chance to the environmental sounds in both
windows, Early: t(1,19) = 5.21, p < .0001; Late: t(1,19) = 6.12, p < .0001) as well as to the
words in the Late window, t(1,19) = 5.48, p < .0001. The 15-month-olds’ accuracy was also
greater than chance for environmental sounds in the Late window, t(1,19) = 4.76, p < .0002, but
not for the words in either window. Thus, the word stimuli, particularly in the Early time win-
dow, did not appear to give the younger infants (15-month-olds, 20-month-olds) enough
3Relative to other preferential looking studies (e.g., Swingley, Pinto, & Fernald, 1999; Fernald et al., 1998), overall
looking accuracy in our study was relatively low; this may have been driven by several factors. First, while many prefer-
ential looking studies test on a small set of items presented repeatedly, our experiment had 40 stimuli, each presented
once. Thus, higher accuracy observed in other studies may be in part driven by opportunities for stimulus learning over
multiple trials. In addition, because the two modalities were not separately blocked within the experiment, infants were
not able to build up expectancies for the upcoming trial. The mixed trial design also may have decreased accuracy, since
young children’s performance can be modulated by stimulus predictability (Fernald & McRoberts, 1999).
FIGURE 4 While the Low and Mid proficiency groups did not differ in
their responses to Verbal and Environmental Sound labels, infants in the
High proficiency group responded significantly more accurately to
Verbal labels. Note that after taking into account each child’s verbal
comprehension/production and the onomatopoetic test items, all cross-
domain differences did disappear (see text for details).
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
information to begin systematically processing the presented pictures. However, as more of the
verbal information was presented (i.e., the verb was spoken) and/or more time was allotted for
the infants to process the information, their performance increased above chance levels. Indeed,
while nouns are the more prevalent and salient word class in young infants’ vocabularies
(Fenson et al., 1994), the verbs in the phases might have provided at least some of the infants
with additional context to help them recognize the sound–object associations (e.g., kiss is com-
prehended by 52% of infants by 10 months of age; Dale & Fenson, 1996). In regard to the envi-
ronmental sound stimuli, the youngest infants did perform at chance levels during the Early
window. However, no other age group for either analysis window was at or below chance in
their sound–object mapping of the environmental sounds onto the pictures.
In order to ascertain that these results were not simply due to the children not knowing the
words in the linguistic stimuli, we also analyzed the data after recalculating each child’s mean
looking accuracy based only on items whose verbal labels were in their individual receptive
(15-month-olds) and/or productive (15-, 20-, 25-month-olds) vocabulary according to the
MacArthur-Bates CDI (see Appendix for list of corresponding items). As in the analyses where
mean accuracy was calculated over all items, there were no significant interactions of Domain
with Age Group at either time window, but there was an interaction of Domain by Verbal Profi-
ciency Group, F1(2,50) = 3.90, p < .03.4
Since onomatopoeic sounds are arguably closer to linguistic sounds than sounds that are not
typically verbalized, this subset of sounds could make infants’ looking to environmental sounds
seem more similar to verbal sounds than it truly is. Therefore, we wanted to ensure that the pat-
tern of results in mean verbal and nonverbal looking accuracy, in particular the similarities in the
patterns, was not driven by the subset of animal-related environmental sounds that were paired
with labels containing onomatopoetic verbs, such as ‘cow mooing’, ‘rooster crowing’ (8 of 20
experimental items, see Appendix for list). Thus we reran each ANOVA listed above, including
only the 12 nononomatopoetic items in the analyses. The only difference was in the CDI-based
analyses: when the eight onomatopoeic items were excluded, the interaction between Verbal
Proficiency Group and Domain was no longer significant. Otherwise, the exclusion of these
items did not change the direction or significance of any analyses, aside from an approximately
2 to 3% overall drop in accuracy.
Finally, because 8 of the 10 picture pairs pitted an animate versus an inanimate object (e.g.,
with the target being ‘dog’ and foil being ‘piano’, and vice versa), we wanted to make sure that
simple animacy cues were not allowing children to distinguish between environmental sounds
on the basis of animacy alone, as this would have confounded a proper comparison of environ-
mental sound and verbal label comprehension. If indeed accuracy on the environmental sounds
trials was being driven (even in part) by a difference in animacy in 8 of 10 target/foil pairs, then
the infants’ looking accuracy would have been correlated over the environmental sounds target/
distracter pairs. For example, accuracy for the target ‘piano’ should positively predict target
accuracy for ‘dog’, if infants are using animacy differences rather than, or in addition to, item
knowledge to guide their sound/picture matching. Instead, we found no hint of a positive
correlation between paired items. For each age group (20 and 25 months) and time window
(Early/Late), correlations between environmental sounds items in a pair were all negative with
4The change in degrees of freedom from the original analyses is due to the fact that 4 subjects’ parents did not return
a completed CDI questionnaire and that there was a lack of sufficient items in some cells for 3 subjects.
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
Spearman’s Rho values of -0.7619 (20-month-old, Window 2); -0.4072 (20 month-old, Window
3); -0.3810 (25-month-old, Window 2); and -0.1557 (25-month-old, Window 3). Thus, these
analyses give no indication that animacy differences were affecting infants’ looking accuracy.
(Note that we did not include the 15-month-old data here, as they were on average near chance
levels; see Discussion).
Regression Analyses
Chronological age and productive vocabulary. We carried out a set of regressions to
examine the predictive value of age and CDI productive vocabulary for verbal and nonverbal
accuracy in Early and Late time windows. One might expect chronological age and productive
vocabulary size to have roughly equivalent predictive value, given that children’s lexicons typi-
cally grow as they get older. Indeed, the two measures showed a very strong positive association
(adjusted r2 = .527, p < .0001). However, as suggested by the ANOVAs above, young children’s
accuracy with verbal and environmental sounds was differentially predicted by these two vari-
ables when both age and productive vocabulary were entered into the regression model.
Increases in environmental sounds accuracy were associated with greater chronological age in
both Early (adjusted r2 = .140, p < .0026) and Late (adjusted r2 = .1804, p < .0006) time win-
dows, whereas productive vocabulary scores had no significant predictive value by itself, or
when it was added to chronological age in the regression model. In contrast to environmental
sounds accuracy, increases in verbal accuracy were associated both with chronological age in
Early (adjusted r2 = .1495, p < .0019) and Late (adjusted r2 = .1963, p < .0004) windows and
with productive vocabulary (Early Window: adjusted r2 = .2401, p < .0001; Late Window:
adjusted r2 = .2944, p < .0001). However, whereas productive vocabulary accounted for signifi-
cant additional variance when entered after chronological age in the regression model in both
Early (increase in adjusted r2 = .0789, p < .0136) and Late (increase in adjusted r2 = .0905, p <
.0071) time windows, chronological age accounted for no significant additional variance when
entered after productive vocabulary.
This study compared 15-, 20-, and 25-month-old infants’ eye-gaze responses associated
with the recognition of intermodal sound–object associations for verbal and environmental
sound stimuli. Infants’ looking time accuracy increased from 15 to 20 months of age, while
no measurable improvement was noted between the 20- and 25-month-old infants. Looking
time accuracy also improved with infants’ reported verbal proficiency levels. Infants with
very small productive vocabularies (< 50 words) were less accurate than children with
larger vocabularies.
Based on earlier research and as described in the introduction, two different outcomes could
have been expected when comparing verbal and nonverbal processing. There could be an early
advantage for the nonverbal sounds, with a gradual preference for the verbal modality as chil-
dren get older, as has been found in studies comparing gesture and language development in the
first 2 years of life. Alternatively, we could observe similar patterns of processing, as evidenced
by studies with older children and adults.
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
We found that the recognition of sound–object associations for environmental sounds and
verbal sounds is quite similar for all infants, especially when examined in the late time window
(1500 to 2500 ms poststimulus onset). We also observed that infants’ ability to recognize these
sound–object associations improves with increasing chronological age, especially for
environmental sounds. Changes in performance for verbal sounds, on the other hand, are more
closely tied to infants’ productive vocabulary sizes (consistent with Conboy & Thal, 2006;
Fernald et al., 2006). Thus, although the infants’ overt behavior (i.e., directed eye gaze) might
be indistinguishable as they process verbal and nonverbal sounds, it is likely that there are
some differences in the underlying processes driving the infants’ responses to the two types of
sounds. The design of the present study, and the data collected, cannot shed further light on
this issue, but neuroimaging studies using methods such as event-related potentials (ERPs) or
near-infrared spectroscopy (NIRS) may be useful in exploring potential underlying differ-
ences or similarities.
In the present study, we did not observe a pronounced shift toward better recognizing sound–
object associations for verbal stimuli as children get older, although, as noted before, children
with the largest productive vocabularies did show greater accuracy for verbal labels than for
environmental sounds. Most studies that observed such a shift were carried out in the context of
learning new associations. For example, studies have compared infants’ ability to associate
novel objects with either new words (an experience quite familiar to children of this age group)
or with artificial, arbitrary sounds (a relatively unnatural and unfamiliar experience; e.g., Wood-
ward & Hoyne, 1999). The relationship between the nonverbal environmental sounds and visual
stimuli in the present study was quite different: Infants were already familiar with most of the
audiovisual associations in our stimuli. Thus, these associations were not arbitrary but were
based on infants’ prior real-world knowledge, with sensitivity to such meaningful associations
increasing with age and presumably experience. Future research could examine whether infants
learn new nonarbitrary associations between objects and novel/unfamiliar environmental sounds
as quickly as verbal labels, or whether there are differences in the processing of verbal and
nonverbal stimuli in the context of learning new associations versus using already learned
While our results do not necessarily bear on the question of how infants initially acquire
and learn words and the environmental sounds, they suggest that at this stage in develop-
ment, children are capable of recognizing correct sound–object associations, regardless of
whether it involves a word or meaningful nonverbal sound. More generally, infants can
detect mismatches between heard and seen stimuli, and their ability to do so improves with
age, as has been reported in previous studies on the perception of object–sound and face–
voice relationships (e.g., Bahrick, 1983; Dodd, 1979; Hollich, Newman, & Jusczyk, 2005;
Kuhl & Meltzoff, 1982, 1984; Lewkowicz, 1996; Patterson & Werker, 2003; Walker-
Andrews & Lennon, 1991).
It is possible that the frequency with which infants are exposed to verbal and environ-
mental sounds, as well as to intermodal associations involving each sound type, may have
affected their responses in the present study. Obtaining specific measurements of infants’
exposure to sounds and intermodal associative situations is a vast and complicated under-
taking (see Ballas & Howard, 1987, who conducted a study of adults’ exposure to different
environmental sounds) and beyond the scope of this study. However, given the influence
frequency of occurrence has on language learning and processing (Dick et al., 2001; Matthews,
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
Lieven, Theakston, & Tomasello, 2005; Roland, Dick, & Elman, 2007), it is very likely that
frequency of occurrence does play a role in establishing both nonverbal and verbal sound–object
As mentioned above, it appears that the children who are more sensitive to intermodal
sound–object associations (and thus are presumably more attentive to the important distin-
guishing characteristics in their world) also have larger vocabularies. As would be predicted by
innumerable studies, we found a strong correlation between chronological age and productive
vocabulary. Presumably, infants who attend to all types of meaningful input—verbal, visual, or
otherwise—are more likely to benefit and learn from their environmental interactions. More
prosaically, productive vocabulary may also reflect more basic individual differences, such as
“speed of processing” that would influence children’s ability to learn new words quickly and
efficiently. We cannot rule out this possibility here, as we have no independent measure of
intellectual and sensorimotor ability for this cohort.
The advantage the most verbally proficient infants showed for the verbal stimuli was no
longer significant when the sounds that could be verbally produced (i.e., onomatopoeias) were
excluded from the analyses. This suggests that onomatopoeic items are a significant, and
potentially important, component of toddlers’ (15 to 25 months) vocabularies. This is interesting
since, except for onomatopoeic items, the recognition of verbal sound–object associations is
thought to be a completely arbitrary process and varies across languages. However, there is
evidence that young children are able to recognize, and prefer, the association between coor-
dinating sounds and objects: Maurer, Pathman, and Mondloch (2006) found that 30-month-old
children appear to have a bias for a sound–shape correspondence (i.e., mouth shape and vowel
sound) during the production of nonsense words, a preference that may serve to potentiate
language development. Even very young infants can match sounds and their sources by local-
izing sound sources in space (e.g., Morrongiello, Fenwick, & Chance, 1998; Morrongiello,
Lasenby, & Lee, 2003; Muir & Field, 1979), as well as identify transparent dynamic relations
between mouth movement and sound (e.g., Bahrick, 1994; Gogate & Bahrick, 1998; Hollich
et al., 2005). Thus, these early-acquired biases for bimodal correspondence could structure
how young infants interact during potential language learning situations (e.g., focusing on
mouths speaking or producing noise, as opposed to feet). Onomatopoeic items provide young
children with information regarding intermodal associations, potentially providing a basis
from which to learn and recognize more arbitrary sound–object associations, such as those
involving words.
To our knowledge, this study was the first of its kind to examine young children’s recognition of
sound–object associations for different types of complex, meaningful sounds. Overall, across
the age span studied, infants were just as accurate in recognizing the association between a
familiar object and its related environmental sound, as when the object was verbally named. In
addition, we observed that the processing of environmental sounds is closely tied to chronologi-
cal age, while the processing of speech was linked to both chronological age and verbal profi-
ciency, thus demonstrating the subtle differences in the mechanisms underlying recognition in
verbal and nonverbal domains early in development. The consistency with which nonverbal
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
sounds and onomatopoeias are associated with their object referents may play an important role
in teaching young children about intermodal associations, which may bootstrap the learning of
more arbitrary word–object relationships.
Alycia Cummings was supported by NIH training grants #DC00041 and #DC007361, Ayse
Pinar Saygin was supported by European Commission Marie Curie Award FP6-025044, and
Frederic Dick was supported by MRC New Investigator Award G0400341. We would like to
thank Leslie Carver and Karen Dobkins for allowing us to use their subject pool at UCSD and
the Project in Cognitive in Neural Development at UCSD for the use of their facility, supported
by NIH P50 NS22343. We would also like to thank Arielle Borovsky and Rob Leech for their
comments on previous versions of this paper.
Aziz-Zadeh, L., Iacoboni, M., Zaidel, E., Wilson, S., & Mazziotta, J. (2004). Left hemisphere motor facilitation in
response to manual action sounds. European Journal of Neuroscience, 19, 2609–2612.
Bahrick, L. E. (1983). Infants’ perception of substance and temporal synchrony in multimodal events. Infant Behavior
and Development, 6, 429–451.
Bahrick, L. E. (1992). Infants’ perceptual differentiation of amodal and modality-specific audio-visual relations. Journal
of Experimental Child Psychology, 53, 180–199.
Bahrick, L. E. (1994). The development of infants’ sensitivity to arbitrary intermodal relations. Ecological Psychology,
6(2), 111–123.
Balaban, M., & Waxman, S. (1997). Do words facilitate object categorization in 9-month-old infants. Journal of Exper-
imental Child Psychology, 64, 3–26.
Ballas, J. (1993). Common factors in the identification of an assortment of brief everyday sounds. Journal of Experimen-
tal Psychology: Human Perception and Performance, 19(2), 250–267.
Ballas, J., & Howard, J. (1987). Interpreting the language of environmental sounds. Environment and Behavior, 19(1),
Boersma, P. (2001). Praat, a system for doing phonetics by computer. Glot International, 5(9/10), 341–345.
Borovsky, A., Saygin, A. P., Cummings, A., & Dick, F. (in preparation). Emerging paths to sound understanding:
Language and environmental sound comprehension in specific language impairment and typically developing chil-
dren. Manuscript in preparation.
Campbell, A., & Namy, L. (2003). The role of social-referential context in verbal and nonverbal symbol learning. Child
Development, 74(2), 549–563.
Chiu, C., & Schacter, D. (1995). Auditory priming for nonverbal information: Implicit and explicit memory for environ-
mental sounds. Consciousness and Cognition, 4(4), 440–458.
Conboy, B., & Thal, D. (2006). Ties between the lexicon and grammar: Cross-sectional and longitudinal studies of
bilingual toddlers. Child Development, 77(3), 712–735.
Cummings, A., Ceponiene, R., Dick, F., Saygin, A. P., & Townsend, J. (2008). A developmental ERP study of verbal
and non-verbal semantic processing. Brain Research, 1208, 137–149.
Cummings, A., Ceponiene, R., Koyama, A., Saygin, A. P., Townsend, J., & Dick, F. (2006). Auditory semantic net-
works for words and natural sounds. Brain Research, 1115, 92–107.
Cummings, A., Ceponiene, R., Williams, C., Townsend, J., & Wulfeck, B. (2006, June). Auditory word versus environ-
mental sound processing in children with Specific Language Impairment: An event-related potential study. Poster
session presented at the 2006 Symposium on Research in Child Language Disorders, Madison, WI.
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
Cycowicz, Y., & Friedman, D. (1998). Effect of sound familiarity on the event-related potentials elicited by novel envi-
ronmental sounds. Brain and Cognition, 36, 30–51.
Dale, P., & Fenson, L. (1996). Lexical development norms for young children. Behavior Research Methods,
Instruments, & Computers, 28, 125–127.
Dawson, C., & Gerken, L. A. (2006). 4-month-olds discover algebraic patterns in music that 7.5-month-olds do not.
In Proceedings of the Twenty-ninth Annual Conference of the Cognitive Science Society (pp. 1198–1203). Mahwah,
NJ: Erlbaum.
Dick, F., Bates, E., Wulfeck, B., Aydelott Utman, J., Dronkers, N., & Gernsbacher, M. A. (2001). Language deficits,
localization and grammar: Evidence for a distributive model of language breakdown in aphasics and normals.
Psychological Review, 108(4), 759–788.
Dick, F., Saygin, A. P., Paulsen, C., Trauner, D., & Bates, E. (2004, July). The co-development of environmental sounds
and language comprehension in school-age children. Poster session presented at the Meeting for Attention and
Performance, Winter Park, Colorado.
Dodd, B. (1979). Lip reading in infants: Attention to speech presented in- and out-of synchrony. Cognitive Psychology,
11(4), 478–484.
Fenson, L., & Dale, P. (1996). Lexical development norms for young children. Behavior Research Methods, Instru-
ments, & Computers, 28, 125–127.
Fernald, A., & McRoberts, G. W. (1999, November). Listening ahead: How repetition enhances infants’ ability to rec-
ognize words in fluent speech. Paper presented at the 24th Annual Boston University Conference on Language
Development, Boston, MA.
Fernald, A., Perfors, A., & Marchman, V. (2006). Picking up speed in understanding: Speech processing efficiency and
vocabulary growth across the 2nd year. Developmental Psychology, 42(1), 98–116.
Fernald, A., Pinto, J. P., Swingley, D., Weinberg, A., & McRoberts, G. W. (1998). Rapid gains in speed of verbal pro-
cessing by infants in the second year. Psychological Science, 9, 72–75.
Friedman, D., Cycowicz, Y., & Dziobek, I. (2003). Cross-form conceptual relations between sounds and words: Effects
on the novelty P3. Cognitive Brain Research, 18(1), 58–64.
Glenn, S., Cunningham, C., & Joyce, P. (1981). A study of auditory preferences in nonhandicapped infants and infants
with Down’s syndrome. Child Development, 52(4), 1303–1307.
Gogate, L. J., & Bahrick, L. E. (1998). Intersensory redundancy facilitates learning of arbitrary relations between vowel
sounds and objects in seven-month-old infants. Journal of Experimental Child Psychology, 69, 133–149.
Gogate, L. J., Bolzani, L. H., & Betancourt, E. A. (2006). Attention to maternal multimodal naming by 6- to 8-month-old
infants and learning of word-object relations. Infancy, 9(3), 259–288.
Gogate, L. J., Walker-Andrews, A. S., & Bahrick, L. E. (2001). The intersensory origins of word comprehension: An
ecological-dynamic systems view. Developmental Science, 4(1), 1–18.
Golinkoff, R., Mervis, C., & Hirsh-Pasek, K. (1994). Early object labels: The case for a developmental lexical principles
framework. Journal of Child Language, 21, 125–155.
Gygi, B. (2001). Factors in the identification of environmental sounds. Unpublished doctoral dissertation, Indiana
University– Bloomington.
Hollich, G., Hirsh-Pasek, K., Golinkoff, R., Brand, R., Brown, E., Chung, H., et al. (2000). Breaking the language barrier:
An emergentist coalition model for the origins of word learning. Monographs of the Society for Research in Child
Development, 65(3), v–123.
Hollich, G., Newman, R. S., & Jusczyk, P. W. (2005). Infants’ use of synchronized visual information to separate
streams of speech. Child Development, 76(3), 598–613.
Houston-Price, C., Plunkett, K., & Harris, P. (2005). Word-learning wizardry at 1;6. Journal of Child Language, 32,
Iverson, J., Capirci, O., & Caselli, M. C. (1994). From communication to language in two modalities. Cognitive Devel-
opment, 9, 23–43.
Lewis, J. W., Brefczynski, J. A., Phinney, R. E., Janik, J. J., & DeYoe, E. (2005). Distinct cortical pathways for process-
ing tool versus animal sounds. Journal of Neuroscience, 25(21), 5148–5158.
Lewkowicz, D. J. (1996). Perception of auditory-visual temporal synchrony in human infants. Journal of Experimental
Psychology: Human Perception and Performance, 22(5), 1094–1106.
Killing, S., & Bishop, D. (2008). Move it!: Visual feedback enhances validity of preferential looking as a measure of
individual differences in vocabulary in toddlers. Developmental Science, 11(4), 525–530.
Kuhl, P., & Meltzoff, A. N. (1982). Bimodal perception of speech in infancy. Science, 218(4577), 1138–1141.
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
Kuhl, P., & Meltzoff, A. N. (1984). The intermodal representation of speech in infants. Infant Behavior and Development,
7, 361–381.
Marchman, V., & Fernald, A. (2008). Speed of word recognition and vocabulary knowledge in infancy predict cognitive
and language outcomes in later childhood. Developmental Science, 11(3), F9–F16.
Marcus, G. F., Fernandes, K. J., & Johnson, S. P. (2007). Infant rule learning facilitated by speech. Psychological
Science, 18(5), 387–391.
Matthews, D., Lieven, E., Theakston, A., & Tomasello, M. (2005). The role of frequency in the acquisition of English
word order. Cognitive Development, 20, 121–136.
Maurer, D., Pathman, T., & Mondloch, C. J. (2006). The shape of boubas: Sound-shape correspondences in toddlers and
adults. Developmental Science, 9(3), 316–322.
Morrongiello, B., Fenwick, K., & Chance, G. (1998). Crossmodal learning in newborn infants: Inferences about proper-
ties of auditory-visual events. Infant Behavior & Development, 21(4), 543–554.
Morrongiello, B., Lasenby, J., & Lee, N. (2003). Infants’ learning, memory, and generalization of learning for bimodal
events. Journal of Experimental Child Psychology, 84, 1–19.
Muir, D., & Field, J. (1979). Infants orient to sounds. Child Development, 50(2), 431–436.
Namy, L. (2001). What’s in a name when it isn’t a word? 17-month-olds’ mapping of nonverbal symbols to object cate-
gories. Infancy, 2, 73–86.
Namy, L., Acredolo, L., & Goodwyn, S. (2000). Verbal labels and gestural routines in parental communication with
young children. Journal of Nonverbal Behavior, 24, 63–79.
Namy, L., & Waxman, S. (1998). Words and gestures: Infants’ interpretations of different forms of symbolic reference.
Child Development, 69(2), 295–308.
Namy, L., & Waxman, S. (2002). Patterns of spontaneous production of novel words and gestures within an experimen-
tal setting in children ages 1;6 and 2;2. Journal of Child Language, 24(2), 911–921.
Patterson, M. L., & Werker, J. F. (2003). Two-month-old infants match phonetic information in lips and voice. Develop-
mental Science, 6(2), 191–196.
Pizzamiglio, L., Aprile, T., Spitoni, G., Pitzalis, S., Bates, E., D’Amico, S., et al. (2005). Separate neural systems for
processing action- or non-action-related sounds. NeuroImage, 24(3), 852–861.
Roberts, K. (1995). Categorical responding in 15-month-olds: Influence of the noun-category bias and the covariation
between visual fixation and auditory input. Cognitive Development, 10(1), 21–41.
Roland, D., Dick, F., & Elman, J. L. (2007). Frequency of basic English grammatical structures: A corpus analysis.
Journal of Memory and Language, 57, 348–379.
Saffran, J., Pollak, S., Seibel, R., & Shkolnik, A. (2007). Dog is a dog is a dog: Infant rule learning is not specific to lan-
guage. Cognition, 105(3), 669–680.
Saygin, A. P., Dick, F., & Bates, E. (2005). An online task for contrasting auditory processing in the verbal and nonver-
bal domains and norms for younger and older adults. Behavior Research Methods, Instruments, & Computers, 37(1),
Saygin, A. P., Dick, F., Wilson, S., Dronkers, N., & Bates, E. (2003). Neural resources for processing language and
environmental sounds: Evidence from aphasia. Brain, 126, 928–945.
Stuart, G., & Jones, D. (1995). Priming the identification of environmental sounds. Quarterly Journal of Experimental
Psychology y, 48A(3), 741–761.
Swingley, D., Pinto, J., & Fernald, A. (1999). Continuous processing in word recognition at 24 months. Cognition, 71,
Van Petten, C., & Rheinfelder, H. (1995). Conceptual relationships between spoken words and environmental sounds:
Event-related brain potential measures. Neuropsychologia, 33, 485–508.
Volterra, V., Caselli, M. C., Capirci, O., & Pizzuto, E. (2005). Gesture and the emergence and development of language.
In M. Thomasello & D. Slobin (Eds.), Beyond nature/nurture: A festschrift for Elizabeth Bates (pp. 3–40). Mahwah,
NJ: Lawrence Erlbaum Associates.
Volterra, V., Iverson, J., & Emmorey, K. (1995). When do modality factors affect the course of language acquisition? In
J. Reilly (Ed.), Language, gesture, and space (pp. 371–390). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Vouloumanos, A., & Werker, J. (2004). Tuned to the signal: The privileged status of speech for young infants. Develop-
mental Science, 7(3), 270–276.
Walker-Andrews, A. S., & Lennon, E. (1991). Infants’ discrimination of vocal expressions: Contributions of auditory
and visual information. Infant Behavior and Development, 14, 131–142.
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
Woodward, A., & Hoyne, K. (1999). Infants’ learning about words and sounds in relation to objects. Child Develop-
ment, 70, 65–77.
Xu, F. (2002). The role of language in acquiring object kind concepts in infancy. Cognition, 85(3), 223–250.
Zangl, R., Klarman, L., Thal, D., Fernald, A., & Bates, E. (2005). Dynamics of word comprehension in infancy: Devel-
opments in timing, accuracy, and resistance to acoustic degradation. Journal of Cognition and Development, 6(2),
Target Picture Distracter Picture Verbal Phrase
Cow Piano Cow Mooing*
Lion Alarm Clock Lion Growling*
Trumpet Kissing Couple Trumpet Playing
Horse Telephone Horse Neighing*
Dog Sneezing Person Dog Barking*
Toilet Cat Toilet Flushing
Baby Grandfather Clock Baby Crying*
Sneezing Person Dog Someone Sneezing
Airplane Singing Woman Airplane Flying*
Fly Sheep Fly Buzzing
Alarm Clock Lion Alarm Clock Ringing*
Singing Woman Airplane Woman Singing*
Kissing Couple Trumpet Someone Kissing*
Cat Toilet Cat Meowing*
Grandfather Clock Baby Grandfather Clock Chiming*
Train Rooster Train Going By*
Piano Cow Piano Playing
Sheep Fly Sheep Baaing*
Telephone Horse Telephone Ringing*
Rooster Train Rooster Crowing*
Note: Items in italics were counted as onomatopoeic; starred verbal phrases had
equivalents in the MacArthur-Bates CDI, and thus were included in the individually
tailored item analyses (see Results).
Downloaded By: [Cummings, Alycia] At: 23:41 30 June 2009
... To the best of our knowledge, there are only two developmental studies that investigated whether young children process known words and sounds similarly [15,16]. In Cummings et al. [15] study, 15-, 20-, and 25-month-old toddlers participated in a looking-while-listening task, during which they viewed pairs of images (e.g. ...
... To the best of our knowledge, there are only two developmental studies that investigated whether young children process known words and sounds similarly [15,16]. In Cummings et al. [15] study, 15-, 20-, and 25-month-old toddlers participated in a looking-while-listening task, during which they viewed pairs of images (e.g. dog-piano) and heard either associated sounds (e.g. ...
... This result must be taken with caution and needs to be considered carefully. Though it matches Hendrickson et al. [16]suggestion that associated sounds require longer time to process the semantic match between the visual object and the generated sound, it contradicts the results of Cummings and colleagues [15], whereby object recognition was similar in the words and associated sounds condition. Toom and Kukona's [11]VWP study with adults, found greater looking times and semantic activation of the competitors in the associated sounds relative to the words conditions. ...
Full-text available
In adults, words are more effective than sounds at activating conceptual representations. We aimed to replicate these findings and extend them to infants. In a series of experiments using an eye tracker object recognition task, suitable for both adults and infants, participants heard either a word (e.g. cow) or an associated sound (e.g. mooing) followed by an image illustrating a target (e.g. cow) and a distracter (e.g. telephone). The results showed that adults reacted faster when the visual object matched the auditory stimulus and even faster in the word relative to the associated sound condition. Infants, however, did not show a similar pattern of eye-movements: only eighteen-month-olds, but not 9- or 12-month-olds, were equally fast at recognizing the target object in both conditions. Looking times, however, were longer for associated sounds, suggesting that processing sounds elicits greater allocation of attention. Our findings suggest that the advantage of words over associated sounds in activating conceptual representations emerges at a later stage during language development.
... Such an investigation can further our understanding of the relation between language and cognition by examining whether an interconnected auditory-semantic network can be instantiated independent of language early in development. What's more, it has recently been suggested that the consistency with which environmental sounds are associated with their object referents may bootstrap the learning of more arbitrary word-object relations (Cummings, Saygin, Bates, & Dick, 2009). However, this claim is based on the assumption that the mechanisms of semantic integration that subserve the processing of words and environmental sounds are similar in the developing brain. ...
... The N400 incongruity effect denotes the relative increase in N400 amplitude to a semantically unrelated stimulus. N400 incongruity effects have been found for words or pictures primed by related and unrelated environmental sounds (Daltrozzo & Schön, 2009;Frey, Aramaki, & Besson, 2014;Schön, Ystad, Kronland-Martinet, & Besson, 2010;Van Petten & Rheinfelder, 1995) and for environmental sounds primed by related and unrelated words, pictures, or other environmental sounds (Aramaki, Marie, Kronland-Martinet, Ystad, & Besson, 2010;Cummings, Čeponienė, Dick, Saygin, & Townsend, 2008;Cummings et al., 2006Cummings et al., , 2009Daltrozzo & Schön, 2009;Orgs, Lange, Dombrowski, & Heil, 2006Plante, Petten, & Senkfor, 2000;Schirmer, Soh, Penney, & Wyse, 2011;Schön et al., 2010;Van Petten & Rheinfelder, 1995). ...
... Only one study has examined the semantic processing of environmental sounds younger than age seven. Cummings et al. (2009) tested 15-, 20-, and 25-month-olds' using a looking-while-listening paradigm. Participants heard environmental sounds or spoken words when viewing pairs of images and eye movements to match versus non-match pictures were captured to determine the accuracy of object identification. ...
The majority of research examining early auditory‐semantic processing and organization is based on studies of meaningful relations between words and referents. However, a thorough investigation into the fundamental relation between acoustic signals and meaning requires an understanding of how meaning is associated with both lexical and non‐lexical sounds. Indeed, it is unknown how meaningful auditory information that is not lexical (e.g., environmental sounds) is processed and organized in the young brain. To capture the structure of semantic organization for words and environmental sounds, we record event‐related potentials (ERPs) as 20‐month‐olds view images of common nouns (e.g., dog) while hearing words or environmental sounds that match the picture (e.g., “dog” or barking), that are within‐category violations (e.g., “cat” or meowing), or that are between‐category violations (e.g., “pen” or scribbling). Results show both words and environmental sounds exhibit larger negative amplitudes to between‐category violations relative to matches. Unlike words, which show a greater negative response early and consistently to within‐category violations, such an effect for environmental sounds occurs late in semantic processing. Thus, as in adults, the young brain represents semantic relations between words and between environmental sounds, though it more readily differentiates semantically similar words compared to environmental sounds. This article is protected by copyright. All rights reserved.
... Our findings of the general development pattern of prelinguistic skills, e.g., eye contact (Berger & Cunningham, 1981;Brooks & Meltzoff, 2005;Dawson et al., 2000), gestures (Bates et al., 1979;Crais et al., 2004;Guidetti, 2002;Iverson et al., 1994;Masur, 1983;Messinger & Fogel, 1998;Perrault et al., 2019;Tomasello et al., 2007), vocalization (Davis & Macneilage, 1994;Vihman et al., 2009), first words (Brooks & Kempe, 2012;Hadley Pamela et al., 2016;Hsu et al., 2017;Mahmoudi Bakhtiyari et al., 2011;Tardif et al., 2008), facial expressions (Cole, 1986;Herba & Phillips, 2004;McClure, 2000), behaviour regulation (Carpenter et al., 1983), joint attention (Carpenter et al., 1998;Mundy et al., 2007), social interaction (Papousek & Papousek, 1975), imitation (Jones, 2009;Wang et al., 2015), object permanence (Baillargeon & DeVos, 1991;Corrigan, 1978;Gopnik & Meltzoff, 2021;Tomasello & Farrar, 1986), play (Casby, 2003) and language comprehension (Cummings et al., 2009;Gervain & Werker, 2008) are consistent with some studies. ...
Prelinguistic skills play an important role in children’s communication development. These skills are considered as significant bases for language acquisition and function conductive to later social development. Means of communication, communicative functions, skills with cognitive bases, and language comprehension are important prelinguistic skills. There is a critical period for acquiring prelinguistic skills and early identification of communication deficits is an important issue to be considered. The present study aimed to develop a communication skills checklist for Persian children aged 6- to 24-month-old and evaluate its psychometric properties. Parents of 277 Persian children aged 6- to 24-month-old participated in the current study. A checklist was first developed after an extensive literature review and various psychometric analyses in addition to regression analyses were carried out to determine its validity and reliability. The final checklist contained 36 items with high face validity and content validity (CVI > 0.62, CVR > 0.79). Also, the checklist demonstrated a high association with the CNCS (Pearson’s correlation coefficient = 0.85, p < 0.001), and the construct validity showed significant differences between the four age groups (F-test = 197.881, p < 0.001). The results of the internal consistency measurement (Cronbach’s alpha coefficient = 0.952) and the test-retest reliability test (ICC = 0.933, p < 0.001) revealed excellent reliability of the checklist. In conclusion, based on the psychometric assessment, this checklist is a promising tool for assessing communication skills in Persian children aged 6 to 24 months.
... In summary, semantic competition effects (e.g., Huettig & Altmann, 2005;Yee & Sedivy, 2006) reveal that spoken words activate a rich network of concepts during lexical processing. In contrast, prior research on environmental sounds has focused on the activation of target concepts (e.g., puppy during barking; e.g., Cummings, Saygin, Bates, & Dick, 2009;Dick et al., 2007;Edmiston & Lupyan, 2015;Iordanescu et al., 2008Iordanescu et al., , 2010Iordanescu et al., , 2011Lupyan & Thompson-Schill, 2012;Van Petten & Rheinfelder, 1995;Saygın et al., 2003;Thierry et al., 2003;Thierry & Price, 2006); conversely, research on the activation of comprehenders' wider semantic knowledge (e.g., semantically related concepts like bone) is limited, and the literature reveals a mixed set of findings (e.g., Hendrickson et al., 2015). In the current study, two visual world experiments investigated the activation of semantically related competitor representations during the processing of environmental sounds and spoken words, and their corresponding impacts on comprehenders' eye movements. ...
Full-text available
Two visual world experiments investigated the activation of semantically related concepts during the processing of environmental sounds and spoken words. Participants heard environmental sounds such as barking or spoken words such as “puppy” while viewing visual arrays with objects such as a bone (semantically related competitor) and candle (unrelated distractor). In Experiment 1, a puppy (target) was also included in the visual array; in Experiment 2, it was not. During both types of auditory stimuli, competitors were fixated significantly more than distractors, supporting the coactivation of semantically related concepts in both cases; comparisons of the two types of auditory stimuli also revealed significantly larger effects with environmental sounds than spoken words. We discuss implications of these results for theories of semantic knowledge.
... In the first year of life, typical infants are awake for ∼4000 hours, during which they are presented with a wide variety of environmental sounds, infant-directed speech, and a companion visual stream of over 1M images (assuming 1 fps). It is only after this pre-verbal exposure that our abilities of object tracking, color discrimination, object recognition, word and phoneme recognition, and environmental sound recognition emerge [1,2,3,4,5,6]. Beginning in the second year, children become proficient at knowing what they do not know and solicit explicit labels for novel classes of stimuli they encounter using finger pointing and direct questions [7,8,9,10]. ...
Humans do not acquire perceptual abilities in the way we train machines. While machine learning algorithms typically operate on large collections of randomly-chosen, explicitly-labeled examples, human acquisition relies more heavily on multimodal unsupervised learning (as infants) and active learning (as children). With this motivation, we present a learning framework for sound representation and recognition that combines (i) a self-supervised objective based on a general notion of unimodal and cross-modal coincidence, (ii) a clustering objective that reflects our need to impose categorical structure on our experiences, and (iii) a cluster-based active learning procedure that solicits targeted weak supervision to consolidate categories into relevant semantic classes. By training a combined sound embedding/clustering/classification network according to these criteria, we achieve a new state-of-the-art unsupervised audio representation and demonstrate up to a 20-fold reduction in the number of labels required to reach a desired classification performance.
... Whereas research on nonverbal signs primarily has focused on the impact of iconic visually based signs, such as pictures and gestures, on children's word learning, the use of sound as an iconic nonverbal support has received less attention. Studies on infants and toddlers have demonstrated that at 1 and 2 years old, children are able to map nonverbal sound effects to visual objects (Campbell & Namy, 2003;Cummings, Saygin, Bates, & Dick, 2009;Hollich et al., 2000;Woodward & Hoyne, 1999). However, the extent to which children can use sounds to learn information about words at older ages remains unexplored. ...
Early vocabulary knowledge is vital for later reading comprehension and academic success. Studies have found that augmenting explicit teaching of word meanings with nonverbal visual aids, particularly pictures and gestures, assists young learners in building rich lexical representations. Research has focused on the effects of visual supports in fostering word knowledge but has not considered the effectiveness of using sound‐based supports. Working from a semiotics perspective, the authors used a music instructional strategy known as a sound story to examine the impact of using sound effects to teach words to first‐grade students. Words were taught with explicit instruction in combination with sound effects or no sound effects during music class. All sound effects were created and performed using musical instruments in the classroom. Students’ receptive and productive definitional word knowledge were assessed. The productive measure was used as a measure of depth of word knowledge. The authors found that students had deeper knowledge of words that were taught with an associated sound effect compared with words taught with no sound effect. Analysis of the types of information students provided about words showed that students gave more contextual information and gestural responses for words that were taught with sound compared with words taught with no sound. These results provide evidence that vocabulary learning can be fostered during specialist music classes using methods familiar to music educators.
... A number of theories have speculated that the expression of coherent and reproducible language or its elements is an inherent feature of human perception and cognition (Pinker, 1999;Kuhl, 2000;Holden, 2004;Wong, 2005). A large body of empirical and theoretical literature on word acquisition exists (Colunga and Smith, 2005;Regier et al., 2005;Garagnani et al., 2008;Cummings et al., 2009;Frank et al., 2009;Mayor and Plunkett, 2010). Many proposals have advocated a role for symbols and names as sensory representations (Harnad, 1990;Humphreys et al., 1999;Feldman and Narayanan, 2004;Sheridan, 2005). ...
... Infants' word comprehension also shares a developmental trajectory similar to their understanding of familiar environmental sounds (i.e., meaningful, nonlinguistic sounds such as a cow mooing or a car starting). Cummings, Saygin, Bates, and Dick (2009) found that 15-to 25-month-old infants' accuracy in comprehending environmental sounds and spoken phrases was approximately equivalent (with a slight advantage for environmental sound recognition early in development). These results suggest that, in fact, speech does not appear to start out as being "privileged" as an acoustical transmitter of referential information. ...
Typically developing children will rapidly and comprehensively master at least one of the more than 6,000 languages that exist around the globe. The complexity of these language systems and the speed and apparent facility with which children master them have been the topic of philosophical and scientific speculation for millennia. In 397 ad, in reflecting on his own acquisition of language, St. Augustine wrote “… as I heard words repeatedly used in their proper places in various sentences, I gradually learnt to understand what objects they signified; and after I had trained my mouth to form these signs, I used them to express my own desires” (quoted in Wittgenstein, 1953/2001). St. Augustine’s intuitions notwithstanding, more recent thinking and research on children’s language acquisition suggest that the problem facing a child is much more intricate than simply remembering the association between a sound and an object and learning to reproduce the word’s sound. The rich and multitiered nature of this problem—and the many and varied paths to its solution (Bates, Bretherton, & Snyder, 1988)—make the process of language acquisition a unique window into multiple low-level and high-level developmental processes.
Full-text available
Selective deficits in aphasic patients' grammatical production and comprehension are often cited as evidence that syntactic processing is modular and localizable in discrete areas of the brain (e.g., Y. Grodzinsky, 2000). The authors review a large body of experimental evidence suggesting that morphosyntactic deficits can be observed in a number of aphasic and neurologically intact populations. They present new data showing that receptive agrammatism is found not only over a range of aphasic groups, but is also observed in neurologically intact individuals processing under stressful conditions. The authors suggest that these data are most compatible with a domain-general account of language, one that emphasizes the interaction of linguistic distributions with the properties of an associative processor working under normal or suboptimal conditions.
Full-text available
This study examined European American and Hispanic American mothers' multimodal communication to their infants (N= 24). The infants were from three age groups representing three levels of lexical-mapping development: prelexical (5 to 8 months), early-lexical (9 to 17 months), and advanced-lexical (21 to 30 months). Mothers taught their infants four target (novel) words by using distinct objects during a semistructured play episode. Recent research suggests that young infants rely on temporal synchrony to learn syllable–object relations, but later, the role of synchrony diminishes. Thus, mothers' target and nontarget naming were coded for synchrony and other communication styles. The results indicated that mothers used target words more often than nontarget words in synchrony with object motion and sometimes touch. Thus, ‘multimodal motherese’ likely highlights target word-referent relations for infants. Further, mothers tailored their communication to infants' level of lexical-mapping development. Mothers of prelexical infants used target words in synchrony with object motion more often than mothers of early- and advanced-lexical infants. Mothers' decreasing use of synchrony across age parallels infants' decreasing reliance on synchrony, suggesting a dynamical and reciprocal environment–organismic relation.
Full-text available
Infants improve substantially in language ability during their 2nd year. Research on the early development of speech production shows that vocabulary begins to expand rapidly around the age of 18 months. During this period, infants also make impressive gains in understanding spoken language. We examined the time course of word recognition in infants from ages 15 to 24 months, tracking their eye movements as they looked at pictures in response to familiar spoken words. The speed and efficiency of verbal processing increased dramatically over the 2nd year. Although 15-month-old infants did not orient to the correct picture until after the target word was spoken, 24-month-olds were significantly faster, shifting their gaze to the correct picture before the end of the spoken word. By 2 years of age, children are progressing toward the highly efficient performance of adults, making decisions about words based on incomplete acoustic information.
Full-text available
The development of infants' ability to detect the arbitrary relation between the color/shape of an object and the pitch of its impact sound was investigated using an infant-control habituation procedure. Ninety-six infants of 3, 5, or 7 months were habituated to films of two objects differing in color and shape, striking a surface in an erratic pattern. One object produced an impact sound of a high pitch, and the other object produced a low pitch. During test trials, infants in the experimental conditions received a change in the pairing of pitch with color/shape, whereas controls received no change. Results indicate that visual recovery to the change in pitch-color/shape relations was significantly greater than that of age-matched controls at 7 months, but not at 3 or 5 months. A prior study demonstrated that by 3 months, infants were able to discriminate the color/shape and pitch changes of these events. However, it is not until 7 months that they show evidence of detecting the arbitrary relation between these attributes.
Summary Although aphasia is often characterized as a selective impairment in language function, left hemisphere lesions may cause impairments in semantic processing of auditory information, not only in verbal but also in nonverbal domains. We assessed the ‘online’ relationship between verbal and nonverbal auditory processing by examining the ability of 30 left hemisphere-damaged aphasic patients to match environmental sounds and linguistic phrases to corresponding pictures. The verbal and nonverbal task components were matched carefully through a norming study; 21 age-matched controls and five right hemisphere-damaged patients were also tested to provide further reference points. We found that, while the aphasic groups were impaired relative to normal controls, they were impaired to the same extent in both domains, with accuracy and reaction time for verbal and nonverbal trials revealing unusually high correlations (r = 0.74 for accuracy, r = 0.95 for reaction time). Severely aphasic patients tended to perform worse in both domains, but lesion size did not correlate with performance. Lesion overlay analysis indicated that damage to posterior regions in the left middle and superior temporal gyri and to the inferior parietal lobe was a predictor of deficits in processing for both speech and environmental sounds. The lesion mapping and further statistical assessments reliably revealed a posterior superior temporal region (Wernicke’s area, traditionally considered a language-specific region) as being differentially more important for processing nonverbal sounds compared with verbal sounds. These results suggest that, in most cases, processing of meaningful verbal and nonverbal auditory information break down together in stroke and that subsequent recovery of function applies to both domains. This suggests that language shares neural resources with those used for processing information in other domains.
Infants begin acquiring object labels as early as 12 months of age. Recent research has indicated that the ability to acquire object names extends beyond verbal labels to other symbolic forms, such as gestures. This experiment examines the latitude of infants' early naming abilities. We tested 17-month-olds' ability to map gestures, nonverbal sounds, and pictograms to object categories using a forced-choice triad task. Results indicated that infants accept a wide range of symbolic forms as object names when they are embedded in familiar referential naming routines. These data suggest that infants may initially have no priority for words over other symbolic forms as object names, although the relative status of words appears to change with development. The implications of these findings for the development of criteria for determining whether a symbol constitutes an object name early in development are considered.
Forty toddlers aged 20 to 24 months were presented with 32 pairs of images with the auditory stimulus Look followed by the name of the target image (e.g. Look . . . tree) in an intermodal preferential looking (IPL) paradigm. The same series of 16 items was presented first with one image as target and then with the other member of the pair as target. Half the children were given feedback, in the form of movement of the target image at the end of the trial, while the other half were presented with static images. IPL performance was quantified in terms of number of words showing at least 15% increase in proportion of looking time in the post-naming interval. Looking preference for the named item was correlated with parental report of vocabulary, this effect being stronger for those receiving feedback. The correlation with parental report of vocabulary comprehension was .65 for those receiving feedback, but only .37 for those with no feedback. It is concluded that the preferential looking task, which has been widely used in group studies, has the potential to act as a reliable index of comprehension level in individual children, especially when movement feedback is used to maintain attention.