ArticlePDF Available

Infant-directed speech facilitates lexical learning in adults hearing Chinese: Implications for language acquisition

Authors:

Abstract

Experiments 1 and 2 examined the effects of infant-directed (ID) speech on adults' ability to learn an individual target word in sentences in an unfamiliar, non-Western language (Chinese). English-speaking adults heard pairs of sentences read by a female, native Chinese speaker in either ID or adult-directed (AD) speech. The pairs of sentences described slides of 10 common objects. The Chinese name for the object (the target word) was placed in an utterance-final position in experiment 1 (n = 61) and in a medial position in experiment 2 (n = 79). At test, each Chinese target word was presented in isolation in AD speech in a recognition task. Only subjects who heard ID speech with the target word in utterance-final position demonstrated learning of the target words. The results support assertions that ID speech, which tends to put target words in sentence-final position, may assist infants in segmenting and remembering portions of the linguistic stream. In experiment 3 (n = 23), subjects judged whether each of the ID and AD speech samples prepared for experiments 1 and 2 were directed to an adult or to an infant. Judgements were above chance for two types of sentence: ID speech with the target word in the final position and AD speech with the target word in a medial position. In addition to indirectly confirming the results of experiments 1 and 2, these findings suggest that at least some of the prosodic features which comprise ID speech in Chinese and English must overlap.
... CDS has characteristics that are shared across various cultures (Broesch & Bryant, 2015;Ferguson, 1977;Fernald, 1989), but the extent to which CDS and adultdirected speech (ADS) differ from one another varies across cultures (Broesch & Bryant, 2018;Farran et al., 2016;Fernald et al., 1989). Despite this variability, listeners can readily distinguish whether the speech they hear addresses a child or an adult and extract intentions from CDS even in unfamiliar languages (Bryant & Barrett, 2007;Fernald, 1989;Golinkoff & Alioto, 1995). By showing that CDS is not only evident, but also readily recognised across cultures, these studies suggest universal links between physical characteristics of CDS and its communicative purpose, supporting the biological significance of this signal (e.g., Bryant & Barrett, 2007). ...
... However, they do not seem to be necessary: Listeners could still make correct age-related inferences in the low-pass filtered condition, where the content cues were largely unavailable and listeners presumably had to rely on prosodic cues instead. Previous findings suggest that adults not only identify CDS (Golinkoff & Alioto, 1995), but also distinguish different intentions in CDS (Bryant & Barrett, 2007;Fernald, 1989) even in the absence of the semantic and lexical cues. The present findings suggest that listeners' sensitivity extends to age-related variations and this sensitivity, similar to previous findings, does not seem to be solely dependent on the content-related cues in speech. ...
Article
This study investigated adult listeners’ ability to detect age-related cues in child-directed speech (CDS). Participants ( N = 186) listened to two speech recordings directed at children between the ages of 6 to 44 months and guessed which had addressed a younger or an older child. The recordings came from North American English-speaking mothers and listeners were native speakers of Turkish with varying degrees of English knowledge. Participants were randomly assigned to listen either to the original recordings or to the low-pass filtered versions. Accuracy was above chance level across all groups. Participants’ English level, age and the age difference between the addressees significantly predicted accuracy. After controlling for these variables, we found a significant effect of condition. Participants’ accuracy tended to be better in the unfiltered condition with the exception of male participants without children. These results suggest that age-related variations in child-directed speech are perceptually available to adult listeners. Further, even though sensitivity to the age-related cues is facilitated by the availability of content-related cues in speech, it does not seem to be solely dependent on these cues, providing further support for the form-function relations in CDS.
... IDS is the speech style or register typically used by mothers and caregivers when speaking to babies and is characterized by the use of larger pitch variations. Many studies have shown that IDS can facilitate word learning in infants (Ma et al., 2011;Graf Estes and Hurley, 2013) and adults (Golinkoff and Alioto, 1995) due to higher salience leading to enhanced attentional processing (Golinkoff and Alioto, 1995;Kuhl et al., 1997;Houston-Price and Law, 2013;Ellis, 2016). Despite it facilitating infant and adult speech learning, IDS may have a negative effect for those with strong musical perception abilities as they might think they are hearing different words due to varying pitch contours when only one word is presented. ...
... IDS is the speech style or register typically used by mothers and caregivers when speaking to babies and is characterized by the use of larger pitch variations. Many studies have shown that IDS can facilitate word learning in infants (Ma et al., 2011;Graf Estes and Hurley, 2013) and adults (Golinkoff and Alioto, 1995) due to higher salience leading to enhanced attentional processing (Golinkoff and Alioto, 1995;Kuhl et al., 1997;Houston-Price and Law, 2013;Ellis, 2016). Despite it facilitating infant and adult speech learning, IDS may have a negative effect for those with strong musical perception abilities as they might think they are hearing different words due to varying pitch contours when only one word is presented. ...
Article
Full-text available
Perception of music and speech is based on similar auditory skills, and it is often suggested that those with enhanced music perception skills may perceive and learn novel words more easily. The current study tested whether music perception abilities are associated with novel word learning in an ambiguous learning scenario. Using a cross-situational word learning (CSWL) task, nonmusician adults were exposed to word-object pairings between eight novel words and visual referents. Novel words were either non-minimal pairs differing in all sounds or minimal pairs differing in their initial consonant or vowel. In order to be successful in this task, learners need to be able to correctly encode the phonological details of the novel words and have sufficient auditory working memory to remember the correct word-object pairings. Using the Mistuning Perception Test (MPT) and the Melodic Discrimination Test (MDT), we measured learners’ pitch perception and auditory working memory. We predicted that those with higher MPT and MDT values would perform better in the CSWL task and in particular for novel words with high phonological overlap (i.e., minimal pairs). We found that higher musical perception skills led to higher accuracy for non-minimal pairs and minimal pairs differing in their initial consonant. Interestingly, this was not the case for vowel minimal pairs. We discuss the results in relation to theories of second language word learning such as the Second Language Perception model (L2LP).
... Such facilitative effects of IDS prosody may not be limited to child language acquisition. Golinkoff and Alioto (1995) found that English-speaking adults learned Chinese words better when these words were produced in IDS-like speech (exaggerated in prosody) and were placed in utterance-final position, suggesting that properties of IDS (including prosody and word order) may continue to promote second language learning in adults. Recent evidence suggests that adult native English speakers learn Chinese target words better when they are presented in IDS-like speech (Ma, Fiveash, Margulis, Behrend, & Thompson, 2020). ...
Article
Full-text available
This study examines (1) whether infant-directed speech (IDS) facilitates children’s word learning compared to adult-directed speech (ADS); and (2) the link between the prosody of IDS in word-learning contexts and children’s word learning from ADS and IDS. Twenty-four Dutch mother-child dyads participated when children were 18 and 24 months old. We collect mothers’ ADS and IDS at both ages and test children’s word learning from ADS and IDS at 24 months. We find that Dutch 24-month-old children could reliably learn novel words from both ADS and IDS, and IDS had a facilitative effect. In addition, children’s word learning from IDS (but not ADS) is predicted by IDS pitch range when mothers introduce unfamiliar words to children at 18 months. Our findings contribute to an understanding of the role of IDS prosody in language development, highlighting both individual differences and contextual differences in IDS prosody.
... For reviews, see (Pine, 1994;Richards, 1994). Each are thought to bestow distinct learning advantages on the language learner, such as more accurate speech segmentation, and word recognition (Golinkoff & Alioto, 1995) 1 . Given the supporting role of child-directed speech in facilitating various aspects of language acquisition, we considered the possibility that distributional signals in the language environment of younger learners could, by extension, better support the discovery of lexical classes compared to speech to older children. ...
Chapter
Full-text available
Prior work has demonstrated that distributional dependencies between word or morpheme-like entities in artificial and naturalistic language can detect clusters of words which broadly conform to the categories of the adult language (Brent & Siskind, 2001; Mintz, 2002; Redington & Chater, 1998). In this work, we examine the hypothesis that the distributional statistics useful for the discovery of the noun category are more useful in speech to younger children compared to older children (approximately 1-3 vs. 3-6 years of age). First, using a novel method for quantifying the extent that nouns occur in mutually shared contexts, we demonstrate an advantage for speech to younger compared to older children. Second, we develop a theoretical framework for understanding why caregiver speech might be scaffolded in this way, and test its predictions against an array of information theoretic patterns computed on child-directed speech. Our account, based on entropy-maximization, and anchoring originally proposed by (Cameron-Faulkner et al., 2003), clarifies issues in incremental learning from non-stationary input-the problem faced by language learners-and paves the way towards integrating the scaffolded organisation of children's early language environment into computational models of acquisition.
... For instance, speech stimuli with the prosodic properties of IDS have been shown to facilitate infants' neural encoding of speech (Kalashnikova, Peter, et al., 2018;Zangl & Mills, 2007), vowel discrimination (Trainor & Desjardins, 2002), segmentation of continuos speech (Thiessen et al., 2005), and word learning (Graf Estes & Hurley, 2013;Ma et al., 2011). Adults also benefit from these properties as they are more successful at learning novel words when they are produced in IDS than in ADS (Golinkoff & Alioto, 1995;Ma et al., 2020). ...
Preprint
Full-text available
This review considers the acoustic features of a clear speech register directed to non-native listeners known as Foreigner Directed Speech (FDS). We identify vowel hyperarticulation and low speech rate as the most representative acoustic features of FDS; other features, including wide pitch range and high intensity, are still under debate. We also discuss factors that may influence the outcomes and characteristics of FDS. We start by examining accommodation theories (Lindblom, 1990), outlining the reasons why FDS is likely to serve a didactic function by helping listeners acquire a second language (L2). We examine how this speech register adapts to listeners’ identities and linguistic needs, suggesting that FDS also takes listeners’ L2 proficiency into account. To confirm the didactic function of FDS, we compare it to other clear speech registers, specifically Infant Directed Speech and Lombard Speech. However, our review reveals that research has not yet established whether FDS, in fact, succeeds as a didactic tool that supports L2 acquisition. Our review reveals a complex set of contextual factors that determine specific realisations of FDS, which need further exploration. We conclude our review by summarising open questions and indicating directions and recommendations for future research.
... Children of both language groups were more likely to accept "pralt" as an object reference than "skik", which was more likely to be paired with an action. The general characteristics of child-directed speech, in terms of varied intonation and emphasised prosody, can facilitate learning novel word-object mappings (Fernald & Mazzie, 1991;Golinkoff & Alioto, 1995). Furthermore, Messer (1981) found that in almost 50% of utterances the word in a sentence with the highest amplitude was the referring word, therefore a potential source of the benefit of child-directed speech for word learning being that prosody provides important cues to the speaker's intended meaning, and can thus constrain the identification of the referring word in a complex utterance. ...
Article
Purpose This scoping review considers the acoustic features of a clear speech register directed to nonnative listeners known as foreigner-directed speech (FDS). We identify vowel hyperarticulation and low speech rate as the most representative acoustic features of FDS; other features, including wide pitch range and high intensity, are still under debate. We also discuss factors that may influence the outcomes and characteristics of FDS. We start by examining accommodation theories, outlining the reasons why FDS is likely to serve a didactic function by helping listeners acquire a second language (L2). We examine how this speech register adapts to listeners' identities and linguistic needs, suggesting that FDS also takes listeners' L2 proficiency into account. To confirm the didactic function of FDS, we compare it to other clear speech registers, specifically infant-directed speech and Lombard speech. Conclusions Our review reveals that research has not yet established whether FDS succeeds as a didactic tool that supports L2 acquisition. Moreover, a complex set of factors determines specific realizations of FDS, which need further exploration. We conclude by summarizing open questions and indicating directions and recommendations for future research.
Article
Language shapes object categorization in infants. This starts as a general enhanced attentional effect of language, which narrows to a specific link between labels and categories by twelve months. The current experiments examined this narrowing effect by investigating when infants track a consistent label across varied input. Six-month-old infants (N = 48) were familiarized to category exemplars, each presented with the exact same labeling phrase or the same label in different phrases. Evidence of object categorization at test was only found with the same phrase, suggesting that infants were not tracking the label’s consistency, but rather that of the entire input. Nine-month-olds (N = 24) did show evidence of categorization across the varied phrases, suggesting that they were tracking the consistent label across the varied input.
Article
Acquiring language is a major developmental feat that all typical, healthy children achieve during the first years of their lives. The ease and speed with which they acquire their native language(s) has puzzled parents, scholars, and the general public alike. The last five decades have brought about a spectacular increase in our knowledge of how young infants acquire their mother tongues. Sophisticated behavioral, corpus-based, and brain imaging techniques have been developed to query young learners' journey into language. This chapter summarizes what we currently know of typical language development during the first years of life. It starts out by reviewing the existing theoretical accounts of language development. It then presents the most important empirical findings about speech perception and language acquisition grouped by different subdomains, such as newborns' speech perception abilities, phoneme perception, word learning, and the early acquisition of grammar, focusing mainly on the first 3 years of life, an age by which the major milestones of language development are typically accomplished. Differences between monolingual and multilingual development are also discussed.
Article
Full-text available
Examined patterns of preboundary lengthening in mother–child speech by analyzing the speech of 9 women engaged in recorded play sessions with their 9–13 mo old daughters over a 6-mo period. It was found that maternal speech to children on the verge of expressive language ability was characterized by a statistically significant increase in the degree of preboundary lengthening normally expected in conversational speech. Observable but nonsignificant preboundary lengthening was observed in maternal speech to children using 1-word utterances and to children using combinatorial language. It is suggested that maternal speech exaggerates a salient cue to the identification of major syntactic units in conversational speech. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Two studies investigated adults' use of prosodic emphasis to mark focused words in speech to infants and adults. In Exp 1, 18 mothers told a story to a 14-mo-old infant and to an adult, using a picture book in which 6 target items were the focus of attention. Prosodic emphasis was measured both acoustically and subjectively. In speech to infants, mothers consistently positioned focused words on exaggerated pitch peaks in utterance-final position, whereas in speech to adults prosodic emphasis was more variable. In Exp 2, 12 women taught another adult an assembly procedure involving familiar and novel terminology. In both studies, stressed words in adult-directed speech rarely coincided with pitch peaks. However, in infant-directed speech, mothers regularly used pitch prominence to convey primary stress. The use of exaggerated pitch peaks at the ends of utterances to mark focused words may facilitate speech processing for the infant. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Mothers were recorded as they informally sang a song of their choice, once to their infant and once in the infant's absence. Paired excerpts from different mothers were then presented to adult listeners, who were required to identify the infant-directed song in each pair. In Experiment 1, with singers and listeners of North American origin, infant-directed excerpts were identified with a high level of accuracy. Mothers in Experiment 2, all of Indian descent, sang Hindi songs in both contexts. Listeners of Indian and North American origin identified the infant-directed excerpts significantly better than chance, with women outperforming men and native Hindi speakers outperforming native English speakers. These findings document a distinctive style of singing to infants, some aspects of which are recognizable across cultures and musical systems. Cross-cultural differences in singing style and the relations between infantdirected song and speech are discussed.
Article
In this report we ask how mother-infant interaction affects the rate of early language acquisition. Mothers and 15-month-old infants were videotaped playing at home. Coders described (a) infants' attention to people and/or objects, (b) mothers' use of literal and conventional acts to direct infants' attention and (c) functions of mothers' utterances. Taken together, these aspects of mother-infant play predicted 40% of the variance in infants' vocabulary size at 18 months. Significant unique contributions to this prediction were made by mothers' conven tional object-marking and metalingual use of language. The more mothers highlighted both (a) shared objects using conventional means, and (b) the linguistic code, the greater the variety of words their infants used at 18 months of age.
Article
Segmentation of continuous speech into its component words is a nontrivial task for listeners. Previous work has suggested that listeners develop heuristic segmentation procedures based on experience with the structure of their language; for English, the heuristic is that strong syllables (containing full vowels) are most likely to be the initial syllables of lexical words, whereas weak syllables (containing central, or reduced, vowels) are nonword-initial, or, if word-initial, are grammatical words. This hypothesis is here tested against natural and laboratory-induced missegmentations of continuous speech. Precisely the expected pattern is found: listeners erroneously insert boundaries before strong syllables but delete them before weak syllables; boundaries inserted before strong syllables produce lexical words, while boundaries inserted before weak syllables produce grammatical words.
Article
In three experiments, subjects heard alternate versions of naturally spoken sentences that allowed two prosodic realizations (They are FRYING chickens vs They are frying CHICKENS). In Experiment 1, a paraphrase choice task showed that interpretations reliably depended on the prosody used. In Experiment 2, presented sentences were tested for recognition against unfamiliar sentences. Recognition ratings were higher if the sentences were spoken with the same prosody at presentation and test than if two different prosodic versions were presented on these two occasions. This matching effect was not larger for sentences whose prosodies biased listeners toward different syntactic analyses than for sentences whose prosodies effected changes in focus. Experiment 3 showed this integration of prosody and text for nonsense sentences.
Article
The prosodic features of maternal speech addressed to 2-month-old infants were measured quantitatively in a tonal language, Mandarin Chinese, to determine whether the features are similar to those observed in nontonal languages such as English and German. Speech samples were recorded when 8 Mandarin-speaking mothers addressed an adult and their own infants. Eight prosodic features were measured by computer: fundamental frequency (pitch), frequency range per sample, frequency range per phrase, phrase duration, pause duration, number of phrases per sample, number of syllables per phrase, and the proportion of phrase time as opposed to pause time per sample. Results showed that fundamental frequency was significantly higher and exhibited a larger range over the entire sample as well as a larger range per phrase in infant-directed as opposed to adult-directed speech. Durational analyses indicated significantly shorter utterances and longer pauses in infant-directed speech. Significantly fewer phrases per sample, fewer syllables per phrase, and less phrase-time per sample occurred in infant-directed speech. This pattern of results for Mandarin motherese is similar to that reported in other languages and suggests that motherese may exhibit universal prosodic features. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
It has been frequently claimed that the meaning of syntactically ambiguous sentences (such as “Visiting relatives can be a nuisance”) can be made explicit by phonetic means such as stress and intonation. This study describes some ways in which such disambiguation can be accomplished. Fifteen ambiguous sentences were first read by four speakers. The ambiguities were then pointed out, and each speaker stated which meaning he had intended to produce. Each sentence was then produced again twice, the speaker making a conscious effort to differentiate between the two potential meanings. Thirty listeners tried to identify the meaning which the speakers had intended. In 10 out of 15 cases, listeners performed at better than chance level, which implies that the intention of the speakers was successfully communicated to the listeners. The suprasegmentalpatterns employed by the speakers in successful disambiguation were established by acoustic analysis. While stress and intonation play a part, timing seems to have been the principal means by which the two meanings of the sentence were differentiated.
Article
Adult subjects attempted to identify structures (words and constituents) in sentences of a language they did not know. They heard each sentence twice-once with a pause interrupting a structural component and once with a pause separating different structural components. They were asked to choose the version that sounded more natural. An experimental group of subjects who had been previously exposed to a spoken passage in the same language as the test sentences was more successful in identifying structures of the sentences than was the control group with previous exposure to another language. This result was interpreted as demonstrating that language structure may be partially acquired during a brief exposure without reliance on meaning. It was also noted that the experimental group identified constituents more accurately than words. This result suggested that constituents, more than words, function as acquisitional units of language.
Article
The subjects' ability to segment foreign speech was examined. Naturalness judgments regarding three syntactically defined pauses [between constituents (noun and verb phrases), words, or syllables] were obtained using a paired-presentation, forced-choice paradigm. It was hypothesized that segmentation skill developed through exposure to lexical and syntactic markers. Teh existence and effect of such markers was investigated by assigning subjects to various exposure conditions. Results indicated that lexical and syntactic markers exist and can be utilized by subjects in segmenting speech. Contrary to previous research, however, exposure did not facilitate performance. All groups discriminated constituents from either words or syllables, and words from syllables. Results were interpreted as reflecting the interdependence of syntax and suprasegmental phonology. Results challenged the credibility of traditional associationist accounts of language acquisition and speech perception. Results were discussed in the context of Martin's theory of the rhythmic structure of speech.