Article

Speech-like vocalized lip-smacking in geladas

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Recently, we have seen a surge of interest in identifying possible evolutionary links between primate facial communication and human speech (for example [1 • Ghazanfar A.A. • Takahashi D.Y. • Mathur N. • Fitch W. Cineradiography of monkey lip-smacking reveals putative precursors of speech dynamics.Curr. Biol. 2012; 22: 1176-1182 • Abstract • Full Text • Full Text PDF • PubMed • Scopus (71) • Google Scholar ]). One suggestion is that primate ‘lip-smacking’ — a non-vocal, rhythmic movement of lips usually given in conjunction with affiliative behavior — may have been a precursor to speech [1 • Ghazanfar A.A. • Takahashi D.Y. • Mathur N. • Fitch W. Cineradiography of monkey lip-smacking reveals putative precursors of speech dynamics.Curr. Biol. 2012; 22: 1176-1182 • Abstract • Full Text • Full Text PDF • PubMed • Scopus (71) • Google Scholar ]. This idea arose because lip-smacking shares several production features with human speech that the vocalizations of non-human primates lack, most notably a 3–8 Hz rhythm [1 • Ghazanfar A.A. • Takahashi D.Y. • Mathur N. • Fitch W. Cineradiography of monkey lip-smacking reveals putative precursors of speech dynamics.Curr. Biol. 2012; 22: 1176-1182 • Abstract • Full Text • Full Text PDF • PubMed • Scopus (71) • Google Scholar ]. Evidence that non-human primates are indeed able to vocalize while simultaneously producing rhythmic facial movements would lend initial, but important, support to the notion that lip-smacking is a plausible evolutionary step towards speech. Here, I report that a wild primate, the gelada (Theropithecus gelada), makes a derived vocalization (the vocalization is absent in their close relatives, the Papio baboons) that is produced while lip-smacking, called a ‘wobble’. The rhythm of wobbles (6–9 Hz) closely matches that of human speech, indicating that a vocalized lip-smack produces sounds that are structurally similar to speech. Geladas are highly gregarious primates with a relatively large vocal repertoire. Their independent evolution of a speech-like vocalization involving complex facial movements provides initial support for the hypothesis that lip-smacking was a precursor to the emergence of human speech.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Intriguingly, there is one apparent example of a vocalized lip-smacking sequence: the "wobble" call of the gelada baboon Theropithecus gelada. 110 These are prolonged vocalizations, produced mainly in affiliative contexts by males toward females, consisting essentially of a "normal" lip-smacking sequence, coupled with clear voicing (by itself termed a "moan"). Unlike speech, wobbles are consistently voiced throughout, with no alternation of voiced and unvoiced components, but nonetheless they represent a clear intermediate case, and thus proofof-concept, that voiced lip-smacking can evolve and be stable (they are not present in other closely related baboon species). ...
... Similar rates of vocal modulation have been recently shown in multiple primate vocalizations. 109,110,114,115 An important recent paper used functional magnetic resonance imaging to examine brain activation in macaques who were induced to produce lip smacks by showing them videos of conspecific faces. 116 The production of these communicative facial gestures activated both medial and lateral motor areas, though with a bias toward the medial regions. ...
... Clearly, proto-musical songs would have had "rhythm" in the sense that modern speech does, 135 at a rate roughly shared by many nonhuman primate vocalizations. 109,110,114,115 Whether these earliest songs also had an isochronic beat cannot be determined. This suggests that the role of the medial system in generating a syllabic frame would have been present early, as suggested by MacNeilage, but in the different context of song. ...
Article
Full-text available
I explore the neural and evolutionary origins of phonological hierarchy, building on Peter MacNeilage's frame/content model, which suggests that human speech evolved from primate nonvocal jaw oscillations, for example, lip smack displays, combined with phonation. Considerable recent data, reviewed here, support this proposition. I argue that the evolution of speech motor control required two independent components. The first, identified by MacNeilage, is the diversification of phonetic “content” within a simple sequential “frame,” and would be within reach of nonhuman primates, by simply intermittently activating phonation during lip smack displays. Such voicing control requires laryngeal control, hypothesized to necessitate direct corticomotor connections to the nucleus ambiguus. The second component, proposed here, involves imposing additional hierarchical rhythmic structure upon the “flat” control sequences typifying mammalian vocal tract oscillations and is required for the flexible combinatorial capacity observed in modern phonology. I hypothesize that phonological hierarchy resulted from a marriage of a preexisting capacity for sequential structure seen in other primates, with novel hierarchical motor control circuitry (potentially evolved in tool use and/or musical contexts). In turn, this phonological hierarchy paved the way for phrasal syntactic hierarchy. I support these arguments using comparative and neural data from nonhuman primates and birdsong.
... Unlike primate vocalizations or human speech, however, lipsmacking is a comparatively simple gestural communication form and most importantly it does not typically include phonation. An exception is lip-smacking in geladas (Theropithecus gelada) which is sometimes accompanied by audible sounds (Bergman, 2013), referred to as "wobbles" (Gustison, le Roux, and Bergman, 2012). These vocal lip-smacks are also speech-like in duration and rhythmicity, at 0.41-1.09 ...
... sec and containing one to five discernable cycles, each corresponding to an opening and closing of the mouth, at a rate of 6.3-9.0 Hz (mean: 7.6 Hz) (Bergman, 2013). ...
... Although the observations from geladas show that nonhuman primates possess the ability to produce signals with similarities to speech (Bergman, 2013), this does not necessarily mean that speech evolved from lip-smacking, and in fact the rhythmic similarity of lip-smacking and speech may be coincidental and homoplastic. Considering the dearth of information regarding the origins of speech, however, the hypothesis of lip-smacking as a precursor to human speech rhythm provides one of few possible comparative avenues for investigation, and led to the discovery that, in addition to its 3-8 Hz rhythm, the temporal coordination of vocal tract elements during lip-smacking has similarities to that of speech (Ghazanfar et al., 2012;Toyoda et al., 2017). ...
Article
Objectives It has long been recognized that in gibbons both sexes disperse from the natal group. However, the fate of dispersed individuals was rarely documented. Here we provide the first detailed information on sex differences in dispersal patterns by analyzing the spatial genetic structure of a well‐known white‐handed gibbon (Hylobates lar) population. Materials and methods Mitochondrial DNA (mtDNA) and Y‐chromosomal haplotypes, and autosomal microsatellite genotypes were determined for individuals of the Mo Singto study site, Khao Yai National Park, Thailand. Mantel tests for the three genetic marker types were performed for 17 gibbon groups comprising 23 adult males and 18 adult females. Results Significant positive Mantel correlations were observed for spatial distance and both autosomal microsatellite‐based as well as Y‐chromosomal haplotype‐based genetic distance among adult males. Neighboring adult males tended to be genetically related and share Y‐chromosomal haplotypes. Conversely, no significant Mantel correlations were observed either in autosomal microsatellites or mtDNA among adult females. Discussion Our results confirm, at a genetic level, hypotheses from long‐term demographic observations that white‐handed gibbon males of the Mo Singto population primarily disperse into adjacent groups. Instead, females disperse more opportunistically either to adjacent or more distant groups. This sex‐specific difference reflects an apparent greater tolerance between males than between females. The higher tolerance of adult males allows the formation of stable multimale groups and facilitates male dispersal into an adjacent group. Stable multifemale groups have never been documented for white‐handed gibbons probably due to feeding competition between females.
... In non-human primates, lipsmacking appeases the recipient of the behaviour and facilitates affiliation (Evers et al., 2014). Some studies have been assessing its frequency, duration and inter-individual variability, as well as the tuning process throughout ontogeny (Bergman, 2013). Lipsmacking is related to the https://doi.org/10.1017/ehs.2023.10 ...
... Lipsmacking is related to the https://doi.org/10.1017/ehs.2023.10 Published online by Cambridge University Press regulatory mechanisms of the infant (Bergman, 2013) and in mother-infant interactions this display may be presented in an exaggerated way in combination with mouth-to-mouth contact (Ferrari et al., 2009). However, we are far from a full comprehension of its function, especially since this display can vary across species, individuals and contexts. ...
Article
Full-text available
Capuchin monkeys have rich social relationships and from very young ages they participate in complex interactions with members of their group. Lipsmacking behaviour, which involves at least two individuals in socially-mediated interactions, may tell about processes that maintain, accentuate or attenuate emotional exchanges in monkeys. Lipsmacking is a facial expression associated with the establishment and maintenance of affiliative interactions, following under the “emotional regulation” umbrella, which accounts for the ability to manage behavioural responses. We investigated behaviours related to the emitter and to the receiver (infant) of lipsmacking to answer the question of how lipsmacking occurs. In capuchin monkeys, lipsmacking has been previously understood solely as a face-to-face interaction. Our data show that emitters are engaged with infants, looking longer towards their face and seeking eye contact during the display. However, receivers spend most of the time looking away from the emitter and stay in no contact for nearly half of the time. From naturalistic observations of wild infant capuchin monkeys from Brazil we found that lipsmacking is not restricted to mutual gaze, meaning there are other mechanisms in place than previously known. Our results open paths to new insights about the evolution of socio-emotional displays in primates.
... Primates' vocalisations and human speech present homologies in terms of articulation and acoustics by production and use of proto-vowels (through typical, 'voiced' calls/vocalisations such as grunts and barks, resulting from the activation of their vocal folds and their regular oscillation) and protoconsonants (through atypical, 'voiceless' calls such as lipsmacks and raspberries, resulting from supra-laryngeal manoeuvring) either singly or in relatively simple syllable-like call combinations (Lameira, 2014(Lameira, , 2018 (Preuschoft, 1995); crested macaques, Macaca nigra (Thierry et al., 2000); gelada baboons, Theropithecus gelada (Bergman, 2013); rhesus macaques (Partan, 2002)]. These findings suggest that (i) our last common ancestor with Cercopithecoidea (around 25 million years ago) exhibited ancestral articulatory abilities; and (ii) early hominids (around 7 million years ago) could have been able to produce a small repertoire of consonants and consonant-vowel combinations (i.e. ...
... Furthermore, reports show that monkey lipsmacking [a lip-smack is a rhythmic oro-facial expression commonly used during face-to-face affiliative interactions between primates (e.g. Van Hoof, 1962;Van Lawick-Goodall, 1968)] and adult human speech both exhibit a 3-8 Hz rhythmic frequency [humans (Greenberg et al., 2003;Chandrasekaran et al., 2009); gelada baboons (Bergman, 2013;Gustison & Bergman, 2017); rhesus macaques (Ghazanfar, Chandrasekaran & Morrill, 2010;Ghazanfar, 2012)]. In addition, the structure and development of macaque monkeys' lip-smacking is consistent with the rhythmic structure of human language, from infant babbling to adult speech (Morrill et al., 2012). ...
Article
Full-text available
Investigating in depth the mechanisms underlying human and non-human primate intentional communication systems (involving gestures, vocalisations, facial expressions and eye behaviours) can shed light on the evolutionary roots of language. Reports on non-human primates, particularly great apes, suggest that gestural communication would have been a crucial prerequisite for the emergence of language, mainly based on the evidence of large communication repertoires and their associated multifaceted nature of intentionality that are key properties of language. Such research fuels important debates on the origins of gestures and language. We review here three non-mutually exclusive processes that can explain mainly great apes’ gestural acquisition and development: phylogenetic ritualisation, ontogenetic ritualisation, and learning via social negotiation. We hypothesise the following scenario for the evolutionary origins of gestures: gestures would have appeared gradually through evolution via signal ritualisation following the principle of derived activities, with the key involvement of emotional expression and processing. The increasing level of complexity of socioecological lifestyles and associated daily manipulative activities might then have enabled the acquisition and development of different interactional strategies throughout the life cycle. Many studies support a multimodal origin of language. However, we stress that the origins of language are not only multimodal, but more broadly multicausal. We propose a multicausal theory of language origins which better explains current findings. It postulates that primates’ communicative signalling is a complex trait continually shaped by a cost–benefit trade-off of signal production and processing of interactants in relation to four closely interlinked categories of evolutionary and life cycle factors: species, individual and context-related characteristics as well as behaviour and its characteristics. We conclude by suggesting directions for future research to improve our understanding of the evolutionary roots of gestures and language.
... Only data from the remaining 28 participants were analysed. These participants had a median age of 21 years (range [18][19][20][21][22][23][24][25][26][27][28][29], 20 were female, nine self-identified as a good singer, 15 self-identified as a good whistler, 26 had some degree of formal musical training (2-15 years) but only two of these had any vocal training. All participants reported normal hearing and no vocal pathology. ...
... Non-human apes have been observed to produce a variety of novel sounds, but these are most often in the form of oral sounds, such as a 'raspberry' or a whistle that use the lips or tongue as a sound source [25][26][27]118], although these species may also have a limited degree of flexibility at the laryngeal sound source [22][23][24]. The most well-documented case is that of Koko the encultured Gorilla. ...
Article
Full-text available
Vocal imitation is a hallmark of human communication that underlies the capacity to learn to speak and sing. Even so, poor vocal imitation abilities are surprisingly common in the general population and even expert vocalists cannot match the precision of a musical instrument. Although humans have evolved a greater degree of control over the laryngeal muscles that govern voice production, this ability may be underdeveloped compared with control over the articulatory muscles, such as the tongue and lips, volitional control of which emerged earlier in primate evolution. Human participants imitated simple melodies by either singing (i.e. producing pitch with the larynx) or whistling (i.e. producing pitch with the lips and tongue). Sung notes were systematically biased towards each individual’s habitual pitch, which we hypothesize may act to conserve muscular effort. Furthermore, while participants who sung more precisely also whistled more precisely, sung imitations were less precise than whistled imitations. The laryngeal muscles that control voice production are under less precise control than the oral muscles that are involved in whistling. This imprecision may be due to the relatively recent evolution of volitional laryngeal-motor control in humans, which may be tuned just well enough for the coarse modulation of vocal-pitch in speech.
... In addition, there are cases of other, non-human, even nonvocal-learning species that are capable of producing human-like vowels (Vs) and consonants (Cs), such as the Gelada Baboons (Theropithecus gelada), which seem to possess an extremely rich sound repertoire, comparable to that of humans. More specifically, it has been shown that this species is able to produce vocalizations that not only employ what we would perceive as Cs and Vs, but also are structured in a way that resembles human sound systems, with different vowel qualities and Cs distinguished by manner and place of articulation, as well as duration similar to that of human speech (Richman, 1976, et seq;Bergman, 2013). There are, of course, different ways of articulating sounds with the same acoustic effect, even among humans, but the very fact that there are indeed other species that are able to produce Cs and Vs in a dynamic manner and yet lack human-like speech, shows that merely having that inventory is not a diagnosis for neither speech nor language. ...
Article
Full-text available
There is still no categorical answer as to why humans, and no other species, have speech, or why speech is the way it is. Several purely anatomical arguments have been put forward, but they have been shown to be false, biologically implausible, or of limited scope. This perspective paper supports the idea that evolutionary theories of speech could benefit from a focus on the cognitive mechanisms that make speech possible, for which antecedents in evolutionary history and brain correlates can be found. This type of approach is part of a very recent but rapidly growing trend that has already provided crucial insights on the nature of human speech by focusing on the biological bases of vocal learning. Here we contend that a general mechanism of attention, which manifests itself not only in the visual but also in the auditory modality, might be one of the key ingredients of human speech, in addition to the mechanisms underlying vocal learning, and the pairing of facial gestures with vocalic units.
... For example, a study on captive chimpanzees found that individuals who Words, Bones, Genes, Tools: DFG Center for Advanced Studies combined vocalizations and orofacial movements to produce attentiongetting sounds had a higher deposition of gray matter in brain regions associated with motor control, compared to individuals that did not produce these signals (Bianchi et al. 2016). Similarly, other researchers have proposed that oropharyngeal motor coordination, such as lip-smacking in primates, creates rhythmic sound utterances that may have provided the ancestral basis for human speech (Bergman 2013;Ghazanfar and Takahashi 2014). These slight movements of the mouth, tongue and larynx are thereby argued to have facilitated the transition in language evolution from a primarily gestural mode to the acoustic channel (Corballis 2017). ...
Chapter
Full-text available
Human evolution is defined by a multifaceted interplay of biological and cultural factors, which comprise the focus of a diverse spectrum of scientific fields. This edited volume aims to establish interdisciplinary links through a series of nine studies that critically discuss the current methods, hypotheses frameworks, and future perspectives for reconstructing habitual behavior in past humans. The authors are specialists in the fields of biological anthropology, primatology, experimental archaeology, and linguistics.
... An apparent existence proof showing perhaps a first step in such an evolutionary trajectory comes from the vocalized lip smack or 'wobble' of the gelada (Theropithecus gelada), which appears to involve overlaying lip smacking on a 'moan' vocalization, both of which are separately present in their repertoire. 159 This does not explain the evolution of pitch coordination, but it does complicate the claim that the pitch coordination system evolved first. A second possible objection comes from vocal convergence in great apes (e.g. Watson et al. 160 )-a simple form of vocal learning where vocalization of individuals in a group become similar-or the apparent ability of orangutans to modify the pitch (higher versus lower) of an existing call in response to a higher or lower pitch cue. ...
Article
Classical neural architecture models of speech production propose a single system centered on Broca's area coordinating all the vocal articulators from lips to larynx. Modern evidence has challenged both the idea that Broca's area is involved in motor speech coordination and that there is only one coordination network. Drawing on a wide range of evidence, here we propose a dual speech coordination model in which laryngeal control of pitch-related aspects of prosody and song are coordinated by a hierarchically organized dorsolateral system while supralaryngeal articulation at the phonetic/syllabic level is coordinated by a more ventral system posterior to Broca's area. We argue further that these two speech production subsystems have distinguishable evolutionary histories and discuss the implications for models of language evolution.
... Similar conclusions were drawn from the analysis of other primate species. For instance, geladas produce so-called wobbles, which are produced while lip-smacking (Bergman, 2013). ...
Chapter
A unique overview of the human language faculty at all levels of organization. Language is not only one of the most complex cognitive functions that we command, it is also the aspect of the mind that makes us uniquely human. Research suggests that the human brain exhibits a language readiness not found in the brains of other species. This volume brings together contributions from a range of fields to examine humans' language capacity from multiple perspectives, analyzing it at genetic, neurobiological, psychological, and linguistic levels. In recent decades, advances in computational modeling, neuroimaging, and genetic sequencing have made possible new approaches to the study of language, and the contributors draw on these developments. The book examines cognitive architectures, investigating the functional organization of the major language skills; learning and development trajectories, summarizing the current understanding of the steps and neurocognitive mechanisms in language processing; evolutionary and other preconditions for communication by means of natural language; computational tools for modeling language; cognitive neuroscientific methods that allow observations of the human brain in action, including fMRI, EEG/MEG, and others; the neural infrastructure of language capacity; the genome's role in building and maintaining the language-ready brain; and insights from studying such language-relevant behaviors in nonhuman animals as birdsong and primate vocalization. Section editorsChristian F. Beckmann, Carel ten Cate, Simon E. Fisher, Peter Hagoort, Evan Kidd, Stephen C. Levinson, James M. McQueen, Antje S. Meyer, David Poeppel, Caroline F. Rowland, Constance Scharff, Ivan Toni, Willem Zuidema
... Nevertheless, modality preference aside, there is a growing body of research showing the potential of voluntary behavior in the vocal apparatus of primates, against some of the previously mentioned accounts (e.g., Hewes, 1973). The recent evidence shows that primates exhibit voluntary control over their lip and tongue movements, including rhythmic lip-smacking (Bergman 2013;Fitch 2010;Ghazanfar et al. 2012; see Kendon 2017, 167). Koko, a well-known human-fostered female gorilla, exhibited advanced control over her breathing behavior (see Perlman, 2017, for an overview). ...
Thesis
Full-text available
This dissertation is concerned with the major theme of iconicity and its prevalence on different linguistic levels. Iconicity refers to a resemblance between the linguistic form and the meaning of a referent (cf. Perniss and Vigliocco, 2014). Just like a sculpture resembles an object or a model, so can the sound or shape of words resemble the thing they refer to. Previous theoretical approaches emphasize that arbitrariness of the linguistic sign is one of the main features of human language; iconicity, however, may have played a role for language evolution, but is negligible in contemporary language. In contrast, the main point of this thesis is to explore the potential and the importance of iconicity in the language nowadays. The individual chapters of the dissertation can be viewed as separate parts that, taken together, reveal the comprehensive spectrum of iconicity. Starting from the language evolutionary debate, the individual chapters address iconicity on different linguistic levels. I present experimental evidence on sound symbolism, using the example of German Pokémon names, on iconic prosody, and on iconic words, the so-called ideophones. The results of the individual investigations point to the widespread use of iconicity in contemporary German. Moreover, this dissertation deciphers the communicative potential of iconicity as a force that not only enabled the emergence of language, but also persists after millennia, unfolding again and again and encountering us every day in speech, writing, and gestures.
... Une autre théorie propose que le langage parlé ait émergé de la capacité des primates à articuler rythmiquement les lèvres et la langue dans leur communication vocale et faciale (e.g. Bergman 2013;Ghazanfar et Takahashi 2014). Lorsque les humains parlent, les mouvements de fermeture et d'ouverture de la bouche suivent un rythme typique de 2 à 7 Hz (Chandrasekaran et al. 2009), et les signaux acoustiques sont émis sur une rythmicité allant de 3 à 8 Hz (Ghazanfar et Takahashi 2014). ...
Thesis
Full-text available
Ce travail s’inscrit dans l’étude des origines évolutives du langage, par la recherche de propriétés langagières dans la communication gestuelle et multimodale de primates cercopithécidés en captivité, les mangabeys à collier. Par une double approche observationnelle et expérimentale, nous avons montré que les gestes des mangabeys remplissent les critères de définition d’une communication intentionnelle, et peuvent être produits de manière flexible dans différents contextes. Nos observations fournissent également de premiers éléments en faveur d’une intentionnalité des expressions faciales des cercopithécidés, souvent considérées comme de simples indices d’état émotionnel. Cette propriété sociocognitive langagière pourrait ainsi être plus ancienne que ce que nous pensions dans l’histoire évolutive des primates, et être héritée de la communication gestuelle des ancêtres des catarrhiniens, il y a environ 29 millions d’années. De plus, nous avons mis en évidence un effet significatif du contexte interactionnel sur la latéralité gestuelle des mangabeys, suggérant une importance particulière de facteurs sociaux dans l’émergence d’une spécialisation hémisphérique pour la communication intentionnelle, dont le langage humain. Enfin, par une méthode originale, reposant sur des analyses de séquences et de réseau, nous avons décrit la communication multimodale et multicomposante des mangabeys à collier, et montré qu’ils combinent de manière flexible différents types et modalités de signaux en fonction du contexte et de facteurs sociodémographiques. Nos résultats soulignent l’importance d’une approche multimodale pour comprendre la complexité de la communication des primates, et apportent de premier éléments de compréhension sur la fonction des combinaisons de signaux. De futures comparaisons à d’autres espèces et dans différents environnements pourraient permettre d’affiner nos connaissances quant aux possibles contraintes évolutives ayant favorisé une telle complexité de la communication des primates humains et non-humains.
... A myriad of evidence points to the existence of rhythmic structures in vocal behaviors in humans and vertebrate animals (Moore et al., 2014). This rhythmicity of vocal production has been shown in primates [humans and non-human primates (NHPs)], mammals (such as rodents), fish, and songbirds (Bergman, 2013;Castellucci et al., 2018;Feng & Bass, 2016;Ghazanfar, 2013;Gustison et al., 2012;Herzing, 2015;Lemasson et al., 2011;Moore et al., 2014;Norton & Scharff, 2016;Ravignani, 2018). For instance, vocal patterns of songbirds are important for understanding rhythm in vocalization (Benichov et al., 2016;Fehér et al., 2009). ...
Article
Full-text available
Central pattern generators (CPGs) generate the rhythmic and coordinated neural features necessary for the proper conduction of complex behaviors. In particular, CPGs are crucial for complex motor behaviors such as locomotion, mastication, respiration, and vocal production. While the importance of these networks in modulating behavior is evident, the mechanisms driving these CPGs are still not fully understood. On the other hand, accumulating evidence suggests that astrocytes have a significant role in regulating the function of some of these CPGs. Here, we review the location, function, and role of astrocytes in locomotion, respiration, and mastication CPGs and propose that, similarly, astrocytes may also play a significant role in the vocalization CPG. • CPGs are crucial for complex motor behaviors such as locomotion, mastication, and respiration. Astrocytes have a significant circuit‐specific role in regulating the function of these CPGs. We propose that astrocytes may have a critical role in modulation of vocal production CPG.
... If true, and if different species share components of musicality to differing degrees, then across species, production or proficiency in "musical" behaviors should predict both the number and complexity of social bonds. For example, gelada baboons live in unusually large and complex groups for primates, and they also exhibit rhythmic and melodic vocal features that are unique among primates (Bergman, 2013;Gustison, Aliza, & Bergman, 2012;Richman, 1978Richman, , 1987. Similar to geladas, many parrot species live in large fission-fusion social groups, and members of the parrot clade show vocal imitation, call convergence, duetting, and the capacity for rhythmic synchronization (Balsby & Scarl, 2008;Bradbury, 2001;Scarl & Bradbury, 2009;Schachner et al., 2009). ...
Article
Full-text available
We compare and contrast the 60 commentaries by 109 authors on the pair of target articles by Mehr et al. and ourselves. The commentators largely reject Mehr et al.'s fundamental definition of music and their attempts to refute (1) our social bonding hypothesis, (2) byproduct hypotheses, and (3) sexual selection hypotheses for the evolution of musicality. Instead, the commentators generally support our more inclusive proposal that social bonding and credible signaling mechanisms complement one another in explaining cooperation within and competition between groups in a coevolutionary framework (albeit with some confusion regarding terminologies such as “byproduct” and “exaptation”). We discuss the proposed criticisms and extensions, with a focus on moving beyond adaptation/byproduct dichotomies and toward testing of cross-species, cross-cultural, and other empirical predictions.
... If true, and if different species share components of musicality to differing degrees, then across species, production or proficiency in "musical" behaviors should predict both the number and complexity of social bonds. For example, gelada baboons live in unusually large and complex groups for primates, and they also exhibit rhythmic and melodic vocal features that are unique among primates (Bergman, 2013;Gustison, Aliza, & Bergman, 2012;Richman, 1978Richman, , 1987. Similar to geladas, many parrot species live in large fission-fusion social groups, and members of the parrot clade show vocal imitation, call convergence, duetting, and the capacity for rhythmic synchronization (Balsby & Scarl, 2008;Bradbury, 2001;Scarl & Bradbury, 2009;Schachner et al., 2009). ...
Article
Savage et al. argue for musicality as having evolved for the overarching purpose of social bonding. By way of contrast, we highlight contemporary predictive processing models of human cognitive functioning in which the production and enjoyment of music follows directly from the principle of prediction error minimization.
... If true, and if different species share components of musicality to differing degrees, then across species, production or proficiency in "musical" behaviors should predict both the number and complexity of social bonds. For example, gelada baboons live in unusually large and complex groups for primates, and they also exhibit rhythmic and melodic vocal features that are unique among primates (Bergman, 2013;Gustison, Aliza, & Bergman, 2012;Richman, 1978Richman, , 1987. Similar to geladas, many parrot species live in large fission-fusion social groups, and members of the parrot clade show vocal imitation, call convergence, duetting, and the capacity for rhythmic synchronization (Balsby & Scarl, 2008;Bradbury, 2001;Scarl & Bradbury, 2009;Schachner et al., 2009). ...
Article
We propose that not social bonding, but rather a different mechanism underlies the development of musicality: being unable to survive alone. The evolutionary constraint of being dependent on other humans for survival provides the ultimate driving force for acquiring human faculties such as sociality and musicality, through mechanisms of learning and neural plasticity. This evolutionary mechanism maximizes adaptation to a dynamic environment.
... If true, and if different species share components of musicality to differing degrees, then across species, production or proficiency in "musical" behaviors should predict both the number and complexity of social bonds. For example, gelada baboons live in unusually large and complex groups for primates, and they also exhibit rhythmic and melodic vocal features that are unique among primates (Bergman, 2013;Gustison, Aliza, & Bergman, 2012;Richman, 1978Richman, , 1987. Similar to geladas, many parrot species live in large fission-fusion social groups, and members of the parrot clade show vocal imitation, call convergence, duetting, and the capacity for rhythmic synchronization (Balsby & Scarl, 2008;Bradbury, 2001;Scarl & Bradbury, 2009;Schachner et al., 2009). ...
Preprint
We compare and contrast the 60 commentaries by 109 authors on the pair of target articles by Mehr et al. and ourselves. The commentators largely reject Mehr et al.’s fundamental definition of music and their attempts to refute 1) our social bonding hypothesis, 2) byproduct hypotheses, and 3) sexual selection hypotheses for the evolution of musicality. Instead, the commentators generally support our more inclusive proposal that social bonding and credible signaling mechanisms complement one another in explaining cooperation within and competition between groups in a coevolutionary framework (albeit with some confusion regarding terminology such as “byproduct” and “exaptation”). We discuss proposed criticisms and extensions, with a focus on moving beyond adaptation/byproduct dichotomies and toward testing of cross-species, cross-cultural, and other empirical predictions.
... For instance, some researchers have looked at facial communication and have shown that lip-smacking, a common form of primate facial movement, is produced with a periodicity that closely matches the periodicity of the gaps between syllables in many human languages. They, therefore, suggested that lip-smacking may be an evolutionary precursor to speech (Chandrasekaran et al. 2009;Ghazanfar 2013;Bergman 2013). More recently, Lameira and colleagues (2015) have shown that other facial movements, "clicks" and "faux-speech", involving lips and tongue, are also produced at speech-like rates in orangutans. ...
Thesis
Full-text available
Animals exhibit an astonishing diversity of communicative systems, with substantial variation in both the nature and the number of signals they produce. Variation in communicative complexity has been conceptually and empirically attributed to social complexity and formalized as the “social-complexity hypothesis for communicative complexity” (SCHCC). Indeed, group-living animals face complex social environments where they engage in a wide range of interactions with different social partners triggering the need for transmission of a broader diversity of messages. In chapter I (Peckre et al. 2019), I review the literature on the current tests of the SCHCC, pointing out and discussing what I identified as the main gaps in the current state of the art. Specifically, three key issues emerged from my analysis. The first issue concerns the operational definition of the main variables, social and communicative complexity. Notably, when defining communicative complexity, most empirical tests of the SCHCC focus on a single modality (e.g., acoustic, visual, olfactory) whereas several good reasons exist for acknowledging the multimodal nature of both, signals and communicative systems in this framework. At the system level, focusing on only one modality may lead to over- or underestimation of the relationship between social and communicative complexity. The second issue relates to the fact that while numerous studies have highlighted a link between social and communicative complexity, their correlative nature does not permit conclusions about the direction of causality. Indeed, alternative hypotheses involving anatomical, phylogenetical, or ecological factors have also been proposed to explain the evolution of more complex forms of communication. Finally, I note that researchers rarely address the actual ways in which social factors directly affect variation in signaling. Indeed, the underlying mechanisms of this link are usually left unexplored, failing to uncover the specific attribute of communication that would be co-evolving with specific aspects of sociality. I, therefore, make a plea for expanding tests of the SCHCC in 1) scope (systematic approach across modalities) and 2) depth (characterization of the observed relationships) as I believe it may significantly advance our understanding of the intricate links between animal sociality and communication. To address point 1), I offer in chapter II a comprehensive approach of the cross-modal communicative systems of two closely related true lemur species having similar morphology, living in similar habitats, but differing in their social systems. I studied wild Eulemur rufifrons and E. mongoz in Madagascar, respectively in Kirindy and Ankatsabe forests for 12 months. I describe a new analytical framework to assess the complexity of signaling systems across modalities. Applying a multimodal approach may help to uncover the different selective pressures acting on the communicative system and to understand better adaptive functions that might be unclear from the study of its separate components independently. E. rufifrons, the species having the more complex social system, also had overall a more complex communicative system than the one of the E. mongoz. Both careful choices of the species to compare to limit the effect of possible additional selective pressures and exploration of the social function of the non-homologous signals allow concluding that this increased complexity of the communicative system in E. rufifrons is most likely associated with social selective pressures. I developed this new analytical framework, partly based on using a cross-modal network approach, with the perspective of facilitating cross-taxonomic comparisons. Moreover, this approach may be combined with new multi-dimensional approaches of social complexity and contribute to a more holistic approach to the tests of the SCHCC. By this, we should be able to derive new testable hypotheses that would contribute to better understand the course of events that have led to the evolution of communication diversity in its distinct dimensions. In chapter III, I address point 2) by investigating the impacts of sociality on the expression of a multimodal signal, the anogenital scent-marking behavior in wild red-fronted lemurs. I specifically investigated intragroup audience effects on anogenital scent-marking behaviors in a wild population of red-fronted lemurs and particularly whether males and females differed in this aspect and if these differences may reveal functional differences associated with anogenital scent-marking across sexes. I found an intragroup audience effect in males but not in females. Males deposited less often anogenital marks when more males were present within a three meters range compared to five- or ten-meters ranges. Males may prefer to reduce the risk of physical contact by avoiding to scent-mark near other males, and/or give priority to other males to scent-mark. With these results, I provide important insights into the functional significance of anogenital scent-marking in red-fronted lemurs and support the idea of greater intragroup social pressures associated with anogenital scent-marking in males than in females in egalitarian species. Studying the flexibility of complex signal usage (e.g., occurrence or structural modifications) across social contexts (audiences) should permit the identification of different individual social characteristics that may elicit or constrain complex signal expression. These social characteristics may later constitute social pressures acting for or against the evolution of these complex signaling behaviors. In chapters IV and V, I also address ethical questions related to this project and the way I tried to adapt and best address my responsibilities for animal welfare. In chapter IV, I expose some technical details and ethical concerns experienced during the choice of my field sites. While in chapter V (Buil and Peckre et al. 2019) I present a remote releasable collar system developed in collaboration with the Neurobiology Laboratory (German Primate Center, Göttingen, Germany) intending to provide a tool to significantly reduce the number of captures in studies using bio-logging for medium-sized mammal species. Overall, by emphasizing the importance of the multimodal nature of communicative systems and the social context in which signals are exchanged, I hope to stimulate the development of new tests of the SCHCC based on this expanded framework. I additionally argue for the importance of looking across research fields since striking parallels may be observed between animal behavior and linguistic research when addressing the origins of communication complexity, be it in the form of human language or animal signaling.
... To date, speech-like rhythms have been detected in the facial expressions of Old World monkeys during teeth chattering [15] and vocal [16] and non-vocal lip smacking [13]. Furthermore, recent studies in apes also identified theta rhythms in the song phrases of gibbons [12] and voiceless clicks and faux-speech in one orangutan [17]. ...
Article
Human speech shares a 3-8-Hz theta rhythm across all languages [1-3]. According to the frame/content theory of speech evolution, this rhythm corresponds to syllabic rates derived from natural mandibular-associated oscillations [4]. The underlying pattern originates from oscillatory movements of articulatory muscles [4, 5] tightly linked to periodic vocal fold vibrations [4, 6, 7]. Such phono-articulatory rhythms have been proposed as one of the crucial preadaptations for human speech evolution [3, 8, 9]. However, the evolutionary link in phono-articulatory rhythmicity between vertebrate vocalization and human speech remains unclear. From the phonatory perspective, theta oscillations might be phylogenetically preserved throughout all vertebrate clades [10-12]. From the articulatory perspective, theta oscillations are present in non-vocal lip smacking [1, 13, 14], teeth chattering [15], vocal lip smacking [16], and clicks and faux-speech [17] in non-human primates, potential evolutionary precursors for speech rhythmicity [1, 13]. Notably, a universal phono-articulatory rhythmicity similar to that in human speech is considered to be absent in non-human primate vocalizations, typically produced with sound modulations lacking concomitant articulatory movements [1, 9, 18]. Here, we challenge this view by investigating the coupling of phonatory and articulatory systems in marmoset vocalizations. Using quantitative measures of acoustic call structure, e.g., amplitude envelope, and call-associated articulatory movements, i.e., inter-lip distance, we show that marmosets display speech-like bi-motor rhythmicity. These oscillations are synchronized and phase locked at theta rhythms. Our findings suggest that oscillatory rhythms underlying speech production evolved early in the primate lineage, identifying marmosets as a suitable animal model to decipher the evolutionary and neural basis of coupled phono-articulatory movements.
... Communicative oscillations in non-human primates are typically non-vocal. Any sound that accompanies these communications is generally produced by percussive sounds of the oral effectors, rather than through phonation at the larynx; an exception is found in the "wobble" of gelada baboons, in which a "moan" vocalization occurs during some lip smacking (Bergman, 2013). This stands in contrast to human speech, where vibration of the vocal folds in the larynx is the primary sound-source for both speaking and singing. ...
Article
A prominent model of the origins of speech, known as the “frame/content” theory, posits that oscillatory lowering and raising of the jaw provided an evolutionary scaffold for the development of syllable structure in speech. Because such oscillations are non‐vocal in most non‐human primates, the evolution of speech required the addition of vocalization onto this scaffold in order to turn such jaw oscillations into vocalized syllables. In the present functional MRI study, we demonstrate overlapping somatotopic representations between the larynx and the jaw muscles in the human primary motor cortex. This proximity between the larynx and jaw in the brain might support the coupling between vocalization and jaw oscillations to generate syllable structure. This model suggests that humans inherited voluntary control of jaw oscillations from ancestral species, but added voluntary control of vocalization onto this via the evolution of a new brain area that came to be situated near the jaw region in the human motor cortex. This article is protected by copyright. All rights reserved.
... Speech-like rhythm has been uncovered in a growing number of primate signals: lip-smacks of various macaque species [11,12], stump-tailed macaques' panting calls [12], gelada's wobbles [13], gibbon song [14] and orangutan clicks and faux speech [15]. Further studies have shown that, in macaques, lip-smacks © 2020 The Author(s) Published by the Royal Society. ...
Article
Full-text available
Speech is a human hallmark, but its evolutionary origins continue to defy scientific explanation. Recently, the open–close mouth rhythm of 2–7 Hz (cycles/second) characteristic of all spoken languages has been identified in the orofacial signals of several nonhuman primate genera, including orangutans, but evidence from any of the African apes remained missing. Evolutionary continuity for the emergence of speech is, thus, still inconclusive. To address this empirical gap, we investigated the rhythm of chimpanzee lip-smacks across four populations (two captive and two wild). We found that lip-smacks exhibit a speech-like rhythm at approximately 4 Hz, closing a gap in the evidence for the evolution of speech-rhythm within the primate order. We observed sizeable rhythmic variation within and between chimpanzee populations, with differences of over 2 Hz at each level. This variation did not result, however, in systematic group differences within our sample. To further explore the phylogenetic and evolutionary perspective on this variability, inter-individual and inter-population analyses will be necessary across primate species producing mouth signals at speech-like rhythm. Our findings support the hypothesis that speech recruited ancient primate rhythmic signals and suggest that multi-site studies may still reveal new windows of understanding about these signals' use and production along the evolutionary timeline of speech.
... www.nature.com/scientificreports/ behaviours [16][17][18][19] , some of which can be compared, at a purely mechanical level, with the use of music instruments 20,21 . Analyses of these behaviours have been seldom done in light of dance evolution and rarely adopted accurate measures of rhythm and synchrony, with only a few exceptions in the field 22 . ...
Article
Full-text available
Dance is an icon of human expression. Despite astounding diversity around the world’s cultures and dazzling abundance of reminiscent animal systems, the evolution of dance in the human clade remains obscure. Dance requires individuals to interactively synchronize their whole-body tempo to their partner’s, with near-perfect precision. This capacity is motorically-heavy, engaging multiple neural circuitries, but also dependent on an acute socio-emotional bond between partners. Hitherto, these factors helped explain why no dance forms were present amongst nonhuman primates. Critically, evidence for conjoined full-body rhythmic entrainment in great apes that could help reconstruct possible proto-stages of human dance is still lacking. Here, we report an endogenously-effected case of ritualized dance-like behaviour between two captive chimpanzees – synchronized bipedalism. We submitted video recordings to rigorous time-series analysis and circular statistics. We found that individual step tempo was within the genus’ range of “solo” bipedalism. Between-individual analyses, however, revealed that synchronisation between individuals was non-random, predictable, phase concordant, maintained with instantaneous centi-second precision and jointly regulated, with individuals also taking turns as “pace-makers”. No function was apparent besides the behaviour’s putative positive social affiliation. Our analyses show a first case of spontaneous whole-body entrainment between two ape peers, thus providing tentative empirical evidence for phylogenies of human dance. Human proto-dance, we argue, may have been rooted in mechanisms of social cohesion among small groups that might have granted stress-releasing benefits via gait-synchrony and mutual-touch. An external sound/musical beat may have been initially uninvolved. We discuss dance evolution as driven by ecologically-, socially- and/or culturally-imposed “captivity”.
... These phenomena can affect the salience of rhythmic similarity (Pardo et al., 2018;Reichel et al., 2018), and it remains to be seen whether the demonstrated effect is actually transferrable from laboratory speech to spontaneous interactions. Animals perceive rhythmic cues to make judgments regarding social affiliation (Bergman, 2013;Connor, Smolker, & Bejder, 2006;Ghazanfar & Takahashi, 2014a, 2014bRęk & Osiejus, 2010. Here, we aimed to detect this effect in humans in a situation in which the referential code was not shared by all the parties. ...
Article
Full-text available
Patterns of nonverbal and verbal behavior of interlocutors become more similar as communication progresses. Rhythm entrainment promotes pro-social behavior and signals social bonding and cooperation. Yet it is unknown if the convergence of rhythm in human speech is perceived and is used to make pragmatic inferences regarding the cooperative urge of the interactors. We conducted two experiments to answer this question. For analytical purposes, we separate pulse (recurring acoustic events) and meter (hierarchical structuring of pulses based on their relative salience). We asked the listeners to make judgments on the hostile or collaborative attitude of interacting agents who exhibit different or similar pulse (exp 1) or meter (exp 2). The results suggest that rhythm convergence can be a marker of social cooperation at the level of pulse, but not at the level of meter. The mapping of rhythmic convergence onto social affiliation or opposition is important at the early stages of language acquisition. The evolutionary origin of this faculty is possibly the need to transmit and perceive coalition information in social groups of human ancestors. We suggest that this faculty could promote the emergence of the speech faculty in humans.
... Corncrakes (a bird in the rail family), for example, switch from regular rhythm with isochronous intervocalization intervals to irregular rhythm when signaling aggression 38 . Monkeys, including gelades, baboons, macaques, and marmosets, produce specific rhythms with lip-smacking frequency at around 4-5Hz to signal affiliation [39][40][41] . In dolphins, specific motor rhythm patterns and synchronization of motor behavior between interacting individuals signals cooperation 42 . ...
Article
Full-text available
Rhythm is fundamental to every motor activity. Neural and physiological mechanisms that underlie rhythmic cognition, in general, and rhythmic pattern generation, in particular, are evolutionarily ancient. As speech production is a kind of motor activity, investigating speech rhythm can provide insight into how general motor patterns have been adapted for more specific use in articulation and speech production. Studies on speech rhythm may further provide insight into the development of speech capacity in humans. As speech capacity is putatively a prerequisite for developing a language faculty, studies on speech rhythm may cast some light on the mystery of language evolution in the human genus. Hereby, we propose an approach to exploring speech rhythm as a window on speech emergence in ontogenesis and phylogenesis, as well as on diachronic linguistic changes.
... We found that geladas are indeed capable of vocalizing while lipsmacking (Bergman, 2013;Gustison & Bergman, 2017). One of the derived affiliative calls of gelada males is the wobble. ...
Article
The origins of speech, the most complex form of animal communication, remain a puzzle. Human speech and nonhuman primate vocalizations have traditionally been viewed dichotomously, with several aspects of speech having no clear analogues in the calls of our primate relatives. The putative unique aspects of speech include a diverse array of learned sounds that are rapidly produced in rhythmic strings and continuously recombined in new sequences. However, recent research challenges the idea that these features are indeed unique to humans and suggests more continuity between nonhuman and human primates than was previously appreciated. Here we review recent findings in four areas of this emerging continuity. In light of these studies, we argue that the evolution of human speech abilities most likely originated in a primate ancestor capable of (1) producing a ‘speech-ready’ range of vowel-like sounds, (2) vocalizing with simultaneous rhythmic mouth movements, (3) combining long strings of varied and structured sounds and (4) exercising some volitional control over calls that were modified based on experience. Taken together, these results suggest that the considerable latent vocal ability that we observe in nonhuman primates is consistent with the hypothesis that a key step towards human speech was the evolution of greater cognitive control of the vocal apparatus (and not the evolution of speech-specific anatomical adaptations). By shifting research emphasis away from the mechanics of how speech is produced to the conditions that favoured more diverse, open-ended and imitative vocal systems, we hope to encourage new avenues for future comparative research.
... One interesting subset of MNs have been found to discharge in response to lipsmaking or tongue protrusion (the so-called mouth communicative MNs) (Ferrari et al., 2003). As lip-smaking is normally associated to affiliative behavior, and sometimes linked to specific vocalizations (Partan, 2002) or even to speechlike sounds (Bergman, 2013), these neurons seem to be connected to facial communication and constitute one of its neural basis (Tramacere et al., 2017;Ferrari et al., 2017). Monkey vocal communication has been classically attributed to mesial and subcortical structures and has been thought to be involuntary, resulting from emotional and motivational activations (Jürgens, 2002). ...
Article
Full-text available
Songbirds possess mirror neurons (MNs) activating during the perception and execution of specific features of songs. These neurons are located in high vocal center (HVC), a premotor nucleus implicated in song perception, production and learning, making worth to inquire their properties and functions in vocal recognition and imitative learning. By integrating a body of brain and behavioral data, we discuss neurophysiology, anatomical, computational properties and possible functions of songbird MNs. We state that the neurophysiological properties of songbird MNs depends on sensorimotor regions that are outside the auditory neural system. Interestingly, songbirds MNs can be the result of the specific type of song representation possessed by some songbird species. At the functional level, we discuss whether songbird MNs are involved in others’ song recognition, by dissecting the function of recognition in various different but possible overlapping processes: action-oriented perception, discriminative-oriented perception and identification of the signaler. We conclude that songbird MNs may be involved in recognizing other singer's vocalizations, while their role in imitative learning still require to solve how auditory feedback are used to correct own vocal performance to match the tutor song. Finally, we compare songbird and human mirror responses, hypothesizing a case of convergent evolution, and proposing new experimental directions.
... An excellent species in which to study the specific social functions of complex vocalizations are geladas (Theropithecus gelada)-a non-human primate known for its high levels of sociality and unique vocal abilities compared to closely related species like baboons (Papio spp.) (Richman 1976(Richman , 1987Aich et al. 1990;Gustison et al. 2012;Bergman 2013). Geladas live in Breproductive units^comprised of a dominant leader male, 0-3 subordinate males, and 1-11 females with their immature offspring (Snyder-Mackler et al. 2012b). ...
Article
Full-text available
Several studies show that highly social taxa produce relatively more complex vocalizations. Yet, very few of these cases have demonstrated the function that vocal complexity plays within a highly social setting. Here, we assess potential functions of vocal complexity in male geladas (Theropithecus gelada) living in the Simien Mountains National Park, Ethiopia. Geladas are known for both their diverse vocalizations (routinely produced in long sequences) and their complex social structure (extremely large groups and long-term male-female bonds). We tested whether sequence complexity (i.e., including elaborate “derived” calls that are unique to geladas and absent in closely related taxa) or size (i.e., number of calls) may function (1) to counteract the challenges of living in a large group (overcoming conspecific noise and crowding, maintaining cohesion), or (2) to maintain social bonds with females. We found that an increase in conspecific noise contributed to the use of longer and more complex sequences. However, behavioral contexts in which the risk of separation was highest (i.e., traveling) were associated with only longer (but not more complex) sequences. We also found that sequence complexity (but not size) was associated with male-female bonding as complex call sequences were produced primarily when males were in close proximity to and approached females, and they led to males being groomed by females. Together, these findings suggest that, while a noisy backdrop of conspecific vocalizations might contribute to vocal complexity, the potential driver of gelada vocal complexity is the need to maintain cross-sex bonds. Significance statement Why do some animals make many diverse sounds while others make only a few simple sounds? Broad comparisons suggest that sociality may be important as more social species (e.g., those with large group size and social bonding) tend to make more types of sounds. Yet, it remains unclear why gregarious species need an expanded call repertoire. Here, we take advantage of previous work on a highly social primate (geladas) that identified several complex vocalizations that contribute to gelada’s expanded vocal repertoire. To better understand why geladas evolved an expanded set of calls, we focus on the context where complex calls are produced and the responses those calls elicit. We found that the potential driver of the use of more call types is the need to maintain cross-sex bonds, suggesting an important role for male-female bonds in the evolution of vocal complexity.
... Recent views with regard to vocal production in non-human primates have updated the limited abilities of their speech-related motor abilities. For example, facial expressions such as lip-smacking would be considered a 'homologous' motor dimension of speech, as they can be controlled [52][53][54][55][56]. Likewise, vocal tract control has been recently confirmed by great ape vocalizations [46]. ...
Article
Full-text available
Voluntary control of vocal production is an essential component of the language faculty, which is thought to distinguish humans from other primates. Recent experiments have begun to reveal the capability of non-human primates to perform vocal control; however, the mechanisms underlying this ability remain unclear. Here, we revealed that Japanese macaque monkeys can learn to vocalize voluntarily through a different mechanism than that used for manual actions. The monkeys rapidly learned to touch a computer monitor when a visual stimulus was presented and showed a capacity for flexible adaptation, such that they reacted when the visual stimulus was shown at an unexpected time. By contrast, successful vocal training required additional time, and the monkeys exhibited difficulty with vocal execution when the visual stimulus appeared earlier than expected; this occurred regardless of extensive training. Thus, motor preparation before execution of an action may be a key factor in distinguishing vocalization from manual actions in monkeys; they do not exhibit a similar ability to perform motor preparation in the vocal domains. By performing direct comparisons, this study provides novel evidence regarding differences in motor control abilities between vocal and manual actions. Our findings support the suggestion that the functional expansion from hand to mouth might be a critical evolutionary event for the acquisition of voluntary control of vocalizations.
... The lip-smacking facial expression has an affiliative function in geladas and consists in a protrusion of the lips that are smacked together repeatedly. Grunting and moan in geladas, as it occurs for baboon species 94,95 , have an important role in the affiliation mechanisms 96,97 . ...
Article
Full-text available
Post-conflict affiliation is a mechanism favored by natural selection to manage conflicts in animal groups thus avoiding group disruption. Triadic affiliation towards the victim can reduce the likelihood of redirection (benefits to third-parties) and protect and provide comfort to the victim by reducing its post-conflict anxiety (benefits to victims). Here, we test specific hypotheses on the potential functions of triadic affiliation in Theropithecus gelada, a primate species living in complex multi-level societies. Our results show that higher-ranking geladas provided more spontaneous triadic affiliation than lower-ranking subjects and that these contacts significantly reduced the likelihood of further aggression on the victim. Spontaneous triadic affiliation significantly reduced the victim’s anxiety (measured by scratching), although it was not biased towards kin or friends. In conclusion, triadic affiliation in geladas seems to be a strategy available to high-ranking subjects to reduce the social tension generated by a conflict. Although this interpretation is the most parsimonious one, it cannot be totally excluded that third parties could also be affected by the negative emotional state of the victim thus increasing a third party’s motivation to provide comfort. Therefore, the debate on the linkage between third-party affiliation and emotional contagion in monkeys remains to be resolved.
... A fascinating hypothesis concerning the evolution of rhythmicity in speech starts with the observation that certain primate facial displays, such as lip-smacking, occur in the same theta frequency range as human speech [115,116]. These displays are typically, but not always, nearly silent [117] and consist of complex, synchronized movements of the lips, jaw, and tongue that are highly similar to speech movements [118]. Human data show that neural entrainment to visual components of speech enhances auditory perception, but only in the speech-typical theta frequency range, suggesting that the origin of speech rhythmicity may lie in pre-existing perceptual neural oscillations, to which lip-smacking, and later speech movements, became 'tuned' during evolution [119] (Box 2). ...
Article
Full-text available
Behavioral and brain rhythms in the millisecond-to-second range are central in human music, speech, and movement. A comparative approach can further our understanding of the evolution of rhythm processing by identifying behavioral and neural similarities and differences across cognitive domains and across animal species. We provide an overview of research into rhythm cognition in music, speech, and animal communication. Rhythm has received considerable attention within each individual field, but to date, little integration. This review article on rhythm processing incorporates and extends existing ideas on temporal processing in speech and music and offers suggestions about the neural, biological, and evolutionary bases of human abilities in these domains.
... Most scholars who study animal communication aim to identify the prerequisites of different aspects of human language capacity, as well as analogies between some traits in animal communication and human language. Thus, the ability to count is regarded as a prerequisite for recursion, preverbal concepts provide a basis for the development of language signs, birdsong syntax is considered as analogous to human language syntax (Okanoya 2002;Hurford 2012), geladas' lip-smacking is considered a precursor to speech (Bergman, 2013), and so on. In many works, animal communication systems are compared to human language in order to determine the distinctive features between the two (Hockett, 1960;Pinker & Jackendoff, 2008). ...
Preprint
Full-text available
Rhythm is an important component of human language and music production. Rhythms like isochrony (intervals spaced equally in time), are also present in vocalisations of certain non-human species, including several birds and mammals. This study aimed to identify rhythmic patterns with music-based methods within display behaviour of chimpanzees (Pan troglodytes), humans' closest living relatives. Behavioural observations were conducted on individuals from two zoo-housed colonies. We found isochronous rhythms in vocal (e.g. pants, grunts and hoots), as well as in motoric (e.g. swaying and stomping) behavioural sequences. Among individuals, variation was found in the duration between onsets of behavioural elements, resulting in individual-specific tempi. Despite this variation in individual tempi, display sequences were consistently structured with stable, isochronous rhythms. Overall, directed displays, targeted at specific individuals, were less isochronous than undirected displays. The presence of rhythmic patterns across two independent colonies of chimpanzees, suggests that underlying mechanisms for rhythm production may be shared between humans and non-human primates. This shared mechanism indicates that the cognitive requirements for rhythm production potentially preceded human music and language evolution.
Article
Full-text available
Yawning is undeniably contagious and hard to resist. Interestingly, in our species, even the mere sound of a yawn can trigger this contagious response, especially when the yawner is someone familiar. Together with humans, one other mammal species is known to produce loud and distinct vocalisations while yawning, Theropithecus gelada. Geladas are known for their complex social interactions and rich vocal communication, making them intriguing subjects for studying yawning behaviour. To explore the contagious effect of yawn sounds on geladas, we conducted playback experiments in a zoo-housed colony with animals living in two groups. We exposed them to yawn sounds (Test) or affiliative grunts (Control) produced by males from either their own group or the other one. The results were remarkable, as simply hearing yawn sounds led to yawn contagion in geladas, with multiple responses observed when the yawns came from members of their own group. This finding adds a significant contribution to the research on mimicry and behavioural contagion in primates. Moreover, it raises intriguing questions about the involvement of sensory modalities beyond visual perception in these phenomena.
Article
Le langage dit articulé que nous pratiquons a de tout temps occupé les philosophes et les savants tant au niveau de l’anatomie qui le rend possible que de la question de sa nature spécifique (i.e. exclusive à notre espèce). Cet article propose un historique des principales études réalisées depuis l’Antiquité mettant en évidence les changements de paradigme et l’évolution des problématiques. Au XIXe s., deux propositions vont orienter durablement les recherches : l’identification par Paul Broca du siège du langage articulé dans le cortex cérébral et l’importance de la longueur du conduit vocal et de ses variations, argumentée par Robert Willis, pour expliquer la production des voyelles, ouvrant la voie aux études phonétiques. Un siècle plus tard, il sera possible de les visualiser dans un espace acoustique et Gunnar Fant proposera la théorie dite source-filtre pour la production des sons contrastés. C’est dans ce contexte que Philip Lieberman va démontrer que la descente du larynx est une condition nécessaire au langage, ce qui va lui permettre d’expliquer pourquoi les singes et les Néandertaliens en sont dépourvu. Cette proposition va dominer pendant un demi-siècle, paralysant les recherches sur la production vocale des primates non-humains utilisés comme modèle pour aborder la question des capacités des hominines fossiles. La démonstration récente d’une dynamique du conduit vocal chez les mammifères dont les primates et d’un espace acoustique maximal partagé par tous permet aujourd’hui de considérer que ce n’est pas la taille du pharynx qui est déterminante mais bien le contrôle des articulateurs de la parole. Ce changement de paradigme, s’il est libérateur pour les études sur les primates non-humains dont il convient de mieux connaître les particularités anatomiques et physiologiques, rend la mise en évidence de l’émergence de la parole au cours de l’évolution humaine encore plus difficile à apprécier sur la base du matériel fossile, conservant peu d’informations quant aux articulateurs en jeu (mandibule, langue et lèvres principalement).
Article
Full-text available
This article delves into the analysis of musical affiliation in the Altai kam and kaichi mystery, by applying the methods of analyzing European musical experience to traditional Altai culture. The authors explore the physiology and psychology of music perception, along with the phenomenology and semiotics of the formation of musical experience. Furthermore, the study highlights similarities between the pre-secret understanding of music, including the formation of societal perception of coordinate systems and internment, the role of the performer in culture, and the structural function of the European and Altai musical agent. The article concludes by discussing the relevance of these findings to the shamanic and bardic traditions. In summary, understanding musical experience necessitates a comprehensive exploration of both physiological and cultural factors. Keywords: music, musical experience, semiotics, phenomenology, kam, kaichi, shaman, bard.
Article
Full-text available
This article is devoted to the analysis of the musical affiliation in the mystery of the Altai kam and kaichi. To do this, the authors transfer the methods of analyzing the European musical experience to the traditional Altai culture. We observe the physiology, psychology of music perception, as well as the phenomenology and semiology of the formation of musical experience. Similarities in the pre-secret understanding of music, as the formation of the perception of society of the system of coordinates and internment, the role of the figure of the performer in culture, and the relevance of comparing the structural function of the European and Altai musical agent are also substantiated
Article
Full-text available
Darwin and other pioneering scholars made comparisons between human facial signals and those of non-human primates, suggesting they share evolutionary history. We now have tools available (Facial Action Coding System: FACS) to make these comparisons anatomically based and standardised, as well as analytical methods to facilitate comparative studies. Here we review the evidence establishing a shared anatomical basis between the facial behaviour of human and non-human primate species, concluding which signals are likely related, and which are not. We then review the evidence for shared function and discuss the implications for understanding human communication. Where differences between humans and other species exist, we explore possible explanations and future directions for enquiry.
Article
Full-text available
The lip-smack is a communicative sound object that has received very little research attention, with most work examining their occurrence in nonhuman primate interaction. The current paper aims to dissect the social potential of lip-smacks in human interaction. The analysis examines a corpus of 391 lip-smack particles produced by English-speaking parents while feeding their infants. A multimodal interaction analysis details the main features: (1) rhythmical production in a series, (2) facial-embodied aspects, and (3) temporal organisation. Lip-smacks occurred in prosodically grouped chains of mostly 3 or 5 particles, with accompanying facial expressions, and were co-ordinated with the infants’ chewing. They highlight the mechanics of chewing while framing eating as a pleasant interactional event. The paper contributes not only to the distinctly social functions of a sound object hitherto ignored in linguistics but also to research on interactional exchanges in early childhood and their potential connection to the sociality of nonhuman primates.
Article
Human language is thought to have evolved from non-linguistic communication systems present in the primate lineage. Scientists rely on data from extant primate species to estimate how this happened, with debates centering around which modality (vocalization, gesture, facial expression) was a likely precursor. In 2011, we demonstrated that different theoretical and methodological approaches are used to collect data about each modality, rendering datasets incomplete and comparisons problematic. Here, 10 years later, we conducted a follow-up systematic review to test whether patterns have changed, examining the primate communication literature published between 2011 and 2020. In sum, despite the promising progress in addressing some gaps in our knowledge, systematic biases still exist and multimodal research remains uncommon. We argue that theories of language evolution are unlikely to advance until the field of primate communication research acknowledges and rectifies the gaps in our knowledge.
Article
Full-text available
Why do humans make music? Theories of the evolution of musicality have focused mainly on the value of music for specific adaptive contexts such as mate selection, parental care, coalition signaling, and group cohesion. Synthesizing and extending previous proposals, we argue that social bonding is an overarching function that unifies all of these theories, and that musicality enabled social bonding at larger scales than grooming and other bonding mechanisms available in ancestral primate societies. We combine cross-disciplinary evidence from archaeology, anthropology, biology, musicology, psychology, and neuroscience into a unified framework that accounts for the biological and cultural evolution of music. We argue that the evolution of musicality involves gene-culture coevolution, through which proto-musical behaviors that initially arose and spread as cultural inventions had feedback effects on biological evolution due to their impact on social bonding. We emphasize the deep links between production, perception, prediction, and social reward arising from repetition, synchronization, and harmonization of rhythms and pitches, and summarize empirical evidence for these links at the levels of brain networks, physiological mechanisms, and behaviors across cultures and across species. Finally, we address potential criticisms and make testable predictions for future research, including neurobiological bases of musicality and relationships between human music, language, animal song, and other domains. The music and social bonding (MSB) hypothesis provides the most comprehensive theory to date of the biological and cultural evolution of music.
Preprint
Full-text available
Why do humans make music? Theories of the evolution of musicality have focused mainly on the value of music for specific adaptive contexts such as mate selection, parental care, coalition signaling, and group cohesion. Synthesizing and extending previous proposals, we argue that social bonding is an overarching function that unifies all of these theories, and that musicality enabled social bonding at larger scales than grooming and other bonding mechanisms available in ancestral primate societies. We combine cross-disciplinary evidence from archaeology, anthropology, biology, musicology, psychology, and neuroscience into a unified framework that accounts for the biological and cultural evolution of music. We argue that the evolution of music’s social bonding functions involves gene-culture coevolution, through which proto-musical behaviors that initially arose and spread as cultural inventions had feedback effects on biological evolution due to their impact on social bonding. We emphasize the deep links between production, perception, prediction, and social reward arising from repetition, synchronization, and harmonization of rhythms and pitches, and summarize empirical evidence for these links at the levels of brain networks, physiological mechanisms, and behaviors across cultures and across species. Finally, we address potential criticisms and make testable predictions for future research, including neurobiological bases of musicality and relationships between human music, language, animal song, and other domains. The music and social bonding (MSB) hypothesis provides the most comprehensive theory to date of the biological and cultural evolution of music.
Chapter
Language is cognitive systems unique to humans, and other animals never showed such equivalent one. Recent studies have strongly suggested that the language would not be originated solely from one unique ability, but rather it would emerge as a consequence of the integrations of multiple abilities which would be commonly shared with nonhuman animals. These ideas suggest the importance of the comparative approaches for the cognitions and communications between humans and nonhuman animals. Particularly, speech or vocal communications are typical examples of the basic biological components which are all necessary for language emergences. The evolutionary pathway from primate vocal systems to human communication via language systems has been always paid a special attention by many evolutionary biologists. Here I review to focus on the recent progress of the studies in the vocal evolution in the primate lineages, mainly for the two dimensions in the primate vocalizations, i.e., vocal controllability and speech homologous facial expressions in monkeys. The limited ability of vocal production control distinguishes us from other primates. Recent experiments, however, have begun to reveal voluntary vocal control ability in nonhuman primates, showing successful attempts of the vocal operant conditionings. These accumulating findings have concluded the functional expansion of cognitive motor control from hand to mouth, suggesting the possible evolutionary history where the motor cortex expansions from forelimb to larynx would occur in the human evolution. Besides, orofacial action is a crucial component for the speech movements, which are characterized by facial actions of ∼5 Hz oscillations of lip, mouth, or jaw movements. A recent promising candidate homologue for these facial actions is a lip-smacking, a facial display of primates, which seems to be independent of speech. Interestingly, such facial actions are also characterized by stable 5 Hz oscillation patterns of jaw or mandible actions, matching that of speech. Recent studies have confirmed the parallel development and kinematics between speech and lip-smacking actions, suggesting a common neural mechanism for the central pattern generator underlying orofacial movements, which would evolve to speech with a sensory foundation for perceptual saliency particular to 5 Hz rhythms widely observed in primate species. Thus, these stepwise acquisitions of multiple independent components such as vocal controllability or facial actions would be all necessary evolutionary events before speech emergence, and the integrations of the multiple components would be a key to finally establish human speech.
Chapter
This chapter surveys progress in the realm of language evolution. Its main message is that, like vocal learning in birds, human language is a highly complex, polygenic trait that recruits numerous brain regions, over and above the classical language regions, to provide a computational regime supporting linguistic cognition. Strong adherence to the Darwinian logic of descent, with its emphasis on a rich cognitive life for nonhuman species, offers the hope to shed light on what looks like a very human-specific, domain-specific capacity like language. Such an approach requires studies that embrace the multifactorial (not only genes, not only environment), multidimensional (genome, connectome, “dynome,” “cognome”) nature of the capacity to master grammatical systems of the kinds humans do.
Article
Vocal production is hierarchical in the time domain. These hierarchies build upon biomechanical and neural dynamics across various timescales. We review studies in marmoset monkeys, songbirds, and other vertebrates. To organize these data in an accessible and across-species framework, we interpret the different timescales of vocal production as belonging to different levels of an autonomous systems hierarchy. The first level accounts for vocal acoustics produced on short timescales; subsequent levels account for longer timescales of vocal output. The hierarchy of autonomous systems that we put forth accounts for vocal patterning, sequence generation, dyadic interactions, and context dependence by sequentially incorporating central pattern generators, intrinsic drives, and sensory signals from the environment. We then show the framework's utility by providing an integrative explanation of infant vocal production learning in which social feedback modulates infant vocal acoustics through the tuning of a drive signal.
Article
Full-text available
Regular rhythm facilitates audiomotor entrainment and synchronization in motor behavior and vocalizations between individuals. As rhythm entrainment between interacting agents is correlated with higher levels of cooperation and prosocial affiliative behavior, humans can potentially map regular speech rhythm onto higher cooperation and friendliness between interacting individuals. We tested this hypothesis at two rhythmic levels: pulse (recurrent acoustic events) and meter (hierarchical structuring of pulses based on their relative salience). We asked the listeners to make judgments of the hostile or collaborative attitude of two interacting agents who exhibit either regular or irregular pulse (Experiment 1) or meter (Experiment 2). The results confirmed a link between the perception of social affiliation and rhythmicity: evenly distributed pulses (vowel onsets) and consistent grouping of pulses into recurrent hierarchical patterns are more likely to be perceived as cooperation signals. People are more sensitive to regularity at the level of pulse than at the level of meter, and they are more confident when they associate cooperation with isochrony in pulse. The evolutionary origin of this faculty is possibly the need to transmit and perceive coalition information in social groups of human ancestors. We discuss the implications of these findings for the emergence of speech in humans.
Article
Full-text available
Human speech and vocal communication of nonhuman primates share many features in semantic ability such as referentiality of calls or pragmatic inference. By contrast, a large gap is found in the phonological ability; vocal learning or volitional vocal control are hardly observed in nonhuman primates. However, recent studies have shown similarity between speech and the vocal, facial communication of nonhuman primates used in low-arousal situations. Here I focused on such vocalizations, called contact calls and close calls. These vocalizations are widely seen among primate taxa. Contact calls function to keep group cohesiveness. Close calls are soft vocalizations used in face-to-face communication and function to signal the benign intent of the caller. These vocalizations exhibit the subcomponent abilities that enables human speech as follows. First, they show some plasticity in usage and acoustic features. The use of close calls becomes partner-specific as individuals mature, and this process is influenced by other group members. The acoustic features of contact calls are modified not only developmentally through interactions with parents but also temporally during vocal exchanges. Second, just as turn-taking in human conversation, timing adjustment has been found in both contact calls and close calls, indicating a vocal coordination ability in nonhuman primates. Third, the facial movements of close calls or lip-smacking have been revealed to be putative precursors of speech rhythm. Finally, some studies have shown that monkeys can acquire volitional vocal control through training. Although primates lack direct projection from the cortex to the vocal control region in the brainstem, initiation of such vocal utterance or lip-smacking involves similar brain areas as in human speech. Further investigation of vocal control during contact and close calls in social, natural settings as well as in experimental settings, would contribute to a better understanding of the evolutionary course leading to speech.
Article
Investigating nonhuman primate vocal communication is often with the intention of elucidating their similarities with human speech and thus reconstructing the evolutionary history of this important behaviour. However, putative parallels between primate and human vocal behaviours have, in some respects, remained elusive. Here, we review two lines of research in marmoset monkeys on volitional vocal control and vocal learning during development that could bridge our understanding of the relationship between primate vocalizations and human speech. Regarding volitional control, we review how changes in vocal output are not solely due to changes in arousal levels and their effect on the vocal apparatus; extrinsic factors like the vocalizations from other conspecifics also have an important influence. With regard to vocal learning, we describe not only how infant marmoset vocalizations undergo dramatic acoustic changes during development that are not wholly explained by physical growth, but also how, as in humans, contingent vocal responses from parents influence the rate of vocal development. We argue that the similarities in the vocal systems of marmoset monkeys and humans may be due to their shared cooperative breeding strategy and prosociality.
Chapter
The emergence of human language is one of the biggest wonders in the universe. In this chapter, I define “a language-like communication system” and examine the components necessary for the emergence of such a system, not only on Earth but in any habitable planet. Human language is a unique system among animal communication. Language is a system of transmitting an infinite variety of meanings by combining a finite number of tokens based on a set of rules. Language is not only used in communication but also in thinking. Thus, language is a system that enables compositional semantics. I propose that at least three components are necessary for the emergence of a language-like system: segmentation of context and behavior, the association between them, and the honesty of the emitted signals. When a signal conveys sufficient information regarding the behavioral state of the sender, that signal is defined as “honest,” meaning that its production incurs physiological, temporal, and social costs. I explain each of these conditions and discuss the possibility of “language as it could be” on other planets. I also extend my argument to the future of linguistic communication.
Book
Cambridge Core - Neurology and Clinical Neuroscience - Assembly of the Executive Mind - by Michael W. Hoffmann
Book
For ninety per cent of our history, humans have lived as 'hunters and gatherers', and for most of this time, as talking individuals. No direct evidence for the origin and evolution of language exists; we do not even know if early humans had language, either spoken or signed. Taking an anthropological perspective, Alan Barnard acknowledges this difficulty and argues that we can nevertheless infer a great deal about our linguistic past from what is around us in the present. Hunter-gatherers still inhabit much of the world, and in sufficient number to enable us to study the ways in which they speak, the many languages they use, and what they use them for. Barnard investigates the lives of hunter-gatherers by understanding them in their own terms, to create a book which will be welcomed by all those interested in the evolution of language. Presents new ideas on the history of language in a scholarly but student-friendly text Up-to-date coverage of research in linguistics, anthropology, archaeology, psychology and other fields Includes an extensive glossary covering all relevant topics Written by an anthropologist with over forty years of research in hunter-gatherer studies.
Article
Full-text available
Primates are intensely social and exhibit extreme variation in social structure, making them particularly well suited for uncovering evolutionary connections between sociality and vocal complexity. Although comparative studies find a correlation between social and vocal complexity, the function of large vocal repertoires in more complex societies remains unclear. We compared the vocal complexity found in primates to both mammals in general and human language in particular and found that non-human primates are not unusual in the complexity of their vocal repertoires. To better understand the function of vocal complexity within primates, we compared two closely related primates (chacma baboons and geladas) that differ in their ecology and social structures. A key difference is that gelada males form long-term bonds with the 2-12 females in their harem-like reproductive unit, while chacma males primarily form temporary consortships with females. We identified homologous and non-homologous calls and related the use of the derived non-homologous calls to specific social situations. We found that the socially complex (but ecologically simple) geladas have larger vocal repertoires. Derived vocalizations of geladas were primarily used by leader males in affiliative interactions with 'their' females. The derived calls were frequently used following fights within the unit suggesting that maintaining cross-sex bonds within a reproductive unit contributed to this instance of evolved vocal complexity. Thus, our comparison highlights the utility of using closely related species to better understand the function of vocal complexity.
Article
Full-text available
Many previous accounts of imitation have pointed out that children's copying behavior is a means by which to learn from others, while virtually ignoring the social factors which influence imitation. These accounts have thus far been unable to explain flexibility in children's copying behavior (e.g., why children sometimes copy exactly and sometimes copy selectively). We propose that the complexity of children's imitation can only be fully understood by considering the social context in which it is produced. Three critical factors in determining what is copied are children's own (learning and/or social) goals in the situation, children's identification with the model and with the social group in general, and the social pressures which children experience within the imitative situation. The specific combination of these factors which is present during the imitative interaction can lead children to produce a more or less faithful reproduction of the model's act. Beyond explaining flexibility in children's copying behavior, this approach situates the developmental study of imitation within a broader social psychological framework, linking it conceptually with closely related topics such as mimicry, conformity, normativity, and the cultural transmission of group differences.
Article
Full-text available
In over-imitation, children copy even elements of a goal-directed action sequence that appear unnecessary for achieving the goal. We demonstrate in 4-year olds that the unnecessary action is specifically associated with the goal, not generally associated with the apparatus. The unnecessary action is performed flexibly: 4-year olds usually omit it if it has already been performed by an adult. Most 5-year olds do not verbally report the unnecessary action as necessary when achieving the goal, although most of them report an equivalent but necessary action as necessary. Most 5-year olds explain the necessary action in functional terms, but are unsure as to the function of the unnecessary action. These verbal measures do not support the hypothesis that children over-imitate primarily because they encode unnecessary actions as causing the goal even in causally transparent systems. In a causally transparent system, explanations for over-imitation fitting the results are that children are ignorant of the unnecessary action's purpose, and that they learn a prescriptive norm that it should be carried out. In causally opaque systems, however, for children and for adults, any action performed before achieving the goal is likely to be inferred as causally necessary-this is not over-imitation, but ordinary causal learning.
Article
Full-text available
Humans, like other animals, are exposed to a continuous stream of signals, which are dynamic, multimodal, extended, and time varying in nature. This complex input space must be transduced and sampled by our sensory systems and transmitted to the brain where it can guide the selection of appropriate actions. To simplify this process, it's been suggested that the brain exploits statistical regularities in the stimulus space. Tests of this idea have largely been confined to unimodal signals and natural scenes. One important class of multisensory signals for which a quantitative input space characterization is unavailable is human speech. We do not understand what signals our brain has to actively piece together from an audiovisual speech stream to arrive at a percept versus what is already embedded in the signal structure of the stream itself. In essence, we do not have a clear understanding of the natural statistics of audiovisual speech. In the present study, we identified the following major statistical features of audiovisual speech. First, we observed robust correlations and close temporal correspondence between the area of the mouth opening and the acoustic envelope. Second, we found the strongest correlation between the area of the mouth opening and vocal tract resonances. Third, we observed that both area of the mouth opening and the voice envelope are temporally modulated in the 2-7 Hz frequency range. Finally, we show that the timing of mouth movements relative to the onset of the voice is consistently between 100 and 300 ms. We interpret these data in the context of recent neural theories of speech which suggest that speech communication is a reciprocally coupled, multisensory event, whereby the outputs of the signaler are matched to the neural processes of the receiver.
Article
Full-text available
Some representative vocalizations of captive rhesus monkey, chimpanzee, and gorilla were recorded and analyzed by means of sound spectrograms and oscillograms. It was found that these animals' vocal mechanisms do not appear capable of producing human speech. The laryngeal output was breathy and irregular. A uniform cross section, schwalike configuration appeared to underlie all the vocalizations. These animals did not modify the shape of their supralaryngeal vocal tracts by means of tongue maneuvers during a vocalization.Formant transitions occurred in some vocalizations, but they appeared to have been generated by means of laryngeal and possibly velar or lip movements. The nonhuman primates lack a pharyngeal region like man's, where the cross‐sectional area continually changes during speech. The data suggest that speech cannot be viewed as an overlaid function that makes use of a vocal tract that has evolved solely for respiratory and deglutitious purposes; the skeletal evidence of human evolution shows a series of changes from the primate vocal tract that may have been, in part, for the purpose of generatingspeech. Articulate speech may not have been fully developed in some of man's ancestors. The study of the peripheral speech‐production apparatus of a fossil thus may be useful in the assessment of its phylogenetic grade.
Article
Full-text available
Viewing a speaker's articulatory movements substantially improves a listener's ability to understand spoken words, especially under noisy environmental conditions. It has been claimed that this gain is most pronounced when auditory input is weakest, an effect that has been related to a well-known principle of multisensory integration--"inverse effectiveness." In keeping with the predictions of this principle, the present study showed substantial gain in multisensory speech enhancement at even the lowest signal-to-noise ratios (SNRs) used (-24 dB), but it was also evident that there was a "special zone" at a more intermediate SNR of -12 dB where multisensory integration was additionally enhanced beyond the predictions of this principle. As such, we show that inverse effectiveness does not strictly apply to the multisensory enhancements seen during audiovisual speech perception. Rather, the gain from viewing visual articulations is maximal at intermediate SNRs, well above the lowest auditory SNR where the recognition of whole words is significantly different from zero. We contend that the multisensory speech system is maximally tuned for SNRs between extremes, where the system relies on either the visual (speech-reading) or the auditory modality alone, forming a window of maximal integration at intermediate SNR levels. At these intermediate levels, the extent of multisensory enhancement of speech recognition is considerable, amounting to more than a 3-fold performance improvement relative to an auditory-alone condition.
Article
Some representative vocalizations of captive rhesus monkey, chimpanzee, and gorilla were recorded and analyzed by means of sound spectrograms and oscillograms. It was found that these animals' vocal mechanisms do not appear capable of producing human speech. The laryngeal output was breathy and irregular. A uniform cross section, schwalike configuration appeared to underlie all the vocalizations. These animals did not modify the shape of their supralaryngeal vocal tracts by means of tongue maneuvers during a vocalization.Formant transitions occurred in some vocalizations, but they appeared to have been generated by means of laryngeal and possibly velar or lip movements. The nonhuman primates lack a pharyngeal region like man's, where the cross‐sectional area continually changes during speech. The data suggest that speech cannot be viewed as an overlaid function that makes use of a vocal tract that has evolved solely for respiratory and deglutitious purposes; the skeletal evidence of human evolution shows a series of changes from the primate vocal tract that may have been, in part, for the purpose of generatingspeech. Articulate speech may not have been fully developed in some of man's ancestors. The study of the peripheral speech‐production apparatus of a fossil thus may be useful in the assessment of its phylogenetic grade.
Article
It has long been asserted that habitat acoustics can determine the frequency band bestadapted for long-range communication, but the generality and validity of measurements claiming to demonstrate a window of best frequencies have recently been questioned. We report the discovery of a prominent sound window in Kenyan rain forest in a study that is free of methodological difficulties. Our results allow us to calculate the range advantage attained by an animal vocalizing within the sound window, and show that sound windows can be a potent factor for the evolution of primate communication.
Article
A key feature of speech is its stereotypical 5 Hz rhythm. One theory posits that this rhythm evolved through the modification of rhythmic facial movements in ancestral primates. If the hypothesis has any validity, then a comparative approach may shed some light. We tested this idea by using cineradiography (X-ray movies) to characterize and quantify the internal dynamics of the macaque monkey vocal tract during lip-smacking (a rhythmic facial expression) versus chewing. Previous human studies showed that speech movements are faster than chewing movements, and the functional coordination between vocal tract structures is different between the two behaviors. If rhythmic speech evolved through a rhythmic ancestral facial movement, then one hypothesis is that monkey lip-smacking versus chewing should also exhibit these differences. We found that the lips, tongue, and hyoid move with a speech-like 5 Hz rhythm during lip-smacking, but not during chewing. Most importantly, the functional coordination between these structures was distinct for each behavior. These data provide empirical support for the idea that the human speech rhythm evolved from the rhythmic facial expressions of ancestral primates.
Article
The acoustic structure of mammal vocalizations often encodes information on different phenotypic traits of the caller that is potentially available to receivers. Here we used a source–filter theory approach to investigate, in light of their biophysical modes of production, which acoustic features of male and female giant panda bleats have the potential to convey information on the caller's sex, age and body size. Our results revealed that both source- and filter-related acoustic features of giant panda bleats provided reliable information on a caller's sex, age and body size. However, when we considered both sexes separately we found that male and female giant panda bleats differed in their potential to signal information on the caller's age and body size: only male bleats contained reliable information on the caller's body size while female bleats were more predictive of the caller's age. We go on to discuss the anatomical basis for our findings, consider the potential functional relevance of signalling this type of information in giant panda sexual communication, and suggest future empirical investigations that should better enable us to understand this species' mating behaviour.
Article
The idea that social motivation deficits play a central role in Autism Spectrum Disorders (ASD) has recently gained increased interest. This constitutes a shift in autism research, which has traditionally focused more intensely on cognitive impairments, such as theory-of-mind deficits or executive dysfunction, and has granted comparatively less attention to motivational factors. This review delineates the concept of social motivation and capitalizes on recent findings in several research areas to provide an integrated account of social motivation at the behavioral, biological and evolutionary levels. We conclude that ASD can be construed as an extreme case of diminished social motivation and, as such, provides a powerful model to understand humans' intrinsic drive to seek acceptance and avoid rejection.
Article
Nearly perfect speech recognition was observed under conditions of greatly reduced spectral information. Temporal envelopes of speech were extracted from broad frequency bands and were used to modulate noises of the same bandwidths. This manipulation preserved temporal envelope cues in each band but restricted the listener to severely degraded information on the distribution of spectral energy. The identification of consonants, vowels, and words in simple sentences improved markedly as the number of bands increased; high speech recognition performance was obtained with only three bands of modulated noise. Thus, the presentation of a dynamic temporal pattern in only a few broad spectral regions is sufficient for the recognition of speech.
Article
This study was designed to test the prediction that adolescents with autism would have specific limitations in imitating the "style" of another person's actions. In a series of original tasks that tested the delayed imitation of novel nonsymbolic actions, 16 participants with autism and 16 nonautistic participants group-matched for age and verbal ability were proficient in copying goal-directed actions, but in 3 out of 4 tasks, strikingly fewer participants with autism imitated with style with which the demonstrator executed the actions. An additional finding was that on 2 conditions that involved copying self-orientated actions, only 5 of the participants with autism but 15 of the 16 nonautistic participants spontaneously adopted the orientation-to-self on at least 1 occasion. The results are discussed with reference to theories concerning imitation deficits in autism, and with regard to the proposal that autism involves an impairment in intersubjective contact between affected individuals and others (Hobson, 1989, 1993; Rogers & Pennington, 1991).
Article
This study provides an overview of the vocalizations of Barbary macaques, Macaca sylvanus. Spectrographic displays of calls are presented along with photographs of the accompanying facial gestures. We give a general description of the contexts in which the different calls are uttered, with special regard to the age and sex of the caller. The vocal repertoire of Barbary macaques mainly consists of screams, shrill barks, geckers, low-frequency pants and grunts, with gradation occurring within and between call types. The spectrograms document that typically, Barbary macaques produce series of several consecutive calls. The influence of habitat, social structure and phylogenetic descent on the morphology of the repertoire and call diversity are discussed in comparison to other species.
Article
This study explored whether the tendency of chimpanzees and children to use emulation or imitation to solve a tool-using task was a response to the availability of causal information. Young wild-born chimpanzees from an African sanctuary and 3- to 4-year-old children observed a human demonstrator use a tool to retrieve a reward from a puzzle-box. The demonstration involved both causally relevant and irrelevant actions, and the box was presented in each of two conditions: opaque and clear. In the opaque condition, causal information about the effect of the tool inside the box was not available, and hence it was impossible to differentiate between the relevant and irrelevant parts of the demonstration. However, in the clear condition causal information was available, and subjects could potentially determine which actions were necessary. When chimpanzees were presented with the opaque box, they reproduced both the relevant and irrelevant actions, thus imitating the overall structure of the task. When the box was presented in the clear condition they instead ignored the irrelevant actions in favour of a more efficient, emulative technique. These results suggest that emulation is the favoured strategy of chimpanzees when sufficient causal information is available. However, if such information is not available, chimpanzees are prone to employ a more comprehensive copy of an observed action. In contrast to the chimpanzees, children employed imitation to solve the task in both conditions, at the expense of efficiency. We suggest that the difference in performance of chimpanzees and children may be due to a greater susceptibility of children to cultural conventions, perhaps combined with a differential focus on the results, actions and goals of the demonstrator.