ArticlePDF Available

Making music: Let’s not be too quick to abandon the byproduct hypothesis

  • University of Nottingham Malaysia


It is premature to conclude that music is an adaptation. Given the danger of overextending the adaptationist mode of explanation, the default position should be the byproduct hypothesis, and it should take very strong evidence to drag us into the adaptationist camp. As yet, the evidence isn’t strong enough – and the proposed adaptationist explanations have a number of unresolved difficulties.
found that selection for more elaborate songs can drive the evolu-
tion of the capacity to learn throughout life (Creanza, Fogarty, &
Feldman, 2016; Robinson, Snyder, & Creanza, 2019). We propose
that this evolutionary paradigm in songbirds that selection on a
learned trait can drive evolution of the brain provides a possible
example of the phenomenon depicted in Savage et al. (Fig. 2, left
panel): Musical features can act as an intermediary between social
functions and their neurobiological underpinnings.
Savage et al. describe musicality as a cognitive toolkit.How
might the framing of musicality as a set of tools affect our under-
standing of its evolution? Our lab modeled the evolution of bird-
song features as culturally transmitted functional traits, similar to
tools, wherein learners aim to imitate proficient tutors (Hudson &
Creanza, 2021). Like other fitness-altering cultural traits, func-
tional signals based on rhythmicity or pitch modulation could
have gradually become more complex if learners preferentially
choose tutors with complex signals. Over time, the cultural devel-
opment of functional signals could elevate the minimum cogni-
tive baseline to recognize and reproduce these signals, thereby
influencing brain evolution to favor attention to and learning
capacity for these acoustic features. In this context, elements of
musicality might have been under selection for purposes other
than the umbrella explanation of social bonding.Savage et al.
describe the neural synchronization between auditory and
motor brain regions during rhythm perception to explain the ori-
gins of dance, but only briefly mention other functions of coordi-
nated behavior. Could rhythmic movement have functioned as a
fitness-enhancing tool? Rhythmicity allows for synchronization
of actions between individuals and for individuals to accurately
predict the actions of others. It is thus conceivable that the devel-
opment of rhythmicity would have facilitated a large repertoire of
coordinated behaviors that could have impacted group survival.
Finally, both target articles discuss the hypothesis that musical-
ity evolved through sexual selection, concluding that it is inade-
quate to explain the evolution of musicality. However, this
hypothesis is framed from an intraspecific mate selection perspec-
tive, where females choose males with the most attractive musical
displays. Studying the evolution of birdsong and its role in species
recognition suggests another perspective: in our evolutionary past,
could musicality have served an interspecific function, mediating
the interactions between the ancestors of Homo sapiens and other
hominin lineages? Although musicality appears to be uniquely
human among extant species, Mehr et al. conjecture that the
basic elements of musicality are ancestral to all primates just
as song is to all songbirds. Did musicality contribute to species
recognition when our ancestors formed groups or selected
mates, perhaps before the emergence of language? We are unable
to know how much musical predisposition we shared with our
evolutionary cousins those we interbred with, and those we
didnt. However, considering songbirds as a model system sug-
gests that the evolutionary implications of musicality need not
be limited to interactions within our own species.
Financial support. This study was supported by the NSF (NC & KS, grant
number BCS-1918824); and Vanderbilt University (KS & NC).
Conflict of interest. None.
Araki, M., Bandi, M. M., & Yazaki-Sugiyama, Y. (2016). Mind the gap: Neural coding of
species identity in birdsong prosody. Science (New York, N.Y.),354, 12821287.
Benichov, J. I., Benezra, S. E., Vallentin, D., Globerson, E., Long, M. A., & Tchernichovski,
O. (2016). The forebrain song system mediates predictive call timing in female and
male zebra finches. Current Biology,26, 309318.
Benichov, J. I., & Vallentin, D. (2020). Inhibition within a premotor circuit controls the
timing of vocal turn-taking in zebra finches. Nature Communications,11, 221.
Colombelli-Négrel, D., Hauber, M. E., Robertson, J., Sulloway, F. J., Hoi, H., Griggio, M.,
& Kleindorfer, S. (2012). Embryonic learning of vocal passwords in superb fairy-wrens
reveals intruder cuckoo nestlings. Current Biology,22, 21552160.
Creanza, N., Fogarty, L., & Feldman, M. W. (2016). Cultural niche construction of
repertoire size and learning strategies in songbirds. Evolutionary Ecology,30, 285305.
Hall, M. L. (2009). A review of vocal duetting in birds. Advances in the Study of Behavior,
40,67121. doi: 10.1016/s0065-3454(09)40003-2.
Hoffmann, S., Trost, L., Voigt, C., Leitner, S., Lemazina, A., Sagunsky, H., (2019).
Duets recorded in the wild reveal that interindividually coordinated motor control
enables cooperative behavior. Nature Communications,10, 2577.
Hudson,E.J.,&Creanza,N.(2021).Ornament,armament,or toolkit? Modelling how population
size drives the evolution of birdsong, a functional cultural trait. bioRxiv, 2021.04.29.442039.
Hudson, E. J., Creanza, N., & Shizuka, D. (2020). The role of nestling acoustic experience
in song discrimination in a sparrow. Frontiers in Ecology and Evolution,8, 99. doi: 10.
Hudson, E. J., & Shizuka, D. (2017). Introductory whistle is sufficient for early song rec-
ognition by golden-crowned sparrow nestlings. Animal Behaviour,133,8388. doi: 10.
Marler, P., & Peters, S. (1977). Selective vocal learning in a sparrow. Science (New York,
N.Y.),198, 519521.
Robinson, C. M., Snyder, K. T., & Creanza, N. (2019). Correlated evolution between rep-
ertoire size and song plasticity predicts that sexual selection on song promotes
open-ended learning. eLife,8, 44454. doi: 10.7554/eLife.44454.
Soha, J. A., & Marler, P. (2000). A species-specific acoustic cue for selective song learning
in the white-crowned sparrow. Animal Behaviour,60, 297306.
Making music: Lets not be too quick
to abandon the byproduct
Steve Stewart-Williams
School of Psychology, University of Nottingham Malaysia, Jalan Broga, 43500
Semenyih, Selangor Darul Ehsan, Malaysia.;
doi:10.1017/S0140525X20001119, e113
It is premature to conclude that music is an adaptation. Given
the danger of overextending the adaptationist mode of explana-
tion, the default position should be the byproduct hypothesis,
and it should take very strong evidence to drag us into the adap-
tationist camp. As yet, the evidence isnt strong enough and
the proposed adaptationist explanations have a number of unre-
solved difficulties.
Mehr et al. and Savage et al. have both put forward interesting and
very reasonable adaptationist accounts of music or more pre-
cisely, of certain aspects of musicality and musical behavior. Im
more sympathetic to such accounts than I was before. On balance,
though, I think its still premature to conclude that music is an
adaptation, and more plausible to think that its a byproduct.
There are three main reasons for this.
First, a lot of the evidence adduced in favor of adaptationist
explanations of music is equally amenable to a byproduct expla-
nation. The cross-cultural universality of music is consistent
with the claim that music is an adaptation but its also
120 Commentary/Mehr et al.: Origins of music in credible signaling
Downloaded from University of Nottingham Malaysia, on 30 Sep 2021 at 10:56:37, subject to the Cambridge Core terms of use, available at
consistent with the claim that its a byproduct of other adapta-
tions that are universal but not music-specific (e.g., emotional
responsiveness to the prosody in speech, which cultures might
independently learnto trigger with melodies). The complex
design evident in music could come from biological evolution
but it could also come from cumulative cultural evolution; after
all, smart phones and bureaucracies exhibit complex design as
well, but are clearly not adaptations. Children take to music
early and easily but they also take to iPads and TV; sometimes
ease of acquisition is a result of culture evolving for our minds,
rather than the other way around. Damage to certain areas of
the brain impairs the ability to make or appreciate music but
none of these areas is involved exclusively in music, and its pos-
sible that the areas in question evolved primarily for their non-
musical functions (which are presumably also impaired by dam-
age to those areas). Music-like abilities in nonhuman animals
show that traits of that kind can evolve but they dont show
that they necessarily did evolve in our species, as human culture
can sometimes independently discover traits that evolved in
other animals: The fact that leaf-cutter ants engage in something
akin to agriculture doesnt imply that human agriculture is an
adaptation; similarly, the fact that various nonhuman animals
produce auditory displays doesnt imply that human music is an
adaptation. In short, much of the evidence is ambiguous. Given
the danger of overextending the adaptationist mode of explana-
tion, the byproduct approach seems like the safer default position
in lieu of more decisive evidence.
Second, the byproduct approach has a number of advantages over
its adaptationist rivals. Uncontroversial adaptations, such as arms
and the basic motivations, are found in all typically developing
human beings and are reasonably similar across cultures, subcultures,
and historical periods. Music, in contrast, varies greatly from place to
place and from time to time, and many people spend little time mak-
ing or consuming it. These facts are easier to square with a byproduct
explanation than an adaptationist one. Even if one argues that certain
core features of music are found in every culture, it remains the case
that plenty of individuals within those cultures devote little time to
music, whereas almost every individual has arms and the basic moti-
vations. And even if one argues that, in traditional cultures, almost
every individual devotes substantial time to music, the fact that
many individuals in modern cultures do not is still surprising on
an adaptationist account after all, even in modern cultures, every
typically developing human being uses language frequently, and it
would be surprising on an adaptationist account of language if this
were not the case.
Third and finally, the adaptationist accounts ofmusic proposed in
this dual treatment face a number of challenges that byproduct expla-
nations do not. If stronger social bonds are adaptive, as Savage et al.
argue, why not select directly for a tendency to bond more strongly,
rather than a tendency to make and enjoy rhythmically patterned
pitch-sequences and to bond with others who do the same?
Regarding Mehr et al.s account, does it seem plausible that raiding
parties would be less inclined to attack a group that kept perfect
time than an equivalently fierce group whose rhythms were slightly
off, or that such a strategy would be particularly useful? Keeping
time isnt important in chimpanzee territorial displays, so the closest
animal analogy doesnt support the idea. Is music-making prowess a
reliable way to assess a groups potential as allies? People could make
beautiful music together but be hopeless at hunting, making tools, or
doing anything else that might make an alliance valuable. Why not
assess the valuable abilities directly, rather than assessing peoples
musical chops? If rhythm evolved for territorial signaling, why
arent men notably more rhythmical than women, given that men
have historically done the bulk of the territorial displaying and
defense? If melody evolved for infant-directed song, why arent
women notably more melodic than men, given that women have his-
torically done the bulk of the infant care? Although some studies sug-
gest such differences (e.g., Miles, Miranda, & Ullman, 2016), the
broader literature is mixed and its certainly not obvious that the
sexes differ much in these domains. Is infant-directed song a reliable
signal of commitment in anyevolutionarilymeaningful way? It tells
the baby that it has the parents undivided attention at that particular
moment, while the parent is singing the song. However, the fact that it
has their attention in a context where it isnt especially costly to the
parent doesnt guarantee that the parent will prioritize the baby if
and when difficult trade-offs need to be made for example, if the
parent has to choose to invest either in the baby or in one of the
babys siblings. A peacock cant grow a decent tail unless itsin
good condition; in contrast, its easy enough to sing a baby a song
then withdraw support later on, if ones circumstances change.
I dont claim that these difficulties are necessarily insurmount-
able, and I concede that some of the evidence presented in favor
of an evolved contribution to human musicality is at the very least
suggestive. However, the difficulties do hint that its premature to
accept an adaptationist account at this stage and if I had to make
a bet today, my money would be on the byproduct approach.
Financial support. The author received no specific grant for this work from
any funding agency.
Conflict of interest. None.
Miles, S. A., Miranda, R. A., & Ullman, M. T. (2016). Sex differences in music: A female
advantage at recognizing familiar melodies. Frontiers in Psychology,7, 278278.
Pre-hunt charade as the cradle of
human musicality
Szabolcs Számadóa,b,c
Department of Sociology and Communication, Budapest University of
Technology and Economics, Egry J. u. 1., Budapest, 1111, Hungary;
Centre for
Social Sciences (TK CSS) LendületResearch Center for Educational and
Network Studies (CSS-RECENS), Tóth Kálmán u. 4, Budapest, 1097, Hungary
Evolutionary Systems Research Group, Centre for Ecological Research,
Klebelsberg Kuno u. 3, Tihany 8237, Hungary.;
doi:10.1017/S0140525X20001077, e114
Human language and human music are both unique communi-
cation systems that evolved in the human lineage. Here, I pro-
pose that they share the same root, they evolved from an
ancestral communication system yet to be described in detail. I
suggest that pre-hunt charade was this shared root, which helped
organize and coordinate the hunt of early hominins.
Commentary/Mehr et al.: Origins of music in credible signaling 121
Downloaded from University of Nottingham Malaysia, on 30 Sep 2021 at 10:56:37, subject to the Cambridge Core terms of use, available at
Music is both universal, appearing in every known human culture, and culture-specific, often defying intelligibility across cultural boundaries. This duality has been the source of debate within the broad community of music researchers, and there have been significant disagreements both on the ontology of music as an object of study and the appropriate epistemology for that study. To help resolve this tension, I present a culture-cognition-mediator model that situates music as a mediator in the mutually constitutive cycle of cultures and selves representing the ways individuals both shape and are shaped by their cultural environments. This model draws on concepts of musical grammars and schema, contemporary theories in developmental and cultural psychology that blur the distinction between nature and nurture, and recent advances in cognitive neuroscience. Existing evidence of both directions of causality is presented, providing empirical support for the conceptual model. The epistemological consequences of this model are discussed, specifically with respect to transdisciplinarity, hybrid research methods, and several potential empirical applications and testable predictions as well as its import for broader ontological conversations around the evolutionary origins of music itself.
Full-text available
Vocal turn-taking is a fundamental organizing principle of human conversation but the neural circuit mechanisms that structure coordinated vocal interactions are unknown. The ability to exchange vocalizations in an alternating fashion is also exhibited by other species, including zebra finches. With a combination of behavioral testing, electrophysiological recordings, and pharmacological manipulations we demonstrate that activity within a cortical premotor nucleus orchestrates the timing of calls in socially interacting zebra finches. Within this circuit, local inhibition precedes premotor neuron activation associated with calling. Blocking inhibition results in faster vocal responses as well as an impaired ability to flexibly avoid overlapping with a partner. These results support a working model in which premotor inhibition regulates context-dependent timing of vocalizations and enables the precise interleaving of vocal signals during turn-taking.
Full-text available
Oscine songbirds are an ideal system for investigating how early experience affects behavior. Young songbirds face a challenging task: how to recognize and selectively learn only their own species' song, often during a time-limited window. Because birds are capable of hearing birdsong very early in life, early exposure to song could plausibly affect recognition of appropriate models; however, this idea conflicts with the traditional view that song learning occurs only after a bird leaves the nest. Thus, it remains unknown whether natural variation in acoustic exposure prior to song learning affects the template for recognition. In a population where sister species, golden-crowned and white-crowned sparrows, breed syntopically, we found that nestlings discriminate between heterospecific and conspecific song playbacks prior to the onset of song memorization. We then asked whether natural exposure to more frequent or louder heterospecific song explained any variation in golden-crowned nestling response to heterospecific song playbacks. We characterized the amount of each species' song audible in golden-crowned sparrow nests and showed that even in a relatively small area, the ratio of heterospecific to conspecific song exposure varies widely. However, although many songbirds hear and respond to acoustic signals before fledging, golden-crowned sparrow nestlings that heard different amounts of heterospecific song did not behave differently in response to heterospecific playbacks. This study provides the first evidence that song discrimination at the onset of song learning is robust to the presence of closely related heterospecifics in nature, which may be an important adaptation in sympatry between potentially interbreeding taxa.
Full-text available
Some oscine songbird species modify their songs throughout their lives (‘adult song plasticity’ or ‘open-ended learning’), while others crystallize their songs around sexual maturity. It remains unknown whether the strength of sexual selection on song characteristics, such as repertoire size, affects adult song plasticity, or whether adult song plasticity affects song evolution. Here, we compiled data about song plasticity, song characteristics, and mating system and then examined evolutionary interactions between these traits. Across 67 species, we found that lineages with adult song plasticity show directional evolution toward increased syllable and song repertoires, while several other song characteristics evolved faster, but in a non-directional manner. Song plasticity appears to drive bi-directional transitions between monogamous and polygynous social mating systems. Notably, our analysis of correlated evolution suggests that extreme syllable and song repertoire sizes drive the evolution of adult song plasticity or stability, providing novel evidence that sexual selection may indirectly influence open- versus closed-ended learning.
Full-text available
Many organisms coordinate rhythmic motor actions with those of a partner to generate cooperative social behavior such as duet singing. The neural mechanisms that enable rhythmic interindividual coordination of motor actions are unknown. Here we investigate the neural basis of vocal duetting behavior by using an approach that enables simultaneous recordings of individual vocalizations and multiunit vocal premotor activity in songbird pairs ranging freely in their natural habitat. We find that in the duet-initiating bird, the onset of the partner's contribution to the duet triggers a change in rhythm in the periodic neural discharges that are exclusively locked to the initiating bird's own vocalizations. The resulting interindividually synchronized neural activity pattern elicits vocalizations that perfectly alternate between partners in the ongoing song. We suggest that rhythmic cooperative behavior requires exact interindividual coordination of premotor neural activity, which might be achieved by integration of sensory information originating from the interacting partner.
Full-text available
Birdsong is a complex cultural and biological system, and the selective forces driving evolutionary changes in aspects of song learning vary considerably among species. The extent to which repertoire size, the number of syllables or song types sung by a bird, is subject to sexual selection is unknown, and studies to date have provided inconsistent evidence. Here, we propose that selection pressure on the size and complexity of birdsong repertoires may facilitate the construction of a niche in which learning, sexual selection, and song-based homophily may co-evolve. We show, using a review of the birdsong literature and mathematical modeling, that learning mode (open-ended or closed-ended learning) is correlated with the size of birdsong repertoires. Underpinning this correlation may be a form of cultural niche construction in which a costly biological trait (for example, open-ended learning) can spread in a population (or be lost) as a result of direct selection on an associated cultural trait (for example, song repertoire size).
Full-text available
Although sex differences have been observed in various cognitive domains, there has been little work examining sex differences in the cognition of music. We tested the prediction that women would be better than men at recognizing familiar melodies, since memories of specific melodies are likely to be learned (at least in part) by declarative memory, which shows female advantages. Participants were 24 men and 24 women, with half musicians and half non-musicians in each group. The two groups were matched on age, education, and various measures of musical training. Participants were presented with well-known and novel melodies, and were asked to indicate their recognition of familiar melodies as rapidly as possible. The women were significantly faster than the men in responding, with a large effect size. The female advantage held across musicians and non-musicians, and across melodies with and without commonly associated lyrics, as evidenced by an absence of interactions between sex and these factors. Additionally, the results did not seem to be explained by sex differences in response biases, or in basic auditory or motor processes as tested in a control task. Though caution is warranted given that this is the first study to examine sex differences in familiar melody recognition, the results are consistent with the hypothesis motivating our prediction, namely that declarative memory underlies knowledge about music (particularly about familiar melodies), and that the female advantage at declarative memory may thus lead to female advantages in music cognition (particularly at familiar melody recognition). Additionally, the findings argue against the view that female advantages at tasks involving verbal (or verbalizable) material are due solely to a sex difference specific to the verbal domain. Further, the results may help explain previously-reported cognitive commonalities between music and language: since declarative memory also underlies language, such commonalities may be partly due to a common dependence on this memory system. More generally, because declarative memory is well studied at many levels, evidence that music cognition depends on this system may lead to a powerful research program generating a wide range of novel predictions for the neurocognition of music, potentially advancing the field.
Many songbird species have a predisposition to learn conspecific songs, suggesting song learning may be guided by an innate auditory template. Evidence for such a template includes preferential response to conspecific song in early life, even before song learning begins. A prime example of an innate cue for selective song learning is the introductory whistle of white-crowned sparrows, Zonotrichia leucophrys. The songs of its sister species, the golden-crowned sparrow, Zonotrichia atricapilla, also contain an introductory whistle, which differs in structure from that of white-crowned sparrows. Here we tested the ability of nestling golden-crowned sparrows in a sympatric population to discriminate between conspecific and heterospecific songs based on introductory whistles alone, prior to the onset of song learning. Golden-crowned sparrow nestlings responded with more chirps to playbacks of conspecific whistles than to heterospecific (white-crowned sparrow) whistles, and they responded similarly to full conspecific songs and conspecific whistles alone. We suggest that the introductory whistle alone is sufficient for song recognition in the golden-crowned sparrow. We discuss similarities and differences in the role of the introductory whistle between these sister taxa, and how this divergent song phrase may share a role in species recognition in both sister species. Identifying the cues underlying song recognition prior to song learning could be key to understanding the evolution of behavioural isolation between closely related songbird species.
Birds of a feather sing together How do birds know that a song that they hear is from a member of their own species, and how do they learn their songs in the first place? Araki et al. identified two types of brain cells involved in how finches learn their songs (see the Perspective by Tchernichovski and Lipkind). When zebra finches were raised by Bengalese finch foster parents, they learned a song whose morphology resembled that of their foster father. However, the temporal structure remained zebra finch–specific, suggesting that it is innate. Gadagkar et al. recorded activity in specific dopamine neurons in singing zebra finches while controlling perceived song quality with distorted auditory feedback. This distorted feedback represented worse performance than predicted and resulted in negative prediction errors. These findings suggest again that finches have an innate internal goal for their learned songs. Science , this issue p. 1282 , p. 1234 ; see also p. 1278
The dichotomy between vocal learners and non-learners is a fundamental distinction in the study of animal communication. Male zebra finches (Taeniopygia guttata) are vocal learners that acquire a song resembling their tutors’, whereas females can only produce innate calls. The acoustic structure of short calls, produced by both males and females, is not learned. However, these calls can be precisely coordinated across individuals. To examine how birds learn to synchronize their calls, we developed a vocal robot that exchanges calls with a partner bird. Because birds answer the robot with stereotyped latencies, we could program it to disrupt each bird’s responses by producing calls that are likely to coincide with the bird’s. Within minutes, the birds learned to avoid this disruptive masking (jamming) by adjusting the timing of their responses. Notably, females exhibited greater adaptive timing plasticity than males. Further, when challenged with complex rhythms containing jamming elements, birds dynamically adjusted the timing of their calls in anticipation of jamming. Blocking the song system cortical output dramatically reduced the precision of birds’ response timing and abolished their ability to avoid jamming. Surprisingly, we observed this effect in both males and females, indicating that the female song system is functional rather than vestigial. We suggest that descending forebrain projections, including the song-production pathway, function as a general-purpose sensorimotor communication system. In the case of calls, it enables plasticity in vocal timing to facilitate social interactions, whereas in the case of songs, plasticity extends to developmental changes in vocal structure.
How do parents recognize their offspring when the cost of making a recognition error is high [1 • Davies N.B. Cuckoos, Cowbirds and Other Cheats. T. & A.D. Poyser, London2000 • Google Scholar , 2 • Lotem A. Learning to recognize nestlings is maladaptive for cuckoo Cuculus canorus host.Nature. 1993; 362: 743-745 • Crossref • Scopus (125) • Google Scholar , 3 • Grim T. The evolution of nestling discrimination by hosts of parasitic birds: why is rejection so rare?.Evol. Ecol. Res. 2006; 8: 785-802 • Google Scholar ]? Avian brood parasite-host systems have been used to address this question because of the high cost of parasitism to host fitness. We discovered that superb fairy-wren (Malurus cyaneus) females call to their eggs, and upon hatching, nestlings produce begging calls with key elements from their mother’s “incubation call.” Cross-fostering experiments showed highest similarity between foster mother and nestling calls, intermediate similarity with genetic mothers, and least similarity with parasitic Horsfield's bronze-cuckoo (Chalcites basalis) nestlings. Playback experiments showed that adults respond to the begging calls of offspring hatched in their own nest and respond less to calls of other wren or cuckoo nestlings. We conclude that wrens use a parent-specific password [4 • Hauber M.E. • Russo S.A. • Sherman P.W. A password for species recognition in a brood-parasitic bird.Proc. Biol. Sci. 2001; 268: 1041-1048 • Crossref • PubMed • Scopus (77) • Google Scholar ] learned embryonically to shape call similarity with their own young and thereby detect foreign cuckoo nestlings.