ArticlePDF Available

Empirical comparisons of pitch patterns in music, speech, and birdsong

Authors:
  • Toronto Metropolitan University

Abstract and Figures

In music, large intervals ("pitch skips") are often followed by reversals, and phrases often have an arch-like shape and final durational lengthening. These regularities could reflect motor constraints on pitch production or could reflect the melodic characteristics of speech. To distinguish between these possibilities we compared pitch patterns in instrumental musical themes, sentences, and birdsongs. Patterns due to production-related constraints should be common to all three domains, whereas patterns due to statistical learning from speech should be present in speech but not birdsong. Sequences were taken from English and French instrumental classical music, sentences from 4 languages, and songs of 56 songbird families. For sentences and birdsongs each syllablenote was assigned one pitch. For each sequence, we quantified patterns of post-skip reversals, the direction of the initial and final interval, the relative duration of the final syllablenote, and the pitch contour shape. Post-skip reversals predominated in all domains, likely reflecting a shared constraint: skips frequently take melodies toward the edges of the pitch range, forcing a subsequent reversal (as suggested by Von Hippel & Huron, 2000). Arch-like contours and final lengthening were found in music and speech but not birdsong, possibly reflecting an influence of speech patterns on musical structure.
Content may be subject to copyright.
Empirical comparisons of pitch patterns in music,
speech, and birdsong
A. T Tierney
a
, F. A Russo
b
and A. D Patel
c
a
UC San Diego Dept. of Cognitive Science, Neurosciences Institute, 9500 Gilman Drive, La
Jolla, CA 92093-0515, USA
b
Ryerson University Department of Psychology, 350 Victoria Street, Toronto, ON M5B 2K3,
Canada
c
Neurosciences Institute, 10640 John Jay Hopkins Drive, La Jolla, CA 92121, USA
adamtierney@gmail.com
Acoustics 08 Paris
4723
In music, large intervals (“skips”) are often followed by reversals, and phrases often have an arch-like shape and final
durational lengthening. These regularities could reflect motor constraints on pitch production or the melodic characteristics of
speech. To distinguish between these possibilities we compared pitch patterns in instrumental musical themes, sentences, and
birdsongs. Patterns due to production-related constraints should be present in all three domains, whereas patterns due to
statistical learning from speech should be present in speech but not birdsong. Sequences were taken from classical music of 5
countries, sentences from 4 languages, and songs of 56 songbird families. For sentences and birdsongs each syllable/note was
assigned one pitch. For each sequence, we quantified patterns of post-skip reversals, the direction of the initial and final
interval, the relative duration of the final vowel/note, and the pitch contour shape. Final lengthening and post-skip reversals
predominated in all domains, likely reflecting shared motor constraints; the latter may result from skips’ tendency to take
melodies toward the edges of the pitch range, forcing subsequent reversals (suggested by Von Hippel & Huron [6]). Arch-like
contours were found in music and speech but not birdsong, possibly reflecting an influence of speech patterns on musical
structure.
1 Introduction
Research over the last decade has identified a number of
patterns in the pitch sequences found in a variety of forms
of music. Huron[1], for example, found that a large corpus
of folk songs exhibited an arch-like shape when phrases of
the same note length were averaged together: they tended to
contain an initial rise, followed by a plateau, followed by a
final fall. Studies have also found that music performers
tend to increase the duration of notes just preceding phrase
boundaries [2,3], possibly to make the boundaries more
clear to listeners. Finally, a much commented-on pattern [4]
is that skips—i.e., large jumps in pitch—tend to be
followed by reversals in direction.
Despite the interest these patterns have generated, their
origins remain obscure. There are at least three possible
sources of these regularities; first, they could be due to a
conscious effort by the composer to communicate or
generate an effect in the listener, as has been claimed for
the skip-reversal pattern [4]. Second, they could be due to
motor constraints—von-Hippel and Huron [5], for example,
present data strongly suggesting that the skip-reversal
pattern is due in large part to the fact that pitch distributions
in music tend to fall in a roughly Gaussian distribution [6];
large jumps in pitch tend to lead away from the center of
this distribution, and thus simple regression to the mean
will cause reversals to more commonly follow than
continuations.
Finally, another possible source of regularities in musical
pitch patterns is statistical learning of pitch patterns present
in speech. Like music, speech consists of a sequence of
sounds associated with particular fundamental frequency
values, and it is perhaps the most prominent feature of
one’s auditory environment from an early age. Final
lengthening, for example, has been found to occur at the
boundaries of speech phrases as well [7], especially when
there is a syntactic ambiguity that needs to be resolved. It is
possible that composers and performers, after hearing final
lengthening associated with phrase boundaries in language
repeatedly during development, appropriated the same
technique when marking phrase boundaries in music. Patel
and Daniele [8] presented evidence that another pattern in
the rhythm of language—English’s tendency to have a
larger contrast between neighboring vowel durations, and
French’s tendency to show a smaller contrast—is reflected
in the music of composers from those cultures. Patel et al.
[9] showed, moreover, that the degree of pitch interval
variability of the two languages is also reflected in the
music written by composers from each country: both
English speech and English music tend to have higher
interval variability than French speech and music.
In order to distinguish between these three sources of
pattern in music—patterns specific to music, resulting from
motor constraints, and learned from speech—we analyzed
spoken sentences, birdsongs, and musical themes as
sequences of pitches. Any patterns found in all three
domains are most likely due to motor constrains. Any
patterns found in music and speech, but not birdsong, may
be due to statistical learning of speech pitch patterns by
composers. Any patterns found in music but not speech
may be specific to music, possibly the product of cultural
tradition.
2 Methods
Corpora: The French, English, and Japanese sentences
were taken from the database of Nazzi et al. [10]. Four
female speakers per language read five unique sentences
each, for a total of twenty sentences per language. The data
set also included three speakers of Yoruba reading seven
unique sentences each; these were taken from the database
of Marina Nespor and Jacques Mehler. These languages
were selected because they possess a wide variety of
rhythmic and prosodic features: Yoruba is a tone language,
Japanese is a pitch-accent language and is mora-timed,
English is a stress-timed intonation language, and French is
a syllable-timed intonation language.
Musical themes were selected from Barlow and
Morgenstern’s Dictionary of Musical Themes [11].
Following Patel and Daniele, themes were selected for all
English, French, German, Italian, and Russian composers in
the dictionary who were born in the 1800s and died in the
1900s. In order to be included, themes were required to
contain at least twelve notes and no internal rests, fermatas,
or grace notes. Moreover, themes from pieces with titles
suggestive of a particular rhythm (e.g. marches, waltzes) or
an attempt to produce an exotic style (children’s music,
music evocative of another composer or country) were also
excluded. These criteria yielded 136 English themes from 6
composers, 180 French themes from 10 composers, 112
German themes from 5 composers, 53 Italian themes from
4 composers, and 238 Russian themes from 6 composers.
Acoustics 08 Paris
4724
The birdsong dataset included one song each from 56 of the
84 families of birds in the oscine suborder listed in the
Howard and Moore Complete Checklist of the Birds of the
World [12]. Songs were required to have at least five notes,
a tonal quality strong enough for f0 analysis to be
performed on them, low background noise, and significant
tonal variation (that is, we excluded songs in which only a
single note was repeated). All songs consisted of a
sequence of notes, both preceded and followed by a long
pause, relative to the duration of the notes. Songs were
provided by the Cornell Laboratory of Ornithology, the
Borror Laboratory of Bioacoustics, the British Museum
Library, and compact discs accompanying Music of the
Birds by Lang Elliot [13], Nature’s Music by Peter Marler
and Hansn Slabbekoorn [14], and The Singing Life of Birds
by Donald Kroodsma [15].
Duration in musical themes was encoded relative to the
time signature, such that the basic beat for each theme was
assigned a duration of one. Thus, in the time signature 4/4,
a quarter note would be assigned a value of 1, an eighth
note would be given a 0.5, etc. Durational data was
collected from the speech samples by marking vowel
boundaries in each sentence using speech spectrograms
generated with Pratt running on a personal computer. Both
the waveform and the spectrogram were available during
this analysis, plus interactive playback. For birdsong, the
onset and offset of each note was marked using wide-band
spectrograms generated in SIGNAL running on a modified
personal computer (frequency resolution = 125 Hz, time
resolution = 8 ms, one FFT every 3 ms, Hanning window).
Pitch sequences in music were encoded as a series of
distances from A440. Thus, a note two half-steps above
A440 would be encoded as 2, and a note three half-steps
below would be encoded as -3. Pitch sequences in speech
were encoded using the prosogram version 1.3.6 as
instantiated in Pratt. The prosogram is a representation of
F0 contour based on human pitch perception. Vowels with
a pitch change exceeding the glide threshold 0.32/T
2
are
marked as glides (where T = vowel duration in s). This
threshold is based on meta-analysis of a number of studies
of the threshold for human perception of pitch change in
speech [16]. Vowels with rates of pitch change below this
threshold are treated as perceptually equivalent to level
tones. Using the threshold 0.32/T
2
semitones/sec., a large
majority of tones are represented as level tones, allowing
melodic patterns in speech to be directly compared to
music. Pitch sequences in birdsong were encoded using F0
analysis in SIGNAL. The fundamental frequency contour
of each note was measured, and the mean pitch was
extracted.
In order to further investigate the proposal [5] that skips
precede reversals in music due to simple regression to the
mean, we calculated the intervals following skips in speech,
music, and birdsong. Large jumps in pitch should, more
often than not, bring a melody closer to the edge of the
available pitch range. Once a melody has landed near the
edge of the range, it will most likely reverse direction, for
the simple reason that pitches closer to the center of the
range are more common than pitches closer to the edge
(assuming the pitches fall under a Gaussian distribution).
Therefore, if this effect is driving the skip-reversal pattern,
skips that cross the median or depart from it should precede
reversals, skips that approach the median should be
followed by continuations, and skips that land on the
median should lead to an equal proportion of reversals and
continuations. This was already found to be the case in a
corpus of folk songs [5]. Skips (intervals larger than two
semitones) were categorized as departing from the median,
crossing the median, landing on the median, or approaching
the median. The shape of the distribution of pitches in each
domain was also assessed by converting pitch values to
semitones, then normalizing each note in a given sequence
by subtracting the mean pitch of that sequence.
To test the hypothesis that both speech and music show
final lengthening, and if so, to question whether or not this
is due to motor constraints also shared with birdsong, the
durations of the final note of each musical theme and
birdsong and the final vowel of each spoken sentence were
calculated and compared to all of the other durations within
that same domain. A similar comparison was also made
between the duration of the initial note/vowel and all
remaining notes/vowels.
In order to test the hypothesis that speech, music, and
potentially birdsong share the “melodic arch” contour, the
initial and final intervals of each phrase was calculated and
compared to all of the remaining intervals. In the case of
speech, intervals were required to consist of two level tones
in order to be included in the analysis.
3 Results
Birdsong, speech, and music all showed a tendency for
small intervals to predominate over large intervals (figure
1).
Fig. 1 Interval sizes in speech, music, and birdsong.
Small intervals tend to predominate over large intervals in
all three domains, extending to speech and birdsong a
finding reported for music by von Hippel and Huron [6]. In
addition, a large peak at 2 semitones was found for music
but not for speech or birdsong.
Pitches in birdsong, speech, and music fell into a roughly
Gaussian distribution, as figure 2 shows.
Acoustics 08 Paris
4725
Fig. 2 Histogram of pitches, in distance from average pitch
of each song/theme.
This data suggests that pitch sequences in music, speech,
and birdsong all show a central tendency, a phenomenon
previously observed in music by von Hippel and Huron [6].
As a result, we would expect to find similar skip-reversal
patterns in all three domains, as figures 3, 4, and 5 show.
Fig. 3 Skip-reversal patterns in music.
Fig. 4 Skip-reversal patterns in speech.
Fig. 5 Skip-reversal patterns in birdsong.
These patterns not only show that skips are followed by
reversals in all three domains, but all suggest that in all
three cases this is driven at least in part by regression to the
mean: median-departing and median-crossing skips tend to
be followed by reversals, whereas median-approaching
skips tend to be followed by continuations, while median-
landing skips do not give rise to a strong pattern.
As figure 6 shows, duration analysis revealed that in
speech, music, and birdsong, the last note/vowel tends to be
lengthened with respect to the average note. (In each case,
the difference between final and average duration was
significant.) For music phrases, each beat was arbitrarily
given a duration of 50 msec in order to display all domains
on the same graph.
Fig. 6 Final durations.
Figure 7 shows that in speech, the duration of the initial
vowel tends to be longer than the average vowel, whereas
in music the opposite trend holds. There is no significant
difference between the first and average note in birdsong.
Acoustics 08 Paris
4726
Fig. 7 Initial durations.
As predicted, we found evidence of a “melodic arch”
contour in music: final intervals were more negative than
average intervals, and initial intervals were more positive
than average intervals (figures 8 and 9). Surprisingly,
though, this pattern also held true for speech (no significant
effect was found for birdsong, and the trend was in the
opposite direction).
Fig. 8 Final intervals.
Fig. 9 Initial intervals.
4 Conclusion
The present study analyzed spoken sentences, birdsongs,
and musical themes for the presence of three patterns:
lengthening of notes/vowels at the end of phrases, the
presence of an arch-like pitch contour in which pitch rises
sharply, plateaus, then falls sharply, and evidence for skips
being followed by reversals due to regression to the mean.
The latter pattern was found in all three domains,
suggesting that it is caused by simple motor constraints:
pitches near the center of a speaker/instrumentalist/bird’s
pitch range are easier to produce than pitches at the edges.
This gives rise to distributions displaying central tendency.
Thus, the skip-reversal pattern is likely due to regression to
the mean in all three cases, and is unlikely to be the
consequence of conscious deliberation on the part of either
speakers or composers. Final lengthening was also found in
all three domains, suggesting that it may be due in part to
shared motor constraints, although that does not rule out the
possibility that it is taken advantage of by listeners and by
speakers as a form of phrase-marking.
A “melodic arch” contour, on the other hand, was found in
speech and music, but not in birdsong. This pattern may,
therefore, be learned by musicians from the pitch sequences
contained in speech, despite the fact that most people are
rarely consciously aware of the pitch changes present in
speech. If this pattern is indeed shared by both domains, it
remains to be determined what function, if any, it serves. It
is possible that the effect helps mark the beginnings and
ends of phrases, which would facilitate initial syntactic
learning in both domains and would help disambiguate
ambiguous syntactic structures. It remains to be seen
whether listeners are actually able to respond to these cues,
and what effect doing so has on their comprehension.
Two patterns were found that appear to be unique to music:
a peak in the interval distribution at 2 semitones, and a
tendency for initial durations to be longer than subsequent
durations. The first effect is most likely due to the tendency
of musical melodies to move by small steps rather than
leaps [17], a tendency which may itself reflect motor
constraints (i.e., smaller intervals are easier to produce than
larger ones). Since steps of 2 semitones are more common
than steps of 1 semitone in musical scales, this could lead to
a predominance of 2-semitone intervals in musical
melodies. The cause of the second effect (the tendency for
initial durations in melodies to be longer than subsequent
ones) is less clear, but it may stem from melodies tending to
begin at strong metrical positions.
Acknowledgments
Supported by Neurosciences Research Foundation as part
of its research program on music and the brain at The
Neurosciences Institute, where ADP is the Esther J.
Burnham Senior Fellow.
Acoustics 08 Paris
4727
References
[1] D. Huron, “The melodic arch in western folksongs.”
Computing in Musicology10, 3-23 (1996)
[2] B. Repp, “Patterns of expressive timing in
performances of a Beethoven minuet by nineteen
famous pianists.” JASA 88, 622-641 (1990)
[3] A. Penel, C. Drake, “Timing variations in music
performance: musical communication, perceptual
compensation, and/or motor control?” Perception and
Psychophysics 66, 545-562 (2004)
[4] L. Meyer, Emotion and Meaning in Music. University
of Chicago Press, Chicago (1961)
[5] P. von Hippel, D. Huron, “Why do skips precede
reversals? The effect of tessitura on melodic structure.”
Music Perception 18, 59-85 (2000)
[6] P. von Hippel, “Redefining pitch proximity: tessitura
and mobility as constraints on melodic intervals.”
Music Perception 17, 315-327 (2000)
[7] A. Schafer, “Intonational disambiguation in sentence
production and comprehension.” Journal of
Psycholinguistic Research 2, 169-182 (2000)
[8] A. Patel, J. Daniele, “An empirical comparison of
rhythm in language and music.” Cognition 87, B35-
B45 (2003)
[9] A. Patel, J. Iversen, J. Rosenberg, “Comparing the
rhythm and melody of speech and music: the case of
British English and French.” JASA 1995, 3034-3047
(2006)
[10] T. Nazzi, J. Bertoncini, J. Mehler, “Language
discrimination in newborns: Toward an understanding
of the role of rhythm.” J. Exp. Psychol. Hum. Percept.
Perform. 24, 756-777 (1998)
[11] H. Barlow, S. Morgenstern, A Dictionary of Musical
Themes, revised edition. Faber and Faber, London
(1983)
[12] R. Howard, A. Moore, A Complete Checklist of Birds
of the World. Macmillan, London (1984)
[13] L. Elliot, Music of the Birds: a Celebration of Bird
Song. NatureSound Studio, New York (1999)
[14] P. Marler, H. Slabbekorn, Nature’s Music: the Science
of Birdsong. Elsevier, London (2004)
[15] D. Kroodsma, The Singing Life of Birds. Houghton
Mifflin, New York (2005)
[16] J. ‘t Hard, R. Collier, A. Cohen, A Perceptual Study of
Intonation. Cambridge University Press, Cambridge
(1990)
[17] D. Huron, Sweet Anticipation: Music and the
Psychology of Expectation. MIT Press, Cambridge
(2006)
Acoustics 08 Paris
4728
... First of all, the pitch-tracks were simplified using an automated pitch-processing module that runs in Praat: the Prosogram (Mertens, 2004). (This is the approach endorsed by Patel, 2008, and followed by both Patel, Iversen, &Rosenberg, 2006, andby Tierney, Russo, &Patel, 2008.) The prosogram algorithm applies statistical and speech-perceptual calculations that produce perceptually satisfactory straight-line approximations of a pitch-track input; any resulting prosodic segments having F 0 slope below the algorithm's ''glissando threshold'' are simplified into a level tone. ...
... Comparing the interval distributions of the present study with those from other studies yields further insights. First of all, intervals in running speech more generally have been shown to yield a roughly (negative) exponential distribution-that is, a distribution with a peak at 0 semitones and a consistent inverse relationship between interval size and occurrence (see Figure 6a, from Tierney et al., 2008;see also Patel, 2008). (The discrepancy with the present study-where intervals were more widely distributed, with peak distributions for some sentence-types as high as 8 to 10 semitonescan be explained by the fact that these tokens were all in the sentence-final position, where a relatively large interval would be expected.) ...
... (The discrepancy with the present study-where intervals were more widely distributed, with peak distributions for some sentence-types as high as 8 to 10 semitonescan be explained by the fact that these tokens were all in the sentence-final position, where a relatively large interval would be expected.) Intervals in music, on the other hand, yield a roughly unimodal distribution with heavy density below six semitones, a peak at two semitones, and a thin right tail (see Figure 6b, from Vos & Troost, 1989; see also Tierney et al., 2008). That peak, of course, in part reflects an inherent structural characteristic of the diatonic scale and other common scales: the sheer combinatorial prevalence of major seconds, as compared to minor seconds, between notes of the scale. ...
Article
Full-text available
This paper describes the first laboratory study of an idiosyncratic linguistic form that represents a crucial point of contact between speech and song: what is referred to here as the stylized interjection. The stylized interjection, as described throughout the musicological and linguistic literature, is associated with a particular intonational formula—the calling contour—and intriguingly, with a purportedly cross-cultural musical fingerprint: the interval of the minor third. A reading task was used to systematically compare the stylized interjection to four other linguistic forms, and to compare spoken to called production. Analysis of several acoustic variables (involving pitch, duration, intensity, and timbre) demonstrates many significant effects of sentence-type and production, which together establish the characteristics of the English stylized interjection and suggest its interpretation as sung speech. The unique sound-meaning correspondence of the stylized interjection is thereby elucidated. Implications for music-language studies (especially vis-a-vis the minor third) are also discussed.
... For example, intonation-like patterns might be analyzed by, for example, examining whether pitch generally decreases over the course of a motif or song bout, or whether it is reset to a higher frequency when a new unit starts. In contrast to human speech and music, no evidence is found for initial notes to be higher than non-initial notes (Tierney et al., 2008). If such effects were to be found, these might be comparable to the effect of declination and pitch resetting at phrase boundaries, as found in human prosody. ...
... Pre-boundary lengthening in human speech refers to the phenomenon that a word is realized with a longer duration at phrase-final position than at phrase-nonfinal position (see Section 3.1). There are preliminary indications for longer notes at the end of a song in birdsong as well (Lachlan and Nowicki, 2015;Tierney et al., 2008). However, this pattern does not necessarily indicate pre-boundary lengthening, because it is unknown if notes are actually lengthened or, alternatively, if intrinsically longer notes tend to appear at the end of songs. ...
Article
MOL, C., Chen, A., Kager, R., ter Haar, S.M. Prosody in birdsong: A review and perspective. NEUROSCI BIOBEHAV REV XX(X) XXX-XXX, XXXX. - Birdsong shows striking parallels with human speech. Previous comparisons between birdsong and human vocalizations focused on syntax, phonology and phonetics. In this review, we propose that future comparative research should expand its focus to include prosody, i.e. the temporal and melodic properties that extend over larger units of song. To this end, we consider the similarities between birdsong structure and the prosodic hierarchy in human speech and between context-dependent acoustic variations in birdsong and the biological codes in human speech. Moreover, we discuss songbirds' sensitivity to prosody-like acoustic features and the role of such features in song segmentation and song learning in relation to infants' sensitivity to prosody and the role of prosody in early language acquisition. Finally, we make suggestions for future comparative birdsong research, including a framework of how prosody in birdsong can be studied. In particular, we propose to analyze birdsong as a multidimensional signal composed of specific acoustic features, and to assess whether these acoustic features are organized into prosody-like structures.
... On the other hand, we found that the size of pitch intervals was linked to the change in musicality ratings after repetition. This is somewhat surprising, as small intervals predominate in both music (Von Hippel & Huron, 2000) and speech [when intervals are defined as pitch distances between the mean fundamental frequency of successive syllables] (Tierney, Russo, & Patel, 2008). Given that a predominance of small intervals is not unique to music, it is unclear why listeners would rely on interval size when making the decision whether a phrase is more characteristic of song than speech. ...
Article
Full-text available
In the “speech-to-song illusion,” certain spoken phrases are heard as highly song-like when isolated from context and repeated. This phenomenon occurs to a greater degree for some stimuli than for others, suggesting that particular cues prompt listeners to perceive a spoken phrase as song. Here we investigated the nature of these cues across four experiments. In Experiment 1, participants were asked to rate how song-like spoken phrases were after each of eight repetitions. Initial ratings were correlated with the consistency of an underlying beat and within-syllable pitch slope, while rating change was linked to beat consistency, within-syllable pitch slope, and melodic structure. In Experiment 2, the within-syllable pitch slope of the stimuli was manipulated, and this manipulation changed the extent to which participants heard certain stimuli as more musical than others. In Experiment 3, the extent to which the pitch sequences of a phrase fit a computational model of melodic structure was altered, but this manipulation did not have a significant effect on musicality ratings. In Experiment 4, the consistency of intersyllable timing was manipulated, but this manipulation did not have an effect on the change in perceived musicality after repetition. Our methods provide a new way of studying the causal role of specific acoustic features in the speech-to-song illusion via subtle acoustic manipulations of speech, and show that listeners can rapidly (and implicitly) assess the degree to which nonmusical stimuli contain musical structure.
... Zebra finches are also reported to display final lengthening (Scharff and Jarvis, personal communication). Finally, Tierney, Russo, and Patel (2008) look at 56 songbird families, and find some evidence of a statistically significant tendency for final notes to be longer than non-final ones. Their data do not allow one to distinguish between a preference for selecting longer note types in final position versus different durations for a single note type depending on its position in the song. ...
Chapter
Prominent scholars consider the cognitive and neural similarities between birdsong and human speech and language. Scholars have long been captivated by the parallels between birdsong and human speech and language. In this book, leading scholars draw on the latest research to explore what birdsong can tell us about the biology of human speech and language and the consequences for evolutionary biology. After outlining the basic issues involved in the study of both language and evolution, the contributors compare birdsong and language in terms of acquisition, recursion, and core structural properties, and then examine the neurobiology of song and speech, genomic factors, and the emergence and evolution of language. ContributorsHermann Ackermann, Gabriël J.L. Beckers, Robert C. Berwick, Johan J. Bolhuis, Noam Chomsky, Frank Eisner, Martin Everaert, Michale S. Fee, Olga Fehér, Simon E. Fisher, W. Tecumseh Fitch, Jonathan B. Fritz, Sharon M.H. Gobes, Riny Huijbregts, Eric Jarvis, Robert Lachlan, Ann Law, Michael A. Long, Gary F. Marcus, Carolyn McGettigan, Daniel Mietchen, Richard Mooney, Sanne Moorman, Kazuo Okanoya, Christophe Pallier, Irene M. Pepperberg, Jonathan F. Prather, Franck Ramus, Eric Reuland, Constance Scharff, Sophie K. Scott, Neil Smith, Ofer Tchernichovski, Carel ten Cate, Christopher K. Thompson, Frank Wijnen, Moira Yip, Wolfram Ziegler, Willem Zuidema
... There have been many attempts at qualitative and quantitative descriptions of speech and song. [2] analyzed pitch patterns in spoken sentences, birdsong and instrumental music themes. Final lengthening and post-skip reversals predominated in all domains, based on which the authors suggest possible shared motor constraints for all three coordinated actions (read speech, spontaneous speech and singing); in addition, arch-like pitch contours were found in music and speech but not birdsong, possibly reflecting an influence of speech patterns on musical structure. ...
Preprint
Full-text available
Music is a complex learned behavior that is ubiquitous among humans, and many musical patterns are shared across geography and cultures ("music universals"). Knowing whether these universals are specific to humans or shared with other animals is important to understand how production-related factors (motor biases and constraints) or cognitive factors (learning) contribute to the emergence of these acoustic patterns. Bird song is often described as an animal analogue of human music, and some studies of individual avian species highlight acoustic similarities between bird song and music. However, expansive and comparative approaches are necessary to identify universal patterns within bird song, reveal mechanisms associated with these patterns , and draw parallels to music universals. Here, we adopt such an approach and analyze the prevalence of acoustic patterns (sequences) across ~300 species of passerines, spanning both oscines (songbirds; vocal learners) and their sister clade, suboscines (passerines that produce songs that are not learned), as well as within a global corpus of human vocal music. This approach allowed us to directly test hypotheses that phonation mechanisms or vocal learning shape the emergence of universal patterns. We first document acoustic patterns that were widely shared across passerines and similar to music universals (e.g., small pitch intervals), highlighting the role of shared vocal production mechanisms in these patterns. Consistent with a contribution of vocal learning, we observed patterns (e.g., alternation in durations) there were more similar between oscines and humans than between suboscines and humans . Interestingly, we also discovered patterns (e.g., pitch alternation) that were inconsistent with a contribution of vocal learning and were more similar between suboscines and humans than between oscines and humans. This research provides the broadest evidence of shared universals in vocal performance across birds and humans and highlights convergent mechanisms shaping communication patterns.
Article
Organizational patterns can be shared across biological systems, and revealing the factors shaping common patterns can provide insight into fundamental biological mechanisms. The behavioral pattern that elements with more constituents tend to consist of shorter constituents (Menzerath’s law [ML]) was described first in speech and language (e.g., words with more syllables consist of shorter syllables) and subsequently in music and animal communication. Menzerath’s law is hypothesized to reflect efficiency in information transfer, but biases and constraints in motor production can also lead to this pattern. We investigated the evolutionary breadth of ML and the contribution of production mechanisms to ML in the songs of 15 songbird species. Negative relationships between the number and duration of constituents (e.g., syllables in phrases) were observed in all 15 species. However, negative relationships were also observed in null models in which constituents were randomly allocated into observed element durations, and the observed negative relationship for numerous species did not differ from the null model; consequently, ML in these species could simply reflect production constraints and not communicative efficiency. By contrast, ML was significantly different from the null model for more than half the cases, suggesting additional organizational rules are imposed onto birdsongs. Production mechanisms are also underscored by the finding that canaries and zebra finches reared without auditory experiences that guide vocal development produced songs with nearly identical ML patterning as typically reared birds. These analyses highlight the breadth with which production mechanisms contribute to this prevalent organizational pattern in behavior.
Article
Full-text available
In melodies from a wide variety of cultures, a large pitch interval tends to be followed by a change of direction. Although this tendency is often attributed to listeners' expectations, it might arise more simply from constraints on melodic ranginess or tessitura. Skips tend toward the extremes of a melody's tessitura, and from those extremes a melody has little choice but to retreat by changing direction. Statistical analyses of vocal melodies from four different continents are consistent with this simple explanation. The results suggest that, in the sampled repertoires, patterns such as "gap-fill," "registral direction," and "registral return" (L. Meyer, 1956, 1973; E. Narmour, 1990) are mere side effects of constraints on melodic tessitura.
Book
Full-text available
The psychological theory of expectation that David Huron proposes in Sweet Anticipation grew out of the author's experimental efforts to understand how music evokes emotions. These efforts evolved into a general theory of expectation that will prove informative to readers interested in cognitive science and evolutionary psychology as well as those interested in music. The book describes a set of psychological mechanisms and illustrates how these mechanisms work in the case of music. All examples of notated music can be heard on the Web. Huron proposes that emotions evoked by expectation involve five functionally distinct response systems: reaction responses (which engage defensive reflexes); tension responses (where uncertainty leads to stress); prediction responses (which reward accurate prediction); imagination responses (which facilitate deferred gratification); and appraisal responses (which occur after conscious thought is engaged). For real-world events, these five response systems typically produce a complex mixture of feelings. The book identifies some of the aesthetic possibilities afforded by expectation, and shows how common musical devices (such as syncopation, cadence, meter, tonality, and climax) exploit the psychological opportunities. The theory also provides new insights into the physiological psychology of awe, laughter, and spine-tingling chills. Huron traces the psychology of expectations from the patterns of the physical/cultural world through imperfectly learned heuristics used to predict that world to the phenomenal qualia we experienced as we apprehend the world. Bradford Books imprint
Book
The voices of birds have always been a source of fascination. Nature's Music brings together some of the world's experts on birdsong, to review the advances that have taken place in our understanding of how and why birds sing, what their songs and calls mean, and how they have evolved. All contributors have strived to speak, not only to fellow experts, but also to the general reader. The result is a book of readable science, richly illustrated with recordings and pictures of the sounds of birds. Bird song is much more than just one behaviour of a single, particular group of organisms. It is a model for the study of a wide variety of animal behaviour systems, ecological, evolutionary and neurobiological. Bird song sits at the intersection of breeding, social and cognitive behaviour and ecology. As such interest in this book will extend far beyond the purely ornithological - to behavioural ecologists psychologists and neurobiologists of all kinds.
Article
In descriptions of melodic structure, pitch proximity is usually defined as the tendency for small pitch intervals to outnumber large ones. This definition is valid as far as it goes; however, an alternative definition is preferable. The alternative defines pitch proximity in terms of two more basic constraints - a constraint on tessitura (or pitch distribution) and a constraint on mobility (or freedom of motion). This new definition offers several advantages. Whereas the usual definition predicts only interval size, the new definition predicts interval direction as well. The usual definition predicts small intervals generally, whereas the new definition predicts context-sensitive variations in interval size. Finally, if the new definition is given the first few notes in a melody, it can assign a probability to each of the pitches that could occur next. In sum, the new definition offers a more precise and detailed description of melodic structure.
Book
List of tables List of figures Preface Acknowledgments 1. Introduction 2. Phonetic aspects of intonation 3. The IPO approach 4. A theory of intonation 5. Declination 6. Linguistic generalizations 7. Applications 8. Conclusion References Author index Subject index.
Article
2. Ed, Completely Rev. und Updated Bibliogr. s. 9 a 14-34 na začátku kn