PreprintPDF Available

A road to a better understanding of rhythms in speech using a comparative approach

Authors:
  • Leibniz Zentrum Allgemeine Sprachwissenschaft
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

The aim of the paper is to once again point out the advantages of a comparative approach between human and non-human animals as a road to a better understanding of the origin of rhythms in language evolution. To illustrate another field that would highly benefit from the comparative approach, we focus on acoustic signals of human and non-human animals using two concrete phenomena: final lengthening and f0 declination. These two phonetic markers have been chosen, because they are involved in determining the units that can form a rhythmic sequence. Final lengthening is a phonetic signature of the end of a phrase and signals a boundary, while the slope of f0 declination can signal the length of an entire phrase. Both are relatively frequent in human communication, they have been considered as being universal even if language-specific modifications may be found. There have been several debates about the origin of prosodic phenomena in general, e.g. whether they rely on linguistic representations or general physical properties. Similar to other scientists (Pika et al., 2018; Matzinger and Fitch, 2021; Pouw and Fuchs, 2022; Hersh et al., 2023; Hoeschele et al., 2023), we strongly believe that more interdisciplinary cross-fertilization is needed to better understand what properties are shared among human and non-human animals in their rhythm communication. The road towards such interdisciplinary exchange is clearly not without obstacles, but the joint venture, in our case between a biologist and a linguist, can enhance the collection of species producing selected phenomena, can initiate new recording and open source databases (Hersh et al., 2023), may reduce the human-centric view on the evolution of complex acoustic communication and the insider bias when doing comparative work (Hoeschele et al., 2023).
A road to a better understanding of rhythms in speech 1
using a comparative approach 2
Lara S. Burchardt and Susanne Fuchs 3
4
Leibniz-Zentrum Allgemeine Sprachwissenschaft, Berlin 5
Emails: burchardt@leibniz-zas.de, fuchs@leibniz-zas.de 6
1. Introduction 7
The production and perception of acoustic communication signals in different species, 8
including humans have intrigued scientists all over the world and from diverse research fields 9
such as linguistics, biology, and psychology. Whilst having completely different foci, some 10
questions are important to answer in all fields. One of these topics is the temporal structure of 11
acoustic communication, and how it is used to convey information. These temporal structures 12
or rhythms can act on different levels, within a phrase or between phrases, they can also help 13
in determining phrase boundaries. Perspectives on these connected questions are manifold 14
and cross-talk between the three disciplines is often limited. In this chapter, we are advocating 15
for more cross-talk using two prosodic markers connected to rhythms as examples to 16
showcase the advantages of combining linguists’ and biologists’ knowledge about the 17
respective phenomena. 18
Rhythms can be defined in various ways (for an overview see (Turk and Shattuck-Hufnagel, 19
2013)). One example is given by Patel, where rhythm is the “systematic patterning of sounds 20
in terms of timing, accent and grouping” (Patel, 2008:96). Building on this, we consider rhythm 21
as a non-random, ordered, predictable, and repeated alternation of different elements in a 22
sequence. The motivation to use this definition is to be relatively independent of theoretical 23
concepts in one or the other research domain. We think that a broader perspective including 24
humans and non-human animals alike is important because it allows us to better understand 25
the underlying principles and may root rhythms of human language in evolution. 26
According to this definition, the building blocks of rhythm are elements in a sequence. 27
Determining these elements in a known human language, i.e. linguistic units like syllables, 28
words, and prosodic phrases, even though a challenge on its own (Fletcher, 2010:524), might 29
be more straightforward than determining the building blocks of rhythm in non-human 30
vocalizations. Amongst other approaches, rhythmic markers might be a way to tackle this 31
issue in non-human animals. In the current chapter, we will focus on two prosodic markers of 32
rhythm, fundamental frequency (f0) declination and final lengthening. Both are frequently 33
found in human communication and have also been reported for specific species in non-34
human communication. 35
Final lengthening refers to a phenomenon where a final or penultimate syllable of an utterance 36
or prosodic phrase is produced for a longer duration than when the same is uttered within an 37
utterance or prosodic phrase (Fletcher, 2010:540)(Figure 1A). Thus, the lengthening of an 38
element can signal the end of a unit which varies nonlinearly with the strength of a boundary 39
(Kentner et al., 2023). It can co-occur with pauses and fundamental frequency (f0) lowering 40
(Petrone et al., 2017). From a perceptual side, it is a phonetic signature that might help to 41
mark the end of a unit (Schel et al., 2009). 42
F0 declination has been widely discussed in phonological (Ladd, 2008, 1988) and phonetic 43
terms (Strik and Boves, 1995). We broadly define f0 declination as “the gradual decrease of 44
f0 over the course of an utterance” ((Fuchs et al., 2015:35) in order to be as inclusive as 45
possible concerning non-human animal communication (Figure 1 A-D). F0 declination is 46
common in statements as opposed to questions. From a phonetic perspective, it is calculated 47
as a linear regression through f0 values in a given temporal window (e.g., 1-4s in (Yuan and 48
Liberman, 2014:69) that often corresponds to interpausal units or annotated prosodic phrases. 49
The linear regression slope is by definition (f0 decline) negative and varies with utterance 50
length. Various studies (among others (Cooper and Sorensen, 1981; Swerts et al., 1996; 51
Fuchs et al., 2015)); have shown that the longer the utterance, the shallower the slope, and 52
the shorter the utterance, the steeper the slope. 53
Together, f0 declination and final lengthening can be used to create a sense of units in a 54
rhythmic sequence, helping to signal boundaries between units and convey information about 55
the length of a unit in the sequence itself that may be repeated over time to produce structured 56
time events. 57
There is a lot of potential in studying the described phenomena in a more comparative way 58
between species. Linguistics could help to answer long-standing questions in animal 59
communication. The other way around we can also use knowledge about animal 60
communication and the opportunities we might have when studying a wide variety of animal 61
species with different cognitive abilities or physical constraints to solve debates in linguistics. 62
We can observe variations in pitch and timing in non-human animal communication similar to 63
prosodic features in human speech ((Briefer, 2012; Hotchkin and Parks, 2013; Filippi, 2016). 64
Oftentimes the reason for these changes is caused by physiological changes (for example the 65
emotional state can influence muscle tension (Briefer, 2012)). Different kinds of information 66
can be conveyed by those changes, may it be individual identity in birds (and many other 67
species) (i.e. (Linhart et al., 2022) or context in primates (Crockford et al., 2018). There are 68
many other examples of such changes that are species-specific and can have very different 69
functions. 70
With the variety of functions, production mechanisms, and environmental constraints, the 71
specific mechanisms and functions of prosodic communication in animals are still not fully 72
understood. An interdisciplinary exchange will enrich linguistics and biology in a 73
complementary fashion. Linguists may consider language and communication as less 74
uniquely human when taking results on other species into account. There have been many 75
recently published papers finding more and more similarities in human and non-human animal 76
communication (Favaro et al., 2020; Huang et al., 2020; Valente et al., 2021). In biology, more 77
research is needed to explore the extent and complexity of prosodic features in non-human 78
communication. Animal communication may resemble human prosody, but there are 79
important differences as far as we know by now: many animal vocalizations are innate, not 80
learned through acquisition and they may also be less flexible and expressive than human 81
speech. At the same time, the spectrum between innate, adjusted and learned vocalizations 82
gives interesting options to study the prerequisites for prosodic phenomena. 83
The aim of the paper is to once again point out the advantages of a comparative approach 84
between human and non-human animals as a road to a better understanding of the origin of 85
rhythms in language evolution. To illustrate another field that would highly benefit from the 86
comparative approach, we focus on acoustic signals of human and non-human animals using 87
two concrete phenomena: final lengthening and f0 declination. These two phonetic markers 88
have been chosen, because they are involved in determining the units that can form a rhythmic 89
sequence. Final lengthening is a phonetic signature of the end of a phrase and signals a 90
boundary, while the slope of f0 declination can signal the length of an entire phrase. Both are 91
relatively frequent in human communication, they have been considered as being universal 92
even if language-specific modifications may be found. There have been several debates about 93
the origin of prosodic phenomena in general, e.g. whether they rely on linguistic 94
representations or general physical properties. Similar to other scientists ((Pika et al., 2018; 95
Matzinger and Fitch, 2021; Pouw and Fuchs, 2022; Hersh et al., 2023; Hoeschele et al., 2023), 96
we strongly believe that more interdisciplinary cross-fertilization is needed to better understand 97
what properties are shared among human and non-human animals in their rhythm 98
communication. The road towards such interdisciplinary exchange is clearly not without 99
obstacles, but the joint venture, in our case between a biologist and a linguist, can enhance 100
the collection of species producing selected phenomena, can initiate new recording and open 101
source databases (Hersh et al., 2023), may reduce the human-centric view on the evolution 102
of complex acoustic communication and the insider bias when doing comparative work 103
(Hoeschele et al., 2023). 104
2. F0 declination in acoustic signals: Humans vs. 105
animals 106
2.1 Characteristics of f0 declination in human speech 107
F0 declination, the gradual decrease of f0 over the course of an utterance, is spanning several 108
words in human language and is a macro rhythm (rhythms in longer time units) in human 109
speech. Strik and Boves among others proposed to start at 1 second, otherwise, the 110
calculation of the declination slope might be heavily affected by local f0 variations (Strik and 111
Boves, 1995). F0 declination is not a rhythm in itself, but similar to final lengthening, it is a 112
prosodic signature of a unit that can be repeated over time and form a rhythm. 113
While more cross-linguistic work on f0 declination using the same methodology is missing, it 114
has been reported for various languages, such as Danish, Dutch, English, French, German, 115
Greek, Japanese, and Spanish (see (Fuchs et al., 2015) for an overview) so that there is 116
reason to believe it is a relatively robust phenomenon (Hauser and Fowler, 1992), at least in 117
Indo-European languages. However, it is not mandatory and can be modulated, e.g. f0 can 118
rise at the end of a phrase (called continuation rise) to signal the speaker’s motivation to 119
continue talking or f0 can raise signaling question intonation. There are also language 120
specificities as well as other factors that may shape the declination slope. For example, 121
Lieberman et al. found more negative slopes in reading than in spontaneous speech 122
(Lieberman et al., 1985). 123
Since at least the end of the 60s of the last century, a potential physiological origin of this 124
phenomenon has been postulated. Lieberman measured subglottal pressure and f0 125
declination in three human participants reading declarative sentences and found a positive 126
correlation between the two variables (Lieberman, 1967). He, therefore, suggested a potential 127
origin in respiratory behavior, because respiration is a driving force of phonation. Others have 128
argued that muscular tension in the vocal folds may rather be at the origin of f0 declination 129
(e.g., (Ohala, 1978)) and that tensioning of the vocal folds is independent of respiration 130
because the primary function of the larynx is to save lives and protect the lungs from foreign 131
bodies, hence it must be independent and quick (Ohala, 1990). The challenge for or against 132
one or the other argument or a mix of the two is that measuring subglottal pressure and 133
laryngeal tension is very invasive so empirical data is limited so far. 134
Apart from the physiological origin of f0 declination, cognitive processes seem plausible as 135
well. Since the slope of the f0 declination is correlated with the length of the upcoming 136
utterance, some anticipatory planning may be involved (Yuan and Liberman, 2014). 137
2.2 F0 declination in animal communication with a focus on bird 138
species and primates 139
We scarcely find descriptions hinting at f0 declination in monkeys. For example, in a 140
description of the vocalizations of the Black and White Colobus monkey, we find the following 141
“The final phrase is often deeper pitched than the others [...]” (Marler, 1972:181). This refers 142
to the alarm call “roar”. This leads Schel et al. to hypothesize that this phenomenon is 143
perceptually conspicuous and marks the end of the sequence (Schel et al., 2009). In Colobus 144
monkeys a roaring sequence consists of one or more roaring phrases, where a phrase is a 145
basic unit, made up of ~ 15 “pulses”, each with an average duration of 0.7 seconds, which 146
makes a whole phrase to be around 10 seconds (Marler, 1972). 147
Confusingly, in the same species in a different study, the exact opposite was reported at least 148
for the Black colobus monkey (Colobus satanas): “Initial phrases all decreased in pitch during 149
delivery, the terminal phrases all increased” in pitch (Oates and Trocco, 1983:100). This could 150
have different explanations: in this species the phenomenon could be dependent on unknown 151
parameters that have differed between the studies, or that coincidentally the decline in final 152
phrases was observed in a different colobus monkey species. 153
The most detailed study on f0 declination in non-human animals was conducted in vervet 154
monkeys (Cercopithecus aethiops) and rhesus macaques (Macaca mulatta), two very 155
common model species (Hauser and Fowler, 1992). For both species, vocal production also 156
shows f0 declination, which is suggested to serve a similar communicative function as in 157
human language (Hauser and Fowler, 1992). Under investigation were vocalizations uttered 158
during aggressive interaction for vervet monkeys in the wild and for an affiliative vocalization 159
of wild rhesus macaques. For both species rhesus macaques and vervet monkeys an 160
almost linear decline in f0 could be shown over call bouts of two calls and for call bouts of 161
three calls. Furthermore, for two call bouts of vervet monkeys, a correlation between the 162
duration of the bout and the magnitude of the f0 decline could be shown, as expected from 163
human language literature (e.g., (Yuan and Liberman, 2014). No correlation could be found 164
between the duration of the bout and the magnitude of the f0 decline in rhesus macaques. 165
Another interesting thing to note is, that the structured decline of f0 in vervet monkeys could 166
only be seen in adults, not in juveniles, which could be explained by the fact, that inter-call 167
intervals in juveniles are generally longer, indicating juveniles might take a breath between 168
calls, in contrast to adults. This makes it even more interesting to study the phenomenon 169
further in this species, to study the exact mechanism in more depth. 170
Another instance of f0 declination being reported in monkeys stems from Baboons, where f0 171
is also reported to decline within bouts of calling. This is here described to be independent of 172
rank and age, in a species, where f0 generally is highly correlated to these two parameters 173
(Fischer et al., 2004), making it a more general phenomenon, worthy to be investigated further. 174
In birds, f0 decline was found in the vocalizations of the budgerigar (Melopsittacus undulatus). 175
Here “mean F0 measurements were lower for segments in syllable-final position when 176
compared to medial segments” (Mann et al., 2021:6, fig. 4 caption). This makes the budgerigar 177
the only non-human species so far, where both final lengthening and f0 declination were 178
observed. This is very likely not representative of the abundance of the phenomena, but of a 179
sampling bias, where those phenomena have not been studied in many species. 180
181
Figure 1: Explanation of f0 declination and final lengthening on two examples. A) Oscillogram 182
of human language. The spoken text is: Always there had been war between the giants and 183
the gods. The duration of the ‘s’ of giants and gods is annotated, the final ‘s’ is much longer, 184
demonstrating final lengthening. B) The Spectrogram for the same human sentence is shown 185
with a solid line indicating the fundamental frequency . A clear f0 decline is visible. C) An 186
oscillogram of budgerigar twittering. D) The corresponding spectrogram of the twittering, the 187
fundamental frequency is shown as an extra solid line and an f0 decline is visible. 188
189
2.3 Comparing the use of f0 declination in human and animal 190
communication 191
The amount and detail of papers published on f0 declination is clearly more substantial in the 192
linguistic domain which shouldn’t imply that this is an exclusively human phenomenon. Papers 193
published on f0 declination in animal communication include primates and birds. They often 194
lack empirical work on the underlying mechanisms (note, however, this is different for final 195
lengthening, see below). While papers on animal communication are a stepping stone toward 196
a broader perspective, they are mostly descriptive. 197
Whether or not f0 declination is primarily the result of a decrease in subglottal pressure, a 198
reduction in laryngeal tension over the course of an utterance, a marker of anticipatory 199
planning of an utterance, or a mixture of these is still unclear. The invasiveness of recording 200
data in favor of the first two explanations also limits the empirical evidence in humans. 201
There may also be some challenges when comparing human and non-human animals. The 202
minimum length of an utterance that can be considered for calculating the f0 declination slope 203
may have to be adjusted for various animal species, similar to the terminology used. For an 204
untrained linguist, the term “syllable” that is used in animal communication may have a very 205
different connotation than for a biologist, for whom it may be clear that this is an utterance 206
between pauses. Joint venture investigations on human and non-human animals might also 207
reveal deeper insights because they differ in their respiratory, vocal, and cognitive repertoire. 208
3. Final lengthening: Humans vs. animals 209
3.1 Final lengthening in human speech 210
Over the last 50 years phonetic studies that investigated temporal properties in a variety of 211
languages found that final lengthening is a reliable phonetic marker determining the end of a 212
speech chunk and is very pronounced next to a following pause (e.g.(Klatt, 1976; Edwards et 213
al., 1991), for a review on languages, see (Paschen et al., 2022)). There have also been 214
considerations that the lengthening of the final segment is part of the pause (e.g., (Krivokapić 215
et al., 2022)) or can, in extreme cases, be produced instead of a pause in fast speech rate. 216
Indeed, pause is an important determiner of rhythm. 217
Final lengthening has been claimed to be universal in human language (Fletcher, 2010) with 218
additional language specificities (e.g., (Nakai et al., 2009). Since the term “universal” can be 219
ambiguous in meaning (Bickel, 2011), we refer to statistically universal here, which relies on 220
robust statistical evidence across languages but also allows for exceptions. There is only 221
recent empirical evidence using the same methodology for 25 mostly understudied languages 222
that the lengthening of vowels is a statistically robust cross-linguistic phenomenon (Paschen 223
et al., 2022). Language-specific variations, driven by phonological vowel length, were found 224
as well. Sound-specific variations have also been reported, for example in Berkovits for 225
Hebrew, who reported stronger lengthening effects for final fricatives than stops (Berkovits, 226
1993) . Paschen provided evidence for Lower Sorbian that lengthening occurs in vowels, 227
sonorants, and fricatives, but not in stops (Paschen, in press). While these latter segmental 228
influences may be specific to human language, they may also give some hints that continuous 229
airstream mechanisms make final lengthening more likely. 230
The degree (how long) of lengthening has been extensively discussed in the linguistic 231
literature. For example, the pi-gesture model (Byrd and Saltzman, 2003) proposes that the 232
longer the segments, the closer they are to the boundary. Moreover, the degree of lengthening 233
varies with the boundary type. Without going too deeply into theoretical proposals such as the 234
Prosodic Hierarchy (Jun, 1998), it is fair to say that final segments at major boundaries, for 235
example at the end of a sentence, may be longer than segments at phrase boundaries within 236
a sentence. While language-specific variants exist, this does not exclude the assumption that 237
the underlying principles are physical in nature but have been shaped in various ways by the 238
properties of the sounds and the users of individual languages. 239
The puzzling question is - what are the underlying mechanisms that may cause final 240
lengthening? Do we also find it in other behavior or other species? 241
3.2 Final lengthening in animal communication with a focus on 242
bird species and primates 243
Final lengthening is getting increased attention in animal acoustic studies. Final lengthening 244
was found in birds and primates. We can find final lengthening as well as we found f0 245
declination in the budgerigar, a vocal learning parrot species. It was reported that segments 246
at the end of vocalizations were more likely to be longer. In 14 adult budgerigars, it was found 247
that segments in syllable-final positions are on average longer than medial segments (Mann 248
et al., 2021). In another study on 80 different song snippets from 80 different songbirds, the 249
same could be shown: song-final notes were on average significantly longer than non-final 250
notes (Tierney et al., 2011). This is especially impressive, as the analysis wasn’t conducted 251
for individual species, but across all 80 species, indicating this to be a generally observable 252
phenomenon in songbirds. A more detailed analysis of these 80 songs, to find possible 253
differences between families would nevertheless be interesting, to get a better understanding 254
of how widespread final lengthening in songbirds really is. Both papers argue that final 255
lengthening can be observed in these songbirds as well as in humans, because of similar 256
motor constraints. Both humans and songbirds show a high control of their vocal articulators 257
with the possibility to rapidly adjust them during vocal production. Nevertheless, abrupt 258
termination of these movements might be difficult, as opposed to a gradual relaxation and 259
therefore slowing of articulators, resulting in final lengthening. This argument is further 260
strengthened by the fact that especially budgerigar segments are produced within a single 261
breath (Mann et al., 2021; Tierney et al., 2011). It would be interesting to investigate respiratory 262
kinematics and final lengthening in human speech production. 263
Final lengthening was also investigated in at least three different primate species: two crested 264
gibbon species and the indri. For all three species, an elongation of unit duration near the end 265
of a phrase was found (Huang et al., 2020; Valente et al., 2021). In the gibbons, this effect 266
was found on two different structural levels, in vocal sequences and in vocal bouts (where a 267
vocal sequence is a short unit and a bout is made up of several vocal sequences). The 268
suggested reasons for this form of final lengthening differ from those discussed in humans. 269
The authors suggest a connection to the performance hypothesis. Gibbons might be 270
modulating the frequency of their calls rapidly at the end of sequences, to advertise their 271
individual quality to females as potential mating partners. An increase in frequency 272
modulations, even though fast, would potentially lead to an increase in the duration of notes 273
(Huang et al., 2020). If this is true, the observed phenomenon of final lengthening in crested 274
gibbons would only be a byproduct of other processes. 275
3.3. Comparing the use of final lengthening in human and animal 276
communication 277
There have been different explanations for final lengthening in the literature which may not be 278
exclusive. It has been understood as a general motor property rather than being innate. 279
Tierney et al. attribute it to energy efficiency of the underlying motor actions in humans and 280
singing birds (Tierney et al., 2011). As mentioned above, Huang et al. explain it as a byproduct 281
rather than a phenomenon on its own (Huang et al., 2020). Matzinger and Fitch mention the 282
possibility that the slowing down of articulators could be a result of a change from exhalation 283
to inhalation, i.e., respiratory dynamics and the occurrence of a breathing pause cause final 284
lengthening(Matzinger and Fitch, 2021). We think that it could be a plausible explanation for 285
those species that produce segments within one breath, but in humans, final lengthening can 286
also be found without a change from exhalation to inhalation. Nevertheless, the relation of one 287
breath to one segment (or one phrase in human communication) may have been at the origin 288
of human language evolution. In spontaneous interactive dialogues, more than 50% of all turns 289
consisted of only one breathing cycle (Rochet-Capellan and Fuchs, 2014). Linguistic studies 290
have mostly focused on language- and segment-specific properties in the implementation of 291
final lengthening. Because the number of published papers on these specificities is so much 292
more than in animal communication, one may implicitly assume that it is a phenomenon of 293
human language. 294
Even if language variations exist, it does not exclude the possibility of an underlying motor 295
principle. Human and non-human animals clearly have motor constraints, but these may not 296
be persistent under each, and every situation and all animals may also be able to compensate 297
for it if needed. Monocausalities are rather rare in biological systems. For example, the 298
available exhalation air in human speech may be physically constrained by the lung volume 299
(vital capacity) of a speaker. In human and non-human acquisition, a high positive correlation 300
between lung volume and utterance/vocalization length has been found (see review in (Fuchs 301
and Rochet-Capellan, 2021). Evidence for a similarly strong correlation between utterance 302
length and vital capacity in adults is missing, because humans with smaller lung capacities 303
may adjust their laryngeal resistance and lose less expiratory air to compensate for their 304
physical constraints. 305
All in all, evidence for final lengthening in non-human animals has changed the perspective 306
that the phenomenon is a purely linguistic phenomenon to the notion, that it might be grounded 307
in general motor constraints. There is clearly an advantage when human and non-human 308
animals are taken into account. 309
4. Future directions for research 310
4.1 A roadmap 311
312
In the venture of finding the biological underpinnings of prosodic phenomena like f0 declination 313
and final lengthening through a comparative approach combining knowledge and advantages 314
from studying human and non-human communication, there are several issues to overcome 315
and steps to take. We will lay out a possible roadmap to achieve this here, with concrete steps. 316
1) Establishing a common vocabulary and conceptual framework 317
To ensure effective communication between researchers from different disciplines, it 318
is essential to establish a common vocabulary and conceptual framework for 319
discussing prosodic features like f0 declination and final lengthening. This touches on 320
issues mentioned earlier, where terms like “syllable” might be understood differently 321
between linguists and biologists, but also within biology. Clear terminology, glossaries, 322
and clear, but adjustable definitions and criteria for measuring and analyzing features 323
are important steps. 324
2) Establishing the necessary data basis 325
We would need to develop a comprehensive database of vocalizations from a wide 326
range of species to study these and other phenomena comparatively. A database 327
should include both natural or spontaneous and elicited vocalizations. Most 328
importantly, meta-information about the social or ecological contexts is important to 329
know, as well as potential age and sex. 330
3) Identifying universal and species-specific patterns 331
Once established, researchers can begin to identify robust patterns in the use of 332
prosodic features like f0 declination and final lengthening across species, as well as 333
species-specific variations by means of the database. Eventually, this can help to shed 334
light on the evolutionary and ecological factors that shape the use of these features in 335
different species with different needs and skills. 336
4) Integrating insights from different disciplines 337
To fully understand the mechanisms and functions of prosodic features in speech and 338
animal communication, researchers need to integrate insights from different 339
disciplines, including linguistics, biology, psychology, and neuroscience. With a 340
comprehensive knowledge of theories, mechanisms, and constraints in all these fields 341
from interdisciplinary collaborations we can identify new research questions and can 342
generate novel insights into the biological and cognitive foundations of communication. 343
4.2 Prosodic rhythm markers and respiration 344
A future endeavor towards a better understanding of prosodic markers of rhythm could be to 345
investigate the interaction between respiration, phonation (voice quality) and anticipatory 346
planning in human and non-human animals. Respiration itself is a biological rhythm and the 347
duration of breathing cycles can constrain how long an animal can phonate. At the end of long 348
sequences when lung volume is reduced, laryngeal adjustments may be required to continue 349
phonation. While this has been modelled for humans (Zhang, 2016), it may also exist in non-350
human animals. The amount of air inhaled may further shape acoustic properties like intensity 351
and to some extent f0 (Watson et al., 2003). 352
Vocalizations in nonhuman primates such as screams or grunts may have specific prosodic 353
features such as f0 declination and final lengthening which can be shaped by respiratory 354
control. Birds have more complex respiratory and laryngeal systems than humans, including 355
not only lungs but several additional air sacs that are necessary during avian flight. Phonation 356
is produced with a syrinx which is much closer to the lungs than in humans, at the place where 357
the trachea forks into the lungs. These physiological differences and control mechanisms are 358
on the one hand a big challenge for comparative work, on the other hand, they may give us 359
insights into which physiological or cognitive systems is able to produce these patterns and 360
which features does it have. Recent technological developments in the area of thermography 361
allows us to record breathing and acoustics even in free ranging animals (Demartsev et al., 362
2022). 363
4.3. The potential for interdisciplinary collaborations between 364
linguistics, psychology, and biology 365
The potential for interdisciplinary collaborations was already discussed throughout this paper 366
from different angles but shall be summarized here again. Opportunities arise on different 367
levels and for different questions. There is a strong potential of gaining evolutionary insights 368
into any phenomenon when studied comparatively, as is the case for prosodic phenomena 369
such as f0 declination and final lengthening. By comparing the use of prosodic features in 370
human and non-human animal communication, we can get insights into their evolutionary 371
origins and development of speech rhythms. We can also light on the cognitive and biological 372
underpinnings of human language and communication. On a different level, the comparative 373
approach can provide valuable perspective on the diversity of prosodic features used across 374
different languages and language families and therefore might raise new questions in more 375
specific research areas. We will be able to identify universal patterns and language-specific 376
variations better. 377
378
5. References 379
Berkovits, R., 1993. Progressive Utterance-Final Lengthening in Syllables with Final 380
Fricatives. Lang. Speech 36, 89–98. https://doi.org/10.1177/002383099303600105 381
Bickel, B., 2011. Statistical modeling of language universals 15, 401–413. 382
https://doi.org/10.1515/lity.2011.027 383
Briefer, E.F., 2012. Vocal expression of emotions in mammals: mechanisms of production 384
and evidence. J. Zool. 288, 1–20. https://doi.org/10.1111/j.1469-7998.2012.00920.x 385
Byrd, D., Saltzman, E., 2003. The elastic phrase: modeling the dynamics of boundary-386
adjacent lengthening. J. Phon. 31, 149–180. https://doi.org/10.1016/S0095-387
4470(02)00085-2 388
Cooper, W., Sorensen, J., 1981. Fundamental frequency in sentence production. Springer 389
Verlag, New York. 390
Crockford, C., Gruber, T., Zuberbühler, K., 2018. Chimpanzee quiet hoo variants differ 391
according to context. R. Soc. Open Sci. 5, 172066. 392
https://doi.org/10.1098/rsos.172066 393
Demartsev, V., Manser, M.B., Tattersall, G.J., 2022. Vocalization-associated respiration 394
patterns: thermography-based monitoring and detection of preparation for calling. J. 395
Exp. Biol. 225, jeb243474. https://doi.org/10.1242/jeb.243474 396
Edwards, J., Beckman, M.E., Fletcher, J., 1991. The articulatory kinematics of final 397
lengthening. J. Acoust. Soc. Am. 89, 369–382. https://doi.org/10.1121/1.400674 398
Favaro, L., Gamba, M., Cresta, E., Fumagalli, E., Bandoli, F., Pilenga, C., Isaja, V., 399
Mathevon, N., Reby, D., 2020. Do penguins’ vocal sequences conform to linguistic 400
laws? Biol. Lett. 16, 20190589. https://doi.org/10.1098/rsbl.2019.0589 401
Filippi, P., 2016. Emotional and Interactional Prosody across Animal Communication 402
Systems: A Comparative Approach to the Emergence of Language. Front. Psychol. 403
7. 404
Fischer, J., Kitchen, D.M., Seyfarth, R.M., Cheney, D.L., 2004. Baboon loud calls advertise 405
male quality: acoustic features and their relation to rank, age, and exhaustion. Behav. 406
Ecol. Sociobiol. 56, 140–148. https://doi.org/10.1007/s00265-003-0739-4 407
Fletcher, J., 2010. The Prosody of Speech: Timing and Rhythm, in: The Handbook of 408
Phonetic Sciences. John Wiley & Sons, Ltd, pp. 521–602. 409
https://doi.org/10.1002/9781444317251.ch15 410
Fuchs, S., Petrone, C., Rochet-Capellan, A., Reichel, U.D., Koenig, L.L., 2015. Assessing 411
respiratory contributions to f0 declination in German across varying speech tasks and 412
respiratory demands. J. Phon. 52, 35–45. https://doi.org/10.1016/j.wocn.2015.04.002 413
Fuchs, S., Rochet-Capellan, A., 2021. The Respiratory Foundations of Spoken Language. 414
Annu. Rev. Linguist. 7, 13–30. https://doi.org/10.1146/annurev-linguistics-031720-415
103907 416
Hauser, M.D., Fowler, C.A., 1992. Fundamental frequency declination is not unique to 417
human speech: Evidence from nonhuman primates. J. Acoust. Soc. Am. 91, 363–418
369. https://doi.org/10.1121/1.402779 419
Hersh, T.A., Ravignani, A., Burchardt, L.S., 2023. Robust rhythm reporting will advance 420
ecological and evolutionary research. Methods Ecol. Evol. n/a. 421
https://doi.org/10.1111/2041-210X.14118 422
Hoeschele, M., Wagner, B., Mann, D.C., 2023. Lessons learned in animal acoustic cognition 423
through comparisons with humans. Anim. Cogn. 26, 97–116. 424
https://doi.org/10.1007/s10071-022-01735-0 425
Hotchkin, C., Parks, S., 2013. The Lombard effect and other noise-induced vocal 426
modifications: insight from mammalian communication systems. Biol. Rev. 88, 809–427
824. https://doi.org/10.1111/brv.12026 428
Huang, M., Ma, H., Ma, C., Garber, P.A., Fan, P., 2020. Male gibbon loud morning calls 429
conform to Zipf’s law of brevity and Menzerath’s law: insights into the origin of human 430
language. Anim. Behav. 160, 145–155. 431
https://doi.org/10.1016/j.anbehav.2019.11.017 432
Jun, S.-A., 1998. The Accentual Phrase in the Korean prosodic hierarchy. Phonology 15, 433
189–226. https://doi.org/10.1017/S0952675798003571 434
Kentner, G., Franz, I., Knoop, C.A., Menninghaus, W., 2023. The final lengthening of pre-435
boundary syllables turns into final shortening as boundary strength levels increase. J. 436
Phon. 97, 101225. https://doi.org/10.1016/j.wocn.2023.101225 437
Klatt, D.H., 1976. Linguistic uses of segmental duration in English: Acoustic and perceptual 438
evidence. J. Acoust. Soc. Am. 59, 1208–1221. https://doi.org/10.1121/1.380986 439
Krivokapić, J., Styler, W., Byrd, D., 2022. The role of speech planning in the articulation of 440
pausesa). J. Acoust. Soc. Am. 151, 402–413. https://doi.org/10.1121/10.0009279 441
Ladd, D.R., 2008. Intonational phonology, 2nd ed. Cambridge University Press, Cambridge, 442
UK. 443
Ladd, D.R., 1988. Declination ‘“reset”’ and the hierarchical organization of utterances. J. 444
Acoust. Soc. Am. 84, 530–544. https://doi.org/10.1121/1.396830 445
Lieberman, P., 1967. Intonation, perception and language. MIT Press, Cambridge, MA. 446
Lieberman, P., Katz, W., Jongman, A., Zimmerman, R., Miller, M., 1985. Measures of the 447
sentence intonation of read and spontaneous speech in American English. J. Acoust. 448
Soc. Am. 77, 649–657. https://doi.org/10.1121/1.391883 449
Linhart, P., Mahamoud-Issa, M., Stowell, D., Blumstein, D.T., 2022. The potential for 450
acoustic individual identification in mammals. Mamm. Biol. 102, 667–683. 451
https://doi.org/10.1007/s42991-021-00222-2 452
Mann, D.C., Fitch, W.T., Tu, H.-W., Hoeschele, M., 2021. Universal principles underlying 453
segmental structures in parrot song and human speech | Scientific Reports. Sci. Rep. 454
11. https://doi.org/10.1038/s41598-020-80340-y 455
Marler, P., 1972. Vocalizations of East African monkeys: II. Black and white colobus. 456
Behaviour 42, 175–197. https://doi.org/10.1163/156853972X00266 457
Matzinger, T., Fitch, W.T., 2021. Voice modulatory cues to structure across languages and 458
species. Philos. Trans. R. Soc. B Biol. Sci. 376, 20200393. 459
https://doi.org/10.1098/rstb.2020.0393 460
Nakai, S., Kunnari, S., Turk, A., Suomi, K., Ylitalo, R., 2009. Utterance-final lengthening and 461
quantity in Northern Finnish. J. Phon. 37, 29–45. 462
https://doi.org/10.1016/j.wocn.2008.08.002 463
Oates, J.F., Trocco, T.F., 1983. Taxonomy and phylogeny of black-and-white colobus 464
monkeys. Inferences from an analysis of loud call variation. Folia Primatol. Int. J. 465
Primatol. 40, 83–113. https://doi.org/10.1159/000156092 466
Ohala, J.J., 1990. Ohala, J. J. (1990) Respiratory activity in speech. In Speech production 467
and speech modelling (Hardcastle & Marchal, editors), pp. 23}53. Dordrecht: Kluwer, 468
in: Hardcastle, Marchal (Eds.), Speech Production and Speech Modelling. Kluwer, 469
Dordrecht, pp. 23–53. 470
Ohala, J.J., 1978. The production of tone., in: Fromkin, V.A. (Ed.), Tone: A Linguistic Survey. 471
Academic Press, New York, pp. 5–39. 472
Paschen, L., in press. Final Lengthening across various sound classes in a language 473
documentation corpus of Lower Sorbian, in: Proceedings of the 20th ICPhS. 474
Presented at the ICPhs, Prague. 475
Paschen, L., Fuchs, S., Seifart, F., 2022. Final Lengthening and vowel length in 25 476
languages. J. Phon. 94, 101179. https://doi.org/10.1016/j.wocn.2022.101179 477
Patel, A.D., 2008. Music, Language and the Brain. NY: Oxford University Press, New York. 478
Petrone, C., Truckenbrodt, H., Wellmann, C., Holzgrefe-Lang, J., Wartenburger, I., Höhle, 479
B., 2017. Prosodic boundary cues in German: Evidence from the production and 480
perception of bracketed lists. J. Phon. 61, 71–92. 481
https://doi.org/10.1016/j.wocn.2017.01.002 482
Pika, S., Wilkinson, R., Kendrick, K.H., Vernes, S.C., 2018. Taking turns: bridging the gap 483
between human and animal communication. Proc. R. Soc. B Biol. Sci. 285, 484
20180598. https://doi.org/10.1098/rspb.2018.0598 485
Pouw, W., Fuchs, S., 2022. Origins of vocal-entangled gesture. Neurosci. Biobehav. Rev. 486
141, 104836. https://doi.org/10.1016/j.neubiorev.2022.104836 487
Rochet-Capellan, A., Fuchs, S., 2014. Take a breath and take the turn: how breathing meets 488
turns in spontaneous dialogue. Philos. Trans. R. Soc. B Biol. Sci. 369, 20130399. 489
https://doi.org/10.1098/rstb.2013.0399 490
Schel, A.M., Tranquilli, S., Zuberbühler, K., 2009. The alarm call system of two species of 491
black-and-white colobus monkeys (Colobus polykomos and Colobus guereza). J. 492
Comp. Psychol. 123, 136–150. https://doi.org/10.1037/a0014280 493
Strik, H., Boves, L., 1995. Downtrend in F0 and Psb. J. Phon. 23, 203–220. 494
https://doi.org/10.1016/S0095-4470(95)80043-3 495
Swerts, M., Strangert, E., Heldner, M., 1996. F0 declination in read-aloud and spontaneous 496
speech, in: Proceedings of ICSLP. Presented at the ICSLP, Philadelphia, PA, pp. 497
1501–1504. 498
Tierney, A.T., Russo, F.A., Patel, A.D., 2011. The motor origins of human and avian song 499
structure. Proc. Natl. Acad. Sci. 108, 15510–15515. 500
https://doi.org/10.1073/pnas.1103882108 501
Turk, A., Shattuck-Hufnagel, S., 2013. What is speech rhythm? A commentary on Arvaniti 502
and Rodriquez, Krivokapić, and Goswami and Leong. Lab. Phonol. 4, 93–118. 503
https://doi.org/10.1515/lp-2013-0005 504
Valente, D., De Gregorio, C., Favaro, L., Friard, O., Miaretsoa, L., Raimondi, T., 505
Ratsimbazafy, J., Torti, V., Zanoli, A., Giacoma, C., Gamba, M., 2021. Linguistic laws 506
of brevity: conformity in Indri indri. Anim. Cogn. 24, 897–906. 507
https://doi.org/10.1007/s10071-021-01495-3 508
Watson, P.J., Ciccia, A.H., Weismer, G., 2003. The relation of lung volume initiation to 509
selected acoustic properties of speech. J. Acoust. Soc. Am. 113, 2812–2819. 510
https://doi.org/10.1121/1.1567279 511
Yuan, J., Liberman, M., 2014. F0 declination in English and Mandarin Broadcast News 512
Speech. Speech Commun. 65, 67–74. https://doi.org/10.1016/j.specom.2014.06.001 513
Zhang, Z., 2016. Respiratory Laryngeal Coordination in Airflow Conservation and Reduction 514
of Respiratory Effort of Phonation. J. Voice Off. J. Voice Found. 30, 760.e7-760.e13. 515
https://doi.org/10.1016/j.jvoice.2015.09.015 516
517
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Rhythmicity in the millisecond to second range is a fundamental building block of communication and coordinated movement. But how widespread are rhythmic capacities across species, and how did they evolve under different environmental pressures? Comparative research is necessary to answer these questions but has been hindered by limited crosstalk and comparability among results from different study species. Most acoustics studies do not explicitly focus on characterising or quantifying rhythm, but many are just a few scrapes away from contributing to and advancing the field of comparative rhythm research. Here, we present an eight‐level rhythm reporting framework which details actionable steps researchers can take to report rhythm‐relevant metrics. Levels fall into two categories: metric reporting and data sharing. Metric reporting levels include defining rhythm‐relevant metrics, providing point estimates of temporal interval variability, reporting interval distributions, and conducting rhythm analyses. Data sharing levels are: sharing audio recordings, sharing interval durations, sharing sound element start and end times, and sharing audio recordings with sound element start/end times. Using sounds recorded from a sperm whale as a case study, we demonstrate how each reporting framework level can be implemented on real data. We also highlight existing best practice examples from recent research spanning multiple species. We clearly detail how engagement with our framework can be tailored case‐by‐case based on how much time and effort researchers are willing to contribute. Finally, we illustrate how reporting at any of the suggested levels will help advance comparative rhythm research. This framework will actively facilitate a comparative approach to acoustic rhythms while also promoting cooperation and data sustainability. By quantifying and reporting rhythm metrics more consistently and broadly, new avenues of inquiry and several long‐standing, big picture research questions become more tractable. These lines of research can inform not only about the behavioural ecology of animals but also about the evolution of rhythm‐relevant phenomena and the behavioural neuroscience of rhythm production and perception. Rhythm is clearly an emergent feature of life; adopting our framework, researchers from different fields and with different study species can help understand why.
Article
Full-text available
Phrase-final syllable duration and pauses are generally considered to be positively correlated: The stronger the boundary, the longer the duration of phrase-final syllables, and the more likely or longer a pause. Exploring a large sample of complex literary prose texts read aloud, we examined pause likelihood and duration, pre-boundary syllable duration, and the pitch excursion at prosodic boundaries. Comparing these features across six predicted levels of boundary strength (level 0: no break; 1: simple phrase break; 2: short comma phrase break; 3: long comma phrase break; 4: sentence boundary; 5: direct speech boundary), we find that they are not correlated in a simple monotonic fashion. Whereas pause duration monotonically increases with boundary strength, both pre-boundary syllable duration and the pitch excursion on the pre-boundary syllable are largest for level-2 breaks and decrease significantly through levels 3 to 5. Our analysis suggests that pre-boundary syllable duration is partly contingent on the tonal realization, which is subject to f0 declination as the utterance progresses. We also surmise that pre-boundary syllable duration reflects differences in planning complexity for the different prosodic and syntactic boundaries. Overall, this study shows that a simple monotonic correlation between pause duration and pre-boundary syllable duration is not valid.
Article
Full-text available
Humans are an interesting subject of study in comparative cognition. While humans have a lot of anecdotal and subjective knowledge about their own minds and behaviors, researchers tend not to study humans the way they study other species. Instead, comparisons between humans and other animals tend to be based on either assumptions about human behavior and cognition, or very different testing methods. Here we emphasize the importance of using insider knowledge about humans to form interesting research questions about animal cognition while simultaneously stepping back and treating humans like just another species as if one were an alien researcher. This perspective is extremely helpful to identify what aspects of cognitive processes may be interesting and relevant across the animal kingdom. Here we outline some examples of how this objective human-centric approach has helped us to move forward knowledge in several areas of animal acoustic cognition (rhythm, harmonicity, and vocal units). We describe how this approach works, what kind of benefits we obtain, and how it can be applied to other areas of animal cognition. While an objective human-centric approach is not useful when studying traits that do not occur in humans (e.g., magnetic spatial navigation), it can be extremely helpful when studying traits that are relevant to humans (e.g., communication). Overall, we hope to entice more people working in animal cognition to use a similar approach to maximize the benefits of being part of the animal kingdom while maintaining a detached and scientific perspective on the human species.
Article
Full-text available
Lengthening of segments at the end of prosodic domains is commonly considered a universal phenomenon, but language-specific variation has also been reported, specifically in languages with a phonological vowel length contrast. This cross-linguistic study uses spontaneous speech data from the DoReCo corpus as a testbed to investigate Final Lengthening (FL) in a diverse sample of 25 mostly understudied languages, thirteen of which have a phonological vowel length contrast. The duration of vowels was labeled using an automatic aligner, with additional manual corrections of word boundaries upon which refined segment alignments were created. The study reveals that (i) FL is a widespread process across languages; (ii) FL shows a wide variety of manifestations with respect to the degree and scope of lengthening; (iii) there are several significant interactions between phonological length and positional lengthening. These results lend support to theories assuming a phonological nature of Final Lengthening.
Article
Full-text available
Gestures during speaking are typically understood in a representational framework: they represent absent or distal states of affairs by means of pointing, resemblance, or symbolic replacement. However, humans also gesture along with the rhythm of speaking, which is amenable to a non-representational perspective. Such a perspective centers on the phenomenon of vocal-entangled gestures and builds on evidence showing that when an upper limb with a certain mass decelerates/accelerates sufficiently, it yields impulses on the body that cascade in various ways into the respiratory-vocal system. It entails a physical entanglement between body motions, respiration, and vocal activities. It is shown that vocal-entangled gestures are realized in infant vocal-motor babbling before any representational use of gesture develops. Similarly, an overview is given of vocal-entangled processes in non-human animals. They can frequently be found in rats, bats, birds, and a range of other species that developed even earlier in the phylogenetic tree. Thus, the origins of human gesture lie in biomechanics, emerging early in ontogeny and running deep in phylogeny.
Article
Full-text available
Vocal emission requires coordination with the respiratory system. Monitoring the increase in laryngeal pressure, needed for vocal production, allows detection of transitions from quiet respiration to vocalization-supporting respiration. Characterization of these transitions could be used to identify preparation for vocal emission and to examine the probability of it manifesting into an actual vocal production event. Specifically, overlaying the subject's respiration with conspecific calls can highlight events of call initiation and suppression, as a mean of signalling coordination and avoiding jamming. Here we present a thermal-imaging based methodology for synchronized respiration and vocalization monitoring of free ranging meerkats. The sensitivity of this methodology is sufficient for detecting transient changes in the subject's respiration associated with the exertion of vocal production. The differences in respiration are apparent not only during the vocal output but also prior to it, marking the potential time frame of the respiratory preparation for calling. A correlation between conspecific calls with elongation of the focal subject's respiration cycles could be related to fluctuations in attention levels or in the motivation to reply. This framework can be used for examining animals’ capability for enhanced respiration control during modulated and complex vocal sequences, detect “failed” vocalisation attempts and investigate the role of respiration cues in the regulation of vocal interactions.
Article
Full-text available
Voice modulatory cues such as variations in fundamental frequency, duration and pauses are key factors for structuring vocal signals in human speech and vocal communication in other tetrapods. Voice modulation physiology is highly similar in humans and other tetrapods due to shared ancestry and shared functional pressures for efficient communication. This has led to similarly structured vocalizations across humans and other tetrapods. Nonetheless, in their details, structural characteristics may vary across species and languages. Because data concerning voice modulation in non-human tetrapod vocal production and especially perception are relatively scarce compared to human vocal production and perception, this review focuses on voice modulatory cues used for speech segmentation across human languages, highlighting comparative data where available. Cues that are used similarly across many languages may help indicate which cues may result from physiological or basic cognitive constraints, and which cues may be employed more flexibly and are shaped by cultural evolution. This suggests promising candidates for future investigation of cues to structure in non-human tetrapod vocalizations. This article is part of the theme issue ‘Voice modulation: from origin and mechanism to social impact (Part I)’.
Article
Full-text available
Vocal and gestural sequences of several primates have been found to conform to two general principles of information compression: the compensation between the duration of a construct and that of its components (Menzerath–Altmann law) and an inverse relationship between signal duration and its occurrence (Zipf’s law of abbreviation). Even though Zipf’s law of brevity has been proposed as a universal in animal communication, evidence on non-human primate vocal behavior conformity to linguistic laws is still debated, and information on strepsirrhine primates is lacking. We analyzed the vocal behavior of the unique singing lemur species (Indri indri) to assess whether the song of the species shows evidence for compression. As roars have a chaotic structure that impedes the recognition of each individual utterance, and long notes are usually given by males, we focused on the core part of the song (i.e., the descending phrases, composed of two–six units). Our results indicate that indris’ songs conform to Zipf’s and Menzerath–Altmann linguistic laws. Indeed, shorter phrases are more likely to be included in the song, and units’ duration decrease at the increase of the size of the phrases. We also found that, despite a sexual dimorphism in the duration of both units and phrases, these laws characterize sequences of both males and females. Overall, we provide the first evidence for a trade-off between signal duration and occurrence in the vocal behavior of a strepsirrhine species, suggesting that selective pressures for vocal compression are more ancestral than previously assumed within primates.
Article
Full-text available
Despite the diversity of human languages, certain linguistic patterns are remarkably consistent across human populations. While syntactic universals receive more attention, there is stronger evidence for universal patterns in the inventory and organization of segments: units that are separated by rapid acoustic transitions which are used to build syllables, words, and phrases. Crucially, if an alien researcher investigated spoken human language how we analyze non-human communication systems, many of the phonological regularities would be overlooked, as the majority of analyses in non-humans treat breath groups, or “syllables” (units divided by silent inhalations), as the smallest unit. Here, we introduce a novel segment-based analysis that reveals patterns in the acoustic output of budgerigars, a vocal learning parrot species, that match universal phonological patterns well-documented in humans. We show that song in four independent budgerigar populations is comprised of consonant- and vowel-like segments. Furthermore, the organization of segments within syllables is not random. As in spoken human language, segments at the start of a vocalization are more likely to be consonant-like and segments at the end are more likely to be longer, quieter, and lower in fundamental frequency. These results provide a new foundation for empirical investigation of language-like abilities in other species.
Article
Extensive research has found that the duration of a pause is influenced by the length of an upcoming utterance, suggesting that speakers plan the upcoming utterance during this time. Research has more recently begun to examine articulation during pauses. A specific configuration of the vocal tract during acoustic pauses, termed pause posture (PP), has been identified in Greek and American English. However, the cognitive function giving rise to PPs is not well understood. The present study examines whether PPs are related to speech planning processes, such that they contribute additional planning time for an upcoming utterance. In an articulatory magnetometer study, the hypothesis is tested that an increase in upcoming utterance length leads to more frequent PP occurrence and that PPs are longer in pauses that precede longer phrases. The results indicate that PPs are associated with planning time for longer utterances but that they are associated with a relatively fixed scope of planning for upcoming speech. To further examine the relationship between articulation and speech planning, an additional hypothesis examines whether the first part of the pause predominantly serves to mark prosodic boundaries while the second part serves speech planning purposes. This hypothesis is not supported by the results.