Conference PaperPDF Available

Synchronization of speech rhythm between Spanish-speaking interlocutors

Authors:

Abstract

Paper presented at Phonetics and Phonology in Europe (PaPE) 2017
Synchronization of speech rhythm between Spanish-speaking interlocutors
Leonardo Barón Birchenall, Noël Nguyen
Aix Marseille Univ, CNRS, LPL, Aix-en-Provence, France
In spoken language communication, accommodation refers to the many processes
employed by talkers to adapt to each other. Within these processes, phonetic convergence is
associated with an increase in similarity in speech patterns between interlocutors over the
course of a conversation. In some theoretical frameworks, phonetic convergence is thought to
occur in an involuntary and automatic manner rather than intentionally [1].
The present work focuses on the interaction between speech rhythm and phonetic
convergence in a semi-interactive task (understanding rhythm as a “temporally regular
[iteration] of events which embody alternating strong and weak values of an observable
parameter” [2]). Specifically, given that a repeated speech stimulus requires both less
processing time and lower neural activation across repetitions, and that multiple repetitions
significantly enhance memory and learning [3], we propose that the use of regular rhythmic
structures during conversations produces more convergence between speakers with respect to
irregular rhythmic structures. To our best knowledge, the only existing research on this
particular topic is the one conducted by Späth et al. [4], who found more rhythmic
convergence between a healthy person and a model speaker than between individuals with
Parkinson’s disease and the same model speaker.
To test our hypothesis, we created a set of stimuli consisting of seven groups of 16 nine-
or eight-syllable Spanish sentences each. Each group had a particular rhythmic structure,
obtained through the arrangement of different types of words (oxytones, paroxytones,
proparoxytones and unstressed words) in feet of different length. Rhythmic structures were
composed as follows (unstressed syllables are represented by a lowercase x and stressed
syllables by an uppercase X and in uppercase within the sentences):
Regular structures: (1) Three feet, head to the right: xxXxxXxxX (e.g. la re--bli-ca
NO ter-mi-NÓ) [the republic did not end]. (2) Three feet, head to the left: XxxXxxXxx (e.g.
E-llos es-PE-ran al MÉ-di-co) [they wait for the doctor]. (3) Four feet, head to the left:
XxXxXxXx (e.g. JUAN es-TÁ bus-CAN-do BA-rro) [John is looking for mud].
Irregular structures: (4) Three feet, head to the right: xxXxXxxXx (e.g. la es-PO-sa CA-
e sin VI-da) [the wife falls down dead]. (5) Three feet, head to the left: XxxxXxxXx (e.g.
CAR-los ter-mi-mi lla-MA-da) [Charles ended my call]. (6) Four feet, head to the left:
XxXXxxxXx (e.g. JUAN es-SIEM-pre so-me-TI-do) [John is always under control]. (7)
Four feet, head to the left: XxXXxxXx (e.g. JUAN sa-LIÓ RÁ-pi-do SIEM-pre) [John always
left quickly].
We tested four dyads of Spanish native speakers separately in a reading - repetition task
with different combinations of the rhythmic structures. In the task, each member of the dyad
must read a sentence and the other one must immediately repeat it. Participants alternate
between reading and repeating the sentences of each group. The order of presentation of the
sentences within a group, and that of the groups themselves, are randomized. A rhythmic
distance score, proposed by Späth et al. [4], was then used to determine the degree of
convergence between the interlocutors’ rhythms.
Results indicate a greater amount of convergence between regular structures than between
irregular ones, when feet nuclei are left aligned. We observed an overall tendency for the
regular utterances to present more similar metrical timing patterns between interlocutors than
the irregular ones, rather than a gradual augmentation of the resemblance between regular
utterances’ rhythms over the course of the task. Details will be given on the response patterns
observed in the other conditions (right-aligned feet nuclei), and implications for current
models of phonetic convergence in speech will be discussed.


74
[1] Louwerse, M., Dale, R., Bard, E., & Jeuniaux, P. 2012. Behavior matching in multimodal
communication is synchronized. Cognitive Science, 36(8), 1404-1426.
[2] Gibbon, D. 2015. Speech rhythms – modeling the groove. In R. Vogel & R. van de Vijver
(Eds.), Rhythm in Cognition and Grammar: A Germanic Perspective (pp. 108-161).
Berlin: De Gruyter Mouton.
[3] Falk, S., Rathcke, T., & Dalla Bella, S. 2014. When speech sounds like music. Journal of
Experimental Psychology: Human Perception and Performance, 40(4), 1491-1506.
[4] Späth, M., Aichert, I., Ceballos, A., Wagner, E., Miller, N., & Ziegler, W. 2016.
Entraining with another person’s speech rhythm: Evidence from healthy speakers and
individuals with Parkinson’s disease. Clinical Linguistics and Phonetics, 30(1), 68-85.
75
It is with great pleasure that we welcome you to Cologne to PaPE 2017!
The Phonetics and Phonology in Europe (PaPE) conference series is a forum that has the aim of
exploring disciplinary and interdisciplinary approaches to all areas of phonetics and phonology, with a
special focus on Laboratory Phonology. This includes both theoretical and applied research, and in
particular the relationship between the two. The series covers a wide variety of topics including tone
and intonation, phonological theory, audiovisual prosody and gesture, language development, linguistic
typology, language pathology, and language teaching. Methodologically, the conference also aims at
bridging the gap between the fields of phonetics and phonology and fields such as psycholinguistics,
neurolinguistics, and computational linguistics.
This is the second PaPE conference, following a highly successful first conference in Cambridge, UK,
in June 2015. Prior to 2015, a series of biennial PaPI (Phonetics and Phonology in Iberia) conferences
dates back to 2003. The broadening of the scope of the conference will hopefully lead to fruitful
exchange in Europe and beyond.
For PaPE 2017, over 200 submissions were received, with authors from 40 countries. Of these, 95 will
be presented at the conference.
Looking forward to your presentations and to stimulating discussions in and around the sessions.
Martine Grice
On behalf of the organising committee
Organising committee:
Stefan Baumann
Anna Bruggeman
Francesco Cangemi
Martine Grice
Local organisation:
Martina Krüger
Christine Riek
We are especially grateful to our sponsors, the Deutsche Forschungsgemeinschaft (DFG), Language
Science Press (LSP), the International Phonetic Association (IPA) and the Association for Laboratory
Phonology (ALP).
Association for Laboratory Phonology
Deutsche Forschungsgesellschaft
Language Science Press
International Phonetic Association
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Repetition can boost memory and perception. However, repeating the same stimulus several times in immediate succession also induces intriguing perceptual transformations and illusions. Here, we investigate the Speech to Song Transformation (S2ST), a massed repetition effect in the auditory modality, which crosses the boundaries between language and music. In the S2ST, a phrase repeated several times shifts to being heard as sung. To better understand this unique cross-domain transformation, we examined the perceptual determinants of the S2ST, in particular the role of acoustics. In two Experiments, the effect of two pitch properties and three rhythmic properties on the probability and speed of occurrence of the transformation were examined. Results showed that both pitch and rhythmic properties are key features fostering the transformation. However, some properties proved to be more conducive to the S2ST than others. Stable tonal targets that allowed for the perception of a musical melody led more often and quickly to the S2ST than scalar intervals. Recurring durational contrasts arising from segmental grouping favoring a metrical interpretation of the stimulus also facilitated the S2ST. This was, however, not the case for a regular beat structure within and across repetitions. In addition, individual perceptual abilities allowed to predict the likelihood of the S2ST. Overall, the study demonstrated that repetition enables listeners to re-interpret specific prosodic features of spoken utterances in terms of musical structures. The findings underline a tight link between language and music, but they also reveal important differences in communicative functions of prosodic structure in the two domains.
Article
This study examines entrainment of speech timing and rhythm with a model speaker in healthy persons and individuals with Parkinson's. We asked whether participants coordinate their speech initiation and rhythm with the model speaker, and whether the regularity of metrical structure of sentences influences this behaviour. Ten native German speakers with hypokinetic dysarthria following Parkinson's and 10 healthy controls heard a sentence ('prime') and subsequently read aloud another sentence ('target'). Speech material comprised 32 metrically regular and irregular sentences, respectively. Turn-taking delays and alignment of speech rhythm were measured using speech wave analyses. Results showed that healthy participants initiated speech more closely in rhythm with the model speaker than patients. Metrically regular prime sentences induced anticipatory responses relative to metrically irregular primes. Entrainment of speech rhythm was greater in metrically regular targets, especially in individuals with Parkinson's. We conclude that individuals with Parkinson's may exploit metrically regular cues in speech.
Article
A variety of theoretical frameworks predict the resemblance of behaviors between two people engaged in communication, in the form of coordination, mimicry, or alignment. However, little is known about the time course of the behavior matching, even though there is evidence that dyads synchronize oscillatory motions (e.g., postural sway). This study examined the temporal structure of nonoscillatory actions-language, facial, and gestural behaviors-produced during a route communication task. The focus was the temporal relationship between matching behaviors in the interlocutors (e.g., facial behavior in one interlocutor vs. the same facial behavior in the other interlocutor). Cross-recurrence analysis revealed that within each category tested (language, facial, gestural), interlocutors synchronized matching behaviors, at temporal lags short enough to provide imitation of one interlocutor by the other, from one conversational turn to the next. Both social and cognitive variables predicted the degree of temporal organization. These findings suggest that the temporal structure of matching behaviors provides low-level and low-cost resources for human interaction.