Conference Paper

A Crosslinguistic Study on the Interplay of Fillers and Silences

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We present a crosslinguistic study on the interplay of hesitation silences and fillers in conversation. The research questions have been addressed for English in a previous DiSS workshop paper (Betz & Kosmala, 2019) and this study extends the analysis to German, Italian and French. The research questions are: 1) Does the type of the filler influence following silence duration 2) Does the duration of the filler correlate with silence duration 3) Does silence duration vary depending on its distance from filler. The analysis shows cross-linguistic similarities and differences, thus highlighting the role and the language- and culture-specific nature of disfluencies.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... For example, the FPs used by Japanese speakers [8] differ from those used by speakers of Hungarian [9], English [10], German [11], Urdu/Hindi [12], and European Portuguese [13]. The position and duration of SPs has also been shown to be variable [14,12]. Speaker based variation in the frequency and selection of hesitation types has been reported by [15,16]. ...
... For example, speakers 12 and 13 used a high number of hesitations per minute compared with speakers 11 and 14. Individual variation in the frequency of hesitations has also been reported in other languages such as Italian [15], English [16], German [11], and French [14]. 3.5 88 25 16 1.6 32 20 ...
... The current study presents evidence for speaker based variation in the form and distribution of hesitations in Punjabi semispontaneous narratives. The variation in the position of hesitations shown in our data has also been reported for Italian [15], German, and Dutch [14]. Punjabi speakers' use of variable vowel quality in FPs is similar to that reported for European Portuguese [13]. ...
... This alludes to the variable use of hesitations in discourse as they may be placed at different positions in turn units and utterances. The occurrence of different types of hesitations and their combinations thereof has also been found to vary between languages [11,16,17]. Recent analyses have shown cross-linguistic differences in the duration of silences preceding and following fillers in an utterance. ...
... Recent analyses have shown cross-linguistic differences in the duration of silences preceding and following fillers in an utterance. [18,17] found that there is an interplay between fillers and associated silences in English and German as silences following fillers are significantly longer compared with the preceding ones. In French, on the other hand, the silences preceding fillers have been found to be significantly longer [17]. ...
... [18,17] found that there is an interplay between fillers and associated silences in English and German as silences following fillers are significantly longer compared with the preceding ones. In French, on the other hand, the silences preceding fillers have been found to be significantly longer [17]. ...
... However, this difference is not significant (t = 0.107, df = 50, p = 0.915). This observation is particularly interesting since it places Romanian on an intermediate position between Germanic and Romance languages in terms of the correlation between filler duration and silence duration (for recent cross-linguistic studies on this topic, see [24] for English data, and [25] for German, French and Italian data). These remarks, nonetheless, must be treated with caution, since they involve pauses and filler particles present only in IRs. ...
... Overall, in the analyzed speech, hesitation is mostly marked by lexicalized filled pauses, which means that the three tourist guides prefer to cover the time needed for speech planning producing fillers that consist of a proper lexical form (though semantically bleached and not adding anything to the propositional content of the utterance; Bazzanella, 2006;Crible, 2018) rather than other hesitation pauses such as vocalizations and silences that may be perceived as being more salient and disruptive (Betz, Bryhadyr, Kosmala & Schettino, 2021). ...
Chapter
Full-text available
This study concerns hesitation strategies that tourist guides may use to manage their speech, with particular attention to individual variability. Previous work has pointed out that hesitation phenomena may occur as a tool to structure discourse and gain visitors' attention, and that linguistic idiosyncratic behavior may affect their production. Given these findings, the proposed investigation delves deeper into the linguistic analysis of formal, phonetic, and functional aspects of hesitations occurring in a small corpus of Italian tourist guides' speech. It aims at describing the speaker-specific and common uses of hesitation phenomena and whether different types of hesitations and their phonetic features correlate with different discourse functions. From the results, it emerges a formal differentiation between hesitations involved in speech planning for lexical coding and for the structuring of information.
... It has been attested in various studies that hesitations can occur as standalones or cluster with other hesitations (e.g. Betz et al. 2015, Schettino et al. 2021). In this study, the cases where silence duration is measured at positions -1 and 0 there are clusters of directly neighboring silences and fillers. ...
Chapter
It is a widely known phenomenon that silence among people engaged in dialogue can be awkward. In linguistic terms, the awkward silence could be understood as a silence threshold; a period of time that has to pass silently in dialogue after which any speaker will give in to the urge to contribute anything. Sometimes there will be nothing to contribute, in which case it is assumed that speakers produce linguistic hesitations or produce non-committing, low-content material. Silences are among the most frequent hesitations, along with fillers (uh, uhm). The interplay of silences and fillers can thus be very revealing for linguistic approaches to the awkward silence. In this study, I propose that fillers ground hesitations in dialogue and increase acceptance for subsequent silences. To test this hypothesis, I analyze spontaneous speech data from four languages, with regard to co-occurences of fillers and silences. The hypothesis is that silences are longer when occurring after fillers. The hypothesis is correct for German and English, but cannot be accepted for French and Italian, which suggests that both the concept of the awkward silence as well as general properties of hesitations are subject to cross-cultural and cross-linguistic differences that demand further attention.KeywordsHesitationFillerSilenceAwkward silence
Article
Full-text available
In this paper, (dis)fluencies will be examined during tandem interactions in French and English by exploring the notions of secondary didacticity and pedagogical intention outside the classroom environment. While (dis)fluencies have typically been viewed as disturbances and markers of production difficulty, or have only been analyzed from a strictly verbal or vocal point of view, this paper offers a fresh multimodal perspective on these processes by taking into account the visual-gestural features of spoken interactions, mainly manual gestures and eye gaze. Based on the qualitative analyses of two sequences, this paper will illustrate how native and non-native speakers co-construct meaning during the course of their talk by relying on several semiotic resources. Our detailed analyses allow for a richer and deeper understanding of (dis)fluencies as they show the way (dis)fluencies can be negotiated multimodally in context during jointly collaborative activities in tandem settings.
Conference Paper
Full-text available
The present study is part of a research project conducted on (dis)fluencies in a French oral corpus of spontaneous and prepared speech which takes into account the different modalities of discourse (linguistic, vocal, visual, and gestural). (Dis)fluencies, which are characterized by an interruption of the vocal and verbal channel, have often been strictly analyzed from a production perspective. Grounded in an interactional approach to language, the aim of the present study is to go beyond this formal approach and to take into account their functional ambivalence and their contribution to the interaction. The analysis is taken from one pair of the corpus, and the preliminary results indicate a higher rate of (dis)fluencies in prepared speech than in conversation. This finding will be used as a starting point for future quantitative analyses on the data. The point of this article is to focus on qualitative analyses on a video extract taken from a humorous sequence in order to stress out their interactional dimension and the way speakers make use of (dis)fluencies for discursive and rhetorical purposes.
Conference Paper
Full-text available
In order to model hesitations for technical applications such as conversational speech synthesis, it is desirable to understand interactions between individual hesitation markers. In this study, we explore a pair of markers that has been subject to many discussions: silences and fillers. While it is generally acknowledged that fillers occur in two distinct forms, um and uh, it is not agreed on whether these forms systematically influence the form of associated silences. This notion will be investigated on a small dataset of English spontaneous speech data and the measure of distance between filler and silence will be introduced to the analyses. Results suggest that filler type influences associated silence duration systematically and that silences tend to gravitate towards fillers in utterances, exhibiting systematically lower duration when preceding them. These results provide valuable insights for improving existing hesitation models.
Conference Paper
Full-text available
Synthetic speech can be used to express uncertainty in dialogue systems by means of hesitation. If a phrase like "Next to the green tree" is uttered in a hesitant way, that is, containing lengthening, silences, and fillers, the listener can infer that the speaker is not certain about the concepts referred to. However, we do not know anything about the referential domain of the uncertainty; if only a particular word in this sentence would be uttered hesitantly, e.g. "the greee:n tree", the listener could infer that the uncertainty refers to the color in the statement, but not to the object. In this study, we show that the domain of the uncertainty is controllable. We conducted an experiment in which color words in sentences like "search for the green tree" were lengthened in two different positions: word onsets or final consonants, and participants were asked to rate the uncertainty regarding color and object. The results show that initial lengthening is predominantly associated with uncertainty about the word itself, whereas final lengthening is primarily associated with the following object. These findings enable dialogue system developers to finely control the attitudinal display of uncertainty, adding nuances beyond the lexical content to message delivery.
Conference Paper
Full-text available
We investigate segment prolongation as a means of disfluent hesitation in spontaneous German speech. We describe phonetic and structural features of disfluent prolongation and compare it to data of other languages and to non-disfluent prolongations.
Conference Paper
Full-text available
This study explores inter-and intra-speaker variation in use of time-management strategies. How do speakers differ in their use of pauses, fillers and other resources aimed at managing time while they plan their next contribution? Taking a rather qualitative approach, we describe individual speakers' production, using a small limited-domain corpus of task-oriented German speech. It is assumed that spoken dialogue systems can benefit from mimicking individual speaking styles. This study is a first step in that direction, aiming to describe in detail speaker-specific productions of selected time-buying disfluencies for later use in synthesis.
Article
Full-text available
This paper characterizes and exemplifies management in dialogue. Own communication management (choice and change directed) is distinguished from interactive communication management (sequences, turn management and feedback). An attempt is then made to motivate and explain the existence of various types of interactive management in dialogue. The suggested explanations involve a combination of general rational and ethical factors with more specific factors related to particular types of management. 1. Purpose A point of departure for this paper is that a number of different phenomena in spoken dialogue (like self correction, hesitation, feedback, and turntaking) exist primarily in order to enable management of dialogue. The term management has been chosen instead of the related terms regulation and control because it is less authoritarian and machine-like and allows, but does not require, intentional control. The purpose of the paper is to briefly describe and initiate an explanation of some of these types of management. 2. Background
Article
Full-text available
This study reports on a number of highly significant differences found between English, German, and Dutch hesitation markers. English and German native speakers used significantly more vocalic-nasal hesitation markers than Dutch native speakers, who used predominantly vocalic hesitation markers. English hesitation markers occurred most frequently when preceded by silence and followed by a lexical item, or when surrounded by silence. German and Dutch hesitation markers occurred most frequently surrounded by lexical items. In Dutch, vocalic-nasal hesitation markers dominated only when surrounded by silence. Vocalic-nasal hesitation markers dominated in all positions in English and German, although in the former language this was more salient than in the latter. Nasal hesitation markers were used significantly more frequently in German than in English or Dutch. In addition to overall language trends, speaker-specific differences, especially within German and Dutch, were observed. These results raise questions in terms of the symptom versus signal hypotheses regarding the function of hesitation markers. a
Article
Full-text available
This paper introduces the concept of speech management (SM), which refers to processes whereby a speaker manages his or her linguistic contributions to a communicative interaction, and which involves phenomena which have previously been studied under such rubrics as “planning”, “editing”, “(self-)repair”, etc. It is argued that SM phenomena exhibit considerable systematicity and regularity and must be considered part of the linguistic system. Furthermore, it is argued that SM phenomena must be related not only to such intraindividual factors as planning and memory, but also to interactional factors such as turntaking and feedback, and to informational content. Structural and functional taxonomies are presented together with a formal description of complex types of SM. The structural types are exemplified with data from a corpus of SM phenomena.
Article
Full-text available
Eye-tracking and gating experiments examined reference comprehension with fluent (Click on the red. . .) and disfluent (Click on [pause] thee uh red . . .) instructions while listeners viewed displays with 2 familiar (e.g., ice cream cones) and 2 unfamiliar objects (e.g., squiggly shapes). Disfluent instructions made unfamiliar objects more expected, which influenced listeners' on-line hypotheses from the onset of the color word. The unfamiliarity bias was sharply reduced by instructions that the speaker had object agnosia, and thus difficulty naming familiar objects (Experiment 2), but was not affected by intermittent sources of speaker distraction (beeps and construction noises; Experiments 3). The authors conclude that listeners can make situation-specific inferences about likely sources of disfluency, but there are some limitations to these attributions.
Article
the conventions used by conversation analysts for transcribing speech are . . . illustrated presents a number of excerpts transcribed in conversation-analytic orthography, as well as a glossary of transcript symbols attempts to uncover what she calls a 'standard maximum' of one second for silences occurring during conversation / she carries out a series of analyses on a large corpus of data, extracting all silences with a duration of at least nine-tenths of a second (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Unlike read or laboratory speech, spontaneous speech contains high rates of disfluencies (e.g. repetitions, repairs, filled pauses, false starts). This paper aims to promote ‘disfluency awareness’ especially in the field of phonetics –which has much to offer in the way of increasing our understanding of these phenomena. Two broad claims are made, based on analyses of disfluencies in different corpora of spontaneous American English speech. First, an Ecology Claim suggests that disfluencies are related to aspects of the speaking environments in which they arise. The claim is supported by evidence from task effects, location analyses, speaker effects and sociolinguistic effects. Second, an Acoustics Claim argues that disfluency has consequences for phonetic and prosodic aspects of speech that are not represented in the speech patterns of laboratory speech. Such effects include modifications in segment durations, intonation, voice quality, vowel quality and coarticulation patterns. The ecological and acoustic evidence provide insights about human language production in real-world contexts. Such evidence can also guide methods for the processing of spontaneous speech in automatic speech recognition applications.
Article
People responding to questions are sometimes uncertain, slow, or unable to answer. They handle these problems of self-presentation, we propose, by the way they respond. Twenty-five respondents were each asked 40 factual questions in a conversational setting. Later, they rated for each question their feeling that they would recognize the correct answer, then took a recognition test on all 40 questions. As found previously, the weaker their feeling of knowing, the slower their answers, the faster their nonanswers ("I don′t know"), and the worse their recognition. But further, as proposed, the weaker their feeling of knowing, the more often they answered with rising intonation, used hedges such as "I guess," responded "I don′t know" instead of "I can′t remember," and added "uh" or "um," self-talk, and other face-saving comments. They reliably used "uh" to signal brief delays and "um" longer ones.
Article
[This paper reports an exploratory investigation of hesitation phenomena in spontaneously spoken English. Following a brief review of the literature bearing on such phenomena, a quantitative study of filled and unfilled pauses, repeats, and false starts in the speech of some twelve participants in a conference is described. Analysis in terms of both individual differences and linguistic distribution is made, and some psycholinguistic implications are drawn, particularly as to the nature of encoding units and their relative uncertainty. A distinction between non-chance statistical dependencies and all-or-nothing dependencies in linguistic methodology is made.]
Article
Speech disfluencies have different effects on comprehension depending on the type and placement of disfluency. Words following false starts (such as windmill after in the in the eleventh example is um in the a windmill) have longer word monitoring latencies than the same tokens with the false starts excised. The decremental effect seems to be limited to false starts that occur in the middle of sentences or after discourse markers. I suggest it is at these points that the repair process is most burdened by the false start. In contrast, words following repetitions (heart in of a of a heart) do not have longer word monitoring latencies than the same tokens with the repetitions excised. In two experiments, words following spontaneously produced repetitions have faster word monitoring latencies. Two other experiments suggest that this seeming repetition advantage is more likely the result of slowed monitoring after a phonological phrase disruption. Inserting repetitions where they did not occur in a manner that preserved the original phonological phrases resulted in neither an advantage nor a disadvantage or repeating. These studies provide a first glimpse at how speech disfluencies affect understanding, and also provide information about the types of comprehension models that can accommodate the effects of speech disfluencies.
Article
Listeners often encounter disfluencies (like uhs and repairs) in spontaneous speech. How is comprehension affected? In four experiments, listeners followed fluent and disfluent instructions to select an object on a graphical display. Disfluent instructions included mid-word interruptions (Move to the yel- purple square), mid-word interruptions with fillers (Move to the yel- uh, purple square), and between-word interruptions (Move to the yellow- purple square). Relative to the target color word, listeners selected the target object more quickly, and no less accurately, after hearing mid-word interruptions with fillers than after hearing comparable fluent utterances as well as utterances that replaced disfluencies with pauses of equal length. Hearing less misleading information before the interruption site led listeners to make fewer errors, and fillers allowed for more time after the interruption for listeners to cancel misleading information. The information available in disfluencies can help listeners compensate for disruptions and delays in spontaneous utterances.
Article
For people to contribute to discourse, they must do more than utter the right sentence at the right time. The basic requirement is that they add to their common ground in an orderly way. To do this, we argue, they try to establish for each utterance the mutual belief that the addressees have understood what the speaker meant well enough for current purposes. This is accomplished by the collective actions of the current contributor and his or her partners, and these result in units of conversation called contributions. We present a model of contributions and show how it accounts for a variety of features of everyday conversations.
Article
Most research on the rapid mental processes of on-line language processing has been limited to the study of idealized, fluent utterances. Yet speakers are often disfluent, for example, saying "thee, uh, candle" instead of "the candle." By monitoring listeners' eye movements to objects in a display, we demonstrated that the fluency of an article ("thee uh" vs. "the") affects how listeners interpret the following noun. With a fluent article, listeners were biased toward an object that had been mentioned previously, but with a disfluent article, they were biased toward an object that had not been mentioned. These biases were apparent as early as lexical information became available, showing that disfluency affects the basic processes of decoding linguistic input.
Are 'silent' pauses always silent
  • M Belz
  • J Trouvain
Belz, M., & Trouvain, J. (2019). Are 'silent' pauses always silent?. In 19. International Congress of Phonetic Sciences (ICPhS).
Phonetic and functional features of pauses, and concurrent gestures, in tourist guides' speech. In c. XV Convegno Nazionale AISV Gli archivi sonori al crocevia tra scienze fonetiche
  • V Cataldo
Cataldo, V. (2019). Phonetic and functional features of pauses, and concurrent gestures, in tourist guides' speech. In c. XV Convegno Nazionale AISV Gli archivi sonori al crocevia tra scienze fonetiche, informatica umanistica e patrimonio digitale (Vol. 6, pp. 205-231).
Some reasons for hesitating. Temporal variables in speech: Studies in Honour of Frieda Goldman-Eisler
  • W Chafe
Chafe, W. (1980). Some reasons for hesitating. Temporal variables in speech: Studies in Honour of Frieda Goldman-Eisler, 169-180.
Using Uh and Um in Spontaneous Speaking
  • H H Clark
  • J E Fox Tree
Clark, H.H., & J.E. Fox Tree. 2002. Using Uh and Um in Spontaneous Speaking. Cognition 84 (1): 73-111. https://doi.org/10.1016/S0010-0277(02)00017-3
Filled pauses and prolongations in Roman Italian task-oriented dialogue
  • J Di Napoli
Di Napoli, J. (2020). Filled pauses and prolongations in Roman Italian task-oriented dialogue. In Laughter and Other Non-Verbal Vocalisations Workshop: Proceedings (2020).
20 Fluency and Disfluency. The handbook of speech production
  • R J Lickley
Lickley, R. J. (2015). 20 Fluency and Disfluency. The handbook of speech production, 445.
The subtle power of uncomfortable silences
  • L Morrison
Morrison, L. (2018). The subtle power of uncomfortable silences. BBC online Article. https://www.bbc.com/worklife/article/20170718-the-subtle-power-of-uncomfortable-silences (assessed 07/09/2021).
Diatopic, diamesic and diaphasic variations in spoken Italian
  • R Savy
  • F Cutugno
Savy, R., & Cutugno, F. (2009). Diatopic, diamesic and diaphasic variations in spoken Italian. In Proceedings of the 5th Corpus Linguistics Conference: CL2009 (pp. 20-23).
2021, submitted). Hesitation Marker Distribution in Italian Discourse
  • L Schettino
  • S Betz
  • P Wagner
Schettino, L., Betz, S., Wagner, P. (2021, submitted). Hesitation Marker Distribution in Italian Discourse.