Conference PaperPDF Available

An analysis of multimodal cues of interruption in dyadic spoken interactions

Authors:

Abstract and Figures

Interruptions are integral elements of natural spontaneous human interaction. Both competitive and cooperative interruption serve a distinct role in the flow of conversation. This paper analyzes their differences with features, change and activeness, employing audio, visual, and disfluency data. These features are able to capture differences between the two types of interruptions better than average feature values of any single modality. Also, discriminant analysis shows that the use of multimodal cues provides a 21% improvement in classification accuracy between the two types of interruptions relative to the baseline while any individual single modality cue does not provide significant improvement.
Content may be subject to copyright.
An Analysis of Multimodal Cues of Interruption in Dyadic Spoken Interactions
Chi-Chun Lee, Sungbok Lee, Shrikanth S. Narayanan


  !""#! 
{chiclee,sungbokl,shri}@usc.edu
Abstract
             
  $  %      
    &$
'  (   &  change
activeness,     $
'              &   
&       
$ &
    )*+  
&&
   &  
$
1. Introduction
,               
    $ '-.
/*0 &          
.. &
          
.    . $ 1   
          &    
          -  
-$
2 ,   &  3  
             /)0$  4  .
                  
($ &
                 
$
 &&
  -.          3  
   
              
$ 5 /60   
      &  7  
$
 
            &  &
.$  4        
        &      &
.& 
   &$  '        
                 
          
          
 $ 2,8/90 (
,   &
$ :& , 
    
   $ 2     
       
                
.$8;/<0( 
   &&
       
 .   $ 4
           
$
;        
     &  
              
     /= >0$  ; &    

$'?4@/#0&
            
-     $  : (   
  (      &      
    $ 2 

&
  $            
    &    &    
          
$
'  (&  
) , 
  6   &. 
9$
2. Research Methodology
4           &    &    
7    $  :    
 A: /!0
               B 
           &    &    
    C$ ' 
   D D       
$ :  
           & 
        &    /*"0      3  
  &    &    &  
$
2.1. Database and Annotation
: ?4@  $&
      
,$ '  & 
 -
              
      $  :         
        $  '    
 $     =*
.)   <6    6  
&  D  
x , y , z

 . $ '  .  &         
       &          
    $  '          
D  &      $  '    &
   $ 
 6$*$6 &  
  $            
      &        
 $ ' &    
            D$  4
  &  ,            
$ ' E'/**0&
       &
  $ 
&?4@**66
*=&D.$
      &
.  &  & . 
     F    
      .    $  BC
/*)0$' 
(  /*60$, 
&
Competitive Interruption
?F)GH G$
2FH&$$$
Cooperative Interruption
?F&& & H&$
2FH $$$
& &
  $  %.      ,    
$'*
    $  '          
     
             
&.$
'*FSummary of Interruptions
  '
? 6!*= *>> <=)6
2 ==6) 6**# !><"
' *"<9# 9#)< *<6>6
2.2. Feature Extraction
'   , F
 F   F  
 $% 
      &        
   &             
 $ :      
$ ;& &    
  .G  
.$ ' ,
$
   &
 change activeness, 
    &          &  
$ Change        
&    
  &         $ Activeness 
               
&$
2.2.1. Speech-Intensity
1& *" &&&
  6"      @    /*90$  '       

?,D-&
?,&
?D-&
?&
change D-&
activeness &
changeD-&
Iw=IfewIfsw
*
&
few, fsw

&$activeness
)
Iov =
i=fsovI
feov
IiIiI
Teov Tsov
)
&
feov , fsov

   
Teov , Tsov
  
$
2.2.2. Hand Motions
;              .
?4@$'

1activeness D-&
1change D-&
activeness D-&
change D-&
    6 .     
.
.$'&activeness
Vk=
i=fswI
few
xixiIJyiyiIJ ziziIJ
TeTs
6
&
xi, yi, z i
          
.   
  &$'
& 
&  
$
;&              
            -
 &. &
Vk

Vk
 $ '     &
                 &    
$Activeness,
rw

rw=V1
V,
& 
V〉 =
k=1
totw
Vk
totw
 9
V1
          &      
 6 
totw
 
&$
Change        &     
  &
&$ ' 3 &   
    $
'   
$  ; &  .-   &  . K  9  
 
x , y , z

  .    
 2*$
2  *       D
x , y , z
  
x , z
,      &$          G
$'
2*Fxample of clustering of right hand
motions into 4 regions
,   
          
$  &  &
   &  
     
  $  ' change       
  F  *             
  &    &    "   &$  '
  &   
$
2.2.3. Disfluencies
:. *
    " &  
$:
&
    &            
 $ '       & 
(/*<0
false start:   .        

repetition: .
filled pause: BC BC BC BC
2.2.4. Feature Normalization
         (  &      
        D      &
.-$ 2 

F
   .  &  
Fref
    

Fref =
sbj
neu
F
numsbjnumneu
<
&
sbj
neu
    D  .     
$  
numsbj
numneu
     
D        D$
'(
Fnorm

Fnorm =F
Csbj
&
Csbj =
neu
Fsbj
numneu
1
Fref
=
&
Csbj
(D$'
( 
&($
3. Results and Discussion
'  ,            &  &  
3
Does each feature listed in Section 2 behave differently
for the two types of interruptions?
Can we obtain a better discriminating power by
incorporating multimodal cues?
; two samplet-test, two proportions test,
fisher's exact test-      (,    
3$
3.1. Hypothesis Testing
4         
3 &
              $
   .  & 
 -    . 
$:,
           $  '  )   
                 
)$
3.1.1. Speech-Intensity
'
        D-  &       
  &  &          
$  :           
 $ 
                 ,
$ '  &            
(    $ '   change 
activeness     &    
&.'&
       

$
3.1.2. Hand Motions
' ' )  
              $ '
         
              
 $ E  change    fisher's exact
test &two sample
proportions test$ :  &D   .
 L  
.&$ % &   
&&&$
3.1.3. Disfluencies
'p-value two proportions test$')
 &   
      $
:( 
     &  
  .   
&
.3.$
3.2. Discriminant Analysis
& 
@& 
Intensity-only Features
Hand Motions-only Features
Combination of Both Modalities
' 6         
$ '
             &  
      6$*$  2  '  *     
       & .
'): Summary Results of Interrupting Utterances
M ;? 
?, ?
 
 
E
Word Overlap Word Overlap Left Right Left Right
 =>$>< >"$# ="$9# =*$9! !*$!) **$<> )$#= )$<< *< *< 69
 =<$9) =#$66 <!$6# ="$)* >*$6! #$"" *$=< )$*< * 6 >
p-value 0.05 0.013 "$*# "$"! 0.007 0.02 0.02 "$)< 0.006 "$"> 0.016
               & 
     $       
              
            
 $ ' %,G ? 
"$"<   
$'   
 &$
3.2.1. Intensity-only
'               ,   
 change  activeness. ':.G
    "$!"9         G  
"$">"$' 
  &     '  6  #!$=+      
            )#+  
        $  :
                  
                  
     
($
3.2.2. Hand Motions-only
'         G activenesschange
    G change.     G  changep-
valueN"$"< &
 $' :.G  "$#<! 
 G "$"*9$2 '
6   
$ 
           
,
$
3.2.3. Combination
'  :.G        "$>#)        
"$*"$2'6  
$
;&                
       
        *9+  )*+     
    >!$< +$  '          
          
   (  
$-&&  
  F,
  Gchange. % D & &
   >*$)+ &
$
4. Conclusions and Future Work
4              
         
 $  '      &
         &  

   
  $  :           
     $$   &   & 
   $ 
             
$' &
 
    
&          
$4 
  6    &            
$ ' (
   &   
$
 &&.&
  ?4@              
$'&&
   & &   
          &.$      &  
      
$;&         
/*=0              &    
        ,     
&  $  2            
&. 
        &    .G  
$  '            
             &  
$& &
               
                
$'
&.$
5. Acknowledgment
'    &             E2  
$
6. References
/*0 $ B1'..'
 C Journal of Personality and Social
Psychology,)6F)#6-!) *!>)
/)0 $  % @$5$  5    $  E  O1-
       -
 OICASSP ; ;   )"">  $
) $=#<-=##$
/60 P$  5$ B        F 
         &-  -
C Journal of Pragmatics *9 ##6-!"6 *!!"
/90 $  8 BM( .  F   
    C   Proc. Second SIGdial
Workshop on Discourse and Dialog $*= *-*" )""*
/<0 2$8$@$;$ B1
C Proc. NAACL HLT 2007 1E8 
)""> *>-)9
/=0 $ ?E Hand and Minds: What Gestures Reveal about
Thoughts,$@   *!!)
/>0 $; B$;?
C AISB ; Q$ )""<$
/#0 $% ?$% $$ $Q(( $?& $Q
P$E$ $ $$E O?4@F
   O P
1 )"">$
/!0 $;  A  $ : B,  1   
 C Language and Sex: Difference and
Dominance,%' E; 1&
?FE&; *!><
/*"0 5 4.   1   -  B?
F      ,  ?    
C Social Psychology Quarterly, $ =<  $* 
6#-<<
/**0 :$?  2 Syllabification Software, '  .  E
@5  E  
' P*!!>$FLL&&&$$LLL
/*)0 ?$  Q  BR          
C Proc. of the Eurpspeech,$*6=>R*6>" )""*
/*60 ;$A$ 8-8 1$8 $ 8$? 8$A 
B       F    
  'SC Journal of Intercultural
Communication Research,$69 $9 )66-)<9 $)""<
/*90 @$%$:. B@  
 C @ 
 E ' 1 *6) *!!= 
FLL&&&$$
/*<0 @$$;P$2$$B 
 .F?.T .
C Computational Linguistics )<9F<)<R<>* *!!!
/*=0 $% A$  ?$5 $ E  $E 
O1,F
 O '     
@ $*< $6 $*"><-*"#= ?)"">$
Table 3: Summary of Classification Result
  4
 *""$"+ "$"+ =<$>+
- #!$=+ )#$"+ =#$<+
;?- <9$)+ ##$"+ =<$#+
 #!$=+ ="$"+ 79.5%
... Yang [57] reported that competitive interruptions have higher pitch and intensity levels, while collaborative interruptions have a relatively lower pitch level. Lee and Narayanan [40] proposed a multimodal analysis method to classify the interruption type. They observed that the absence of hand motions signal the occurrence of cooperative interruptions with high probability. ...
... We did not find significant differences for the other features. We could not replicate all the results from previous studies [40,48,26]. Such dissimilarities may come from the scenarios of the corpora used in the different studies. ...
Chapter
Full-text available
During an interaction, interactants exchange speaking turns. Exchanges can be done smoothly or through interruptions. Listeners can display backchannels, send signals to grab the speaking turn, wait for the speaker to yield the turn, or even interrupt and grab the speaking turn. Interruptions are very frequent in natural interactions. To create believable and engaging interaction between human interactants and embodied conversational agent ECA, it is important to endow virtual agent with the capability to manage interruptions, that is to have the ability to interrupt, but also to react to an interruption. As a first step, we focus on the later one where the agent is able to perceive and interpret the user’s multimodal behaviors as either an attempt or not to take the turn. To this aim, we annotate, analyse and characterize interruptions in human-human conversations. In this paper, we describe our annotation schema that embeds different types of interruptions. We then provide an analysis of multimodal features, focusing of prosodic features (F0 and loudness) and body (head and hand) activity, to characterize interruptions.
... This paper concerns the pragmatic implications of the alternation between task-oriented and non-task oriented discourse among coworkers. Research has demonstrated how boundaries of spoken exchanges can reveal important information on the mechanics and power dynamics of interpersonal relationships (see for example, Angouri & Marra 2010, 2011Bolden 2006;Laver 1975Laver , 1981Lee, Lee, and Narayanan 2008;Lindström 1994). Different strategies and linguistic markers are used to introduce a new topic, exhausting the topic at hand or enacting engagement on the ongoing discussion on a certain topic. ...
Article
Full-text available
The focus of this article revolves around discourse markers (DMs) that are used when switching between work talk and small talk in workplace interactions. Research in this field has showed how discourse markers are used to manage several interpersonal dynamics in interaction. This study is aimed at identifying which DMs are used in the workplace to operate a shift of topic, how often DMs are used at the juncture of interaction, and what are their specific pragmatic and discursive function when they are used in these situations. This study is based on a workplace small-talk corpus of spoken American English. Results show that DMs are often used to mark the shift to a different topic or mode of discourse; in particular, shifts to work talk are marked more often than shifts to more small talk on different topics. Also, speakers may select different DMs based on the type of shift. The role and function of the highest-ranking discourse markers were observed, as well as pragmatic implications and impact in the daily interactions among co-workers.
... To automatically distinguish different types of interruptions, Lee et al. [6] analysed the differences in speech intensity, hand motion, and disfluency between cooperative and competitive interruptions. Yang et al. [13] also mentioned acoustic and prosodic differences. ...
... Oertel et al. [8] used prosodic features (from the overlapper) and body movment features (from both overlapper and overlappee) to investigate the context surrounding overlaps. Multimodal cues such as speech intensity, hand motions, and disfluencies were used in Lee et al. [9] to classify overlaps as either competitive or cooperative. Rather than classifying overlaps, Lee and Narayanan [10] aim to predict interruptions. ...
... Goldberg [8] and others, on the other hand, claim that interruptions may indeed be competitive, but they may also be neutral (e.g., requests for clarification) or used even to convey rapport with the interlocutor; these are often termed collaborative interruptions, in which a speaker helps their interlocutor, e.g. by completing their utterance. Collaborative interruptions are described as indicators of coordination and alignment in dialogue [9], and their production presents prosodic and gestural differences from competitive interruptions [10]. Cross-cultural studies show differences both in the frequency of interruptions and in the sociocultural value attached to them [11,12]. ...
... It has been proposed that gestures and gaze are relevant resources for overlap management in face-to-face discourse. Lee et al. (2008) found that hand movements helped to discriminate between turn-competitive and non-competitive overlaps in a corpus of acted dialogues. In a study of French natural conversations, Mondada and Oloff (2011) showed that continuing vs. abandoning gesturing during overlap is associated with how problematic participants take the overlap to be. ...
Article
Full-text available
Objectives: Training software to facilitate participation in conversations where overlapping talk is common was to be developed with the involvement of Cochlear implant (CI) users. Methods: Examples of common types of overlap were extracted from a recorded corpus of 3.5 hours of British English conversation. In eight meetings, an expert panel of five CI users tried out ideas for a computer-based training programme addressing difficulties in turn-taking. Results: Based on feedback from the panel, a training programme was devised. The first module consists of introductory videos. The three remaining modules, implemented in interactive software, focus on non-overlapped turn-taking, competitive overlaps and accidental overlaps. Discussion: The development process is considered in light of feedback from panel members and from an end of project dissemination event. Benefits, limitations and challenges of the present approach to user involvement and to the design of self-administered communication training programmes are discussed. Conclusion: The project was characterized by two innovative features: the involvement of service users not only at its outset and conclusion but throughout its course; and the exclusive use of naturally occurring conversational speech in the training programme. While both present practical challenges, the project has demonstrated the potential for ecologically valid speech rehabilitation training.
Conference Paper
Chapter
Contemporary technical devices obey the paradigm of naturalistic multimodal interaction and user-centric individualisation. Users expect devices to interact intelligently, to anticipate their needs, and to adapt to their behaviour. To do so, companion-like solutions have to take into account the affective and dispositional state of the user, and therefore to be trained and modified using interaction data and corpora. We argue that, in this context, big data alone is not purposeful, since important effects are obscured, and since high-quality annotation is too costly. We encourage the collection and use of enriched data. We report on recent trends in this field, presenting methodologies for collecting data with rich disposition variety and predictable classifications based on a careful design and standardised psychological assessments. Besides socio-demographic information and personality traits, we also use speech events to improve user state models. Furthermore, we present possibilities to increase the amount of enriched data in cross-corpus or intra-corpus way based on recent learning approaches. Finally, we highlight particular recent neural recognition approaches feasible for smaller datasets, and covering temporal aspects.
Article
Although being a frequently occurring phenomenon in spoken communication, speech overlaps did not obtain the deserved attention in research so far - in both Human-Human Interaction (HHI) and Human-Computer Interaction (HCI). It is common knowledge that overlaps can figure as a competitive, rude interruption as well as a cooperative, convenient feedback signal giving important insight on the course of the interaction - but how are they related to the internal state of the overlapping speaker or the overlapped speaker? In this paper, we investigate dyadic human-human interactions and focus on the relations between the emotional changes occurring around overlaps in both interaction participants. Further to an in-depth statistical analysis of the changes in control and valence levels with respect to the nature of the overlap, we also present a classification approach based on features derived from such emotional changes surrounding an overlap. We show that the automatic classification of competitive and cooperative overlaps using the changes in valence and control levels of the overlapping speaker outperforms common approaches employing acoustic and linguistic features.
Article
Full-text available
In this paper we focus on a long-standing debate surrounding the measurement of interruptions in conversational behavior. This debate has implications for conversational analysts interested in turn-taking structures, researchers interested in close relationships who interpret them as an exercise of power, and group processes researchers studying status-organizing structures. We explore two different measurements of interruptions: (1) a syntactic measurement that operationalizes an interruption as simultaneous talk initiated more than two syllables from the end of a current speaker's sentence, and (2) a more contextual measurement that takes into account situational factors such as the current speaker's intentions and the content of what both speakers say when judging whether a speech act is an interruption. We coded transcripts from 86 task group discussions using West and Zimmerman's (1983) syntactic criteria and Murray's (1985) context-sensitive method for identifying interruptions. Factor analyses found a one-factor solution, an indication that both measurements capture the same underlying construct. Confirmatory factor analyses identified more subtle variations, however, suggesting that gender and subcultural differences affect how coders construe interruptions.
Conference Paper
Full-text available
Intelligent environments equipped with audio-visual sensors provide suitable means for automatically monitoring and tracking the behavior, strategies and engagement of the participants in multiperson meetings. In this paper, high-level features are calculated from active speaker segmentations, automatically annotated by our smart room system, to infer the interaction dynamics between the participants. These features include the number and the average duration of each turn, statistics of turn-taking such as time as active speaker, and turn-taking transition patterns between participants. The results show that it is possible to accurately estimate in real-time not only the flow of the interaction, but also how dominant and engaged each participant was during the discussion. These high-level features, which cannot be inferred from any of the individual modalities by themselves, can be useful for summarization, classification, retrieval and (after action) analysis of meetings
Conference Paper
Full-text available
Anvil is a tool for the annotation of audiovisual material con- taining multimodal dialogue. Annotation takes place on freely definable, multiple layers (tracks) by inserting time-anchored elements that hold a number of typed attribute-value pairs. Higher-level elements (suprasegmental) consist of a sequence of elements. Attributes contain symbols or cross-level links to arbitrary other elements. Anvil is highly generic (usable with different annotation schemes), platform-independent, XML- based and fitted with an intuitive graphical user interface. For project integration, Anvil offers the import of speech transcrip- tion and export of text and table data for further statistical pro- cessing.
Article
Full-text available
Since emotions are expressed through a combination of verbal and non-verbal channels, a joint analysis of speech and gestures is required to understand expressive human communication. To facilitate such investigations, this paper describes a new corpus named the “interactive emotional dyadic motion capture database” (IEMOCAP), collected by the Speech Analysis and Interpretation Laboratory (SAIL) at the University of Southern California (USC). This database was recorded from ten actors in dyadic sessions with markers on the face, head, and hands, which provide detailed information about their facial expressions and hand movements during scripted and spontaneous spoken communication scenarios. The actors performed selected emotional scripts and also improvised hypothetical scenarios designed to elicit specific types of emotions (happiness, anger, sadness, frustration and neutral state). The corpus contains approximately 12 h of data. The detailed motion capture information, the interactive setting to elicit authentic emotions, and the size of the database make this corpus a valuable addition to the existing databases in the community for the study and modeling of multimodal and expressive human communication.
Article
Studied the turn-taking mechanism, whereby participants manage the smooth and appropriate exchange of speaking turns in face-to-face interaction in 2 videotapes showing a therapist-patient interview and a discussion between 2 therapists. 3 basic signals were noted: (a) turn-yielding signals by the speaker, (b) attempt-suppressing signals by the speaker, and (c) back-channel signals by the auditor. These signals were used and responded to in a relatively structured manner, describable in terms of a set of rules. Results indicate that behaviors in every communication modality examined content, syntax, intonation, paralanguage, and body motion were active as elements of the turn-taking signals. (22 ref.) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
In this paper we show that interruptions are important elements in the interactive character of discourse and in the resolution of issues of cognitive uncertainty and planning. By representing discourse graphically, we also show that interruptions are part of the local and global coherence that is brought about through the systematic phrase-to-phrase prosodic patterns of discourse. The specific pitch height of the interruption varies with the expression of emotion, signals of attention-getting, and signals of competitiveness. These prosodic forms are potentially usable in spoken dialogue systems to provide intelligent responding systems that are responsive to human motivations in dialogues.
Article
Analysts interested in the social significance of conversational behavior have traditionally treated interruptions as reliable, objective indicators of the interlocutors' power, control or dominance. The relational significance interruptions have for the participants themselves, however, was rarely considered. Recently, researchers have become increasingly aware that interruptions are not and need not be synonymous with power. This paper attempts to differentiate between power and non-power interruptions. It provides a means for assessing the ‘meaning’ of each interruption as a display of relational power or rapport, or as a non-relational display of ‘neutrality’.
Conference Paper
In this paper, we report on an empirical study on initiative conflicts in human-human conver- sation. We examined these conflicts in two corpora of task-oriented dialogues. The re- sults show that conversants try to avoid initia- tive conflicts, but when these conflicts occur, they are efficiently resolved by linguistic de- vices, such as volume.