Article

Degrees of freedom of facial movements in face-to-face conversational speech

DOI:HAL:http://hal.archives-ouvertes.fr/hal-00195551/en/
Source: OAI

ABSTRACT In this paper we analyze the degrees of freedom (DoF) of facial movements in face-to-face conversation. We propose here a method for automatically selecting expressive frames in a large fine-grained motion capture corpus that best complement an initial shape model built using neutral speech. Using conversational data from one speaker, we extract 11 DoF that reconstruct facial deformations with a average precision less than a millimeter. Gestural scores are then built that gather movements and discursive labels. This modeling framework offers a productive analysis of conversational speech that seeks in the multimodal signals the rendering of given communicative functions and linguistic events.

0 0
 · 
0 Bookmarks
 · 
29 Views
  • Source
    Article: Three-dimensional linear articulatory modeling of tongue, lips and face, based on MRI and video images
    [show abstract] [hide abstract]
    ABSTRACT: In this study, previous articulatory midsagittal models of tongue and lips are extended to full three-dimensional models. The geometry of these vocal organs is measured on one subject uttering a corpus of sustained articulations in French. The 3D data are obtained from magnetic resonance imaging of the tongue, and from front and profile video images of the subject's face marked with small beads. The degrees of freedom of the articulators, i.e., the uncorrelated linear components needed to represent the 3D coordinates of these articulators, are extracted by linear component analysis from these data. In addition to a common jaw height parameter, the tongue is controlled by four parameters while the lips and face are also driven by four parameters. These parameters are for the most part extracted from the midsagittal contours, and are clearlyinterpretable in phonetic/biomechanical terms. This implies that most 3D features such as tongue groove or lateral channels can be controlled by articulatory parameters defined for the midsagittal model. Similarly, the 3D geometry of the lips is determined by parameters such as lip protrusion or aperture, that can be measured from a profile view of the face.
    Journal of Phonetics.
  • Article: Audiovisual Speech Synthesis
    [show abstract] [hide abstract]
    ABSTRACT: This paper presents the main approaches used to synthesize talking faces, and provides greater detail on a handful of these approaches. An attempt is made to distinguish between facial synthesis itself (i.e. the manner in which facial movements are rendered on a computer screen), and the way these movements may be controlled and predicted using phonetic input. The two main synthesis techniques (model-based vs. image-based) are contrasted and presented by a brief description of the most illustrative existing systems. The challenging issues—evaluation, data acquisition and modeling—that may drive future models are also discussed and illustrated by our current work at ICP.
    International Journal of Speech Technology 01/2003; 6(4):331-346.
  • Article: The Power of a Nod and a Glance: Envelope Vs. Emotional Feedback in Animated Conversational Agents.
    Applied Artificial Intelligence. 01/1999; 13:519-538.

Full-text (2 Sources)

View
3 Downloads
Available from
27 Sep 2012

Keywords

discursive labels
 
DoF
 
expressive frames
 
Gestural scores
 
initial shape model
 
large fine-grained motion capture corpus
 
linguistic events
 
neutral speech
 
productive analysis