Conference Proceeding

Context dependent viseme models for voice driven animation

Comput. Sci. & Eng., Northwestern Polytech. Univ., Xi'an, China
08/2003; DOI:10.1109/VIPMC.2003.1220537 ISBN: 953-184-054-7 pp.649 - 654 vol.2 In proceeding of: Video/Image Processing and Multimedia Communications, 2003. 4th EURASIP Conference focused on, Volume: 2
Source: IEEE Xplore

ABSTRACT This paper addresses the problem of animating a talking figure, such as an avatar, using speech input only. The system that was developed is based on hidden Markov models for the acoustic observation vectors of the speech sounds that correspond to each of 16 visually distinct mouth shapes (visemes). The acoustic variability with context was taken into account by building acoustic viseme models that are dependent on the left and right viseme contexts. Our experimental results show that it is indeed possible to obtain visually relevant speech segmentation data directly from the purely acoustic speech signal.

0 0
 · 
0 Bookmarks
 · 
29 Views