A preview of this full-text is provided by Wiley.
Content available from Computer Graphics Forum
This content is subject to copyright. Terms and conditions apply.
EUROGRAPHICS 2022 / R. Chaine and M. H. Kim
(Guest Editors)
Volume 41 (2022), Number 2
Compact Facial Landmark Layouts for Performance Capture
E. Zell1,2and R. McDonnell1
1Trinity College Dublin
2University of Bonn
Facial
Rig
Compact
Facial
Landmarks
Capturing
marker-based data annotation
M = 28 M = 32 M = 36 M = 24 M = 29 M = 33
Figure 1: Different to previous work, we suggest to derive facial landmarks from a low-dimensional facial rig by analyzing the degrees of
freedom. Our method (red) is purely based on the existing animation model and does not require large character databases or person-specific
4D sequences. Different compact layouts are computed by out method for two of Epic’s Metahuman character, with ε=0.3,0.5and 0.7for
the female and ε=0.5,0.7and 0.8for the male character.
Abstract
An abundance of older, as well as recent work exists at the intersection of computer vision and computer graphics on accurate
estimation of dynamic facial landmarks with applications in facial animation, emotion recognition, and beyond. However, only a
few publications exist that optimize the actual layout of facial landmarks to ensure an optimal trade-off between compact layouts
and detailed capturing. At the same time, we observe that applications like social games prefer simplicity and performance over
detail to reduce the computational budget especially on mobile devices. Other common attributes of such applications are pre-
defined low-dimensional models to animate and a large, diverse user-base. In contrast to existing methods that focus on creating
person-specific facial landmarks, we suggest to derive application-specific facial landmarks. We formulate our optimization
method on the widely adopted blendshape model. First, a score is defined suitable to compute a characteristic landmark for
each blendshape. In a following step, we optimize a global function, which mimics merging of similar landmarks to one. The
optimization is solved in less than a second using integer linear programming and guarantees a globally optimal solution to an
NP-hard problem. Our application-specific approach is faster and fundamentally different to previous, actor-specific methods.
Resulting layouts are more similar to empirical layouts. Compared to empirical landmarks, our layouts require only a fraction
of landmarks to achieve the same numerical error when reconstructing the animation from landmarks. The method is compared
against previous work and tested on various blendshape models, representing a wide spectrum of applications.
1. Introduction
Over the last two dacades, facial animation capturing evolved from
a research topic relevant only for high-end VFX application to a
widely accessible technology and is nowadays even integrated in
smartphones. Current applications span from highly-detailed cap-
tures of digital doubles to simple emoji animation, and from highly
actor-specific solutions to a nearly unlimited user base. Besides
capturing technology, best practices evolved for character creation
pipelines paving the way for parametric character configurators
like Epic’s MetaHuman, Daz3D Genesis or Polywink. The orig-
inally linear workflow, starting with motion capturing and move
afterwards to character creation and animation retargeting became
more and more non-linear due to convenient access and compelling
prices of pre-built characters. But if the character to animate exists
before the actual capturing, is it possible to limit the capturing data
and minimize data and processing time? We investigate the ques-
tion of how to distinguish between relevant and non-relevant infor-
© 2022 The Author(s)
Computer Graphics Forum © 2022 The Eurographics Association and John
Wiley & Sons Ltd. Published by John Wiley & Sons Ltd.
DOI: 10.1111/cgf.14463