Conference PaperPDF Available

HEFES: An Hybrid Engine for Facial Expressions Synthesis to control human-like androids and avatars

  • Hanson Robotics

Abstract and Figures

Nowadays advances in robotics and computer science have made possible the development of sociable and attractive robots. A challenging objective of the field of humanoid robotics is to make robots able to interact with people in a believable way. Recent studies have demonstrated that human-like robots with high similarity to human beings do not generate the sense of unease that is typically associated to human-like robots. For this reason designing of aesthetically appealing and socially attractive robots becomes necessary for realistic human-robot interactions. In this paper HEFES (Hybrid Engine for Facial Expressions Synthesis), an engine for generating and controlling facial expressions both on physical androids and 3D avatars is described. HEFES is part of a software library that controls a human robot called FACE (Facial Automaton for Conveying Emotions). HEFES was designed to allow users to create facial expressions without requiring artistic or animatronics skills and it is able to animate both FACE and its 3D replica. The system was tested in human-robot interaction studies aimed to help children with autism to interpret their interlocutors' mood through facial expressions understanding.
Content may be subject to copyright.
HEFES: an Hybrid Engine for Facial Expressions Synthesis to control
human-like androids and avatars
Daniele Mazzei, Nicole Lazzeri, David Hanson and Danilo De Rossi
AbstractNowadays advances in robotics and computer
science have made possible the development of sociable and
attractive robots. A challenging objective of the field of hu-
manoid robotics is to make robots able to interact with people
in a believable way. Recent studies have demonstrated that
human-like robots with high similarity to human beings do
not generate the sense of unease that is typically associated to
human-like robots. For this reason designing of aesthetically
appealing and socially attractive robots becomes necessary for
realistic human-robot interactions.
In this paper HEFES (Hybrid Engine for Facial Expressions
Synthesis), an engine for generating and controlling facial
expressions both on physical androids and 3D avatars is
described. HEFES is part of a software library that controls
a human robot called FACE (Facial Automaton for Conveying
Emotions). HEFES was designed to allow users to create facial
expressions without requiring artistic or animatronics skills and
it is able to animate both FACE and its 3D replica.
The system was tested in human-robot interaction studies
aimed to help children with autism to interpret their interlocu-
tors’ mood through facial expressions understanding.
In the last years, more and more social robots have been
developed due to rapid advances in hardware performance,
computer graphics, robotics technology and Artificial Intel-
ligence (AI).
There are various examples of social robots but it is
possible to roughly classify them according to their aspect
in two main categories: human-like and not human-like.
Human-like social robots are usually associated to the per-
nicious myth that robots should not look or act like human
beings in order to avoid the so-called ’Uncanny Valley’ [1].
MacDorman and Ishiguro [2] explored observers’ reactions
to gradual morphing of robots and humans pictures and
found a peak in judgments of the eeriness in the transition
between robot and human-like robot pictures according to
the Uncanny Valley hypothesis. Hanson [3] repeated this
experiment morphing more attractive pictures and found that
the peak of eeriness was much smoother, approaching to
a flat line, in the transition between human-like robot and
human beings pictures. This indicates that typical reactions
due to the Uncanny Valley were present only in the transition
between classic robots and cosmetically atypical human-like
robots. Although more studies demonstrate the presence of
the Uncanny Valley effect, it is possible to design and create
human-like robots that are not uncanny using innovative
Daniele Mazzei, Nicole Lazzeri and Danilo De Rossi are with Interdepart-
mental Research Center ’E. Piaggio’, Faculty of Engineering - University of
Pisa, Via Diotisalvi 2, 56126 Pisa, Italy. (
David Hanson is with Hanson Robotics, Plano Tx, USA.
technologies that integrate movies and cinema animation
with make-up techniques [4].
The enhancement of the believability of human-like robots
is not a pure aesthetic challenge. In order to create machines
that look and act as humans, it is necessary to improve the
robot’s social and expressive capabilities in addition to the
appearance. Therefore, facial expressiveness is one of the
most important aspect to be analyzed in designing human-
like robots since it is the major emotional communication
channel used in interpersonal relationships together with
facial and head micro movements [5].
Since the early 70’s, facial synthesis and animation have
raised a great interest among computer graphics researchers
and numerous methods for modeling and animating human
faces have been developed to reach more and more realistic
One of the first models for the synthesis of faces was
developed by Parke [6], [7]. The Parke parametric model is
based on two groups of parameters: conformation parameters
which are related to the physical facial features, such as the
shape of the mouth, nose, eyes, etc., and expression parame-
ters which are related to facial actions such as wrinkling the
forehand for anger or open the eyes wide for surprise.
Differently, physically-based models manipulate directly
the geometry of the face to approximate real deformations
caused by the muscles including skin layers and bones.
Waters [8], using vectors and radial functions, developed a
parameterized model based on facial muscles dynamic and
skin elasticity.
Another approach used for creating facial expressions is
based on interpolation methods. Interpolation-based engines
use a mathematical function to specify smooth transitions
between two or more basic facial positions in a defined time
interval [9]. One, two or three-dimensional interpolations
can be performed to create an optimized and realistic facial
morphing. Although interpolations are fast methods, they are
limited in the number of realistic facial configurations they
can generate.
All geometrically-based methods described above can
generate difficulties in achieving realistic facial animations
since they require artistic skills. On the other hand, animation
skills are required only for creating a set of basic facial
configurations since an interpolation space can be use to
generate a wide set of new facial configurations starting from
the basic ones.
In this work a facial animation engine called HEFES was
implemented as fusion of a muscle-based facial animator
and an intuitive interpolation system. The facial animation
system is based on the Facial Action Coding System (FACS)
in order to make it compatible with both physical robots and
3D avatars and usable in different facial animation scenarios.
The FACS is the most popular standard for describing facial
behaviors in terms of muscular movements. The FACS is
based on a detailed study of the facial muscles carried out
by Ekman and Friesen in 1976 [10] and is aimed at classi-
fying the facial muscular activity according to Action Units
(AUs). AUs are defined as visually discernible component
of facial movements which are generated through one or
more underlying muscles. AUs can be used to describe all
the possible movements that a human face can express.
Therefore an expression is a combination of several AUs,
each of them with their own intensity measured in 5 discrete
levels (A:Trace, B:Slight, C:Marked pronounced, D:Severe,
E:Extreme maximum).
FACE is a robotic face used as emotions conveying system
(Fig. 1). The artificial skull is covered by a porous elastomer
material called Frubber
that requires less force to be
stretched by servo motors than other solid materials [11].
FACE has 32 servo motors actuated degrees of freedom
which are mapped on the major facial muscles to allow FACE
to simulate facial expressions.
Fig. 1. FACE and the motor actuation system
FACE servo motors are positioned following the AUs
disposition according to the FACS (Fig. 2) and its facial
expressions consist of a combination of many AUs positions.
Thanks to the fast response of the servo motors and the me-
chanical properties of the skin, FACE can generate realistic
human expressions involving people in social interactions.
HEFES is a subsystem of the FACE control library deputed
to the synthesis and animation of facial expressions and
includes a set of tools for controlling FACE and its 3D avatar.
HEFES includes four modules: synthesis, morphing, anima-
tion and display. The synthesis module is designed to allow
Fig. 2. Mapping between servo motors positions and Action Units of FACS
users to manually create basic facial expressions that are
normalized and converted according to the FACS standard.
The morphing module takes the normalized FACS-based
expressions as input and generates an emotional interpolation
space where expressions can be selected. The animation
module merges concurrent requests from various control
subsystems and creates a unique motion request resolving
possible conflicts. Finally, the display module receives the
facial motion request and converts it in movements according
to the selected output display.
1) The synthesis module allows users to generate new fa-
cial expressions through the control of the selected emotional
display, i.e. FACE robot or 3D avatar. Both modules provide
a graphical user interface (GUI) with as many slider controls
as the number of servo motors (FACE robot) or anchor points
(3D avatar) which are present in the corresponding emotional
In the Robot editor, each slider defines a normalized
range between 0 and 1 for moving the corresponding servo
motor which is associated to an AU of the FACS. Us-
ing the Robot editor, the six basic expressions, i.e. hap-
piness, sadness, anger, surprise, fear and disgust, defined
as universally accepted’ by Paul Ekman [12], [13], were
manually created. According to the ”Circumplex Model of
Affect” theory [14], [15], each generated expression was
saved as an XML file including the set of the AUs values,
the expression name and the corresponding coordinates in
terms of Pleasure and arousal. In the Circumplex Model of
Affect expressions are associated with Pleasure that indicates
the pleasant/unpleasant feelings and with Arousal which is
related to a physiological activation.
The 3D virtual editor is a similar tool used to deform a
facial mesh. The 3D editor is based on a user interface on
which a set of slider controls is used to actuate various facial
muscles. Expressions are directly rendered on the 3D avatar
display and saved as XML files as in the Robot Editor.
2) The morphing module generates, on the base of the
Posner’s theory, an emotional interpolation space, called
Emotional Cartesian Space (ECS) [16]. In the ECS the x
coordinate represents the valence and the y coordinate rep-
resents the arousal. Each expression e(v, a) is consequently
Fig. 3. The architecture of the facial animation system based on four main modules: synthesis, morphing, animation and display.
associated with a point in the valence-arousal plane where
the neutral expression e(0, 0) is placed in the origin (Fig. 3,
Morphing module). The morphing module takes the set of
basic expressions as input and generates the ECS applying
a shape-preserving piecewise cubic interpolation algorithm
implemented in Matlab
. The output of the algorithm is a
three-dimensional matrix composed of 32 planes correspond-
ing to the 32 AUs. As shown in Fig. 4, each plane represents
the space of the possible positions of a single AU where
each point is identified by two the coordinates, valence and
arousal. The coordinates of each plane range between -1 and
1 with a step of 0.1 therefore the generated ECS produces
21x21 new normalized FACS-based expressions that can be
performed by the robot or the 3D avatar independently. Since
the ECS is not a static space, each new expression manually
created through the synthesis module can be used to refine
the ECS including it in the set of expressions used by
the interpolation algorithm. The possibility of updating the
ECS with additional expressions allows users to continuously
adjust the ECS covering zones in which the interpolation
algorithm could require a more detailed description of the
AUs (II-B.1).
3) The animation module is designed to combine and
merge multiple requests coming from various modules which
can run in parallel in the robot/avatar control library. The
facial behavior of the robot or avatar is inherently concurrent
since parallel requests could interest the same AU generating
conflicts. Therefore the animation module is responsible for
mixing movements, such as eye blinking or head turning,
with requests of expressions. For example, eye blinking
conflicts with the expression of amazement since normally
amazed people react opening the eyes wide.
The animation module receives as input a motion request,
which is defined by a single AU or a combination of
multiple AUs, with an associated priority. The animation
engine is implemented as a Heap, a specialized tree-based
data structure used to define a shared timer that is responsible
for orchestrating the animation. The elements of the Heap,
Fig. 4. The emotional Cartesian plane for the right eyebrow (motor #24
corresponding to the AU 1 in Fig. 2).
called Tasks, are ordered by their due time therefore the root
of the Heap contains the first task to be executed. In the Heap
there can be two types of tasks, Motion Task and Interpolator
Task, that are handled in a different way. Both types of tasks
are defined by the expiring time, the duration of the motion
and the number of steps in which the task will be divided. A
Motion Task also includes 32 AUs, each of them with their
associated values and a priority. When a movement request
is generated, a Motion Task is sent to the Animation Engine
and inserted into the Heap which will be reordered according
to the due time. The animation engine is always running to
check whether some tasks into the Heap are expired. For each
expired task, the animation engine removes it from the Heap
and executes it. If the task is a Motion Task, the animation
engine calculates the amount of movement to be performed
at the current step, stores the result in correspondence to the
relative AU and reschedules the task into the Heap if the
task is not completed. If the task is an Interpolation Task,
the animation engine calculates the new animation state by
blending all the steps, previously calculated, for each AUs
according to their priority. At the end, the Interpolator Task
is automatically rescheduled into the Heap with an expiring
time of 40ms.
The output of the animation module is a motion task
composed of 32 normalized AUs values that is sent to the
emotional display module.
4) The display module represents the output of the sys-
tem. We implemented two dedicated emotional displays: the
FACE android and the 3D avatar. According to a calibration
table, the FACE android display converts normalized AUs
values into servo motor positions that are expressed as duty
cycles in the range 500-2500. Each motor has a different
range of movements due to its position inside the FACE.
For this reason, the display module includes a control layer
to avoid the exceeding the servo motor limits according
to minimum and maximum values stored in the calibration
The 3D avatar display is a facial animation system based
on a physical model described in [17] that approximates the
anatomy of the skin and the muscles. The model is based on
a non-linear spring system which can simulate the dynamics
of human face movements while the muscles are modeled
as mesh of force deformed springs. Each skin point of the
mesh is connected with its neighbors by non-linear springs.
Human face includes a wide range of muscles types, e.g.
rectangular, triangular, sheet, linear, sphincter. Since servo
motors act as linear forces, the type of muscle satisfying
this condition is the linear muscle that is specified by two
points: the attachment point which is normally fixed and the
insertion point which defines the area where the facial muscle
performs its action. Facial muscle contractions pull the skin
surface from the area of the muscle insertion point to the
area of the muscle attachment point. When a facial muscle
contracts, the facial skin points in the influence area of the
muscle change their position according to the distance from
the muscle attachment point and the elastic properties of the
mass-spring system. Facial skin points not directly influenced
by the muscle contraction are in a sort of unbalanced state
that is stabilized through propagation of other unbalanced
elastic forces.
The elastic model of the skin and the mathematical imple-
mentation of the muscles have been already developed while
the manual mapping of the 3D mesh anchor points to AUs
is still under development.
Generally facial animation softwares are tools that re-
quire a certain level of knowledge in design, animation and
anatomy. Often users only need to easily animate facial
expressions without having these specific skills. Therefore
the system was designed to be used both by experts in facial
design and animation which can directly create or modify
expressions and users that are interested in quickly designing
HRI experimental protocols selecting a set of pre-configured
The ECS Timeline is a tool of the HEFES system that is
intended to meet the needs of different users. The timeline
is a Graphical User Interface (GUI) with two use modalities:
”Auto Mode” and ”Advanced Mode”. In Auto Mode, users
can create sequences of expressions selecting the correspond-
ing points in the ECS and dragging them into the timeline.
Sequences can be saved, played and edited using the timeline
control. When a sequence is reproduced, motion requests
are sent to the animation module that resolves conflicts and
forwards them to the robot or the avatar display. The ECS
Timeline GUI includes a chart that visualizes the motors
positions during an animation for a deeper understanding of
the facial expression animation process (Fig. 5). In Advanced
Mode, a sequence of expressions can be displayed as editable
configurations of all AUs values in a multitrack graph where
each AU is expressed as a motion track and can be manually
edited. In the Advanced Mode is possible to use ECS
expressions as starting point for creating more sophisticated
animations in which single AUs can be adjusted in real-time.
Fig. 5. The ECS Animation in the Auto Mode configuation.
HEFES was used as emotions conveying system within the
IDIA (Inquiry into Disruption of Intersubjective equipment
in Autism spectrum disorders in childhood) project in col-
laboration with the IRCCS Stella Maris (Calambrone, Italy)
[16], [18].
In particular, the ECS Animation tool was used by the
psychologist in Auto Mode to easily design the therapeutic
protocol creating facial animation paths without require
FACE android direct motor configuration and calibration.
The tool does not required skills in facial animation and
human anatomy and allowed therapist to intuitively create
therapeutic scenarios adding expressions to the timeline
dragging them from the ECS. Moreover the Manual Mode
Fig. 6. The morphing module used for creating new ’mixed’ expressions (right side) selecting (V,A) points (red dots) from the ECS. The module takes
in input a set of basic expressions (left side) with their (V,A) values (blue dots).
configuration was used to create specific patterns of move-
ments such as the turning of the head. Head movements was
oriented to watch a little robot used by the therapist to test
children’s shared attention capabilities.
Recent study demonstrated that people with Autism Spec-
trum Disorders (ASDs) do not perceive robots as machine
but as ”artificial partners” [19]. On the base of this theory the
IDIA project aimed to the study of alternative ASD treatment
protocol involving robots, avatars and other advanced tech-
nologies. One of the purposes of the protocol was to verify
the capability of the FACE android to convey emotions to
children with ASD. Figure 6 shows examples of expressions
generated by the morphing module. It takes the six basic
expressions as input (expressions on the left side of the figure
corresponding to the blue dots in the ECS) and generates
’half-way’ expressions (right side of the figure corresponding
to the red dots in the ECS) by clicking on the ECS. All these
generated expressions are identified by their corresponding
pleasure and arousal coordinates.
FACE base protocol was tested on a panel of normally
developing children and children with Autism Spectrum
Disorders (ASDs) (aged 6-12 years).
The test was conducted on a panel of 5 children with
ADSs and 15 normally developing interacting with the robot
individually under therapist supervision. The protocol was
divided in phases and one of these concerned evaluating
the accuracy of emotional recognition and imitation skills.
In this phase children were asked to recognize, label and
then imitate a set of facial expressions performed by the
robot and subsequently by the psychologist. The sequence
of expressions included happiness, anger, sadness, disgust,
fear and surprise. Moreover, the protocol included a phase
called ”free play” where the ECS tool was directly used by
the psychologist to control the FACE android in real-time.
The subjects’ answers in labeling an expression were
scored as correct or wrong by a therapist and used for
calculating the percentage of correct expressions recognition.
As shown in Fig. 7 both children with ASDs and normally
developing children were able to label Happiness, Anger and
Sadness performed by FACE and by the psychologist with-
out errors. Otherwise Fear, Disgust and Surprise performed
by FACE and by the psychologist have not been labeled
correctly, especially by subjects with ASDs. Fear, Disgust
and Surprise are emotions which convey empathy not only
through stereotypical facial expressions but also with body
movements and vocalizations. The affective content of this
emotions is consequently dramatically reduced if expressed
only through facial expressions.
Fig. 7. Results of the labeling phase for ASD and control subjects observing
FACE and psychologist expressions.
In conclusion HEFES allows operators and psychologists
to easily model and generate expressions following the
current standards of facial animations. The morphing module
provides a continuous emotional space where it is possible
to select a wide range of expressions, most of them difficult
to be manually generated. The possibility to continuously
add new expressions to the ECS interpolator allows users to
refine the expressions generation system for reaching a high
expressiveness level without requiring animation or artistic
Through HEFES is possible to control robot or avatar
creating affective based human-robot interaction scenarios on
which different emotions can be conveyed. Facial expressions
performed by FACE and by the psychologist have been
labeled by children with ASDs and normally developed
children with the same score. This analysis demonstrates that
the system is able to correctly generate human-like facial
HEFES was designed to be used both with a physical
robot and with a 3D avatar. The actual state of the 3D editor
includes the algorithm to animate the facial mesh according
to the model described in Sec. II and the definition of some
anchor points. In future all the AUs will be mapped on
the 3D avatar mesh for a complete control of the avatar.
HEFES will be used to study how human beings perceive
facial expressions and emotion expressed by a physical
robot in comparison with its 3D avatar for understanding
if the physical appearance has an emphatic component in
conveying emotions.
Moreover the synthesis module will include the control of
facial micro movements and head dynamics that are asso-
ciated with human moods. For example, blinking frequency
and head speed are considered to be indicators of discomfort.
These micro movements will be designed and controlled
using an approach similar to the one designed for facial
expressions. A set of basic head and facial micro move-
ments will be generated and associated with corresponding
behaviors according to their pleasure and arousal coordinates.
The set of basic behaviors will be used as input of the
morphing module which will generate a Behavioral Cartesian
Space (BCS). Future experiment on emotion labeling and
recognition will be conducted including the facial micro
movement generator and a face tracking algorithm in order
to investigate the contribute of this affective related activities
on emotions conveying FACE capabilities.
[1] M. Mori, “Bukimi no tani (the uncanny valley), Energy, 1970.
[2] K. F. MacDorman and H. Ishiguro, “The uncanny advantage of using
androids in cognitive and social science research, Interaction Studies,
vol. 7, no. 3, pp. 297–337, 2006.
[3] D. Hanson, “Exploring the aesthetic range for humanoid robots, in
Proceedings of the ICCS CogSci 2006 Symposium Toward Social
Mechanisms of Android Science. Citeseer, 2006, p. 1620.
[4] H. Ishiguro, Android science - toward a new cross-interdisciplinary
framework, Development, vol. 28, pp. 118–127, 2007.
[5] P. Ekman, “Facial expression and emotion, American Psychologist,
pp. 384–392, 1993.
[6] F. I. Parke, “Computer generated animation of faces, in ACM ’72:
Proceedings of the ACM annual conference. New York, NY, USA:
ACM, 1972, pp. 451–457.
[7] F. I. Parke, A parametric model for human faces, Ph.D. dissertation,
The University of Utah, 1974.
[8] K. Waters, A muscle model for animation three-dimensional facial
expression, SIGGRAPH Computer Graphics, vol. 21, pp. 17–24,
August 1987.
[9] F. Pighin, J. Hecker, D. Lischinski, R. Szeliski, and D. H. Salesin,
“Synthesizing realistic facial expressions from photographs, in Pro-
ceedings of the 25th annual conference on Computer graphics and
interactive techniques, ser. SIGGRAPH ’98. New York, NY, USA:
ACM, 1998, pp. 75–84.
[10] P. Ekman and W. V. Friesen, “Measuring facial movement, Journal
of Nonverbal Behavior, vol. 1, no. 1, pp. 56–75, Sep. 1976.
[11] D. Hanson, “Expanding the design domain of humanoid robots, in
Proceedings of ICCS CogSci Conference, special session on Android
Science, 2006.
[12] P. Ekman, Are there basic emotions?” Psychological Review, vol. 99,
no. 3, pp. 550–553, Jul 1992.
[13] P. Ekman, Handbook of Cognition and Emotion: 3 Basic emotions.
New York: John Wiley & Sons Ltd, 1999, ch. 3, pp. 45–60.
[14] J. A. Russell, “The circumplex model of affect, Journal of Personality
and Social Psychology, vol. 39, pp. 1161–1178, 1980.
[15] J. Posner, J. A. Russell, and B. S. Peterson, “The circumplex model
of affect: An integrative approach to affective neuroscience, cognitive
development, and psychopathology, Development and Psychopathol-
ogy, vol. 17, no. 3, pp. 715–734, 2005.
[16] D. Mazzei, L. Billeci, A. Armato, N. Lazzeri, A. Cisternino, G. Piog-
gia, R. Igliozzi, F. Muratori, A. Ahluwalia, and D. De Rossi, “The face
of autism,” in RO-MAN 2009. The 18th IEEE International Symposium
on Robot and Human Interactive Communication, 2009, 2010, pp.
[17] Y. Zhang, E. C. Prakash, and E. Sung, “Real-time physically-based
facial expression animation using mass-spring system, in Computer
Graphics International 2001, ser. CGI ’01. Washington, DC, USA:
IEEE Computer Society, 2001, pp. 347–350.
[18] D. Mazzei, N. Lazzeri, L. Billeci, R. Igliozzi, A. Mancini,
A. Ahluwalia, F. Muratori, and D. De Rossi, “Development and
evaluation of a social robot platform for therapy in autism, in
EMBC 2011. The 33rd Annual International Conference of the IEEE
Engineering in Medicine and Biology Society, 2011, pp. 4515–4518.
[19] J. Scholtz, “Theory and evaluation of human robot interactions, in
Proc. 36th Annual Hawaii Int System Sciences Conf, 2003.
... System architecture for emotional robots. In literature [61][62][63], such architecture involves two levels of processing,i.e., low level, and high level. The tasks of the low level involve scene analysis and emotion display; in contrast, the high level is responsible for processing an internal emotional state and defining the most appropriate response ...
... Act, the last module, shows the expressive result of an internal process; it processes the behavioral output and regulates the possible conflict in the elaboration of the expression. The core section of this module is HEFES (Hybrid Engine for Facial Expressions Synthesis) [63]. ...
... To display emotions, FACE has an artificial skull made by Frubber, an elastomer material, and 32 servo motors. The capacity to display emotion is implemented by HEFES (Hybrid Engine for Facial Expressions Synthesis), a sophisticated system implementing "the fusion of muscle-based facial animator and an intuitive interpolation system" [63]. HEFES therefore integrates the two face synthesis architectures. ...
Full-text available
One of the most challenging goals in social robotics is implementing emotional skills. Making robots capable of expressing and deciphering emotions is considered crucial for allowing humans to socially interact with them. In addition to presenting technical challenges, the implementation of artificial emotions in artificial systems raises intriguing ethical issues. In this paper, moving from the case study of a human android, we present a relational perspective on human–robot interaction, claiming that, since robots are material objects not endowed with subjectivity, only an asymmetrical relationship can be established between robots and humans. Based on this claim, we deal with some of the most relevant issues in roboethics, such as transparency, trust, and authenticity. We conclude suggesting that a machine-centered approach to ethics should be abandoned in favor of a relational approach, which revalues the centrality of human being in the Human–Robot Interaction.
... The synergetic effects of realistic appearance and complex humanlike behavior, i.e., gaze, expressions, and motor abilities, have been identified as essential factors (Minato et al., 2004). Hence, novel robots with expressive capabilities have facilitated research on mimicking, synthesizing, and modelling of robotic face movements (Wu et al., 2009;Magtanong et al., 2012;Mazzei et al., 2012;Meghdari et al., 2016). Furthermore, researchers aim at providing insights on how we evaluate, recognize, respond, react, and interact with such social and emotional humanlike robots (Hofree et al., 2018;Hortensius et al., 2018;Jung and Hinds, 2018;Tian et al., 2021). ...
... For example, to control using machine vision software and AI to recognize specific human expressions and mirroring, recreating, or reacting using a robotic agent Silva et al., 2016;Todo, 2018). Similarly, AI applications have been utilized to analyze robots' facial capabilities and automatically learn various expressions (Wu et al., 2009;Mazzei et al., 2012;Meghdari et al., 2016;Chen et al., 2021;Rawal et al., 2022). For automated control of robotic faces, using AUs is valuable as it becomes a transferal unit of facial movement, representing both the human action and the robot's actuation capabilities (Lin et al., 2011;Lazzeri et al., 2015;Faraj et al., 2020). ...
Full-text available
This paper presents a new approach for evaluating and controlling expressive humanoid robotic faces using open-source computer vision and machine learning methods. Existing research in Human-Robot Interaction lacks flexible and simple tools that are scalable for evaluating and controlling various robotic faces; thus, our goal is to demonstrate the use of readily available AI-based solutions to support the process. We use a newly developed humanoid robot prototype intended for medical training applications as a case example. The approach automatically captures the robot's facial action units through a webcam during random motion, which are components traditionally used to describe facial muscle movements in humans. Instead of manipulating the actuators individually or training the robot to express specific emotions, we propose using action units as a means for controlling the robotic face, which enables a multitude of ways to generate dynamic motion, expressions, and behavior. The range of action units achieved by the robot is thus analyzed to discover its expressive capabilities and limitations and to develop a control model by correlating action units to actuation parameters. Because the approach is not dependent on specific facial attributes or actuation capabilities, it can be used for different designs and continuously inform the development process. In healthcare training applications, our goal is to establish a prerequisite of expressive capabilities of humanoid robots bounded by industrial and medical design constraints. Furthermore, to mediate human interpretation and thus enable decision-making based on observed cognitive, emotional, and expressive cues, our approach aims to find the minimum viable expressive capabilities of the robot without having to optimize for realism. The results from our case example demonstrate the flexibility and efficiency of the presented AI-based solutions to support the development of humanoid facial robots.
... Image taken from [77]. More details on single services reported in the figure can be found in [79][80][81]. ...
... Such information detected by the perception system can lead to an immediate physical reaction, like a movement or a facial expression, in the body of the robot [81] (i.e., the reactive path) and/or constitute a trigger for emotional or reasoning processes that will lead to more complex behavior (i.e., the deliberative path). ...
Full-text available
Humanoids have been created for assisting or replacing humans in many applications, providing encouraging results in contexts where social and emotional interaction is required, such as healthcare, education, and therapy. Bioinspiration, that has often guided the design of their bodies and minds, made them also become excellent research tools, probably the best platform by which we can model, test, and understand the human mind and behavior. Driven by the aim of creating a believable robot for interactive applications, as well as a research platform for investigating human cognition and emotion, we are constructing a new humanoid social robot: Abel. In this paper, we discussed three of the fundamental principles that motivated the design of Abel and its cognitive and emotional system: hyper-realistic humanoid aesthetics, human-inspired emotion processing, and human-like perception of time. After reporting a brief state-of-the-art on the related topics, we present the robot at its stage of development, what are the perspectives for its application, and how it could satisfy the expectations as a tool to investigate the human mind, behavior, and consciousness.
... However, although numerous studies have developed androids for emotional interactions (Kobayashi and Hara, 1993;Kobayashi et al., 2000;Minato et al., 2004Minato et al., , 2006Minato et al., , 2007Weiguo et al., 2004;Ishihara et al., 2005;Matsui et al., 2005;Berns and Hirth, 2006;Blow et al., 2006;Hashimoto et al., 2006Hashimoto et al., , 2008Oh et al., 2006;Sakamoto et al., 2007;Lee et al., 2008;Takeno et al., 2008;Allison et al., 2009;Lin et al., 2009Lin et al., , 2016Kaneko et al., 2010;Becker-Asano and Ishiguro, 2011;Ahn et al., 2012;Mazzei et al., 2012;Tadesse and Priya, 2012;Cheng et al., 2013;Habib et al., 2014;Yu et al., 2014;Asheber et al., 2016;Glas et al., 2016;Marcos et al., 2016;Faraj et al., 2021;Nakata et al., 2021; Table 1), few have empirically validated the androids that were developed. First, no study validated androids' AUs coded using FACS (Ekman and Friesen, 1978;Ekman et al., 2002). ...
Full-text available
Android robots capable of emotional interactions with humans have considerable potential for application to research. While several studies developed androids that can exhibit human-like emotional facial expressions, few have empirically validated androids’ facial expressions. To investigate this issue, we developed an android head called Nikola based on human psychology and conducted three studies to test the validity of its facial expressions. In Study 1, Nikola produced single facial actions, which were evaluated in accordance with the Facial Action Coding System. The results showed that 17 action units were appropriately produced. In Study 2, Nikola produced the prototypical facial expressions for six basic emotions (anger, disgust, fear, happiness, sadness, and surprise), and naïve participants labeled photographs of the expressions. The recognition accuracy of all emotions was higher than chance level. In Study 3, Nikola produced dynamic facial expressions for six basic emotions at four different speeds, and naïve participants evaluated the naturalness of the speed of each expression. The effect of speed differed across emotions, as in previous studies of human expressions. These data validate the spatial and temporal patterns of Nikola’s emotional facial expressions, and suggest that it may be useful for future psychological studies and real-life applications.
... The following paragraphs summarize some of the relevant developed works involving a humanoid robot capable of displaying facial expression interacting with children with ASD. FACE [28,29] is a female android built to allow children with ASD to deal with expressive and emotional information. The system was tested with five children with ASD and fifteen typically developing children. ...
Full-text available
Facial expressions are of utmost importance in social interactions, allowing communicative prompts for a speaking turn and feedback. Nevertheless, not all have the ability to express themselves socially and emotionally in verbal and non-verbal communication. In particular, individuals with Autism Spectrum Disorder (ASD) are characterized by impairments in social communication, repetitive patterns of behaviour, and restricted activities or interests. In the literature, the use of robotic tools is reported to promote social interaction with children with ASD. The main goal of this work is to develop a system capable of automatic detecting emotions through facial expressions and interfacing them with a robotic platform (Zeno R50 Robokind® robotic platform, named ZECA) in order to allow social interaction with children with ASD. ZECA was used as a mediator in social communication activities. The experimental setup and methodology for a real-time facial expression (happiness, sadness, anger, surprise, fear, and neutral) recognition system was based on the Intel® RealSense™ 3D sensor and on facial features extraction and multiclass Support Vector Machine classifier. The results obtained allowed to infer that the proposed system is adequate in support sessions with children with ASD, giving a strong indication that it may be used in fostering emotion recognition and imitation skills.
... Moreover, various researchers created social robots to be used as research platforms for them or other colleagues and labs. This is the case of Kismet [18], iCub [19], Kaspar [20,21], FACE [22][23][24], Geminoid [25], and many others. ...
Full-text available
The fast growth of social robotics (SR) has not been unidirectional, but rather towards a multidisciplinary scenario, creating a need for collaboration between different fields. This divergent expansion calls for a clear analysis of the field aimed at better orienting the research, thus paving the future of social robotics. This paper aims at understanding how the SR research field evolved in the last two decades by analyzing academic publications in SR and human–robot interaction using natural language processing (NLP) techniques. The analysis spotted an overlap between SR and human–robot interaction research fields that have been disambiguated using a data-driven approach that leads to the identification of a new group of papers we clustered under the concept of “soft HRI.” This research topic has been analyzed by extracting trends and insights. Finally, another topic modelling step has been applied to identify seven sub-topics that have been discussed and analyzed picturing the current state of the art of SR. The paper reports a complete overview of the SR research field identifying various topics and sub-topics helping researchers in understanding the evolution of this field, thus supporting the strategic placing and evolution of their research activities.
Clinical educators have used robotic and virtual patient simulator systems (RPS) for dozens of years, to help clinical learners (CL) gain key skills to help avoid future patient harm. These systems can simulate human physiological traits; however, they have static faces and lack the realistic depiction of facial cues, which limits CL engagement and immersion. In this article, we provide a detailed review of existing systems in use, as well as describe the possibilities for new technologies from the human–robot interaction and intelligent virtual agents communities to push forward the state of the art. We also discuss our own work in this area, including new approaches for facial recognition and synthesis on RPS systems, including the ability to realistically display patient facial cues such as pain and stroke. Finally, we discuss future research directions for the field.
Proposed mental immersion in virtual reality avatar provides interaction with autism children to improve their social communication that includes visual object recognition, speaking—speech gestures, and aural—responding to avatar. Autism children need to understand the basic activities for their day-to-day life. Using the 3D modeling tool such as Blender creates an animated 3D models avatar and interact with autism children to improve their social communication. Autism children with above social communication issues will be reduced and interaction of avatar will make the children to connect emotionally and reduce their unusual or repetitive gestures while interact with virtual reality-based avatar in the continuous rehabilitation process. Five children with mild level and two children with medium level of autism are considered for observation and the interaction with avatar improves the social communication of the children with autism. Simple objects the children used in their day to day were modeled (referred from special school curriculum book). Avatar interacts with the children about the objects. While interacting with avatar, children eye contact, gesture, and expressions will be promising compared to the traditional therapy with physical objects. Avatar walks in the virtual environment and does the interaction. Two children with medium autism find difficulty in using head-mount device (HMD). We replace the HMD with leap motion device, so that hand and head-free movement makes them to see and interact with virtual environment. Day-to-day objects they are seeing in the rehabilitation were modeled in realistic way, so they can connect their minds with the objects easily. We have created virtual classroom with avatar for the next level, so that avatar interacts with children such as rhymes and music. It stimulates the autism children mental state to connect to music in the virtual world which enhances their social world communication too.
Full-text available
Understanding human trust in machine partners has become imperative due to the widespread use of intelligent machines in a variety of applications and contexts. The aim of this paper is to investigate whether human-beings trust a social robot—i.e. a human-like robot that embodies emotional states, empathy, and non-verbal communication—differently than other types of agents. To do so, we adapt the well-known economic trust-game proposed by Charness and Dufwenberg (2006) to assess whether receiving a promise from a robot increases human-trust in it. We find that receiving a promise from the robot increases the trust of the human in it, but only for individuals who perceive the robot very similar to a human-being. Importantly, we observe a similar pattern in choices when we replace the humanoid counterpart with a real human but not when it is replaced by a computer-box. Additionally, we investigate participants’ psychophysiological reaction in terms of cardiovascular and electrodermal activity. Our results highlight an increased psychophysiological arousal when the game is played with the social robot compared to the computer-box. Taken all together, these results strongly support the development of technologies enhancing the humanity of robots.
Full-text available
More than 40 years ago, Masahiro Mori, a robotics professor at the Tokyo Institute of Technology, wrote an essay [1] on how he envisioned people's reactions to robots that looked and acted almost like a human. In particular, he hypothesized that a person's response to a humanlike robot would abruptly shift from empathy to revulsion as it approached, but failed to attain, a lifelike appearance. This descent into eeriness is known as the uncanny valley. The essay appeared in an obscure Japanese journal called Energy in 1970, and in subsequent years, it received almost no attention. However, more recently, the concept of the uncanny valley has rapidly attracted interest in robotics and other scientific circles as well as in popular culture. Some researchers have explored its implications for human-robot interaction and computer-graphics animation, whereas others have investigated its biological and social roots. Now interest in the uncanny valley should only intensify, as technology evolves and researchers build robots that look human. Although copies of Mori's essay have circulated among researchers, a complete version hasn't been widely available. The following is the first publication of an English translation that has been authorized and reviewed by Mori. (See “Turning Point” in this issue for an interview with Mori.).
Full-text available
Factor-analytic evidence has led most psychologists to describe affect as a set of dimensions, such as displeasure, distress, depression, excitement, and so on, with each dimension varying independently of the others. However, there is other evidence that rather than being independent, these affective dimensions are interrelated in a highly systematic fashion. The evidence suggests that these interrelationships can be represented by a spatial model in which affective concepts fall in a circle in the following order: pleasure (0), excitement (45), arousal (90), distress (135), displeasure (180), depression (225), sleepiness (270), and relaxation (315). This model was offered both as a way psychologists can represent the structure of affective experience, as assessed through self-report, and as a representation of the cognitive structure that laymen utilize in conceptualizing affect. Supportive evidence was obtained by scaling 28 emotion-denoting adjectives in 4 different ways: R. T. Ross's (1938) technique for a circular ordering of variables, a multidimensional scaling procedure based on perceived similarity among the terms, a unidimensional scaling on hypothesized pleasure–displeasure and degree-of-arousal dimensions, and a principal-components analysis of 343 Ss' self-reports of their current affective states. (70 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Full-text available
The development of robots that closely resemble human beings can contribute to cognitive research. An android provides an experimental apparatus that has the potential to be controlled more precisely than any human actor. However, preliminary results indicate that only very humanlike devices can elicit the broad range of responses that people typically direct toward each other. Conversely, to build androids capable of emulating human behavior, it is necessary to investigate social activity in detail and to develop models of the cognitive mechanisms that support this activity. Because of the reciprocal relationship between android development and the exploration of social mechanisms, it is necessary to establish the field of android science. Androids could be a key testing ground for social, cognitive, and neuroscientific theories as well as platform for their eventual unification. Nevertheless, subtle flaws in appearance and movement can be more apparent and eerie in very humanlike robots. This uncanny phenomenon may be symptomatic of entities that elicit our model of human other but do not measure up to it. If so, very humanlike robots may provide the best means of pinpointing what kinds of behavior are perceived as human, since deviations from human norms are more obvious in them than in more mechanical-looking robots. In pursuing this line of inquiry, it is essential to identify the mechanisms involved in evaluations of human likeness. One hypothesis is that, by playing on an innate fear of death, an uncanny robot elicits culturally-supported defense responses for coping with death’s inevitability. An experiment, which borrows from methods used in terror management research, was performed to test this hypothesis. [Thomson Reuters Essential Science Indicators: Fast Breaking Paper in Social Sciences, May 2008]
This paper describes the representation, animation and data collection techniques that have been used to produce "realistic" computer generated half-tone animated sequences of a human face changing expression. It was determined that approximating the surface of a face with a polygonal skin containing approximately 250 polygons defined by about 400 vertices is sufficient to achieve a realistic face. Animation was accomplished using a cosine interpolation scheme to fill in the intermediate frames between expressions. This approach is good enough to produce realistic facial motion. The three-dimensional data used to describe the expressions of the face was obtained photogrammetrically using pairs of photographs.
First description of the uncanny valley theory
Although the uncanny exists, the inherent, unavoidable dip (or valley) may be an illusion. Extremely abstract robots can be uncanny if the aesthetic is off, as can cosmetically atypical humans. Thus, the uncanny occupies a continuum ranging from the abstract to the real, although norms of acceptability may narrow as one approaches human likeness. However, if the aesthetic is right, any level of realism or abstraction can be appealing. If so, then avoiding or creating an uncanny effect just depends on the quality of the aesthetic design, regardless of the level of realism. The author's preliminary experiments on human reaction to near-realistic androids appear to support this hypothesis.
A procedure has been developed for measuring visibly different facial movements. The Facial Action Code was derived from an analysis of the anatomical basis of facial movement. The method can be used to describe any facial movement (observed in photographs, motion picture film or videotape) in terms of anatomically based action units. The development of the method is explained, contrasting it to other methods of measuring facial behavior. An example of how facial behavior is measured is provided, and ideas about research applications are discussed.