Conference PaperPDF Available

When Faces Are Masked: Exploring Emotional Expression Through Body Postures in Virtual Reality

Authors:

Abstract and Figures

As simulation advances in healthcare training, understanding how body-only signals convey emotions in virtual environments is crucial, particularly with masked virtual agents. This study involved 41 nursing students evaluating 16 faceless fear and surprise postures to assess their realism and the emotion conveyed. While well-recognized in 2D human representations, only three of 16 postures were correctly identified by more than 50% of participants in a 3D virtual agent. These results highlight the impact of virtual agent design on emotional recognition and the need for rigorous testing and refinement to improve emotional expressiveness and realism.
Content may be subject to copyright.
© 2025 IEEE. This is the author’s version of the article that has been published in the proceedings of
IEEE Virtual Reality and 3D User Interfaces Abstracts and Workshops. The final version is available at:
10.1109/VRW66409.2025.00394
When Faces Are Masked: Exploring Emotional Expression Through Body
Postures in Virtual Reality
Inas Redjem*
Univ Rennes, LP3C,
F-35000 Rennes, France
Julien Cagnoncle
Univ Rennes, Inserm, LTSI -
UMR S1099,
F-35000 Rennes, France
Arnaud Huaulm ´
e
Univ Rennes, Inserm, LTSI -
UMR S1099,
F-35000 Rennes, France
Alexandre Audinot
Univ Rennes, CNRS, Inria,
IRISA - UMR 6074,
F-35000 Rennes, France
Florian Nouviale
Univ Rennes, CNRS, Inria,
IRISA - UMR 6074,
F-35000 Rennes, France
Mathieu Risy
Univ Rennes, CNRS, Inria,
IRISA - UMR 6074,
F-35000 Rennes, France
Val´
erie Gouranton
Univ Rennes, CNRS, Inria,
IRISA - UMR 6074,
F-35000 Rennes, France
Estelle Michinov
Univ Rennes, LP3C,
F-35000 Rennes, France
Pierre Jannin
Univ Rennes, Inserm, LTSI -
UMR S1099,
F-35000 Rennes, France
ABSTRACT
As simulation advances in healthcare training, understanding how
body-only signals convey emotions in virtual environments is cru-
cial, particularly with masked virtual agents. This study involved
41 nursing students evaluating 16 faceless fear and surprise pos-
tures to assess their realism and the emotion conveyed. While well-
recognized in 2D human representations, only three of 16 postures
were correctly identified by more than 50% of participants in a 3D
virtual agent. These results highlight the impact of virtual agent de-
sign on emotional recognition and the need for rigorous testing and
refinement to improve emotional expressiveness and realism.
Index Terms: Human-centered computing—Human computer in-
teraction (HCI)—Virtual Reality—Empirical studies in HCI
1 INTRODUCTION
Simulation is a key tool for team training in high-stakes environ-
ments [2]. Effective teamwork relies on emotional communication,
fostering collaboration, especially during crises [7]. Virtual agents
in simulation often use facial expression as the primary channel for
communication conveying emotions [9]. Studies show that virtual
characters’ emotional expressions are better recognized when com-
bining facial and body cues [5]. However, in operating room teams,
masks hinder facial expression, making the body the main expres-
sive channel [4]. This study aims to simulate faceless emotional
postures using 3D masked virtual agents, assessing their recogni-
tion and perception by nursing students.
2 METHOD
2.1 Participants
We recruited 41 nursing students (Mage = 23.18, S D = 6.46),
93% of whom were female. Among participants, 63% had prior ex-
perience with virtual reality while 73% indicated they either never
engage in video gaming or do so only a few times per year.
2.2 Procedure
After providing informed consent, participants completed a pre-
questionnaire measuring their anxiety levels followed by the virtual
*e-mail: inas.redjem@univ-rennes2.fr
e-mail: estelle.michinov@univ-rennes2.fr
reality (VR) phase where they observed an artificial agent adopting
various emotional postures. For each posture, they identified the
expressed emotion and rated its realism. Finally, after viewing all
stimuli, participants completed post-questionnaires to assess social
presence and their impressions of the virtual agent.
2.3 Material
We developed a virtual operating room in Unity, displayed via an
HTC Vive Pro, featuring a virtual agent representing an orthope-
dic surgeon in full surgical attire, including a helmet, hood, gown,
and mask. The emotional postures were derived from the BESST
dataset [11] which include 565 frontal view images of real bodies
with blurred facial expressions depicting six emotions (happiness,
sadness, fear, anger, disgust, and surprise). We focused on fear and
surprise body postures and selected the 16 most accurately recog-
nized and highly rated for realism from the original article data.
These postures were recreated on a 3D model using a motion cap-
ture system (Xsens motion capture suit by Movella) and integrated
into the operating room in Unity.
2.4 Measures
First, participants’ pre-VR anxiety levels were measured using the
6-item State-Trait Anxiety Inventory [8] on a 4-point Likert scale (1
= “Not at all”, 4 = “Very much”). Then, participants evaluated the
virtual agent’s emotional expressiveness by identifying emotions
conveyed by each posture, choosing from six options: fear, sadness,
surprise, happiness, anger, and disgust (Figure 1). They also rated
each posture’s realism on a 7-point Likert scale ranging from “Not
at all realistic” to “Completely realistic”.
After viewing the 16 postures, social presence was assessed us-
ing a 5-item scale developed by Bailenson and colleagues [1], with
responses on a 5-point Likert scale from 1 (“Not at all”) to 5 (“To-
tally”). An example item is: “The idea that the person isn’t a real
person has often crossed my mind.” Perception of the virtual agent
was qualitatively assessed using three items from Ho and McDor-
man’s uncanny valley scale [6], evaluating attractiveness (repul-
sive, agreeable), eeriness (ordinary, weird), and humanness (nat-
ural, real) on a 7-point Likert scale.
3 RE SULTS
The results showed that only three out of the 16 postures were cor-
rectly identified by more than 50% of participants, all of which were
fear postures. Postures A was recognized by 76% of participant,
posture B by 58% (Figure 2), and the third posture by 51% only.
1
© 2025 IEEE. This is the author’s version of the article that has been published in the proceedings of IEEE Virtual Reality and
3D User Interfaces Abstracts and Workshops. The final version is available at: 10.1109/VRW66409.2025.00394
Figure 1: Emotion recognition task interface
Figure 2: Fear Body Posture A (left panel) and B (right panel)
In contrast, the highest recognition rate for surprise postures was
49%, suggesting that surprise is more difficult to convey through
body-only cues. Interestingly, among the two most recognized fear
postures, a difference in perceived realism emerged: Posture A was
rated as more realistic M= 5.17, SD = 1.73 than posture B
M= 3.97, SD = 1.68. The perception of the virtual agent was
ambivalent, being rated as synthetic (M= 3.22, SD = 1.86) and
ordinary (M= 3.68, SD = 1.69) with a neutral judgment on
repulsivness (M= 3.98, SD = 1.30). Social presence was mod-
erate (M= 2.57, SD = 1.30), suggesting the interaction did not
create a fully immersive or credible social experience. Social pres-
ence was positively correlated with perceiving the virtual agent as
more agreeable (r=.31, p < .05), ordinary (r=.52, p < .001),
and real (r=.61, p < .001). Anxiety showed no correlation with
the emotion recognition, but correct identification was negatively
correlated with perceiving the virtual agent as more real than syn-
thetic (r=.33, p < .05).
4 CONCLUSION
This study highlights the challenges of recognizing emotions
through body-only cues on 3D virtual agents. Despite selecting
16 highly recognizable fear and surprise postures from a validated
dataset [11], recognition rates were lower on virtual agents, sug-
gesting that additional factors may impact emotional interpretation
in virtual environments. One possible explanation is that anima-
tion complexity in virtual agents may hinder emotion recognition
by introducing subtle movements that complicate interpretation. In
addition, the uncanny valley effect could contribute [6], as partici-
pants perceived the virtual agent as synthetic, limiting participants’
ability to connect with the virtual agent and accurately interpret its
emotional expressions. This is supported by the virtual agent’s lack
of credibility needed to establish a strong sense of social presence.
Although fear and surprise postures had similar recognition rates
in 2D images, surprise was harder to recognize in 3D, possibly due
to the complexity of surprise. Unlike fear, which is a basic emotion
with universal and recognizable cues, surprise may be a context-
dependent emotion or mental state [10] and may require more sub-
tle visual information to be conveyed [3]. Prior research [5] also
highlighted that surprise relies heavily on facial expressions, sup-
porting the idea that body-only cues are insufficient.
This study has limitations: while frontal postures were selected
from the original dataset, participants in the virtual reality environ-
ment viewed the virtual agent from a three-quarter profile perspec-
tive. Also, no multimodal cues (voice or text) were used. Emotional
recognition is inherently multimodal, and the absence of these sig-
nals may have reduced recognition accuracy. Future studies should
integrate other cues to enhance emotional conveyance. These find-
ings emphasize the need for a VR-specific body-language lexicon,
as body-only emotional communication is inherently more chal-
lenging to convey in virtual environments. Improving this aspect is
crucial for applications such as collaborative virtual environments
and medical training scenarios where facial cues are masked. Fu-
ture research should focus on improving the design of virtual agents
to mitigate the uncanny valley effect and incorporate voice-based
interactions or dynamic gestures to strengthen the social presence
of virtual agents.
ACKNOWLEDGMENTS
This work was supported by state aid managed by the French Na-
tional Research Agency under the France 2030 program, bearing
the reference ANR-21-DMES-0001.
REFERENCES
[1] J. N. Bailenson, J. Blascovich, A. C. Beall, and J. M. Loomis. Inter-
personal distance in immersive virtual environments. Personality &
Social Psychology Bulletin, 29(7):819–833, July 2003. doi: 10.1177/
0146167203029007002 1
[2] A. J. Carpenter. Simulation is a valuable tool for team training. The
Journal of Thoracic and Cardiovascular Surgery, 155(6):2525, June
2018. doi: 10. 1016/j.jtcvs.2018. 01.046 1
[3] J. L. Cheal and M. D. Rutherford. Context-Dependent Categorical
Perception of Surprise. Perception, 42(3):294–301, Mar. 2013. 2
[4] B. de Gelder, A. W. de Borst, and R. Watson. The perception of emo-
tion in body expressions. Wiley Interdisciplinary Reviews. Cognitive
Science, 6(2):149–158, Jan. 2015. doi: 10. 1002/wcs.1335 1
[5] C. Ennis, L. Hoyet, A. Egges, and R. McDonnell. Emotion Cap-
ture: Emotionally Expressive Characters for Games. In Proceedings
of Motion on Games, MIG ’13, pp. 53–60. Association for Computing
Machinery, New York, NY, USA, Nov. 2013. doi: 10. 1145/2522628.
2522633 1,2
[6] C.-C. Ho and K. F. MacDorman. Measuring the Uncanny Valley Ef-
fect. International Journal of Social Robotics, 9(1):129–139, Jan.
2017. doi: 10. 1007/s12369-016-0380-9 1,2
[7] S. Kaplan, K. LaPort, and M. J. Waller. The role of positive affec-
tivity in team effectiveness during crises. Journal of Organizational
Behavior, 34(4):473–491, 2013. doi: 10. 1002/job.1817 1
[8] T. M. Marteau and H. Bekker. The development of a six-item short-
form of the state scale of the Spielberger State-Trait Anxiety Inventory
(STAI). The British Journal of Clinical Psychology, 31(3):301–306,
Sept. 1992. doi: 10. 1111/j.2044-8260.1992. tb00997.x 1
[9] M. Ochs, R. Niewiadomski, and C. Pelachaud. Facial expressions of
emotions for virtual characters. In The Oxford handbook of affective
computing, pp. 261–272. Oxford University Press, New York, NY, US,
2015. doi: 10. 1093/oxfordhb/9780199942237.001.0001 1
[10] A. Ortony. Are All ”Basic Emotions” Emotions? A Problem for the
(Basic) Emotions Construct. Perspectives on Psychological Science:
A Journal of the Association for Psychological Science, 17(1):41–61,
Jan. 2022. doi: 10. 1177/1745691620985415 2
[11] P. Thoma, D. Soria Bauser, and B. Suchan. BESST (Bochum Emo-
tional Stimulus Set)–a pilot validation study of a stimulus set contain-
ing emotional bodies and faces from frontal and averted views. Psy-
chiatry Research, 209(1):98–109, Aug. 2013. doi: 10.1016/j. psychres
.2012.11.012 1,2
2
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Using a hypothetical graph, Masahiro Mori proposed in 1970 the relation between the human likeness of robots and other anthropomorphic characters and an obser-ver's affective or emotional appraisal of them. The relation is positive apart from a U-shaped region known as the uncanny valley. To measure the relation, we previously developed and validated indices for the perceptual-cognitive dimension humanness and three affective dimensions: inter-personal warmth, attractiveness, and eeriness. Nevertheless, the design of these indices was not informed by how the untrained observer perceives anthropomorphic characters categorically. As a result, scatterplots of humanness vs. eeri-ness show the stimuli cluster tightly into categories that are widely separated from each other. The present study applies a card sorting task, laddering interview, and adjective evaluation (N = 30) to revise the humanness, attractiveness, and eeriness indices and validate them via a representative survey (N =1,311). The revised eeriness index maintains its or-thogonality to humanness (r = .04, p = .285), but the stimuli show much greater spread, reflecting the breadth of their range in human likeness and eeriness. The revised indices enable empirical relations among characters to be plotted similarly to Mori's graph of the uncanny valley. Accurate measurement with these indices can be used to enhance the design of androids and 3D computer-animated characters.
Conference Paper
Full-text available
It has been shown that humans are sensitive to the portrayal of emotions for virtual characters. However, previous work in this area has often examined this sensitivity using extreme examples of facial or body animation. Less is known about how attuned people are at recognizing emotions as they are expressed during conversational communication. In order to determine whether body or facial motion is a better indicator for emotional expression for game characters, we conduct a perceptual experiment using synchronized full-body and facial motion-capture data. We find that people can recognize emotions from either modality alone, but combining facial and body motion is preferable in order to create more expressive characters.
Article
Full-text available
This paper introduces the freely available Bochum Emotional Stimulus Set (BESST) which contains pictures of bodies and faces depicting either a neutral expression or one of the six basic emotions (happiness, sadness, fear, anger, disgust, and surprise), presented from two different perspectives (0° frontal view vs. camera averted by 45° to the left). The set comprises 565 frontal view and 564 averted view pictures of real-life bodies with masked facial expressions and 560 frontal and 560 averted view faces which were synthetically created using the FaceGen 3.5 Modeller. All stimuli were validated in terms of categorization accuracy and the perceived naturalness of the expression. Additionally, each facial stimulus was morphed into three age versions (20/40/60 years). The results show high recognition of the intended facial expressions, even under speeded forced-choice conditions, as corresponds to common experimental settings. The average naturalness ratings for the stimuli range between medium and high.
Article
Full-text available
The authors discuss some of the key points raised by Ekman (1992), Izard (1992), and Panksepp (1992) in their critiques of Ortony and Turner's (1990) suggestion that there are and probably can be no objective and generally acceptable criteria for what is to count as a basic emotion. A number of studies are discussed that are relevant to the authors' contention that a more promising approach to understanding the huge diversity among emotions is to think in terms of emotions being assemblages of basic components rather than combinations of other basic emotions. The authors stress that their position does not deny that emotions are based on "hardwired" biological systems. On the other hand, the existence of such systems does not mean that some emotions (such as those that appear on lists of basic emotions) have a special status. Finally, the authors note that Ekman, Izard, and Panksepp, in adopting different starting points for their research, arrive at rather different conclusions as to what basic emotions are and which emotions are basic. It is concluded that converging resolutions of these questions are improbable.
Article
Full-text available
Digital immersive virtual environment technology (IVET) enables behavioral scientists to conduct ecologically realistic experiments with near-perfect experimental control. The authors employed IVET to study the interpersonal distance maintained between participants and virtual humans. In Study 1, participants traversed a three-dimensional virtual room in which a virtual human stood. In Study 2, a virtual human approached participants. In both studies, participant gender, virtual human gender, virtual human gaze behavior, and whether virtual humans were allegedly controlled by humans (i.e., avatars) or computers (i.e., agents) were varied. Results indicated that participants maintained greater distance from virtual humans when approaching their fronts compared to their backs. In addition, participants gave more personal space to virtual agents who engaged them in mutual gaze. Moreover, when virtual humans invaded their personal space, participants moved farthest from virtual human agents. The advantages and disadvantages of IVET for the study of human behavior are discussed.
Article
During communication, we perceive and express emotional information through many different channels, including facial expressions, prosody, body motion, and posture. Although historically the human body has been perceived primarily as a tool for actions, there is now increased understanding that the body is also an important medium for emotional expression. Indeed, research on emotional body language is rapidly emerging as a new field in cognitive and affective neuroscience. This article reviews how whole-body signals are processed and understood, at the behavioral and neural levels, with specific reference to their role in emotional communication. The first part of this review outlines brain regions and spectrotemporal dynamics underlying perception of isolated neutral and affective bodies, the second part details the contextual effects on body emotion recognition, and final part discusses body processing on a subconscious level. More specifically, research has shown that body expressions as compared with neutral bodies draw upon a larger network of regions responsible for action observation and preparation, emotion processing, body processing, and integrative processes. Results from neurotypical populations and masking paradigms suggest that subconscious processing of affective bodies relies on a specific subset of these regions. Moreover, recent evidence has shown that emotional information from the face, voice, and body all interact, with body motion and posture often highlighting and intensifying the emotion expressed in the face and voice. WIREs Cogn Sci 2015, 6:149-158. doi: 10.1002/wcs.1335 For further resources related to this article, please visit the WIREs website. The authors have declared no conflicts of interest for this article. © 2014 John Wiley & Sons, Ltd.
Article
Organizational efforts to improve team effectiveness in crisis situations primarily have focused on team training initiatives and, to a lesser degree, on staffing teams with respect to members' ability, experience, and functional backgrounds. Largely neglected in these efforts is the emotional component of crises and, correspondingly, the notion of staffing teams with consideration for their affective makeup. To address this void, we examined the impact of team member dispositional positive affect (PA) on team crisis effectiveness and the role of felt negative emotion in transmitting that influence. A study of 21 nuclear power plant crews engaged in crisis training simulations revealed that homogeneity in PA, but not mean-level PA, was associated with greater team effectiveness. Mediation analysis suggested that homogeneity in PA leads to greater team effectiveness by reducing the amount of negative emotions that team members experience during crises. Furthermore, homogeneity in PA compensated for lower mean-level PA in predicting effectiveness. Discussion focuses on the implications of these findings for understanding and further exploring the importance of affective factors and especially team affective composition in team crisis performance. Copyright © 2012 John Wiley & Sons, Ltd.
Article
Evidence regarding the categorical perception of surprise facial expressions has been equivocal. Surprise is inherently ambiguous with respect to valence: it could be positive or negative. If this ambiguity interferes with categorical perception, disambiguating the valence might facilitate categorical perception. Participants identified and discriminated images that were selected from expression continua: happy-fear, surprise-fear, happy-surprise. Half were presented with a context for the surprise expressions indicating positive or negative valence. Both groups had a typical identification curve, but discrimination performance was better predicted by identification in the context condition for happy-fear and surprise-fear continua, suggesting that categorical perception was facilitated by the disambiguating context.
Article
Two studies are reported describing the development of a short-form of the state scale of the Spielberger State-Trait Anxiety Inventory (STAI) for use in circumstances where the full-form is inappropriate. Using item-remainder correlations, the most highly correlated anxiety-present and anxiety-absent items were combined, and correlated with scores obtained using the full-form of the STAI. Correlation coefficients greater than .90 were obtained using four and six items from the STAI. Acceptable reliability and validity were obtained using six items. The use of this six-item short-form produced scores similar to those obtained using the full-form. This was so for several groups of subjects manifesting a range of anxiety levels. This short-form of the STAI is therefore sensitive to fluctuations in state anxiety. When compared with the full-form of the STAI, the six-item version offers a briefer and just as acceptable scale for subjects while maintaining results that are comparable to those obtained using the full-form of the STAI.