Conference PaperPDF Available

Elevating Social Presence in Multi-User VR by Increasing Behavioral Realism



Due to profound changes in our social structure, the exchange of social interaction via technology-mediated communication channels is on the rise. In this regard, we argue that by creating an increased social presence using multiuser VR applications, social connectedness among individuals can be elevated. Although several factors influencing social presence positively were already identified, the impact of behavioral realism increased by employing facial tracking technologies is yet unexplored. We hypothesize that social presence can be further elevated by utilizing this new technology and propose a study design to evaluate this hypothesis.
Elevating Social Presence in Multi-User VR by Increasing Behavioral Realism
Simon Kimmel and Wilko Heuten
OFFIS - Institute for Information Technology, Oldenburg, Germany
Due to profound changes in our social structure, the exchange of social interaction via technology-mediated
communication channels is on the rise. In this regard, we argue that by creating an increased social presence
using multi-user VR applications, social connectedness among individuals can be elevated. Although several
factors influencing social presence positively were already identified, the impact of behavioral realism increased
by employing facial tracking technologies is yet unexplored. We hypothesize that social presence can be further
elevated by utilizing this new technology and propose a study design to evaluate this hypothesis.
Keywords— Social Presence, Social VR, Behavioral Realism, Facial Tracking
1 Motivation
Today’s society is facing tremendous changes. Receiving
education, studying, and working in a more and more glob-
alized world increases the demand for mobility to a never-
seen extent [10]. In terms of its ecological and economic
consequences, travel cannot remain a long-term solution to
the resulting separation of families and friends [12]. This
also became visible within the COVID-19 pandemic, during
which this decrease in social connectedness was further exac-
erbated by enforced social distancing measures. As a means
to counteract these developments, the digitization of social
exchanges has accelerated during the pandemic, particularly
by the increased usage of videoconferencing platforms [5, 6].
While this emerging trend provides several potential ben-
efits that go beyond the reduced risk of infection such as
increased time and resource efficiency, negative effects of
digital communication at a distance are also emerging simul-
taneously. Among others, videoconferencing is perceived as
fatiguing and often elicits low levels of perceived social con-
nectedness and proximity among users for instance due to
usage eliciting elevated cognitive load levels [7, 11]. Due to
these negative effects of videoconferencing usage, the recent
emergence of multi-user Virtual Reality (in the following
referred to as Social VR) applications constitutes a promis-
ing alternative for technology-mediated communication over
distance. Among other things, this is due to the fact that in
comparison to other communication technologies, Social VR
applications oftentimes elicit increased levels of social pres-
ence in their users [22]. Social presence can be defined as the
perceived ”sense of being with another” [8] and is a predict-
ing factor for a variety of positive communication outcomes
including trust, enjoyment, and attractiveness [22]. Con-
sequently, elevating perceived social presence to the largest
extent possible ought to be a major focus when aiming to
improve technology-mediated social experiences.
Even though social presence levels in Social VR applica-
tions are often already considered elevated, a variety of po-
tentially influencing factors has not yet been adequately ex-
amined within the existing research in the field [22]. Among
these is the rendering of facial expressions tracked in real-
time onto the communication partners avatar within Social
VR applications. We hypothesize that employing recently
developed facial tracking technology to render users’ facial
expressions onto avatars in Social VR can increase elicited
social presence levels.
2 Factors Influencing Social Presence
A range of research initiatives has already successfully iden-
tified a variety of factors that can impact the level of per-
ceived social presence when using communication technolo-
gies [22, 31]. These include, among others, demographic
and personality-related characteristics. It was thus deter-
mined, for instance, that women and people with a greater
desire for social interaction generally perceive stronger so-
cial presence (see e.g. [13, 19, 20]). Beyond these psy-
chological aspects, various technological characteristics have
been identified that can exert a demonstrable influence on
the users’ perceived social presence with a social VR ap-
plication [22, 31]. Thus, the type of activity (e.g. inde-
pendent vs. interdependent; collaborative vs. competitive)
that users jointly engage in a virtual environment can exert
an effect on perceived social presence levels [22, 31]. Fur-
thermore, while only little research has been conducted in
the field, several research contributions suggest that social
presence levels can be impacted by the multi-modality of a
Social VR application. For instance, research conducted by
Hoppe et al. exploring haptically enriched interactions with
agents within VR environments indicates that haptic feed-
back can elevate users’ perceived social presence significantly
[18]. One of the most discussed aspects potentially influenc-
ing social presence in the context of social VR, however, is
the visualization of users both in terms of self-embodiment
and embodiment of the communication partner. In this re-
gard, research has suggested that an embodiment reinforces
the perception of social presence [27]. However, the results
are not as unambiguous regarding the way this representa-
tion should appear in order to ideally increase social pres-
ence. For instance, research conducted in the field shows
no conclusive trend for photographically realistic avatars to
increase perceived social presence levels [22, 31]. This re-
sult is among other things often attributed to the Uncanny
Valley effect [21]. However, it could also be due to the fact
that users’ ideally preferred avatar representations might be
highly dependent on the communicators’ relationships to-
wards one another [26].
In contrast to results regarding the photographic realism,
data obtained in studies evaluating the impact of avatars’
behavioral realism, clearly indicate that avatars with in-
creased behavioral realism entail an intensified perception
of the social presence (see e.g. [25, 30, 17, 27, 28]). Most re-
search regarding behavioral realism focuses on rather simple
non-verbal interactions such as eye gaze, nodding, blushing
or body movements. In recent years, novel methods for ren-
dering facial expressions driven by facial motion tracking
data have been developed on basis of advances in depth-
camera technology and machine learning algorithms [32, 9].
However, the impact on social presence of employing these
novel approaches to enrich avatars in Social VR applications
with more behavioral realistic mimics has yet to be evalu-
ated. This is particularly due to the fact, that research con-
tributions evaluating said facial tracking technology up to
this point either focused on social interactions among com-
puter users, or on artificial scenarios in which one VR-HMD
user was communicating with another user that utilized a
computer [24, 23, 16, 15]. Therefore, Yassien et al. conclude
in an extensive review of social presence research that ”the
impact of facial expressions on social presence is an open
research opportunity” [31]. As ”social VR platforms may
benefit from investing in technologies that can capture (or
infer) and map facial expressions within avatar-mediated en-
vironments” [24], we intend to close this research gap by em-
ploying recently commercially released facial tracking hard-
ware in an evaluation study. We hypothesize that employ-
ing facial tracking to render facial expressions can improve
perceived social presence levels, even in comparison to fre-
quently applied facial tracking simulation techniques.
3 Evaluating Impact of Behavioral Realism
By investigating the above-mentioned and thus far mostly
unexplored design element of integrating real-time facial ex-
pression rendering within a Social VR application, we plan
to evaluate whether it is beneficial for a system to employ
said feature to increase avatars’ behavioral realism in order
to convey further elevated social presence over distance. For
said evaluation, we have developed a Social VR application
using the game engine Unity [29], which is visualized to its
users via an Oculus Quest 2 [3]. Within this application, two
users can interact with each other to collaboratively solve a
task, which consists of the users attempting to explain a
fixed set of terms to each other using both verbal and non-
verbal communication. The task is interdependent in that
a solution can only be achieved if both players cooperate.
This task was selected in particular as it obliges users to en-
gage with each other and thereby entails users to pay a high
level of attention to their communication partner. Users
in the applications are visually represented by avatars de-
signed with the character creation tool ReadyPlayerMe [1].
The system can be utilized with three different degrees of
integrated behavioral realism: (1) without the avatars ren-
dering any facial expressions; (2) with the avatars simulating
facial expressions based solely on analyzing the users’ voice
input via Oculus Lipsync [2]; (3) with the avatars render-
ing realistic facial expressions based on both analyzing the
users’ voice input via Oculus Lipsync and by employing the
VIVE Facial Trackers for real-time motion detection [2, 4].
Figure 1: User (left) and its avatar (right) employing our
In order to evaluate the system as well as the impact the
differing degrees of behavioral realism have on perceived so-
cial presence levels, we are currently planning a large-scale
study. For this study, we are intending to employ a within-
subject design. We thereby aim for approximately 50 par-
ticipants (25 dyads) to perform the above-depicted task in
all three behavioral realism conditions. After completion of
the task in each condition, we intend to measure perceived
social presence levels quantitatively by instructing partici-
pants to complete, among other measures, the Networked
Minds Questionnaire [14]. Additionally, we plan to measure
completion times to be able to analyze and compare task
performance. Early results of an initial pilot study (see Fig-
ure 1) with just 6 participants already exhibited tendencies
of the condition (3), which employs motion detected facial
rendering, to yield the highest social presence levels. Fur-
thermore, variation in task performance in dependence of
the employed conditions were also detectable.
By conducting the proposed study, we will contribute to
current trends in the HCI community, as we hope to iden-
tify rendering motion-detected facial expressions as a design
element that developers can draw upon to develop future
Social VR applications. This way, we believe that future so-
cial VR applications will be able to foster the maintenance
and strengthening of social connectedness within families
and groups of friends, despite the changes that are currently
prevalent in our society.
4 Acknowledgments
This project is funded under project number 16SV8712 by
the German Federal Ministry for Education and Research.
We would like to thank Eric Landwehr and Timo von Reeken
for the assistance in creating the prototype and in conduct-
ing the pilot study.
[1] Metaverse Full-Body Online 3D Avatar Creator — Ready Player Me.
[2] Oculus Lipsync Unity. Oculus.
[3] Oculus Quest 2: Unser bisher bestes, neues all-in-one VR-Headset — Oculus.
2/?locale=de DE.
[4] VIVE Facial Tracker — VIVE Deutschland.
[5] Microsoft Teams usage growth surpasses Zoom.
Teams-usage-growth-surpasses-Zoom, 2020.
[6] Zoom Goes From Conferencing App to the Pandemic’s Social Network. (Apr. 2020).
[7] Bailenson, J. N. Nonverbal Overload: A Theoretical Argument for the Causes of Zoom Fatigue. Technology, Mind,
and Behavior 2, 1 (Feb. 2021).
[8] Biocca, F., Harms, C., and Burgoon, J. Towards A More Robust Theory and Measure of Social Presence:
Review and Suggested Criteria. Presence 12 (Oct. 2003), 456–480.
[9] Brito, C., and Mitchell, K. Repurposing Labeled Photographs for Facial Tracking with Alternative Camera
Intrinsics. In 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR) (Osaka, Japan, Mar. 2019),
IEEE, pp. 864–865.
[10] Bundeszentrale f¨
ur Politische Bildung, D., Statistisches Bundesamt, W. B. f. S., and Deutsches
Institut f¨
ur Wirtschaftsforschung, P. D. S.-o. P. Datenreport 2021 ein Sozialbericht f¨ur die Bundesrepublik
Deutschland. 2021.
[11] Fauville, G., Luo, M., Queiroz, A. C. M., Bailenson, J. N., and Hancock, J. Zoom Exhaustion & Fatigue
Scale. Computers in Human Behavior Reports 4 (Aug. 2021), 100119.
[12] Generaldirektion Mobilit¨
at und Verkehr (Europ¨
aische Kommission).EU Transport in Figures: Statistical
Pocketbook 2021. Amt f¨ur Ver¨offentlichungen der Europ¨aischen Union, LU, 2021.
[13] Giannopoulos, E., Eslava, V., Oyarzabal, M., Hierro, T., Gonzalez, L., Ferre, M., and Slater, M.
The Effect of Haptic Feedback on Basic Social. Interaction within Shared Virtual Environments. June 2008.
[14] Harms, C., and Biocca, F. Internal Consistency and Reliability of the Networked Minds Measure of Social
Presence. 7.
[15] Hart, J. D., Piumsomboon, T., Lawrence, L., Lee, G. A., Smith, R. T., and Billinghurst, M. Emotion
Sharing and Augmentation in Cooperative Virtual Reality Games. In Proceedings of the 2018 Annual Symposium on
Computer-Human Interaction in Play Companion Extended Abstracts (Melbourne VIC Australia, Oct. 2018), ACM,
pp. 453–460.
[16] Hart, J. D., Piumsomboon, T., Lee, G. A., Smith, R. T., and Billinghurst, M. Manipulating Avatars
for Enhanced Communication in Extended Reality. In 2021 IEEE International Conference on Intelligent Reality
(ICIR) (May 2021), pp. 9–16.
[17] Herrera, F., Oh, S. Y., and Bailenson, J. N. Effect of Behavioral Realism on Social Interactions Inside
Collaborative Virtual Environments. PRESENCE: Virtual and Augmented Reality 27, 2 (Feb. 2020), 163–182.
[18] Hoppe, M., Rossmy, B., Neumann, D. P., Streuber, S., Schmidt, A., and Machulla, T.-K. A Human
Touch : Social Touch Increases the Perceived Human-likeness of Agents in Virtual Reality. In 2020 CHI Conference
on Human Factors in Computing Systems (2020).
[19] Jin, S.-A. A. Parasocial Interaction with an Avatar in Second Life: A Typology of the Self and an Empirical Test
of the Mediating Role of Social Presence. Presence: Teleoperators and Virtual Environments 19, 4 (Aug. 2010),
[20] Johnson, R. D. Gender Differences in E-Learning: Communication, Social Presence, and Learning Outcomes.
Journal of Organizational and End User Computing (JOEUC) 23, 1 (2011), 79–94.
[21] Mori, M., MacDorman, K. F., and Kageki, N. The Uncanny Valley [From the Field]. IEEE Robotics Automation
Magazine 19, 2 (June 2012), 98–100.
[22] Oh, C. S., Bailenson, J. N., and Welch, G. F. A Systematic Review of Social Presence: Definition, Antecedents,
and Implications. Frontiers in Robotics and AI 5 (Oct. 2018), 114.
[23] Oh, S. Y., Bailenson, J., Kr¨
amer, N., and Li, B. Let the Avatar Brighten Your Smile: Effects of Enhancing
Facial Expressions in Virtual Environments. PLOS ONE 11, 9 (Sept. 2016), e0161794.
[24] Oh Kruzic, C., Kruzic, D., Herrera, F., and Bailenson, J. Facial expressions contribute more than body
movements to conversational outcomes in avatar-mediated virtual environments. Scientific Reports 10, 1 (Dec. 2020),
[25] Pan, X., Gillies, M., and Slater, M. The Impact of Avatar Blushing on the Duration of Interaction between a
Real and Virtual Person. 7.
[26] Praetorius, A. S., Krautmacher, L., Tullius, G., and Curio, C. User-Avatar Relationships in Various
Contexts: Does Context Influence a Users’ Perception and Choice of an Avatar? In Mensch Und Computer
2021 (New York, NY, USA, Sept. 2021), MuC ’21, Association for Computing Machinery, pp. 275–280.
[27] Smith, H. J., and Neff, M. Communication Behavior in Embodied Virtual Reality. In Proceedings of the 2018
CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY,
USA, Apr. 2018, pp. 1–12.
[28] Tang, T. Y., and Wang, Y. Alone Together: Multiplayer Online Ball Passing using Kinect - An Experimental
Study. In Proceedings of the 18th ACM Conference Companion on Computer Supported Cooperative Work & Social
Computing (New York, NY, USA, Feb. 2015), CSCW’15 Companion, Association for Computing Machinery, pp. 187–
[29] Technologies, U. Unity Real-Time Development Platform.
[30] von der P¨
utten, A. M., Kr¨
amer, N. C., Gratch, J., and Kang, S.-H. ”It doesn’t matter what you are!”
Explaining social effects of agents and avatars. Computers in Human Behavior 26, 6 (Nov. 2010), 1641–1650.
[31] Yassien, A., ElAgroudy, P., Makled, E., and Abdennadher, S. A Design Space for Social Presence in VR. In
Proceedings of the 11th Nordic Conference on Human-Computer Interaction: Shaping Experiences, Shaping Society
(Tallinn Estonia, Oct. 2020), ACM, pp. 1–12.
[32] Yu, J., and Park, J. Real-time facial tracking in virtual reality. In SIGGRAPH ASIA 2016 VR Showcase (New
York, NY, USA, Nov. 2016), SA ’16, Association for Computing Machinery, p. 1.
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
In 2020, video conferencing went from a novelty to a necessity, and usage skyrocketed due to shelter-in-place throughout the world. However, there is a scarcity of academic research on the psychological effects and mechanisms of video conferencing, and scholars need tools to understand this drastically scaled usage. The current paper presents the development and validation of the Zoom Exhaustion & Fatigue Scale (ZEF Scale). In one qualitative study, we developed a set of interview prompts based on previous work on media use. Those interviews resulted in the creation of 49 survey items that spanned several dimensions. We administered those items in a survey of 395 respondents and used factor analyses to reduce the number of items from 49 to 15, revealing five dimensions of fatigue: general, social, emotional, visual, and motivational fatigue. Finally, in a scale validation study based on 2724 respondents, we showed the reliability of the overall scale and the five factors and demonstrated scale validity in two ways. First, frequency, duration, and burstiness of Zoom meetings were associated with a higher level of fatigue. Second, fatigue was associated with negative attitudes towards the Zoom meetings. The scale is available for download at [anonymized]. We discuss future directions for validation and expansion of the scale.
Full-text available
This study focuses on the individual and joint contributions of two nonverbal channels (i.e., face and upper body) in avatar mediated-virtual environments. 140 dyads were randomly assigned to communicate with each other via platforms that differentially activated or deactivated facial and bodily nonverbal cues. The availability of facial expressions had a positive effect on interpersonal outcomes. More specifically, dyads that were able to see their partner’s facial movements mapped onto their avatars liked each other more, formed more accurate impressions about their partners, and described their interaction experiences more positively compared to those unable to see facial movements. However, the latter was only true when their partner’s bodily gestures were also available and not when only facial movements were available. Dyads showed greater nonverbal synchrony when they could see their partner’s bodily and facial movements. This study also employed machine learning to explore whether nonverbal cues could predict interpersonal attraction. These classifiers predicted high and low interpersonal attraction at an accuracy rate of 65%. These findings highlight the relative significance of facial cues compared to bodily cues on interpersonal outcomes in virtual environments and lend insight into the potential of automatically tracked nonverbal cues to predict interpersonal attitudes.
Conference Paper
Full-text available
Virtual Reality experiences and games present believable virtual environments based on graphical quality, spatial audio, and interactivity. The interaction with in-game characters, controlled by computers (agents) or humans (avatars), is an important part of VR experiences. Pre-captured motion sequences increase the visual humanoid resemblance. However, this still precludes realistic social interactions (eye contact, imitation of body language), particularly for agents. We aim to make social interaction more realistic via social touch. Social touch is non-verbal, conveys feelings and signals (coexistence, closure, intimacy). In our research, we created an artificial hand to apply social touch in a repeatable and controlled fashion to investigate its effect on the perceived human-likeness of avatars and agents. Our results show that social touch is effective to further blur the boundary between computer- and human-controlled virtual characters and contributes to experiences that closely resemble human-to-human interactions.
Conference Paper
Full-text available
We present preliminary findings from sharing and augmenting facial expression in cooperative social Virtual Reality (VR) games. We implemented a prototype system for capturing and sharing facial expression between VR players through their avatar. We describe our current prototype system and how it could be assimilated into a system for enhancing social VR experience. Two social VR games were created for a preliminary study. We discuss our findings from our pilots, potential games for this system, and future directions for this research.
Conference Paper
Avatars are in use when interacting in virtual environments in different contexts, in collaborative work, as well as in gaming and also in virtual meetings with friends. Therefore it is important to understand how the relationship between user and avatar works. In this study, an online survey is used to determine how the perception of an avatar changes in different contexts by relating it to existing avatar relationship typologies. Additionally, it is determined whether in each context a realistic, abstract or comic-like representation is preferred by the participants. One result was a preference of low poly representations in the work context, which are associated with the perception of the avatar as a tool. In the context of meeting friends, a realistic representation is perceived as more appropriate, which is perceived as an accurate self-representation. In the gaming context, the results are less clear, which can be attributed to different gaming preferences. Here, unlike in the other contexts, a comic-like representation is also perceived as appropriate, which is associated with the perception of the avatar as a friend. A symbiotic user-avatar relationship is not directly related to any form of representation, but always lies in the midfield, which is attributed to the fact that it represents a whole spectrum between other categories.
Collaborative virtual environments (CVEs), wherein people can virtually interact with each other via avatars, are becoming increasingly prominent. However, CVEs differ in type of avatar representation and level of behavioral realism afforded to users. The present investigation compared the effect of behavioral realism on users' nonverbal behavior, self-presence, social presence, and interpersonal attraction during a dyadic interaction. Fifty-one dyads (aged 18 to 26) embodied either a full-bodied avatar with mapped hands and inferred arm movements, an avatar consisting of only a floating head and mapped hands, or a static full-bodied avatar. Planned contrasts compared the effect of behavioral realism against no behavioral realism, and compared the effect of low versus high behavioral realism. Results show that participants who embodied the avatar with only a floating head and hands experienced greater social presence, self-presence, and interpersonal attraction than participants who embodied a full-bodied avatar with mapped hands. In contrast, there were no significant differences on these measures between participants in the two mapped-hands conditions and those who embodied a static avatar. Participants in the static-avatar condition rotated their own physical head and hands significantly less than participants in the other two conditions during the dyadic interaction. Additionally, side-to-side head movements were negatively correlated with interpersonal attraction regardless of condition. We discuss implications of the finding that behavioral realism influences nonverbal behavior and communication outcomes.