Conference PaperPDF Available

Embodied online dance learning objectives of CAROUSEL+

Embodied online dance learning objectives of CAROUSEL+
Kenny Mitchell*
Edinburgh Napier University, Roblox, 3FINERY, Cobra Simulation
Babis Koniaris
Edinburgh Napier University
Monica Tamariz
Edinburgh Napier University, Heriot-Watt University
Jessie Kennedy
Edinburgh Napier University
Noshaba Cheema
DFKI, MPI-INF, Saarland University
Elisa Mekler
Aalto University
Pieter van der Linden
Erik Herrmann
DFKI, Saarland University
Perttu H¨
Aalto University
Iain McGregor
Edinburgh Napier University
Philipp Slusallek
DFKI, Saarland University
Carmen Mac Williams
Grassroots Arts
This is a position paper concerning the embodied dance learning
objectives of the CAROUSEL
project, which aims to impact how
online immersive technologies influence multi-user interaction and
communication with a focus on dancing and learning dance together
online. We aim to enable shared online social immersive experiences
across the reality-virtuality continuum synchronizing audio, visual
and haptic rendering. In teaching and learning to dance remotely,
our models should support accessibility, style transfer and adaption
of multi-modal feedback according to skills, strength, flexibility
The Internet connects individuals to the world. Phones and com-
puters give unprecedented access not only to information but also
to countless other people in the world in virtual social networks.
However, a lot of online experience is passive, disconnected and
disembodied. Internet users are patently part of the digital world, but
often isolated from real-world sensations and feelings such as the
presence of others, their touch, or their movement. We find ourselves
isolated, instead of brought together, by the very technologies that
are designed to connect us.
Isolation and loneliness are distressing and debilitating and have
long-lasting consequences for health, including mental health, pro-
ductivity, and happiness [31]. The current COVID-19 crisis and
the consequent social distancing, confinement and lockdown have
made the problem more acute and visible for the whole of soci-
ety. The ways in which we stay in touch with friends, family and
colleagues and take part in social events will probably change for-
has the vision that, with the support of novel,
original and imaginative combinations of Artificial Intelligence and
immersive interaction technologies, people will be able to feel each
other’s presence, touch, and movement, even if they are physically
disconnected. These new developments will help overcome isolation
and loneliness and bring improvements to our health, work, and
wellbeing. Moreover, they will also generate the foundations for an
ecosystem of original, as yet unimagined forms of communication
and expression.
Dance is a profoundly human activity. We dance when we are
in love, we show we are happy with dance, sharing a dance with
someone creates a deep connection. Together with language and
music, dance is one of the few behaviors that occurs naturally in chil-
Figure 1: Imagine you feel lonely at home and wish to dance with
somebody. You invite both your real friends and artificial intelligence
(AI)-driven characters for a virtual house party in your living room to
dance together. You can feel their touch and dancing bodies next
to you and you are all enjoying this spontaneous party. You feel
connected to the others and not lonely anymore. Augmented remote
dance aims to overcome current challenges of learning practices
hitherto restricted to the same physical space.
dren and is attested universally across world cultures [25]. Dance
uniquely combines thinking, feeling, sensing, and doing. It has
strong effects on physiological and psychological well-being, com-
bining the benefits of physical exercise with heightened sensory
awareness, cognitive function, creativity, inter-personal contact and
emotional expression [11, 25]. This is why in CAROUSEL
have chosen dance as vehicle to implement our vision.
Our studies include modern freeform dance styles in groups and
in pairs, folk dances, and partner dances such as tango. We believe
dance presents a special challenge due to the complex dependency
between the motions, the music, the tactile contact and the dancers’
feelings and sensations. The results learned from dance will open
up research avenues to many other use cases in the future, from
physical training to manual assembly, martial arts, companionship
etc., that require collaboration and synchronization of movements in
A person may feel lonely or isolated if they find themselves with
nobody to contact, temporary confined, in pandemic quarantine or
lockdown conditions. This person may open an interaction with a
digital character and start dancing together; other real people may
decide to join (figure 2). In live events an icebreaker, for example
from a group animator, an inspiring dancer, musician, moderator,
trainer, or companion is often needed to kick-start the interaction.
We imagine that an autonomous digital character will be able to play
this role in the future. This is precisely the technical CAROUSEL
breakthrough: to create AI-driven characters who will be able to
interact autonomously with a single person or a group of people in a
meaningful way and the people can feel and dance with each other
even if they are physically not in the same space!
Therefore, CAROUSEL
aims to deliver a fully immersive-
2021 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)
978-0-7381-1367-8/21/$31.00 ©2021 IEEE
DOI 10.1109/VRW52623.2021.00062
Figure 2: Levels of potential of real-world social and physical AI with
autonomous digital characters.
shared performance environment through which participants may
visualize their interactions with each other and AI-characters in a
virtual space or in a mixed reality reconstruction of their rooms to
lay the foundation for networked synchronized virtual dancing as
well as other physical activities. CAROUSEL
aims to lay the new
technical and scientific foundation for social and physical real-world
AI for enhanced social interaction with digital characters (Figure 2).
Many interaction and movement synthesis tasks benefit from the
ability of predicting the user’s movements and/or intention based
on past movements and sensor data. However, for many interaction
tasks, predicting poses and movement trajectories is not enough, and
the user’s movement needs to be understood in terms of the move-
ment vocabularies of particular dance styles, and ergonomic and
biomechanical properties such as energy expenditure and perceived
effort. Furthermore, for the AI system to adapt the difficulty or to
give constructive criticism on various poses or dance styles it is also
necessary to evaluate the user’s fatigue needs based on kinematic
data alone. However, a full-body model for real-time perceived
fatigue analysis and motion synthesis has yet to be developed.
This project involves technical, scientific, social science, innova-
tion, and artistic partners from the outset. The focus on dance is
the beginning of our research path to explain our research concepts.
The consortium follows the interdisciplinary co-design methodol-
ogy for the development of the scenarios, use cases, requirements
for the interactive formats and technologies, demonstrations, and
validations. From the start of the project, the consortium partners
are involving interdisciplinary co-designers in their cities, networks,
and campuses, holding focus groups with artists, musicians, design
students, technologists, and scientists.
In our research, we concentrate on the emergent method to study
novel forms of social interaction between groups of humans and dig-
ital characters. Emergent behavior of a system cannot be described
by the behavior of the components of the system and is therefore
unexpected to a designer or observer. The development of new
interactive technologies takes place within Research and Develop-
ment aspects of computer science and technology on one side and
artistic and social science on the other. These worlds are character-
ized by different thought processes, work techniques, strengths, and
needs. Unravelling the potential of all sides in order to find unusual
innovative solutions requires close interdisciplinary collaboration.
Our concept, scenarios and experiments are based on hypothe-
ses that emergent behavior will arise from the interactions between
members of a group, rather than from the behavior of the members
of the group in isolation. The idea is that unforeseen, yet unknown
forms of expression and communication will develop between hu-
mans and digital characters in our experiments. This should lead
to the emergence of a new generation of digital characters being
capable of motivated, autonomous interaction with humans.
Two things are clear about artificial intelligence (AI) and extended
reality (XR). First, the technologies will be flooding the consumer,
enterprise, and education markets in the coming decades. Second,
researchers know little to nothing about how this novel technology
will change social interaction. In CAROUSEL
we will run user
studies with different test user groups (ranging from experts to
beginner dance students of all ages) to explore how group interaction
will change by staging trials, experiments and live dance events
with XR and AI-characters. “Experts” in this regard include dance
experts, such as teachers and hobby dancers who may have not a lot
of expertise in XR technologies, and XR experts, such as developers
and psychologists who may not have much experience in dancing,
or a combination of both expert types.
aims to develop a complete solution for realistic
augmentation and interaction of people with a never before seen
level of quality and presence in the dynamic virtual environment.
This is expected to have a large impact on other uses and applications
of extended reality. The integration and experimental validation for
the dancing experience of the proposed visual, haptic, and aural
technologies providing consistent low-latency multi-modal feedback
is of great research interest and challenge. Whilst hardware improves
and develops, alternative combinations of devices will suit levels of
sensing and display with increasing freedom of movement towards
ideally fully unencumbered immersive dance in the presence of
virtual dance partners.
3.1 Audio
Audio is a critical component to rendering an immersive feedback
for remote participants to feel tele-present in a virtual shared environ-
ment and key to musical rhythmic dance. CAROUSEL+will focus
upon binaural audio reproduction over headphones/earphones, in
line with industry practice for monitoring live-performance and pro-
duction. By contrast, spatial audio reproduction over loudspeakers
tends to require large numbers of loudspeakers, careful positioning,
and calibration, and suffers limited isolation between personal sound
delivered to multiple listeners in the same room [9,28].
Traditionally, mixes for feedback during film/music production
are channel-based whereas interactive media tend to be object-based,
such as games and emerging broadcast technologies [13]. For an
auditory display to sound realistically, one must include both sound
sources and their reverberation with the (virtual) acoustical envi-
ronment [29]. Current research is exploring methods to avoid poor
localization, front-back confusions, and a lack of externalization
(i.e., failing to display the sound outside of the listener’s head) and
context of the real environment’s acoustic materials [10].
will advance feedback methods to give an object-
based immersive experience, and will employ machine-learning
methods to select appropriate features from images and anthropo-
metric data to estimate a close match to each actor’s individual head-
related impulse response (HRIR), e.g., via sparse representation
[He15]. The novel contributions to audio rendering in CAROUSEL
will be to demonstrate the use of object-based spatial audio for vir-
tual auditory displays in synchronized networked shared experiences,
to investigate the effect of virtual location acoustics within this repre-
sentation, and to adapt the rendered scenes according to the various
roles in the dance or setting. Key research questions to be addressed
with respect to immersive audio experiences in CAROUSEL+are:
How can object-based spatial audio techniques bring benefit
to a person’s immersion via a virtual auditory display that is
adapted to his/her anthropometric measurements and delivered
What influence does the impression of the acoustics of the
virtual environment have on the person’s immersion within
that environment?
How can auditory displays be designed to best scale to large
groups and crowds?
3.2 Visual
will deliver a combination of novel science (algo-
rithms, representations) and engineering contributions to extend
intelligent predictive motion sensing to real-time networked co-
motions. This will enable real-time immersive remote visualization
of partners’ virtual character performances. For addressing latency,
motion prediction schemes [1] will surmount latency of body motion
reconstruction among remote participants.
Although finger tracking involves capturing mainly fine scale
details and motion, it is nevertheless important due to the nuances
and subtleties they exhibit particularly in dance. Further, finger
tracking will be leveraged by the haptic feedback system to provide
the participants with unprecedented sensorimotor immersion in their
connected performance and play environment.
We propose to advance the state of the art using integrated camera
sensors with head mounted sensors, coupled with hand-held mo-
tion sensors for open, bespoke, easily deployable, new hardware
frameworks. For transmitting facial avatar expressions with greater
detail than before in the first instance, we will address eyes and
mouth region by extending methods of facial capture using cameras
mounted inside the immersive display headset [6,30]. In the second
instance, we will modify our solution to fuse camera sensor and
immersive headset tracking information, covering the volume of
each remote person, to specifically merge sensor data with as few
encumbrances as possible. The aim, in each instance, is to acquire
enough information to track key-features by locating cameras in
the appropriate view for detailed facial, hand and body expression
reconstruction. Recent promising work of the consortium on vision
based facial capture image-based feature detail [20] and employs
deep learning with synthesis and large 2D and 3D training sets [8].
3.3 Haptics
Current haptic feedback displays are not greatly suited to broad
audience usage scenarios. Commercial devices, such as the Cyber-
Force system, offer impressive force feedback capabilities but are
cumbersome and adversely affect the range of motion and comfort
of the wearer. Vibrotactile displays have recently gained popularity
for gaming and industrial applications as they have smaller form
factor and are therefore are less intrusive
. However, these solutions
are not designed with a dance group in mind. Furthermore, the asso-
ciated rendering methods are limited in their ability to provide rich,
perceptually plausible feedback, although recent work has begun to
focus on developing perceptual rendering methods for multi-grid
displays creating highly immersive effects [3,15, 16,26].
A tangible result of CAROUSEL
will be a sparse targeted-body
haptic display, developed by consortium partners, which is capable
Gaming vests provide real-time feedback about a player’s virtual pres-
ence within a gaming environment (Tactsuit, KORFX and VYBE chair
adaption) provide similar feedback for film viewing or racing game scenar-
of display a range of physical and non-physical effects to a partici-
pant, thus increasing their sense of presence in a virtual setting and
providing spatiotemporal cues through touch and temperature sens-
ing modality. The haptic actuators will utilize innovative design of
vibrotactile hardware, which allows direct integration onto textiles.
Hence considerably reducing any obstruction of movement impor-
tant for dance. These will further provide optical tracking points for
use with the sensor hardware, providing a whole-body sensorimotor
solution. The haptic rendering methods will likewise leverage novel
retargeting methodologies of CAROUSEL
for providing realistic
tactile stimulus to each person whose dance movements are being
used to control their virtual character, mapping tangible interactions
to each person’s body appropriately.
develops technology for intelligent simulation and
control of physically plausible virtual characters. Interpreting the
signals from multiple people in real-time with the purpose of driving
high quality animated content underpinning seamless and engaging
visual experiences in the context of dancing.
4.1 User Analysis
tackles the many unsolved research questions related
to real-time embodied interaction: Prediction of future movements
from partial sensor data, understanding of user intend in various
contexts. This requires generative and predictive movement models
that can be flexibly conditioned on subsets of observed variables,
preferably without retraining for each possible subset. The models
should also support accessibility, style transfer and adaption of multi-
modal feedback according to skills, strength, flexibility observations.
Objectives include,
Further building on neural generative models (e.g., StyleGAN
2 [18] and the DRMM [14]).
Current approaches [2] in movement and dance style classifi-
cation rely on a discretization of movement patterns, which
narrows and specifies the number of dance styles for example.
We will aim at extending such classifications to a continuous
representation allowing a variety of styles.
Increase value of feedback on posture or dance routine by
incorporating models from biomechanical literature based on
computational efficient torque and force analysis for kinematic
fatigue measurements [21, 32].
Incorporating current emotion recognition methods for explain-
able and valuable feedback.
A challenge that needs to be addressed is the adaption of these
models to different users. To address this adaptation, next to various
body measurements described in literature, CAROUSEL
aims to
develop the first full-body perceived fatigue analysis that can also be
used for efficient motion synthesis with the optimization of rewards.
For this we hope to combine recent movement synthesis approaches
based on joint actuation torques [17] and our recently developed
cumulative fatigue model for reinforcement learning (RL) [7], which
uses cumulative fatigue to synthesize arm movements.
Furthermore, the machine learning models providing the esti-
mates need to be explainable, for an AI movement partner such as
a virtual dance teacher to be able to provide effective corrective
feedback (“this is how you should correct your movement”) instead
of just summative feedback (“this movement is incorrect”). Explain-
ability is an active topic in machine learning, mostly pursued in the
domain of medical image analysis; we will review the literature,
apply, and extend the most promising approaches to movement. In
addition to movement classification and analysis there has already
been work done on emotion recognition from image data specifically
for VR, handling the partially occluded face due to the head mounted
display [33]. If other sensors for face muscle activation [19, 22] or
EEG [23] are available, they can also be used for emotion detection.
4.2 Character Animation for Dancing
will investigate how to compute synchronized dance
motions combining physical simulation that incorporates captured
sensing performance data with anatomical knowledge. The algo-
rithms will further ensure that the dance can be augmented and
enhanced such that they conform to group constraints while not
encumbering group members in an undesired manner. For the sim-
ulation of the dance use cases, we want to develop a framework
for dancing motions that can adapt to input signals derived from
real user motion such as rhythm and interact with the environment.
To our knowledge no prior AI model exists that can interact and
dance with humans, handle all kind of unpredictable behavior, give
feedback and simultaneously come up with new creative movements
and autotelic behavior in real-time. While Granados et al. [12] have
developed a robot dance teacher, which provides haptic feedback to
the user, it is not able to come up with its own new dance moves,
nor analyse the user’s intention and behavior.
We plan to apply reinforcement learning of physics-based ani-
mation controllers. Increasing complexity in dance motions during
training can be addressed by curriculum learning. Dancing requires
a lot of creative and autotelic behavior to emerge that is intrinsi-
cally motivated. In reinforcement learning such motivation can be
modelled by intrinsic rewards. In artificial agents, intrinsic motiva-
tion resembles curiosity and exploration. In reinforcement learning
such motivation can be modelled by intrinsic rewards [4,27]. Re-
search [5, 24] suggests that such intrinsic rewards correspond to the
need for novel stimuli, i.e. rewarding actions that yield novel or
unpredicted observations.
To satisfy dancers quest for creativity, we want to be able to
“teach” custom motion to the characters by copying emergent behav-
ior directly from the online dancers. This will lead to the develop-
ment of spontaneous dance improvisational skills.
The rigid body simulations only approximate the complex muscu-
loskeletal system of a human. Therefore, unrealistic motions can be
produced by a simulation. Increasing the complexity of the model
will result in a loss of performance, therefore we intend to apply
fatigue models to simulate the effect of muscle systems.
To avoid having to retrain each individual character separately and
thus saving time and space, we plan to introduce reusable controllers
for different AI-characters.
The innovation perspective of this new branch “Real-World Social
and Physical AI” is tremendous as it can be applied to many applica-
tions in many areas, for physically interacting in a meaningful social
way. In particular digital characters in the future could act as physical
trainers, dancers, entertainers, actors, coworkers, health assistants,
guides, educators, spectators, physical therapists and companions.
Beyond social, entertainment, health and educational applications,
learning and understanding of body language and
group dynamics could be deployed in many other areas, includ-
ing security, peace-making, emergency handling and autonomous
In our networked dance learning framework we can foresee taking
advantage of regular rhythmic patterns, as exemplified by video
games such as Just Dance, Beat Saber and Dance Central VR, but
extended in guidance of cooperative, tactile and responsive motions
with partners, going beyond learning by copying, but learning by
leading and being led with anticipation and spontaneous response of
the others’ moves. Further, live motion analysis in a body tracked
XR session provides a data stream that can be adapted to deliver
encouraging corrections, enhancements and amplification of dance
is research and innovation project number 101017779,
funded under the European Horizon 2020 FET Proactive program.
S. Andrews, I. Huerta, T. Komura, L. Sigal, and K. Mitchell. Real-time
Physics-based Motion Capture with Sparse Sensors. In Proceedings
of the 13th European Conference on Visual Media Production (CVMP
2016), CVMP 2016, pp. 1–10. Association for Computing Machinery,
New York, NY, USA, Dec. 2016. doi: 10.1145/2998559.2998564
A. Aristidou, E. Stavrakis, M. Papaefthimiou, G. Papagiannakis, and
Y. Chrysanthou. Style-based motion analysis for dance composition.
The Visual Computer, 34(12):1725–1737, Dec. 2018. doi: 10. 1007/
P. Bach-Y-Rita, C. C. Collins, F. A. Saunders, B. White, and L. Scad-
den. Vision Substitution by Tactile Image Projection. Nature,
221(5184):963–964, Mar. 1969. Number: 5184 Publisher: Nature
Publishing Group. doi: 10. 1038/221963a0
G. Baldassarre and M. Mirolli. Intrinsically Motivated Learning Sys-
tems: An Overview. In G. Baldassarre and M. Mirolli, eds., Intrinsi-
cally Motivated Learning in Natural and Artificial Systems, pp. 1–14.
Springer, Berlin, Heidelberg, 2013. doi: 10.1007/978-3-642-32375-1 1
M. Bellemare, S. Srinivasan, G. Ostrovski, T. Schaul, D. Saxton, and
R. Munos. Unifying Count-Based Exploration and Intrinsic Motivation.
Advances in Neural Information Processing Systems, 29:1471–1479,
C. J. D. S. Brito and K. Mitchell. Recycling a Landmark Dataset
for Real-time Facial Capture and Animation with Low Cost HMD
Integrated Cameras. In The 17th International Conference on Virtual-
Reality Continuum and its Applications in Industry, VRCAI ’19, pp.
1–10. Association for Computing Machinery, New York, NY, USA,
Nov. 2019. doi: 10.1145/3359997.3365690
N. Cheema, L. A. Frey-Law, K. Naderi, J. Lehtinen, P. Slusallek, and
P. H
ainen. Predicting Mid-Air Interaction Movements and Fatigue
Using Deep Reinforcement Learning. In Proceedings of the 2020
CHI Conference on Human Factors in Computing Systems, pp. 1–13.
Association for Computing Machinery, New York, NY, USA, Apr.
A. Chen, Z. Chen, G. Zhang, K. Mitchell, and J. Yu. Photo-Realistic
Facial Details Synthesis From Single Image. pp. 9428–9438. IEEE
Computer Society, Oct. 2019. doi: 10.1109/ICCV. 2019.00952
P. Coleman, P. J. B. Jackson, M. Olik, M. Møller, M. Olsen, and
J. Abildgaard Pedersen. Acoustic contrast, planarity and robustness of
sound zone methods using a circular loudspeaker array. The Journal
of the Acoustical Society of America, 135(4):1929–1940, Apr. 2014.
Publisher: Acoustical Society of America. doi: 10.1121/1.4866442
D. W. Crawford, A. Samdahl, J. Voris, I. B. Kadar, and K. Mitchell.
Augmented reality (AR) audio with position and action triggered virtual
sound effects, Sept. 2014.
R. Elliott. The Use of Dance in Child Psychiatry. Clinical Child Psy-
chology and Psychiatry, 3(2):251–265, Apr. 1998. Publisher: SAGE
Publications Ltd. doi: 10. 1177/1359104598032008
D. F. P. Granados, B. A. Yamamoto, H. Kamide, J. Kinugawa, and
K. Kosuge. Dance Teaching by a Robot: Combining Cognitive and
Physical Human-Robot Interaction for Supporting the Skill Learning
Process. IEEE Robotics and Automation Letters, 2(3):1452–1459, July
2017. Publisher: Institute of Electrical and Electronics Engineers Inc.
doi: 10.1109/LRA. 2017.2671428
J. Herre, J. Hilpert, A. Kuntz, and J. Plogsties. MPEG-H Audio—The
New Standard for Universal Spatial/3D Audio Coding. Journal of
the Audio Engineering Society, 62(12):821–830, Jan. 2015. Publisher:
Audio Engineering Society.
P. H
ainen, T. Saloheimo, and A. Solin. Deep Residual Mixture
Models. arXiv e-prints, 2006:arXiv:2006.12063, June 2020.
A. Israr and I. Poupyrev. Control space of apparent haptic motion. In
2011 IEEE World Haptics Conference, pp. 457–462, June 2011. doi:
10.1109/WHC. 2011.5945529
A. Israr, I. Poupyrev, C. Ioffreda, J. Cox, N. Gouveia, H. Bowles,
A. Brakis, B. Knight, K. Mitchell, and T. Williams. Surround Haptics:
sending shivers down your spine. In ACM SIGGRAPH 2011 Emerg-
ing Technologies, SIGGRAPH ’11, p. 1. Association for Computing
Machinery, New York, NY, USA, Aug. 2011. doi: 10.1145/2048259.
Y. Jiang, T. Van Wouwe, F. De Groote, and C. K. Liu. Synthesis
of biologically realistic human motion using joint torque actuation.
ACM Transactions on Graphics, 38(4):72:1–72:12, July 2019. doi: 10.
T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila.
Analyzing and Improving the Image Quality of StyleGAN. pp. 8110–
8119, 2020.
H. Li, L. Trutoiu, K. Olszewski, L. Wei, T. Trutna, P.-L. Hsieh,
A. Nicholls, and C. Ma. Facial performance sensing head-mounted
display. ACM Transactions on Graphics, 34(4):47:1–47:9, July 2015.
doi: 10.1145/2766939
Y. Li, L. Ma, H. Fan, and K. Mitchell. Feature-preserving detailed 3D
face reconstruction from a single image. In Proceedings of the 15th
ACM SIGGRAPH European Conference on Visual Media Production,
CVMP ’18, pp. 1–9. Association for Computing Machinery, New York,
NY, USA, Dec. 2018. doi: 10. 1145/3278471.3278473
J. M. Looft, N. Herkert, and L. Frey-Law. Modification of a three-
compartment muscle fatigue model to predict peak torque decline
during intermittent tasks. Journal of Biomechanics, 77:16–25, Aug.
2018. doi: 10. 1016/j.jbiomech. 2018.06. 005
I. Mavridou, J. T. McGhee, M. Hamedi, M. Fatoorechi, A. Cleal,
E. Balaguer-Ballester, E. Seiss, G. Cox, and C. Nduka. FACETEQ: A
novel platform for measuring emotion in VR. In Proceedings of the
Virtual Reality International Conference - Laval Virtual 2017, VRIC
’17, pp. 1–3. Association for Computing Machinery, New York, NY,
USA, Mar. 2017. doi: 10.1145/3110292. 3110302
J. Nam, H. Chung, Y. a. Seong, and H. Lee. A New Terrain in HCI:
Emotion Recognition Interface using Biometric Data for an Immersive
VR Experience. arXiv e-prints, 1912:arXiv:1912.01177, Dec. 2019.
D. Pathak, P. Agrawal, A. A. Efros, and T. Darrell. Curiosity-driven
exploration by self-supervised prediction. In Proceedings of the 34th
International Conference on Machine Learning - Volume 70, ICML’17,
pp. 2778–2787., Sydney, NSW, Australia, Aug. 2017.
A. Pickard and D. Risner. Dance, health and wellbeing special issue.
Research in Dance Education, 21(2):225–227, 2020.
P. Preechayasomboon, A. Israr, and M. Samad. Chasm: A Screw
Based Expressive Compact Haptic Actuator. In Proceedings of the
2020 CHI Conference on Human Factors in Computing Systems, pp.
1–13. Association for Computing Machinery, New York, NY, USA,
Apr. 2020.
S. Roohi, J. Takatalo, C. Guckelsberger, and P. H
ainen. Review of
Intrinsic Motivation in Simulation-based Game Testing. In Proceedings
of the 2018 CHI Conference on Human Factors in Computing Systems,
CHI ’18, pp. 1–13. Association for Computing Machinery, New York,
NY, USA, Apr. 2018. doi: 10.1145/3173574. 3173921
S. Spors, H. Wierstorf, A. Raake, F. Melchior, M. Frank, and F. Zotter.
Spatial Sound With Loudspeakers and Its Perception: A Review of the
Current State. Proceedings of the IEEE, 101(9):1920–1938, Sept. 2013.
Conference Name: Proceedings of the IEEE. doi: 10.1109/JPROC.
K. Sunder, J. HE, E. L. Tan, and W. Gan. Natural Sound Rendering for
Headphones: Integration of signal processing techniques. IEEE Signal
Processing Magazine, 32(2):100–113, Mar. 2015. Conference Name:
IEEE Signal Processing Magazine. doi: 10. 1109/MSP.2014. 2372062
S.-E. Wei, J. Saragih, T. Simon, A. W. Harley, S. Lombardi, M. Perdoch,
A. Hypes, D. Wang, H. Badino, and Y. Sheikh. VR facial animation
via multiview image translation. ACM Transactions on Graphics,
38(4):67:1–67:16, July 2019. doi: 10. 1145/3306346.3323030
N. Weinstein and T.-V. Nguyen. Motivation and preference in isolation:
a test of their different influences on responses to self-isolation during
the covid-19 outbreak. Royal Society Open Science, 7(5):200458, 2020.
T. Xia and L. A. Frey Law. A theoretical approach for modeling
peripheral muscle fatigue and recovery. Journal of Biomechanics,
41(14):3046–3052, Oct. 2008. doi: 10. 1016/j.jbiomech. 2008.07. 013
H. Yong, J. Lee, and J. Choi. Emotion Recognition in Gamers Wearing
Head-mounted Display. In 2019 IEEE Conference on Virtual Reality
and 3D User Interfaces (VR), pp. 1251–1252, Mar. 2019. ISSN: 2642-
5254. doi: 10. 1109/VR.2019. 8797736
The performative installation DeviceD utilizes a network of systems toward facilitating interaction between dancer, digital media, and audience. Central to the work is a wearable haptic feedback system able to wirelessly deliver vibrotactile stimuli, with the latter initiated by the audience through posting on Twitter social media platform; the system in use searches for specific mentions, hashtags, and keywords, with positive results causing the system to trigger patterns of haptic biofeedback across the wearable’s four actuator motors. The system acts as the intermediator between the audience’s online actions and the dancer receiving physical stimuli; the dancer interprets these biofeedback signals according to Laban’s Effort movement qualities, with the interpretation informing different states of habitual and conscious choreographic performance. In this article, the authors reflect on their collaborative process while developing DeviceD alongside a multidisciplinary team of technologists, detailing their experience of refining the technology and methodology behind the work while presenting it in three different settings. A literature review is used to situate the work among contemporary research on interaction over internet and haptics in performance practice; haptic feedback devices have been widely used within artistic work for the past 25 years, with more recent practice and research outputs suggesting an increased interest for haptics in the field of dance research. The authors detail both technological and performative elements making up the work, and provide a transparent evaluation of the system, as means of providing a foundation for further research on wearable haptic devices.
Full-text available
This multi-wave study examined the extent that both preference and motivation for time alone shapes ill-being during self-isolation. Individuals in the USA and the UK are self-isolating in response to the COVID-19 outbreak. Different motivations may drive their self-isolation: some might see value in it (understood as the identified form of autonomous motivation), while others might feel forced into it by authorities or close others (family, friends, neighbourhoods, doctors; the external form of controlled motivation). People who typically prefer company will find themselves spending more time alone, and may experience ill-being uniformly, or as a function of their identified or external motivations for self-isolation. Self-isolation, therefore, offers a unique opportunity to distinguish two constructs coming from disparate literatures. This project examined preference and motivation (identified and external) for solitude, and tested their independent and interacting contributions to ill-being (loneliness, depression and anxiety during the time spent alone) across two weeks. Confirmatory hypotheses regarding preference and motivation were not supported by the data. A statistically significant effect of controlled motivation on change in ill-being was observed one week later, and preference predicted ill-being across two weeks. However, effect sizes for both were below our minimum threshold of interest.
Conference Paper
Full-text available
We present a compact broadband linear actuator, Chasm, that renders expressive haptic feedback on wearable and handheld devices. Unlike typical motor-based haptic devices with integrated gearheads, Chasm utilizes a miniature leadscrew coupled to a motor shaft, thereby directly translating the high-speed rotation of the motor to the linear motion of a nut carriage without an additional transmission. Due to this simplicity, Chasm can render low-frequency skin-stretch and high-frequency vibrations, simultaneously and independently. We present the design of the actuator assembly and validate its electromechanical and perceptual performance. We then explore use cases and show design solutions for embedding Chasm in device prototypes. Finally, we report investigations with Chasm in two VR embodiments, i.e., in a headgear band to induce locomotion cues and in a handheld pointer to enhance dynamic manual interactions. Our explorations show wide use for Chasm in enhancing user interactions and experience in virtual and augmented settings. (Video available here:
Conference Paper
Preparing datasets for use in the training of real-time face tracking algorithms for HMDs is costly. Manually annotated facial landmarks are accessible for regular photography datasets, but introspectively mounted cameras for VR face tracking have incompatible requirements with these existing datasets. Such requirements include operating ergonomically at close range with wide angle lenses, low-latency short exposures, and near infrared sensors. In order to train a suitable face solver without the costs of producing new training data, we automatically repurpose an existing landmark dataset to these specialist HMD camera intrinsics with a radial warp reprojection. Our method separates training into local regions of the source photos, i.e., mouth and eyes for more accurate local correspondence to the mounted camera locations underneath and inside the fully functioning HMD. We combine per-camera solved landmarks to yield a live animated avatar driven from the user’s face expressions. Critical robustness is achieved with measures for mouth region segmentation, blink detection and pupil tracking. We quantify results against the unprocessed training dataset and provide empirical comparisons with commercial face trackers.
A key promise of Virtual Reality (VR) is the possibility of remote social interaction that is more immersive than any prior telecommunication media. However, existing social VR experiences are mediated by inauthentic digital representations of the user (i.e., stylized avatars). These stylized representations have limited the adoption of social VR applications in precisely those cases where immersion is most necessary (e.g., professional interactions and intimate conversations). In this work, we present a bidirectional system that can animate avatar heads of both users' full likeness using consumer-friendly headset mounted cameras (HMC). There are two main challenges in doing this: unaccommodating camera views and the image-to-avatar domain gap. We address both challenges by leveraging constraints imposed by multiview geometry to establish precise image-to-avatar correspondence, which are then used to learn an end-to-end model for real-time tracking. We present designs for a training HMC, aimed at data-collection and model building, and a tracking HMC for use during interactions in VR. Correspondence between the avatar and the HMC-acquired images are automatically found through self-supervised multiview image translation, which does not require manual annotation or one-to-one correspondence between domains. We evaluate the system on a variety of users and demonstrate significant improvements over prior work.
Using joint actuators to drive the skeletal movements is a common practice in character animation, but the resultant torque patterns are often unnatural or infeasible for real humans to achieve. On the other hand, physiologically-based models explicitly simulate muscles and tendons and thus produce more human-like movements and torque patterns. This paper introduces a technique to transform an optimal control problem formulated in the muscle-actuation space to an equivalent problem in the joint-actuation space, such that the solutions to both problems have the same optimal value. By solving the equivalent problem in the joint-actuation space, we can generate human-like motions comparable to those generated by musculotendon models, while retaining the benefit of simple modeling and fast computation offered by joint-actuation models. Our method transforms constant bounds on muscle activations to nonlinear, state-dependent torque limits in the joint-actuation space. In addition, the metabolic energy function on muscle activations is transformed to a nonlinear function of joint torques, joint configuration and joint velocity. Our technique can also benefit policy optimization using deep reinforcement learning approach, by providing a more anatomically realistic action space for the agent to explore during the learning process. We take the advantage of the physiologically-based simulator, OpenSim, to provide training data for learning the torque limits and the metabolic energy function. Once trained, the same torque limits and the energy function can be applied to drastically different motor tasks formulated as either trajectory optimization or policy learning.