Conference PaperPDF Available

A Literature Review on Collaboration in Mixed Reality


Abstract and Figures

Mixed Reality is defined as a combination of Reality, Augmented Reality, Augmented Virtuality and Virtual Reality. This innovative technology can aid with the transition between these stages. The enhancement of reality with synthetic images allows us to perform tasks more easily, such as the collaboration between people who are at different locations. Collaborative manufacturing, assembly tasks or education can be conducted remotely, even if the collaborators do not physically meet. This paper reviews both past and recent research, identifies benefits and limitations, and extracts design guidelines for the creation of collaborative Mixed Reality applications in technical settings.
Content may be subject to copyright.
A Literature Review on Collaboration in Mixed
Philipp Ladwig and Christian Geiger
University of Applied Sciences, 40476 D¨usseldorf, Germany,
Abstract. Mixed Reality is defined as a combination of Reality, Aug-
mented Reality, Augmented Virtuality and Virtual Reality. This inno-
vative technology can aid with the transition between these stages. The
enhancement of reality with synthetic images allows us to perform tasks
more easily, such as the collaboration between people who are at differ-
ent locations. Collaborative manufacturing, assembly tasks or education
can be conducted remotely, even if the collaborators do not physically
meet. This paper reviews both past and recent research, identifies ben-
efits and limitations, and extracts design guidelines for the creation of
collaborative Mixed Reality applications in technical settings.
1 Introduction
With the advent of affordable tracking and display technologies, Mixed Reality
(MR) has recently gained increased media attention and has ignited the imag-
inations of many prospective users. Considering the progress of research and
enhancement of electronics over recent years, we inevitably will move closer to
the ultimate device which will make it difficult to distinguish between the virtual
world and reality. Star Trek’s Holodeck can be considered as an ultimate display
in which even death can take place. Such a system would provide realistic and
complete embodied experiences incorporating human senses including haptic,
sound or even smell and taste. If it were possible to send this information over
a network and recreate it at another place, this would allow for collaboration as
if the other person were physically at the place where the help is needed.
As of today, technology has not yet been developed to the level of Star Trek’s
Holodeck. Olson and Olson [24, 25] summarized that our technology is not yet
mature enough, and that ”distance matters” for remote collaboration. But many
institutes and companies have branches at different locations which implies that
experts of different technical fields are often distributed around a country or
even around the world. But the foundation of a company lies in the expertise
of their employees and in order to be successful, it is critical that the company
or institute shares and exchanges knowledge among colleagues and costumers.
Remote collaboration is possible via tools such as Skype, DropBox or Evernote,
but these forms of remote collaboration usually consists of ”downgraded packets
of communication” such as text, images or video. However, machines, assembly
Fig. 1. Reality - Virtuality Continuum by Milgram and Kishino. [19]
tasks and 3D CAD data are increasingly getting more complex. Exchanging 3D
data is possible, but interacting remotely in real time on real or virtual spatial
data is still difficult [24, 25]. At this point MR comes into play, which have the
potential to ease many of the problems of todays remote collaboration.
Milgram and Kishinio [19] defined the Reality-Virtuality Continuum, as de-
picted in Fig. 1, which distinguish between four different stages: Reality is the
perception of the real environment without any technology. Augmented Reality
(AR) overlays virtual objects and supplemental information into the real world.
An example of an AR device is Microsoft HoloLens.Augmented Virtuality (AV)
captures real objects and superimposes them into a virtual scene. A video of a
real person, showed in a virtual environment, is an example for AV. Virtual Re-
ality (VR) entirely eliminates the real world and shows only computer generated
graphics. Head Mounted Displays (HMD) such as the HTC Vive or Oculus Rift
are current examples of VR devices. This paper focuses on Mixed Reality which
is defined by Milgram and Kishinio as a blend between AR and AV technology.
In the last three decades, research has shown a large amount of use cases for
collaboration in MR: Supporting assembly tasks over the Internet [2,3, 7, 23,34],
conducting design reviews of a car by experts who are distributed geographically
[12, 22] and the remote investigation of a crime scene [5, 30] are only a few
examples of collaborative applications in MR. Especially the domain of Remote
Engineering and Virtual Instrumentation can benefit from remote guidance in
MR. For example, many specialized, expensive and recent equipment or machines
can only be maintained by highly qualified staff and are not often available
at the location upon request if the machine happens to become inoperative.
Furthermore, remote education could assist in the prevention of such emergency
cases and help to spread specialized knowledge more easily.
The following sections chronologically describe the progress of research over
recent decades. A predominant scenario can be observed in user studies: A re-
mote user helps a local user to complete a task. Although, different authors use
different terms for the participants of a remote session, we will use the abbrevi-
ations RU for remote user and LU for local user.
2 Research until the year 2012
A basis function of collaboration in every study examined in this paper is bi-
directional transmission of speech. Every application uses speech as a foundation
for communication. However, language can be ambiguous or vague if it describes
spatial locations and actions in space. Collaborative task performance increases
significantly when speech is combined with physically pointing as Heiser et al.
[8] state. Some of the first collaborative systems, which uses MR, were video-
mediated applications as presented by Ishii et al. [9,10]. A video camera, which
was mounted above the participant’s workplace, captured the work on the table
and transmitted it to other meeting participants on a monitor. A similar system
was developed by Kirk and Fraser [11]. They conducted a user study in which
the participants had to perform a Lego assembly task. They found, that AR not
only speeds up the collaboration task but it was also easier for the participants
(in regards to time and errors) to recall the construction steps in a self-assembly
task 24 hours later when they were supported by MR technology instead of only
listening to voice commands.
Baird and Barfield [2] and Tang et al. [34] prove that AR reduces the men-
tal workload for assembly tasks. Billinghurst and Kato [4] reviewed the state
of research on collaborative MR of the late 90’s and concluded that there are
promising applications and ideas, but that they scratch just only the surface
of possibilities. It must be further determined, in which areas MR can be ef-
fectively used. Furthermore, Billinghurst and Kato mention that the traditional
WIMP-interface (Windows-Icons-Menus-Pointer) is not appropriate for such a
platform and must be reinvented for MR.
Klinker et al. [12] created the system Fata Morgana which allows for col-
laborative design reviews on cars and is capable to focus on details as well as
compare different designs.
Monahan, McArdle and Bertolotto [20] emphasize the potential of Gamifica-
tion for educational purposes: ”Computer games have always been successful at
capturing peoples imagination, the most popular of which utilize an immersive
3D environment where gamers take on the role of a character.” [20] Li, Yue and
Jauregui [14] developed a VR e-Learning system and summarize that virtual ”e-
Learning environments can maintain students interest and keep them engaged
and motivated in their learning.” [14]
Gurevich, Lanir and Cohen [7] developed a remote-controlled robot with
wheels, named TeleAdvisor, which carries a camera and projector on a movable
arm. The RU sees the camera image, can remotely adjust the position of the
robot and his arm with aid of a desktop PC and is able to project drawings and
visual cues onto a surface by the projector. A robot, which carries a camera,
has the advantage of delivering a steady image to the RU while a head-worn
camera by the LU lead to jittery recordings, which can cause discomfort for the
RU. Furthermore, a system controlled by the RU allows mobility, flexibility and
eases the cognitive overhead for the LU, since the LU does not need to maintain
the Point-of-View (PoV) for the RU.
To summarize this section, the transmission of information were often re-
stricted until the year 2012 due to limited sensors, displays, network bandwidth
and processing power. Many system rely on video transfer and were not capable
of transmitting the sense of ”being there” which restricts the mutual under-
standing of the problem and the awareness of spatial information.
3 New Technology introduces a sustainable change
After the year 2012, more data became available for MR collaboration due to
new technology. The acquisition and triangulation of 3D point clouds of the
environment became affordable and feasible in real time. Better understanding
of the environment results in more robust tracking of MR devices. Furthermore,
display technology was enhanced and enabled the development of inexpensive
HMDs. Tecchia, Alem and Huang [35] created one of the first systems which is
able to record the workplace as well as arms and hands of the RU and LU with
a 3D camera and allows the entrance of the triangulated and textured virtual
scene by an HMD with head tracking. The system revealed improvements in
performance over a 2D-based gesture system. Sodhi et al. [31] combines the
Mircosoft Kinect and a short range depth sensor and achieved 3D reconstruction
of a desktop-sized workplace and implemented a transmission of a hand avatar to
the remote participant. Instead of a simple pointing ray, a hand avatar allows for
the execution of more complex gestures, therefore delivering more information
among the participants for creating a better mutual understanding.
Moreover, the system by Sodhi et al. [31] is capable of recognizing real sur-
faces. Understanding surfaces of the real environment allows for realistic physical
interactions such as collision of the hand avatar with real objects such as a table.
If the position of real surfaces are available within the virtual world, snapping
of virtual objects to real surfaces is possible as well. This allows for decreased
time in placing virtual object in the scene such as a furniture or assembly parts.
If the environment is available as a textured 3D geometry, it can be freely
explored by the RU. Tait and Billinghurst [33] created a system which incorpo-
rates a textured 3D scan of a workplace. It allows the RU to explore the scene
with keyboard and mouse on a monoscoping monitor and allows the selection
of spatial annotations. It was found that increasing view independence (fully
independent view vs. fixed or freeze views of the scene) leads to a faster com-
pletion of collaborative tasks and a decrease in time spent on communication
during the task. Similar results are found by Lanir et al. [1] and explain: ”A
remote assistance task is not symmetrical. The helper (RU) usually has most
of the knowledge on how to complete the task, while the worker (LU) has the
physical hands and tools as well as a better overall view of the environment.
Ownership of the PoV (Point-of-View), therefore, does not need to be symmet-
rical either. It seems that for helper-driven (RU-driven) construction tasks there
is more benefit in providing control (of the PoV) to the helper (the RU)” [1].
Oda et al. [23] uses Virtual Replicas for assembly tasks. A Virtual Replica is a
virtual copy of a real-existing, tracked assembly part. It exists in real life for the
LU and it is rendered as a 3D model in VR for the RU. The position of the virtual
model is constantly synchronized with the real environment. Many assembly
parts of machines have complex forms and in some cases it is difficult for the LU
to follow the instructions of the RU in order to achieve the correct rotation and
placement of such complex objects. Therefore, virtual replicas, controlled by the
RU, can be superimposed in AR for the LU which eases the mental workload for
the task. Oda et al. found that the simple demonstration of how to physically
align the virtual replica on another machine part is faster compared to making
spatial annotations onto the Virtual Replicas as visual guidance for the LU which
allows for an easier placement. Oda et al. employs physical constraints such as
snapping of objects to speed up the task similar to Sodhi et al.
Poelman et al. [30] developed a system which is also capable of building a 3D
map of the environment in real-time and was developed with the focus to tackle
issues in remote-collaborative crime scene investigation. Datcu et al. [5] uses the
system of Poelman et al. and proves that MR supports Situational Awareness of
the RU. Situational Awareness is defined as the perception of a given situation,
its comprehension and the prediction of its future state as Endsley descriped [6].
Pejsa et al. [26] created a life-size, AR-based, tele-presence projection system
which employs the Microsoft Kinect 2 for capturing the remote scene and recreate
it with the aid of a projector from the other participant’s side. A benefit of such
a system is that nonverbal communication cues, such as facial expressions, can
be better perceived compared to systems where the participants wear HMDs
which covers parts of the face.
Mueller et al. [21] state that the completion time of remote collaborative
tasks, such as finding certain virtual objects in a virtual room, benefits by pro-
viding simple Shared Virtual Landmarks. Shared Virtual Landmarks are objects,
such as virtual furniture, which helps to understand deictic expressions such as
”under the ceiling lamp” or ”behind the floating cube”.
Piumsomboon et al. [28,29] developed a system which combines AR and VR.
The system scans and textures a real room with a Microsoft HoloLens and shares
the copy of the real environment to a remote user who can enter this copy by
a HTC Vive. The hands, fingers, head gaze, eye gaze and Field-of-View (FoV)
were tracked and visualized among both users. Piumsomboon et al. reveal that
rendering the eye gaze and FoV as additional awareness cues in collaborative
tasks can decrease the physical load (as distance traveled by users) and make
the task (subjectively rated by the users) easier. Furthermore, Piumsomboon et
al. offers different scalings of the virtual environment. Shrinking the virtual copy
of the real environment allows for a better orientation and path planning with
help of a miniature model in the users hand similar as Stoakley, Conway and
Pausch [32] show.
In summary, since technology has become advanced enough to scan and un-
derstand the surface of the environment in real time, important enhancements for
collaboration tasks were achieved and attested as important for efficient remote
work. 3D reconstruction of the participants’ body parts and the environment
allows for 1.) better spatial understanding of the remote location (free PoV) 2.)
as well as better communication because of transmission of nonverbal cues (gaze,
gestures) and 3.) allows for incorporating the real surfaces with virtual objects
(virtual collision, snapping). Furthermore, the 3D reconstruction of the environ-
ment implies better understanding of the environment which, in turn, leads to
4.) more robust tracking of devices (phones, tablets, HMDs, Virtual Replicas)
and 5.) new display technologies enables more immersive experiences which lead
to better spatial understanding and problem awareness for both users.
Fig. 2. a) View of a third collaborator through his HoloLens: Users design a sail ship
in a local collaboration scenario. One user is immersed by an VR HMD (HTC Vive)
while his collaborators uses an AR device (HoloLens). b) VR view of the Vive user:
The sail ship in the middle and the Vive controller at the bottom can be seen.
4 Insights from a Development of a Collaborative Mixed
Reality Application
We have developed an application in order to apply recent research outcomes
and we want to share our lessons learned of combining two tracking systems. Our
application is an immersive 3D mesh modeling tool which we have developed and
evaluated previously [13]. Our tool allows creating 3D meshes with the aid of
an HMD and two 6Degree-of-Freedom controllers and is inspired by common
desktop modeling applications such as Blender and Autodesk Maya. We have
extended our system with a server-client communication which enables users
with different MR devices to join a modeling session. Our tool can simulate how
colleagues can collaboratively develop, review and discuss ideas, machine parts
or designs.
It is created with the intent to be as flexible as possible. This includes: First,
the users are free to choose an AR or VR device such as HTC Vive or Microsoft
HoloLens. Second, the user can work with real objects, virtual replicas or entirely
virtual items. Third, the system is capable to work locally in the same room,
depicted in Figure 2a, or remotely at different places.
A use case demonstrates how our system works and give insights of connecting
and merging two different MR systems: A LU, using a HTC Vive, starts the
modeling application and hosts a session. A RU scans a fiducial marker with his
HoloLens in order to join the session. The marker has two purposes. First, it
contains a QR code with connection details such as an IP address to the server.
Second, it represents the origin of the tracking space of the remote Vive system.
This allows the HoloLens user to place the virtual content of the server (content
of the HTC Vive side) to any place in his real environment. Additionally, this
approach also enables the user to synchronize the tracking spaces in the same
room by placing the marker on the origin of the Vive tracking system, as shown
in Figure 2a.
Our first tests showed that we can successful merge two different tracking
systems, such as the HTC Vive and the HoloLens, but we experienced some is-
sues: The tracking system of the Vive interferes with the tracking system of the
HoloLens as soon as the users approach closer than one meter to each other. It
lead to tracking errors for the HTC Vive. Furthermore, we experienced that the
HoloLens’ processing power is limited due relative low technical specifications
compared to a workstation which limits the complexity of the rendered scene.
Moreover, we have identified that even the local network connection in our col-
laboration scenario in the same room reveals delays which are noticeable and
could interfere with natural interaction, nonverbal cues and gestures.
5 Research Agenda, Technology Trends and Outlook
This paper has shown examples of remote collaboration which prove the per-
formance and potential of MR. Although important enhancements and research
results have been discovered in recent years, we still have a long way to go until
we have achieved the ultimate display for collaboration - Star Trek’s Holodeck.
A major concern of research, which up to this point has been scarcely inves-
tigated, is the collaboration between multiple teams. The focus in past research
has been mainly conducted on collaboration between two persons, but how to
exchange complex data and interact between multiple groups has yet to be re-
searched further. Lukosch et al. [17] have taken the first steps in this direction but
stated that further research is necessary. Piirainen, Kolfschoten and Lukosch [27]
mention that a difficulty of collaborative remote work in teams is developing a
consensus about the nature of the problem and specification. Situational Aware-
ness cues and Team Awareness cues need to be outlined.
Another important point on the agenda is how to maintain focus of the users
to certain events and parts in the environment. Awareness cues are in general an
ongoing topic of research and must be investigated further. M¨uller, R¨adle und
Reiterer [21] ascertain that a technique is needed to put events, collaborators or
objects into the users’ focus, which are not in the field of view. Pejsa et al. [26] and
Masai et al. [18] emphasize the importance of nonverbal communication cues such
as facial expression, posture and proxemics which are important contributors to
empathy but these cues are still difficult to transmit with today’s hardware.
A relative rarely investigated field of research is comfort in MR, though it is
an important area for the usage of an application over a long period of time. Up
to this point, a real use case could look like this: A worker conducts a demanding
assembly task for hours on an expensive machine by remote guidance. But the
weight of the HMD, the usability of the application and the fatigue in his arms
from making gestures for interacting with the device lead to a growing frustration
by the worker which lead, in turn, to errors of the assembling. Piirainen et al. [27]
advise not to underestimate user needs and human factors: ”From a practical
perspective the challenges show that the usability of systems is a key.” Today, a
general problem and consideration for every MR application is comfort for the
user. Only a few years ago, VR and AR hardware was used to be bulky and
heavy and research in regards of comfort was theoretically in vain. Research in
MR is mainly focused on technical feasibility and compares productivity between
non-MR and MR application. However, comfort and usability is important, if
long-term applications are required, but research of comfort is scarce. Ladwig,
Herder and Geiger [13] consider and evaluate comfort for MR application. Lubos
et al. [15] revealed important outcomes for comfortable interaction and did first
steps into this direction.
Moreover, perceiving virtual haptic is widely an unresolved problem in MR
and researcher tries to substitute it with the aid of constraints such as virtual
collisions and snapping, as Oda et al. shows [23]. Furthermore, Lukosch et al. [16]
and Billinghurst [4] mention that further research is needed which particular
tasks can be effectively solved and managed with MR.
Better tracking technologies, faster networks, enhanced sensors and faster
processing will move us to the Holodeck and maybe even beyond. Further areas of
research will arise with the advent of new technologies such as machine learning
for object detection and recognition. MR devices of the future will not only
recognize surfaces of the environment, but also detect objects such as machine
parts, tools and humans.
6 Design Guidelines
Past research and our lessons learned revealed many issues which can be con-
cluded into design guidelines for the development of MR applications:
Provide as much information about the remote environment as pos-
sible Video is a minimum requirement. A 3D mesh of the environment is bet-
ter [5,23, 28–31]. An updated 3D mesh in real-time seems to be the best case [35].
Provide an independent PoV for investigating the remote scenery It
allows better spatial p erception and problem understanding [1, 28, 29, 33, 35]
Provide as much Awareness Cues as possible Transmitting speech is fun-
damental. Information of posture of collaborators such as head position, head
gaze, eye gaze, FoV [28, 29] is beneficial. For pointing by hand is a virtual ray
sufficient but a static hand model [31] or even a full tracked hand model is
better and conveys more information such as natural gestures [28,29]. Provide
cues for events happen outside the FoV of the users and provide Shared Local
Landmarks [21]. To avoid cluttering the view of the users, awareness cues can
be turned on and off [21].
Consider usability and comfort If a long-term usage is desired, take a com-
fortable interface for the user into account and consider human factors [13,15,27].
1. Ownership and control of point of view in remote assistance. p. 2243. ACM Press,
2. K. M. Baird and W. Barfield. Evaluating the effectiveness of augmented reality
displays for a manual assembly task. Virtual Reality, 4(4):250–259, 1999.
3. M. Billinghurst, A. Clark, and G. Lee. A Survey of Augmented Reality Augmented
Reality. Foundations and Trends in Human-Computer Interaction, 8(2-3):73–272,
4. M. Billinghurst and H. Kato. Collaborative Mixed Reality. In Mixed Reality, pp.
261–284. Springer Berlin Heidelberg, 1999.
5. D. Datcu, M. Cidota, H. Lukosch, and S. Lukosch. On the usability of augmented
reality for information exchange in teams from the security domain. In Proceedings
- 2014 IEEE Joint Intelligence and Security Informatics Conference, JISIC 2014,
pp. 160–167. IEEE, 2014.
6. M. R. Endsley. Toward a Theory of Situation Awareness in Dynamic Systems.
Human Factors: The Journal of the Human Factors and Ergonomics Society,
37(1):32–64, 1995.
7. P. Gurevich, J. Lanir, and B. Cohen. Design and Implementation of TeleAdvisor: a
Projection-Based Augmented Reality System for Remote Collaboration. Computer
Supported Cooperative Work (CSCW), 24(6):527–562, 2015.
8. J. Heiser, B. Tversky, and M. I. A. Silverman. Sketches for and from collaboration.
Visual and Spatial Reasoning in Design III, pp. 69–78, 2004.
9. H. Ishii, M. Kobayashi, and J. Grudin. Integration of inter-personal space and
shared workspace. In Proceedings of the 1992 ACM conference on Computer-
supported cooperative work - CSCW ’92, pp. 33–42. ACM Press, 1992.
10. H. Ishii and N. Miyake. Toward an open shared workspace: computer and video
fusion approach of TeamWorkStation. ACM, 34(12):37–50, 1991.
11. D. Kirk and D. Fraser. The effects of remote gesturing on distance instruction.
Lawrence Erlbaum Associates, 2005.
12. G. Klinker, A. H. Dutoit, M. Bauer, J. Bayer, V. Novak, and D. Matzke. Fata Mor-
gana - A presentation system for product design. In Proceedings - International
Symposium on Mixed and Augmented Reality, ISMAR 2002, pp. 76–85. IEEE Com-
put. Soc, 2002.
13. P. Ladwig, J. Herder, and C. Geiger. Towards Precise, Fast and Comfortable
Immersive Polygon Mesh Modelling. In ICAT-EGVE 2017 - International Confer-
ence on Artificial Reality and Telexistence and Eurographics Symposium on Virtual
Environments. The Eurographics Association, 2017.
14. Z. Li, J. Yue, and D. A. G. Jauregui. A new virtual reality environment used for e-
Learning. In 2009 IEEE International Symposium on IT in Medicine & Education,
pp. 445–449. IEEE, 2009.
15. P. Lubos, G. Bruder, O. Ariza, and F. Steinicke. Touching the Sphere: Leveraging
Joint-Centered Kinespheres for Spatial User Interaction. In Proceedings of the 2016
Symposium on Spatial User Interaction, SUI ’16, pp. 13–22. ACM, 2016.
16. S. Lukosch, M. Billinghurst, L. Alem, and K. Kiyokawa. Collaboration in Aug-
mented Reality. Computer Supported Cooperative Work: CSCW: An International
Journal, 24(6):515–525, 2015.
17. S. Lukosch, H. Lukosch, D. Datcu, and M. Cidota. Providing Information on the
Spot: Using Augmented Reality for Situational Awareness in the Security Domain.
Computer Supported Cooperative Work (CSCW), 24(6):613–664, 2015.
18. K. Masai, K. Kunze, M. Sugimoto, and M. Billinghurst. Empathy Glasses. In
Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in
Computing Systems - CHI EA ’16, pp. 1257–1263. ACM Press, New York, New
York, USA, 2016.
19. P. Milgram and F. Kishino. A Taxonomy of Mixed Reality Visual Displays. IEICE
Transactions on Information Systems, (12), 1994.
20. T. Monahan, G. McArdle, and M. Bertolotto. Virtual reality for collaborative
e-learning. Computers and Education, 50(4):1339–1353, 2008.
21. J. M¨uller, R. R¨adle, and H. Reiterer. Remote Collaboration With Mixed Real-
ity Displays. In Proceedings of the 2017 CHI Conference on Human Factors in
Computing Systems - CHI ’17, pp. 6481–6486. ACM Press, 2017.
22. Nvidia. Nvidia Holodeck.
visualization/technologies/holodeck/ Accessed on 2018-01-25.
23. O. Oda, C. Elvezio, M. Sukan, S. Feiner, and B. Tversky. Virtual Replicas for
Remote Assistance in Virtual and Augmented Reality. In Proceedings of the 28th
Annual ACM Symposium on User Interface Software & Technology - UIST ’15,
pp. 405–415. ACM Press, 2015.
24. G. M. Olson and J. S. Olson. Distance Matters. HumanComputer Interaction,
15(2-3):139–178, 2000.
25. J. S. Olson and G. M. Olson. How to make distance work work. 2014.
26. T. Pejsa, J. Kantor, H. Benko, E. Ofek, and A. D. Wilson. Room2Room: Enabling
Life-Size Telepresence in a Projected Augmented Reality Environment. In Pro-
ceedings of the 19th ACM Conference on Computer-Supported Cooperative Work
& Social Computing - CSCW ’16, pp. 1714–1723. ACM Press, 2016.
27. K. A. Piirainen, G. L. Kolfschoten, and S. Lukosch. The Joint Struggle of Complex
Engineering: A Study of the Challenges of Collaborative Design. International
Journal of Information Technology & Decision Making, 11(6):1087–1125, 2012.
28. T. Piumsomboon, A. Day, B. Ens, Y. Lee, G. Lee, and M. Billinghurst. Exploring
Enhancements for Remote Mixed Reality Collaboration. pp. 1–5, 2017.
29. T. Piumsomboon, Y. Lee, G. Lee, and M. Billinghurst. CoVAR: a collaborative
virtual and augmented reality system for remote collaboration. In SIGGRAPH
Asia 2017 Emerging Technologies on - SA ’17, pp. 1–2. ACM Press, 2017.
30. R. Poelman, O. Akman, S. Lukosch, and P. Jonker. As if being there. Proceedings
of the ACM 2012 conference on Computer Supported Cooperative Work - CSCW
’12, (5):1267, 2012.
31. R. S. Sodhi, B. R. Jones, D. Forsyth, B. P. Bailey, and G. Maciocci. BeThere: 3D
Mobile Collaboration with Spatial Input. Proceedings of the SIGCHI Conference
on Human Factors in Computing Systems - CHI ’13, pp. 179–188, 2013.
32. R. Stoakley, M. J. Conway, and R. Pausch. Virtual reality on a WIM. In Proceedings
of the SIGCHI conference on Human factors in computing systems - CHI ’95, pp.
265–272. ACM Press, New York, New York, USA, 1995.
33. M. Tait and M. Billinghurst. The Effect of View Independence in a Collaborative
AR System. Computer Supported Cooperative Work: CSCW: An International
Journal, 24(6):563–589, 2015.
34. A. Tang, C. Owen, F. Biocca, and W. Mou. Comparative effectiveness of aug-
mented reality in object assembly. In Proceedings of the conference on Human
factors in computing systems - CHI ’03, p. 73, 2003.
35. F. Tecchia, L. Alem, and W. Huang. 3D Helping Hands : a Gesture Based MR
System for Remote Collaboration. VRCAI - Virtual Reality Continuum and its
Applications in Industry, 1(212):323–328, 2012.
... These cues can be especially useful especially if a digital twin is employed. Given the realistic representations of 3D objects and possibilities to interact with them, a collaborative XR environment should deliver such awareness cues much more naturally than current state-of-art digital collaboration tools, and unlock exciting new possibilities, especially for remote teams (Bai et al., 2020;Ladwig and Geiger, 2018;Orts-Escolano et al., 2016). However, despite its promise, there are still many challenges attached to collaboration using XR. ...
Full-text available
Various forms of extended reality might empower remote collaboration in ways that the current de facto standards cannot facilitate. Especially when combined by a digital twin of the remote physical object, mixed reality (MR) opens up interesting new ways to support spatial communication. In this study, we explore the use of a digital twin to facilitate visuospatial communication in an expert-guided repair and maintenance operation scenario, supported by visual annotations. We developed two MR prototypes, one with a digital twin of the object of interest, and another where a first-person camera view was shown additionally. We tested these prototypes in a study with 19 participants (9 pairs) against a state-of-the art solution as a baseline and measured their usability, and obtained qualitative user feedback. Our findings suggest that digital twin supported mixed reality enriched with real time visual annotations can potentially improve remote collaboration tasks.
... Mixed and augmented reality finally occupy significant space in our daily routine. The achievement of several historical milestones, from routing [5] to entertainment [6], and from social media [7] to engineering and remote collaboration [8], showcases the promising future of AR and MR. ...
Full-text available
“Interaction” represents a critical term in the augmented and mixed reality ecosystem. Today, in mixed reality environments and applications, interaction occupies the joint space between any combination of humans, physical environment, and computers. Although interaction methods and techniques have been extensively examined in recent decades in the field of human-computer interaction, they still should be reidentified in the context of immersive realities. The latest technological advancements in sensors, processing power and technologies, including the internet of things and the 5G GSM network, led to innovative and advanced input methods and enforced computer environmental perception. For example, ubiquitous sensors under a high-speed GSM network may enhance mobile users’ interactions with physical or virtual objects. As technological advancements emerge, researchers create umbrella terms to define their work, such as multimodal, tangible, and collaborative interactions. However, although they serve their purpose, various naming trends overlap in terminology, diverge in definitions, and lack modality and conceptual framework classifications. This paper presents a modality-based interaction-oriented diagram for researchers to position their work and defines taxonomy ground rules to expand and adjust this diagram when novel interaction approaches emerge.
... Two main approaches to avatar representation have been presented in the past: i) 3D point cloud reconstruction based avatar representations of users using RGB-D sensing (e.g., [5,16,37,40,44]), ii) 3D virtual character based representations that are animated based on motion tracking of users (e.g., [2,7,49,53,57,64]. Further research proposed a combination of both techniques (e.g., [63,69]); for a review and summaries see [32,54]. ...
Full-text available
A 3D Telepresence system allows users to interact with each other in a virtual, mixed, or augmented reality (VR, MR, AR) environment, creating a shared space for collaboration and communication. There are two main methods for representing users within these 3D environments. Users can be represented either as point cloud reconstruction-based avatars that resemble a physical user or as virtual character-based avatars controlled by tracking the users' body motion. This work compares both techniques to identify the differences between user representations and their fit in the reconstructed environments regarding the perceived presence, uncanny valley factors, and behavior impression. Our study uses an asymmetric VR/AR teleconsultation system that allows a remote user to join a local scene using VR. The local user observes the remote user with an AR head-mounted display, leading to facial occlusions in the 3D reconstruction. Participants perform a warm-up interaction task followed by a goal-directed collaborative puzzle task, pursuing a common goal. The local user was represented either as a point cloud reconstruction or as a virtual character-based avatar, in which case the point cloud reconstruction of the local user was masked. Our results show that the point cloud reconstruction-based avatar was superior to the virtual character avatar regarding perceived co-presence, social presence, behavioral impression, and humanness. Further, we found that the task type partly affected the perception. The point cloud reconstruction-based approach led to higher usability ratings, while objective performance measures showed no significant difference. We conclude that despite partly missing facial information, the point cloud-based reconstruction resulted in better conveyance of the user behavior and a more coherent fit into the simulation context.
... Such collaborative environments can consist of purely virtual environments (e.g., [9,10,11,12]), augmented environments (e.g., [1,13]) or asymmetric combinations that merge virtual as well as augmented reality aspects (e.g., [14,15,16,8,17,18,19], see [20,21] for further systematic reviews). One of the applications of the latter class is telepresence and, more specifically, teleconsultation [2,18,22], in which two or more users, physically apart from each other, can interact and guide another through a specific procedure. ...
Conference Paper
Full-text available
When users create hand-drawn annotations in Virtual Reality they often reach their physical limits in terms of precision, especially if the region to be annotated is small. One intuitive solution employs magnification beyond natural scale. However, scaling the whole environment results in wrong assumptions about the coherence between physical and virtual space. In this paper, we introduce Magnoramas, a novel interaction method for selecting and extracting a region of interest that the user can subsequently scale and transform inside the virtual space. Our technique enhances the user’s capabilities to perform supernaturally precise virtual annotations on virtual objects. We explored our technique in a user study within a simplified clinical scenario of a teleconsultation-supported craniectomy procedure that requires accurate annotations on a human head. Teleconsultation was performed asymmetrically between a remote expert in Virtual Reality that collaborated with a local user through Augmented Reality. The remote expert operates inside a reconstructed environment, captured from RGB-D sensors at the local site, and is embodied by an avatar to establish co-presence. The results show that Magnoramas significantly improve the precision of annotations while preserving usability and perceived presence measures compared to the baseline method. By hiding the 3D reconstruction while keeping the Magnorama, users can intentionally choose to lower their perceived social presence and focus on their tasks.
... The final great advantage that is brought by SAR is a more natural way for promoting multi-user collaborations with its ubiquitous projected displays. A successful collaborative medium design needs to advocate a more natural and efficient way to connect tasks and collaborators [56,70,12]. In OST and VST based AR, the tasks and work spaces are often cloned and shared to collaborators through networks in co-located collaborations, where the collaborators are in same space. ...
Accessibility is about creating interfaces and products that can be used by everyone and in all contexts, which is a key methodology in human-centered design. Upgrading and redesigning existing interfaces would cause additional time, efforts and costs, leading to practical difficulties. One solution to solve this is the Spatial Augmented Reality (SAR), a specific kind of Augmented Reality (AR) approach that synthesizes digital elements on top of physical environment by directly projecting computer generated graphics on the surfaces of physical object. Over the past decade, an explosion of application oriented SAR research aims to address the accessibility issue of existing user interface without renovating the existing interfaces permanently, though are highly disconnected and fragmented. To unify these existing efforts and inform the future research paradigms, we paint the picture of methods and challenges of state-of-art SAR designs, with a particular focus on addressing accessibility issues of existing user interfaces. To the end, we discuss and unveil the potential future research opportunities of SAR design research for solving the accessibility issues.
... These technologies give Computer-Supported Collaborative Work (CSCW) the possibility to integrate various elements into a shared world, including heterogeneous user interfaces, data structures, information models, and graphical representations of users themselves. For instance, several overviews on collaboration in MR can be found in [12,61]. The integration of multiple devices and interaction modalities has largely changed how users interact with data and with other users. ...
Conference Paper
Full-text available
Recent trends in Extended Reality technologies, including Virtual Reality and Mixed Reality, indicate that the future infrastructure will be distributed and collaborative, where end-users as well as experts meet, communicate, learn, interact with each other, and coordinate their activities using a globally shared network and meditated environments. The integration of new display devices has largely changed how users interact with the system and how those activities, in turn, change their perception and experience. Although a considerable amount of research has already been done in the fields of computer-supported collaborative work, human-computer interaction, extended reality, cognitive psychology , perception, and social sciences, there is still no in-depth review to determine the current state of research on multiple-user-experience-centred design at the intersection of these domains. This paper aims to present an overview of research work on coexperience and analyses important aspects of human factors to be considered to enhance collaboration and user interaction in collaborative extended reality platforms, including: (i) presence-related factors, (ii) group dynamics and collaboration patterns , (iii) avatars and embodied agents, (iv) nonverbal communication, (v) group size, and (vi) awareness of physical and virtual world. Finally, this paper identifies research gaps and suggests key directions for future research considerations in this multidisciplinary research domain.
Full-text available
The 8th annual International Conference of the Immersive Learning Research Network (iLRN2022) was the first iLRN event to offer a hybrid experience, with two days of presentations and activities on the iLRN Virtual Campus (powered by ©Virbela), followed by three days on location at the FH University of Applied Sciences BFI in Vienna, Austria.
Full-text available
When two or more users attempt to collaborate in the same space with Augmented Reality, they often encounter conflicting intentions regarding the occupation of the same working area and self-positioning around such without mutual interference. Augmented Reality is a powerful tool for communicating ideas and intentions during a co-assisting task that requires multi-disciplinary expertise. To relax the constraint of physical co-location, we propose the concept of Duplicated Reality, where a digital copy of a 3D region of interest of the users’ environment is reconstructed in real-time and visualized in-situ through an Augmented Reality user interface. This enables users to remotely annotate the region of interest while being co-located with others in Augmented Reality. We perform a user study to gain an in-depth understanding of the proposed method compared to an in-situ augmentation, including collaboration, effort, awareness, usability, and the quality of the task. The result indicates almost identical objective and subjective results, except a decrease in the consulting user’s awareness of co-located users when using our method. The added benefit from duplicating the working area into a designated consulting area opens up new interaction paradigms to be further investigated for future co-located Augmented Reality collaboration systems.
The haptic illusion literature suggests that individuals’ haptic experiences rely not only on tactile signals, but on visual and auditory stimulation as well. Given that mixed reality (MR) technologies often harness realistic visual and auditory but not haptic stimulation, it is important to understand how visual and auditory signals shape haptic experiences. We examined whether visual and auditory cues in MR can enhance tactile sensation, including perceptions of roughness and stiffness on virtual gadgets, in the absence of haptic signals. A laboratory experiment showed that tactile perception of mobile gadgets in motion was influenced by both visual and auditory cues when the motion of the gadget was generated by participants (i.e., active touch), but only by auditory cues when participants observed the gadget moving (i.e., passive touch). The results indicate that visual and auditory cues can be employed to facilitate haptic experiences of virtual objects in virtual reality, with auditory cues being potentially applicable to a broader context than visual cues.
Full-text available
Mixed Reality (MR) applications are widely considered to be effective educational tools. Yet, the use of MR alone cannot ensure learning and studies even suggest that the affordances of this technology could decrease the mental processes required for the acquisition of new knowledge. As any other technological innovation, the educational possibilities of MR are closely related to the design of its contents. Despite, there are no design recommendations for MR focused on learning. Educational psychology presents a range of empirically proven design guidelines for multimedia learning environments. This chapter reviews existing guidelines, categorizes those into principles related with the perception of information and the related essential information processing (Design Principles) and principles aiming at promoting generative learning (Activating Principles). These principles finally are translated to MR-learning environments.
Conference Paper
Full-text available
In this paper, we explore techniques for enhancing remote Mixed Reality (MR) collaboration in terms of communication and interaction. We created CoVAR, a MR system for remote collaboration between an Augmented Reality (AR) and Augmented Virtuality (AV) users. Awareness cues and AV-Snap-to-AR interface were proposed for enhancing communication. Collaborative natural interaction, and AV-User-Body-Scaling were implemented for enhancing interaction. We conducted an exploratory study examining the awareness cues and the collaborative gaze, and the results showed the benefits of the proposed techniques for enhancing communication and interaction.
Conference Paper
Full-text available
We present CoVAR, a novel remote collaborative system combining Augmented Reality (AR), Virtual Reality (VR) and natural communication cues to create new types of collaboration. AR user can capture and share their local environment with a remote user in VR to collaborate on spatial tasks in shared space. COVAR supports various interaction methods to enrich collaboration, including gestures, head gaze, and eye gaze input, and provides virtual cues to improve awareness of a remote collaborator. We also demonstrate collaborative enhancements in VR user's body scaling and snapping to AR perspective.
Conference Paper
Full-text available
HCI research has demonstrated Mixed Reality (MR) as being beneficial for co-located collaborative work. For remote collaboration, however, the collaborators' visual contexts do not coincide due to their individual physical environments. The problem becomes apparent when collaborators refer to physical landmarks in their individual environments to guide each other's attention. In an experimental study with 16 dyads, we investigated how the provisioning of shared virtual landmarks (SVLs) influences communication behavior and user experience. A quantitative analysis revealed that participants used significantly less ambiguous spatial expressions and reported an improved user experience when SVLs were provided. Based on these findings and a qualitative video analysis we provide implications for the design of MRs to facilitate remote collaboration.
Conference Paper
Full-text available
" In many complex tasks, a remote subject-matter expert may need to assist a local user to guide actions on objects in the local user's environment. However, effective spatial referencing and action demonstration in a remote physical environment can be challenging. We introduce two approaches that use Virtual Reality (VR) or Augmented Reality (AR) for the remote expert, and AR for the local user, each wearing a stereo head-worn display. Both approaches allow the expert to create and manipulate virtual replicas of physical objects in the local environment to refer to parts of those physical objects and to indicate actions on them. This can be especially useful for parts that are occluded or difficult to access. In one approach, the expert points in 3D to portions of virtual replicas to annotate them. In another approach, the expert demonstrates actions in 3D by manipulating virtual replicas, supported by constraints and annotations. We performed a user study of a 6DOF alignment task, a key operation in many physical task domains, comparing both approaches to an approach in which the expert uses a 2D tablet-based drawing system similar to ones developed for prior work on remote assistance. The study showed the 3D demonstration approach to be faster than the others. In addition, the 3D pointing approach was faster than the 2D tablet in the case of a highly trained expert.
Full-text available
For operational units in the security domain that work together in teams it is important to quickly and adequately exchange context-related information. This extended abstract investigates the potential of augmented reality (AR) techniques to facilitate information exchange and situational awareness of teams from the security domain. First, different scenarios from the security domain that have been elicited using an end-user oriented design approach are described. Second, a usability study is briefly presented based on an experiment with experts from operational security units. The results of the study show that the scenarios are well-defined and the AR environment can successfully support information exchange in teams operating in the security domain.
Conference Paper
More than three decades of ongoing research in immersive modelling has revealed many advantages of creating objects in virtual environments. Even though there are many benefits, the potential of immersive modelling has only been partly exploited due to unresolved problems such as ergonomic problems, numerous challenges with user interaction and the inability to perform exact, fast and progressive refinements. This paper explores past research, shows alternative approaches and proposes novel interaction tools for pending problems. An immersive modelling application for polygon meshes is created from scratch and tested by professional users of desktop modelling tools, such as Autodesk Maya, in order to assess the efficiency, comfort and speed of the proposed application with direct comparison to professional desktop modelling tools.
Conference Paper
Designing spatial user interfaces for virtual reality (VR) applications that are intuitive, comfortable and easy to use while at the same time providing high task performance is a challenging task. This challenge is even harder to solve since perception and action in immersive virtual environments differ significantly from the real world, causing natural user interfaces to elicit a dissociation of perceptual and motor space as well as levels of discomfort and fatigue unknown in the real world. In this paper, we present and evaluate the novel method to leverage joint-centered kinespheres for interactive spatial applications. We introduce kinespheres within arm's reach that envelope the reachable space for each joint such as shoulder, elbow or wrist, thus defining 3D interactive volumes with the boundaries given by 2D manifolds. We present a Fitts' Law experiment in which we evaluated the spatial touch performance on the inside and on the boundary of the main joint-centered kinespheres. Moreover, we present a confirmatory experiment in which we compared joint-centered interaction with traditional spatial head-centered menus. Finally, we discuss the advantages and limitations of placing interactive graphical elements relative to joint positions and, in particular, on the boundaries of kinespheres.
Conference Paper
Room2Room is a life-size telepresence system that leverages projected augmented reality to enable co-present interaction between two remote participants. Our solution recreates the experience of a face-to-face conversation by performing 3D capture of the local user with color + depth cameras and projecting their virtual copy into the remote space at life-size scale. This creates an illusion of the remote person’s physical presence in the local space, as well as a shared understanding of verbal and non-verbal cues (e.g., gaze, pointing) as if they were there. In addition to the technical details of our two prototype implementations, we contribute strategies for projecting remote participants onto physically plausible seating or standing locations, such that they form a natural and consistent conversational formation with the local participant. We also present observations and feedback from an evaluation with 7 pairs of participants on the usability of our solution for solving a collaborative, physical task.
Conference Paper
In this paper, we describe Empathy Glasses, a head worn prototype designed to create an empathic connection between remote collaborators. The main novelty of our system is that it is the first to combine the following technologies together: (1) wearable facial expression capture hardware, (2) eye tracking, (3) a head worn camera, and (4) a see-through head mounted display, with a focus on remote collaboration. Using the system, a local user can send their information and a view of their environment to a remote helper who can send back visual cues on the local user's see-through display to help them perform a real world task. A pilot user study was conducted to explore how effective the Empathy Glasses were at supporting remote collaboration. We describe the implications that can be drawn from this user study.