Content uploaded by Peter König
Author content
All content in this area was uploaded by Peter König on Aug 05, 2020
Content may be subject to copyright.
The Effectiveness of Multimodal Sensory
Feedback on VR Users’Behavior
in an L-Collision Problem
Sumin Kim
1(&)
, Krzysztof Izdebski
3
, and Peter König
1,2
1
Institute of Cognitive Science, Universität Osnabrück, Osnabrück, Germany
sumkim@uni-osnabrueck.de
2
Institut für Neurophysiologie und Pathophysiologie,
Universitätsklinikum Hamburg Eppendorf, Hamburg, Germany
3
SALT AND PEPPER Software GmbH & Co. KG, Osnabrück, Germany
Abstract. Virtual Reality (VR) is highly dependent on visual information,
although it offers multimodal channels for sensory feedback. In this study, we
compared the effectiveness of different sensory modalities in the context of
collision avoidance in the industrial manufacturing process. Participants per-
formed a pick-and-place task with L-shaped objects on a virtual workstation. In a
between-subject design each person performed one of four conditions: Baseline,
Auditory, Haptic, and Visual condition. We measured the timing and accuracy of
the performed actions. Statistical testing by an ANOVA showed a significant
main effect, i.e. a difference between the conditions. We observed the lowest
number of collisions in the auditory condition followed by the haptic, baseline
and visual conditions. Post hoc tests revealed a significant difference between the
auditory condition, the most accurate, and the visual condition, the least accurate.
This implies that giving additional feedback by the visual modality is not optimal
and utilizing a fully multimodal interface has increased effectivity.
Keywords: VR Multisensory feedback Collision Simulation
1 Introduction
Virtual Reality (VR) has evolved quickly in the last decade. The technical basis and
critical performance criteria have been largely improved and now allow for the
development of virtual environments with high immersion. Simultaneously, due to the
rising number of applications and enthusiasts, the price tag has dropped considerably.
This allows applications in new domains like entertainment, science, and industrial
applications. The industrial use of VR has its emphasis on simulating and prototyping
production processes in virtual environments. Notable features are the realistic ren-
dering of the environment including multimodal features, naturalistic behavior in the
VR by participants and dynamic feedback contingent on task performance. Thus, VR
tries to combine the best of both worlds and triggers a quantitative and qualitative
change in prototyping production processes. Boxplan, an example of industrial VR
software, is a virtual space where users plan their assembly stations at scale, create 3D
mock-ups and experience the assembly workflow. Thus, they can faithfully test the
©Springer Nature Switzerland AG 2019
J. Trojanowska et al. (Eds.): Advances in Manufacturing II - Volume 1, LNME, pp. 381–389, 2019.
https://doi.org/10.1007/978-3-030-18715-6_32
layout concept in industrial and economic contexts. Importantly, feedback by the later
users is quickly incorporated leading to short turnaround times and a reduction in costs.
For such applications, users require more than just a realistic visualization, but the
direct feedback that directs the realistic physical behavior of the users and the inter-
action with virtual objects [1]. Therefore, the goal of such a VR-application can be
achieved only by realistic experience, as it is essential to recognize that the realism of a
training simulation influences training effectiveness.
Training in a virtual environment can take different forms. Performing complex
movements, such as in many sports disciplines, naturalistic feedback of desired move-
ment trajectories might be given. Other applications, where no single optimal behavior is
defined, might limit themselves to alarms when an error occurs. Specifically, a collision,
one of the most common physical interactions, can trigger such an alarm using multiple
modalities. In a natural environment, it might be seen, be audible, or be felt. However,
basic virtual environments confine themselves to the visual modality, i.e. when a col-
lision occurs the visible movement is stopped. This establishes a baseline condition.
Additionally, signals in other modalities, either in naturalistic form or as standardized
alarms may be supplied [2]. That is, the realistic sound of the collision could give
feedback on the erroneous movement. This, however, would require simulation of the
material properties, which is well beyond the scope of familiar virtual environments.
Therefore, standardized acoustic alarms are often used. Similar concerns apply to
feedback by the tactile modality [3]. To implement natural force feedback for free
movements is much more demanding than a simple vibration alarm. Still, modern
technology provides many choices for multimodal feedback. In this respect, multimodal
feedback has been investigated in different scenarios [4]. The effectiveness of tactile,
visual, and auditory warnings for the case of rear-end collision prevention in simulated
driving has already been demonstrated [5]. It found that the collision warning systems
consisting of different sensory feedback have a reliable effect on user’s behavior and
influence the number of collisions made [5]. A few studies already revealed that mul-
timodal feedback design could enhance motor learning and reduce workload, taking
advantage of each modality, which is especially beneficial for complex task and pro-
duction process in industries [2].
The importance of haptic feedback in VR is quickly growing (e.g., HaptX) and has
received considerable attention since the very early study of VR [6–8]. However, while
the comparison among general sensory feedback is competitively debated even in the
framework of VR, the effectiveness of their functions in different contexts has been
explored, but no general understanding is achieved [4]. Hence, the goal of our study is
to compare different sensory modalities regarding their effectiveness in collision
avoidance in VR.
2 Method
To analyze the effectiveness of different sensory-feedback modalities, we set up a study
that compares users’behaviors in four different sensory-feedback conditions: a baseline
condition with naturalistic visual feedback, an auditory condition with an additional
auditory alarm, a haptic condition with an additional tactile alarm, and a visual
382 S. Kim et al.
condition with an additional color-changing visual alarm. We compare these conditions
in the L-collision problem. This problem describes a collision made by an L-shaped
object with other obstacles. When the head of the L-shaped object is behind the
obstacle, a behavior of pulling out the object causes collisions easily. Therefore it
provides a suitable environment to address the question of interest.
2.1 Participants
In our experiment, 65 volunteers (21 female, 44 male), aged between 19 and 35,
participated. All participants had an average or corrected-to-normal vision and did not
have any known neurological conditions. Due to a misunderstanding of the task
instructions, the data from two participants were excluded. In total, we measured 15
participants in the baseline condition, 15 participants in the auditory condition, 17
participants in the haptic condition and 16 participants in the visual condition. Each
participant was introduced to only one of the four conditions, and the condition was
randomly chosen before the participant was known to the experimenter.
2.2 Apparatus
For our study, we used a VR-ready PC with Nvidia 1070, Unity3D 5.6.3p2, NewtonVR
and HTC Vive HMD (110-degree field of view, 90 Hz, resolution 1080 1200 px per
eye). As we used the NewtonVR environment in our study, we decided to set a pure
NewtonVR condition with only its natural visual cue as our control condition (baseline
condition).
Fig. 1. (a) This figure describes the experimental setup scene of the shelf and the box.
A participant needed to pick up an L-shaped object, pull out of the shelf and place into the box to
complete a trial (top). (b) This is a picture of the first shelf L-shaped objects (5 per each story)
and obstacles (bottom). (c) Participants were given enough time to adapt to the VR environment
(right).
The Effectiveness of Multimodal Sensory Feedback on VR Users’Behavior 383
2.3 Task
The task employed in this study was the L-collision problem (Fig. 1). A two-story shelf
with different-sized L-shaped objects was positioned in front of the user. Obstacles
were mainly consisting of two types: First, an obstacle with a minimal gap between the
ceiling of the shelf and itself so that the user could rotate the L-shaped object or pull it
to the side. Second, an obstacle that has a large enough gap between itself and the
ceiling, such that the user can apply any movements, for example merely lifting the L-
shaped object and pulling it out directly. The participants were instructed to pull out ten
L-shaped objects out of obstacles on the two-story shelf under one of the four different
feedback collision conditions. By putting the selected L-shaped object into the box
behind the user, the trial was considered to be completed. When users missed or
dropped the L-shaped object before they placed the object into the box, the trial was
recorded as a failure. Avoiding collisions with the given obstacles was not mentioned
so that the participants must recognize that by themselves via received feedback.
However, because the interacting object was L-shape, it was technically difficult to pull
it out and complete the task successfully when it stuck or collided with obstacles.
Therefore, by giving an instruction of pulling the L-shaped objects out of the obstacles,
the participants had to try to avoid the collision. The removal task was explained
individually to each participant. Participants was given one of the four different
feedback collision conditions: baseline, auditory, haptic and visual condition. In the
baseline condition, which served as the control condition, participants received no
feedback other than the natural visual cue. The visual cue here was given by default
physics setup of NewtonVR. The interacting object did not pass through the obstacles,
and participants could not complete the task without finding a way to avoid the
obstacles. All other conditions also included this type of natural NewtonVR visual cue.
On top of that, auditory, haptic and visual condition employed an additional modality
for feedback. Thus, these three conditions are multimodal feedback conditions
including the natural visual cue provided by NewtonVR setup as well. In the auditory
condition, an alarm sound played when the object touched any obstacles. In the haptic
condition, the controller performing the grabbing motion vibrated to indicate a colli-
sion. In the visual condition, participants received additional visual feedback via having
the color of the object being changed every time it touched another object, here we call
them obstacles, and reverting to its original color once it is no longer touching other
objects. The L-shaped object’s color material changed to black when it touched an
obstacle, and it changed back to its original color when it was moved away from the
obstacle.
2.4 Procedure and Analysis
We recorded the number and timing of collisions. Each collision was labeled with the
trial number and the index of the specific L-shaped object involved in the collision.
Completion of the trial was defined such that the L-shaped object interacted with the
collider of the box, which was placed behind the user. Successful completion was noted
as well as the number of failures to complete the task. In order to make the collisions
comparable in different conditions, consistent feedback types were provided throughout
384 S. Kim et al.
each multimodal condition. For example, the same feedback color in visual condition,
the same beeping sound in auditory condition, and the same vibration frequency in
haptic condition were used. The analysis was performed using a one-way ANOVA and
Tukey Tests for post-hoc analysis. For outlier treatment, capping process was used by
replacing those observations outside the lower limit with the value of 5th percentile and
those that lie above the upper limit, with the value of 95th percentile before parametric
testing by an ANOVA.
3 Results
As a first step, we performed two controls that could potentially influence the inter-
pretation of our results. Specifically, we examined learning effects and compared the
level of difficulty for different objects. For this reason, two different sequences of the
number of collisions were visualized before the analysis of the number of collisions
among different sensory-feedback types.
Fig. 2. (a) The number of collisions for the different trials; (b) The number of collisions for the
different L-shaped objects; (c) Boxplot on the four different conditions to compare their mean
value occurred.
The Effectiveness of Multimodal Sensory Feedback on VR Users’Behavior 385
As each participant performed the task in a pseudo-random sequence, we checked
whether a learning effect occurred over the trials. The Fig. 2a demonstrates that most of
the users made more collisions on their first trial than any other trial, and the least
number of collisions on their last trial. However, a higher index of the trial did not
always lead to fewer collisions. In the other trials, trial 2 to trial 9 seem to vary on the
number of collisions. On the analysis of the Fig. 2b, the fifth object (L5) seemed to be
the most challenging object for the users. It caused more than 20 collisions on average.
In comparison, the fourth object (L4) was the easiest, with less than five collisions on
average. However, besides these two extreme cases, the variation in the number of
collisions for different objects was moderate and, thus, added a limited amount of
variance to each task. Hence, we concluded that there was no strong learning effect that
occurred after the first trial in this task. Also, as all subjects handled all objects and
performed the identical number of trials the data might be well averaged over these two
variables.
For the next step, we focused on the differences between the conditions to explore
the effectiveness of multimodal feedback. The auditory condition resulted in the least
number of collisions, followed by the haptic condition, the baseline condition, and
finally the visual condition. For statistical analysis, a one-way between subjects
ANOVA was conducted to compare the number of collisions of the virtual L-shaped
objects and obstacles in the different conditions. There was a significant difference in
the number of collisions for the four conditions [F(3) = 2.9, p = 0.0424]. As we found
a statistically significant main effect of condition, we computed a Tukey post-hoc test.
Tukey HSD post-hoc test indicated that the mean score for the auditory condition
(M = 94.6, SD = 58.98) was significantly different than for the visual condition
(M = 151.03, SD = 88.32) [p < 0.05.]. No other significant pairwise differences were
found. Thus, we observe significantly different numbers of collisions as a function of
condition, and specifically that the number of collisions in the auditory feedback
condition is reduced in the pairwise comparison to the visual condition.
4 Discussion
With this experiment, we could demonstrate that the choice of modality influences the
effectiveness of multimodal feedback. Specifically, supplying the additional feedback
by another modality seems more effective than using the visual modality for natural
feedback and an alarm signal at the same time.
A prior study comparing the effectiveness between visual-auditory and visual-
tactile multimodal feedbacks on users in real-world task setup suggested that multi-
modal feedback is advantageous when compared to single modalities [4]. Specifically,
it showed that visual-auditory feedback is most effective when a single task is being
performed [4]. Another prior study also made a convincing case for the inclusion of
multimodal feedback for common direct manipulations such as the drag-and-drop and
showed that the inclusion of auditory feedback was common to conditions that
improved performance [9]. These results obtained in real-world setups match the
observations of the present study using a VR setup. Here, that the auditory feedback
condition, which includes the natural visual feedback, shows the best performance.
386 S. Kim et al.
Furthermore, the visual feedback condition, which contained a visual alarm on top
of the natural visual feedback, performed significantly worse in our study. We spec-
ulate that the central focus induced by the task in VR reduced attention to the peripheral
vision and made the visual alarm less effective [10]. The visually alarming feedback
detection, therefore, might not function as effectively as it would do in non-constrained
real-world conditions. Another finding was that the controlled baseline condition with
the natural visual cue did not significantly differ from the visual feedback condition, if
anything it was slightly better. In other words, there was a large number of collisions
detected in the visual feedback condition, particularly compared to other conditions
including baseline condition. This provides evidence that a visual cue and additional
visual information convey a similar type of information. From this we can conclude
that both together do not improve the effectiveness of feedback.
These results are compatible with studies investigating the attentional bottleneck of
multiple modalities [11–13]. They report that in a dual-task setup the interference is
reduced when multiple modalities are involved. Also, further studies demonstrated that
multimodal feedback is advantageous when compared to single modalities in a variety
of task setups [4,9,14]. In this respect, natural visual cue and additionally designed
visual feedback of a color change are both mediated by the visual modality. This
explains why the visual condition, which is technically a single modality feedback
condition, performed worse than other multimodal feedback conditions.
A further reason for the increased effectiveness of auditory feedback might be that
participants are only adapted to realistic feedback. In the real world, when collisions
occur between two objects, auditory and haptic feedback are naturally generated, as
well as force feedback and the natural visual cue. However, the change of color of the
colliding object is rather artificial feedback. As our motor control is influenced by
internal representations of the actual and predicted states of our body and the external
world environment [15], such an artificial cue would not be predicted in the case of an
error, i.e., is harder to interpret. We speculate that the lack of natural internal repre-
sentations due to the unrealistic feedback, such as the changing of the color of the
object, can lead to ineffectiveness in task performance in VR.
In line with our study, there is a study that shows the utilization of the haptic
feedback in telepresence assembly task environment, whose setup is comparable to
virtual environments. Although it emphasized on the utilization of the haptic feedback,
it highlights that more realistic presence under haptic feedback was achieved by other
modalities such as a visual bar graph or an auditory stimulus [8], supporting the
effectiveness of the auditory feedback.
Another finding is in accordance with our result, showing the efficacy of multi-
modal feedback in general [4,9,14]. This was reached by a few studies, comparing the
effect of different modalities on user’s performance. One study specifically found out
that the multimodal combination of visual-auditory feedback yields more favorable
results regarding performance than visual feedback alone in single task scenarios under
normal workload conditions [4]. Also, our study expanded the result of a joint task
study in a non-VR condition [13] to a VR condition, showing that auditory displays are
a viable option to receive task-related information in virtual reality as well.
To that end, our study demonstrates that different types of feedback should be
considered depending on the different contexts of VR applications in order to optimize
The Effectiveness of Multimodal Sensory Feedback on VR Users’Behavior 387
their effectiveness. Notably, our results cast a new light on the function of sensory
modalities other than vision in VR. However, as pointed out in other similar studies,
with varying workloads, different modalities could offer additional advantages. Hence,
our research suggests further study into investigating these findings in more specific
contexts or different tasks in order to apply them to particular practical cases.
Acknowledgments. We gratefully acknowledge the support by the project ErgoVR (BMBF,
KMU Innovativ V5KMU17/221) and the SALT AND PEPPER Software GbmH & Co.KG.
References
1. Ragan, E.D., Bowman, D.A., Kopper, R., Stinson, C., Scerbo, S., McMahan, R.P.: Effects of
field of view and visual complexity on virtual reality training effectiveness for a visual
scanning task. IEEE Trans. Vis. Comput. Graph. 21(7), 794–807 (2015)
2. Sigrist, R., Rauter, G., Riener, R., Wolf, P.: Augmented visual, auditory, haptic, and
multimodal feedback in motor learning: a review. Psychon. Bull. Rev. 20(1), 21–53 (2013)
3. Hayward, V., Astley, O.R., Cruz-Hernandez, M., Grant, D., Robles-De-La-Torre, G.: Haptic
interfaces and devices. Sens. Rev. 24(1), 16–29 (2014)
4. Burke, J.L., Prewett, M.S., Gray, A.A., Yang, L., Stilson, F.R., Coovert, M.D., Elliot, L.R.,
Redden, E.: Comparing the effects of visual-auditory and visual-tactile feedback on user
performance: a meta-analysis. In: Proceedings of the 8th International Conference on
Multimodal Interfaces, pp. 108–117. ACM (2006)
5. Scott, J.J., Gray, R.: A comparison of tactile, visual, and auditory warnings for rear-end
collision prevention in simulated driving. Hum. Factors 50(2), 264–275 (2008)
6. Burdea, G.C.: Keynote address: haptics feedback for virtual reality. In: Proceedings of
International Workshop on Virtual Prototyping, Laval, France, pp. 87–96 (1999)
7. Srinivasan, M.A., Basdogan, C.: Haptics in virtual environments: taxonomy, research status,
and challenges. Comput. Graph. 21(4), 393–404 (1997)
8. Petzold, B., Zaeh, M.F., Faerber, B., Deml, B., Egermeier, H., Schilp, J., Clarke, S.: A study
on visual, auditory, and haptic feedback for assembly tasks. Presence: Teleoper. Virtual
Environ. 13(1), 16–21 (2004)
9. Jacko, J.A., Scott, I.U., Sainfort, F., Barnard, L., Edwards, P.J., Emery, V.K., Kongnakorn,
T., Moloney, K.P., Zorich, B.S.: Older adults and visual impairment: what do exposure times
and accuracy tell us about performance gains associated with multimodal feedback? In:
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 33–
40. ACM (2003)
10. Khan, A.Z., Blohm, G., McPeek, R.M., Lefevre, P.: Differential influence of attention on
gaze and head movements. J. Neurophysiol. 101(1), 198–206 (2009)
11. Wahn, B., König, P.: Can limitations of visuospatial attention be circumvented? A review.
Front. Psychol. 8, 1896 (2017)
12. Wahn, B., König, P.: Is attentional resource allocation across sensory modalities task-
dependent? Adv. Cogn. Psychol. 13(1), 83 (2017)
13. Wahn, B., Schwandt, J., Krüger, M., Crafa, D., Nunnendorf, V., König, P.: Multisensory
teamwork: using a tactile or an auditory display to exchange gaze information improves
performance in joint visual search. Ergonomics 59(6), 781–795 (2016)
388 S. Kim et al.
14. Lee, J.H., Spence, C.: Assessing the benefits of multimodal feedback on dual-task
performance under demanding conditions. In: Proceedings of the 22nd British HCI Group
Annual Conference on People and Computers: Culture, Creativity, Interaction-Volume 1.
British Computer Society, pp. 185–192 (2008)
15. Frith, C.D., Blakemore, S.J., Wolpert, D.M.: Abnormalities in the awareness and control of
action. Philos. Trans. R. Soc. London. Ser. B 355(1404), 1771–1788 (2000). https://doi.org/
10.1080/00140139.2015.1099742
The Effectiveness of Multimodal Sensory Feedback on VR Users’Behavior 389