Content uploaded by Peter König
Author content
All content in this area was uploaded by Peter König on Aug 05, 2020
Content may be subject to copyright.
Multimodal Integration, Attention, and Sensory
Augmentation
Basil Wahn1
1Institute of Cognitive Science
University of Osnabrück
49076 Osnabrück, Germany
Peter König1,2
2Department of Neurophysiology and Pathophysiology
University Medical Center Hamburg-Eppendorf
20246 Hamburg, Germany
Abstract— Human information processing is limited in capacity.
Here, we investigated under which circumstances humans can
better process information if they receive task-relevant sensory
input via several sensory modalities compared to only one
sensory modality (i.e., vision). We found that the benefits of
distributing information processing across sensory modalities
critically depend on task demands. That is, when information
processing requires only spatial processing, distributing
information processing across several sensory modalities does not
lead to any performance benefits in comparison to receiving the
same information only via the visual sensory modality. When
information processing additionally involves the discrimination
of stimulus attributes, then distributing information processing
across several sensory modalities effectively circumvents
processing limitations within the visual modality. Crucially, these
performance benefits generalize to settings using sensory
augmentation as well as a collaborative setting. Findings are
potentially applicable to visually taxing real-world tasks that are
either performed alone or in a group.
Keywords: multisensory processing; visuospatial attention; joint
action; attentional resources; multisensory integration; sensory
augmentation.
I. INTRODUCTION
Human information processing is limited in attentional
capacity [1-3]. However, in many demanding real-world
collaborative tasks such as air-traffic control, humans need to
process more information than they can effectively attend. As
a consequence, task-relevant information is missed and the
risk of accidents is greatly increased. Here, we investigated
how to effectively circumvent attentional limitations in
collaborative tasks by distributing information processing
across several sensory modalities. For this purpose, we first
tested under which circumstances humans, that perform tasks
alone, can process more information when the information is
received via several sensory modalities (e.g., vision and
audition) in comparison to receiving the same information via
only one sensory modality (e.g. vision) [4-6]. In addition, we
tested whether receiving redundant information via several
sensory modalities leads to processing benefits (e.g., a more
reliable and accurate performance) and whether these benefits
are robust against additional processing load [4-6]. That is,
whether the process of multisensory integration and the
associated benefits are susceptible to the overall processing
load in tasks. These studies were meant to inform us about
how to effectively exchange task-relevant information in
collaborative tasks. Finally, we tested whether joint
performance in a visual collaborative task is improved by
receiving task-relevant information about the co-actor (i.e.,
where the co-actor is looking) via a different sensory modality
than vision (i.e., audition and touch) [7].
II. EXPERIMENTAL PARADIGMS AND RESULTS
We assessed the effectiveness of tactile and auditory displays
for circumventing limitations in visuospatial processing in a
dual task paradigm [4-6]. Here, participants performed two
spatial tasks, i.e., a multiple object tracking task and a
localization task, both in the visual modality or in separate
sensory modalities. In the localization task, participants were
required to localize either visual, auditory, tactile, audiovisual,
or visuotactile spatial cues. In the multiple object-tracking
task, participants were required to track the movements of
target objects among several moving distractor objects on a
computer screen.
We found that the interference between tasks was equal
regardless of the sensory modalities in which tasks were
performed. In particular, performance in the tasks was equally
affected in the dual task conditions regardless of whether two
tasks were performed in the same or separate sensory
modalities. These findings suggest that visuospatial processing
limitations cannot be circumvented by distributing spatial
information across several sensory modalities [4,5].
In a follow-up study, we replaced the multiple object tracking
tasks by a visual search task [6] — a task additionally
involving the need to discriminate stimulus attributes. We
found that participants searched faster when simultaneously
performing the tactile localization task in comparison to the
visual localization task. Crucially, these findings suggest that
limitations in visuospatial processing can only be
circumvented by distributing spatial information processing
across several sensory modalities if visuospatial processing
involves the discrimination of stimulus attributes [6].
In the experiments reported above [4-6], we additionally tested
whether multisensory integration in a localization task is
robust against additional processing load. In particular, we
tested whether audiovisual or visuotactile integration was
affected by additionally performing either a multiple object
tracking task or a visual search task. We found in all cases that
the benefits of multisensory integration were not affected by
the additional task. That is, participants made consistently less
errors in the localization task when they received redundant
spatial information via several sensory modalities than when
they received the same information only via one of the two
sensory [4-6].
We then tested whether the findings above would generalize to
a collaborative task. In particular, we tested whether
visuospatial limitations in a collaborative task can be
effectively circumvented by distributing information
processing across several sensory modalities [7]. For this
purpose, pairs of participants performed a visual search task
together (see Figure 1). While they were searching, they
received information about where their search partner is
currently looking on the screen. Pairs received this
information either via a visual, auditory, or tactile display, on
which each location on the displays mapped to a location on
the screen.
For instance, when the upper right vibromotor on a vibrotactile
belt was vibrating, then this vibration indicated that the search
partner was looking at the upper right corner of the screen.
Prior to performing the collaborative search task together, we
tested whether participants can map the indicated locations on
the displays to locations on the screen and found that they
were able to perform well above chance [7].
The task demands in this collaborative visual search task
closely resemble the demands of a study reported above [6], in
which participants need to perform a visual search task and a
tactile localization task at the same time. In the study above
[6], distributing information across sensory modalities for
these task demands proved beneficial, suggesting that these
benefits could also generalize to a collaborative task with
similar task demands. However, the task demands in the
collaborative visual search task also slightly differed as
participants not only were required to localize where their
search partner was looking but additionally were required to
incorporate this information in their search behavior.
Nonetheless, we found that pairs used the partner's gaze
information to divide the search space efficiently. Specifically,
pairs searched faster when localizing the gaze information via
the auditory or tactile display than via the visual display (see
Figure 2), suggesting that the findings above [6] generalize to
a collaborative context. These results were also supported by
subjective ratings: Pairs rated performing the visual search
task as less difficult when they received the gaze information
via the tactile modality in comparison to the visual modality.
Figure 1: Joint visual search task. Partners jointly searched for
a target among distractors on two separate computer screens,
which were separated by a curtain. Partners received the
partner’s viewing location either via visual signals on a
miniature box placed below the currently viewed location (a),
via vibrations on a vibrotactile belt (b) or acoustic signals
transmitted via headphones (c). In a no communication
condition (d), the partner’s viewing location was not received.
The head movements of the participants were tracked via a
headmounted headtracker. This Figure was adapted from
Figure 4 in [7].
10 B. WAHN ET AL.
three times the interquartile range relative to the median).
Then the data was averaged over trials for each participant
for each condition.
4.3. Results: joint visual search task
In order to assess whether there were dierences in how
fast participants found the target between all conditions,
we computed a one factorial repeated measures ANOVA
with the factor Condition and the dependent variable
search time (T1). We found a signicant main eect of
Condition (F(3,27)=3.56, p=.03), indicating that search
time diered between conditions. Post-hoc pairwise com-
parisons yielded that participants were signicantly faster
in the tactile in comparison to the no communication con-
dition (t(9)=2.33, p=.04) and a trend towards signicance
that participants were faster in the auditory in comparison
to the no communication condition (t(9)=2.17, p=.06, see
Figure 5). This supports our hypothesis that participants
receiving continuous information about the gaze loca-
tion of a search partner via a tactile or an auditory display
search faster in comparison to a condition in which they
do not receive any gaze information (weak tactile/audi-
tory display advantage hypothesis). Furthermore, partic-
ipants’ search speed did not signicantly dier between
the visual condition and the no communication condition
(t(9)=0.07, p=.94). Moreover, we found that participants
received information about each other’s current viewing
location via one of the three modality-specic interfaces
(see Figure 4). We refer to these conditions as ‘no com-
munication’, ‘visual’, ‘tactile’ and ‘auditory’. Each condition
contained 10 trials and was repeated 4 times. The order of
conditions was pseudo-randomised. The experiment took
2h on average.
After completion of the joint visual search task, par-
ticipants lled out a questionnaire that assessed the dif-
culty of the collaborative visual search task with a set of
four items. Statements were rated on a ve-point Likert
scale (1=strongly disagree, 5=strongly agree). Specically,
participants rated whether it was hard to simultaneously
do the search task and attend to the information about
the partner’s search behaviour, whether they were over-
whelmed by the task, whether they were overstrained by
the task and whether their mind wandered o (as a sign
of being unchallenged – item will be inverted). This set of
items will be referred to as ‘diculty’. Additionally, partic-
ipants answered open-format questions on their use of
search strategies. It was assessed whether participants
used search strategies and whether they helped their
partners to nd the target once they had found it.
Search performance was quantied by the following
three measures. Our rst measure was the time until one
of the participants had found the target (T1). Our second
measure was the time until both participants had found
the target (T2). As a third measure, the time dierence
between T1 and T2 was calculated, indicating how long it
took one participant to locate the target after her partner
had already found it (
T2 −T1
). Due to technical diculties
which resulted in data loss, two pairs’ data-sets for these
four measures were excluded from further analyses of the
joint visual search task.
4.2. Data analyses
For the joint visual search task, we used the same data
analyses procedure as for the individual mapping task.
With respect to the questionnaire that assessed the rated
diculty of the joint visual search task, using an inter-
nal consistency analysis, we excluded the item in which
participants rated whether their mind wandered o (as a
sign of being unchallenged). With respect to the variable
search time (T1), using box plots to identify outliers, we
found a total of 0.3% of the trials (visual condition: 0.7%,
tactile condition: 0.3%, baseline condition: 0%, auditory
condition: 0%, outlier trials had a T1 larger than 26s). With
respect to the variable
T2 −T1
, we found a total of 0.03%
outliers (visual condition: 0.03%, tactile condition: 0.04%,
baseline condition: 0.02%, auditory condition: 0.02%,
outlier trials had a
T2 −T1
larger than 6.6s). Outliers were
replaced by the box plot threshold (i.e., below or above
Figure 5.Joint visual search task results: Search time (T1) refers to
the time until one of the participants had found the target. Search
time is shown as a function of the visual, tactile, auditory and
no communication conditions (abbreviated as ‘No Comm’). Error
bars are standard error of the mean. Significant comparisons are
indicated with ‘*’, comparisons approaching significance with ‘as’.
Individual pairs’ performances are shown in light gray.
Downloaded by [Central European University], [Basil Wahn] at 01:35 24 November 2015
Figure 2: Joint visual search task results: Search time refers to
the time until one of the participants had found the target.
Search time is shown as a function of the visual, tactile,
auditory and no communication conditions (abbreviated as ‘no
comm’). Error bars are standard error of the mean. Significant
comparisons are indicated with ‘*’, comparisons approaching
significance with ‘as’. Individual pairs’ performances are
shown in light gray. The Figure is taken from Figure 5 in [7].
III. CONCLUSIONS AND DISCUSSION
Overall, we found that attentional limitations within one
sensory modality can be circumvented by distributing
information processing across several sensory modalities [6,7].
However, we also found that these multisensory benefits are
specific to the task demands. In particular, when information
processing requires only spatial processing, distributing the
information across sensory modalities does not lead to any
performance benefits in comparison to receiving the same
spatial information via only one sensory modality [4,5]. Yet,
when visuospatial processing additionally involves the
discrimination of stimulus attributes, then distributing
information processing across several sensory modalities
effectively circumvents processing limitations within the
visual sensory modality [6]. Crucially, these performance
benefits generalize to a collaborative setting: A collaborative
visual search task is performed faster when receiving gaze
information about a co-actor via the tactile or auditory
modality in comparison to receiving it via the visual modality
[7].
Findings are applicable to tasks that require a high demand of
visuospatial attention and that are either performed alone or in
a group. For tasks performed in a group, task-relevant
information could be exchanged between co-actors via a
different sensory modality than vision, thereby circumventing
visuospatial limitations. For instance, in a collaborative object
control task requiring visuospatial attention, it has been shown
that receiving task-relevant information via the auditory
modality facilitates joint performance [8]. Given these and the
presented findings, joint performance in other collaborative
object control tasks (e.g., [9]) could be similarly facilitated.
Moreover, future studies investigating collaborative tasks
could make use of the robust benefits of multisensory
integration to exchange information between co-actors. That
is, task-relevant information could be exchanged between co-
actors via multiple sensory modalities, thereby making use of
the benefits of multisensory integration [4-6].
For tasks performed alone, task performance in many real-
world scenarios, e.g., car-driving, air-traffic control, or flying
an airplane, that require a high demand of visuospatial
attention could be facilitated if information is received via
several sensory modalities. A very interesting question is,
whether performance in navigational tasks could be facilitated
if navigation-relevant information is received via a different
newly learned modality by sensory augmentation and how it
interacts with the native modalities [10,11].
ACKNOWLEDGMENT
We gratefully acknowledge the support by H2020 – H2020-
FETPROACT-2014 641321 – socSMCs (for BW) and ERC-
2010-AdG #269716 – MULTISENSE (for PK).
REFERENCES
[1] R. Marois and J. Ivanoff, “Capacity limits of information processing in
the brain“, Trends in Cognitive Sciences, vol. 9(6), 296—305, 2005
[2] M. M. Chun, J.D. Golomb, and N.B. Turk-Browne, ”A taxonomy of
external and internal attention“, Annual Review of Psychology, vol. 62,
73 – 101, 2011
[3] B. Wahn, Daniel P. Ferris, W. David Hairston, P. König “Pupil sizes
scale with attentional load and task experience in a multiple object
tracking task” PLoS ONE, in press.
[4] B. Wahn, and P. König, “Vision and haptics share spatial attentional
resources and visuotactile integration is not affected by high attentional
load” Multisensory Research, vol 28(3-4), pp. 371—392, 2015a.
[5] B. Wahn, and P. König, “Audition and vision share spatial attentional
resources, yet attentional load does not disrupt audiovisual integration”
Frontiers in Psychology, vol. 6, pp. 1—12, 2015b.
[6] B. Wahn, and P. König, “Attentional resource allocation in visuotactile
processing depends on the task, but optimal visuotactile integration does
not depend on attentional resources.” Frontiers in Integrative
Neuroscience, vol. 10, pp. 1 – 13, 2016.
[7] B. Wahn, J. Schwandt, M. Krüger, D. Crafa, V. Nunnendorf, and P.
König, ”Multisensory teamwork: using a tactile or an auditory display to
exchange gaze information improves performance in joint visual
search”. Ergonomics, vol. 59 (6), pp. 781—795, 2016
[8] G. Knoblich, and S. Jordan, “Action coordination in individuals and
groups: Learning anticipatory control“. Journal of Experimental
Psychology: Learning, Memory, & Cognition, vol. 29, pp. 1006–1016,
2003.
[9] B. Wahn, L. Schmitz, P. König, and G. Knoblich, “Benefiting from
being alike: Interindividual skill differences predict collective benefit in
joint object control.“ In A. Papafragou, D. Grodner, D. Mirman, & J.C.
Trueswell (Eds.), Proceedings of the 38th Annual Conference of the
Cognitive Science Society, pp. 2747—2752, 2016.
[10] S. U. König, F. Schumann, J. Keyser, C. Goeke, C. Krause, S. Wache,
A. Lytochkin, M. Ebert, V. Brunsch, B. Wahn, K. Kaspar, S. K. Nagel,
T. Meilinger, H. Bülthoff, T. Wolbers, C. Büchel, and P.
König.“Learning New Sensorimotor Contingencies: Effects of Long-
term Use of Sensory Augmentation on the Brain and Conscious
Perception.“, PloS ONE, in press.
[11] C. M. Goeke, S. Planera, H. Finger, and P. König, „Bayesian alternation
during tactile augmentation“.Front Behav Neurosci vol. 10. pp. 187,
2016.