Conference PaperPDF Available

Guiding Visual Attention on 2D Screens: Effects of Gaze Cues from Avatars and Humans

Authors:
Guiding Visual Aention on 2D Screens: Eects of Gaze Cues
from Avatars and Humans
Julius Albiz
jalbiz@kth.se
KTH Royal Institute of Technology
Stockholm, Sweden
Olga Viberg
oviberg@kth.se
KTH Royal Institute of Technology
Stockholm, Sweden
Andrii Matviienko
andriim@kth.se
KTH Royal Institute of Technology
Stockholm, Sweden
ABSTRACT
Guiding visual attention to specic parts of an interface is essential.
One powerful tool for guiding attention is gaze cues, that direct
visual attention in the same direction as a presented gaze. In this
paper, we explored how to direct users’ visual attention on 2D
screens using gaze cues from avatars and humans. For this, we
conducted a lab experiment (N = 30) based on three independent
variables: (1) stimulus shown either as avatars or human faces,
(2) target direction with a target appearing left or right from a
stimulus, and (3) gaze validity indicating whether a stimulus’ gaze
was directed towards a target (valid gaze) or not (invalid gaze).
Our results show that participants’ total and average xation on a
target lasted longer in the presence of the human image than the
avatar stimulus when a target appeared on the right side and when
a stimulus’ gaze was towards the target. Moreover, participants’
average xation was longer on the human than avatar stimulus
gazing in the opposite direction from a target than towards it.
CCS CONCEPTS
Human-centered computing
User studies;Empirical stud-
ies in HCI.
KEYWORDS
visual attention, gaze cues, attention guidance, virtual avatars
ACM Reference Format:
Julius Albiz, Olga Viberg, and Andrii Matviienko. 2023. Guiding Visual
Attention on 2D Screens: Eects of Gaze Cues from Avatars and Humans.
In The 2023 ACM Symposium on Spatial User Interaction (SUI ’23), October
13–15, 2023, Sydney, NSW, Australia. ACM, New York, NY, USA, 9pages.
https://doi.org/10.1145/3607822.3614529
1 INTRODUCTION
Within User Experience Design, eectively guiding users’ attention
in an interface is essential [
9
,
22
,
37
]. Visual Hierarchy is one of the
established methods dened as the order in which the human eye
is guided to consume each design element of an interface in the
intended way [
9
]. To facilitate a specic order of visual guidance,
researchers have introduced dierent visual cues such as arrows,
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specic permission
and/or a fee. Request permissions from permissions@acm.org.
SUI ’23, October 13–15, 2023, Sydney, NSW, Australia
©2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 979-8-4007-0281-5/23/10. . . $15.00
https://doi.org/10.1145/3607822.3614529
words, and gaze to encourage users to interact more with an in-
terface to complete tasks [31]. Following another human gaze is a
clear example of Cialdini’s Social Proof [
8
] saying that if an object
has caught the attention of others, it will probably be of interest to
us as well. However, with constantly increasing attention-hungry
user interfaces on 2D screens, e.g., websites and smartphone appli-
cations, guiding visual attention via gaze cues becomes even more
complex and requires a better understanding of ecient strategies.
Gaze cues are social cues used to direct visual attention based
on the direction of other people’s gaze. It is often used in websites
to allocate users’ visual attention to parts of the website that are
important to interact with [
25
]. Websites typically use gaze cues by
displaying images of real-life people with a certain gaze direction to
guide visual attention to call-to-action objects such as buttons. With
the emergence and fast expansion of technology, more and more
people interact daily with computers and virtual environments
that include virtual characters, i.e., “avatars”. Therefore, previous
research focused on making these avatars more human-like by
adding gaze cues [
3
,
18
]. When researching gaze cues and how they
aect users, eye tracking is often used as an evaluation method, as
it provides designers a more psychological approach to usability
testing [
35
]. Expanding the usage of gaze cues to avatars can be
benecial for settings where avatars need to communicate, e.g.,
video games [
16
] or immersive learning settings [
20
]. As the goal
of avatars in these settings is to make the interactions as natural
as possible, being able to use human-like social patterns such as
gaze cues will help fulll this goal. Therefore, this study evaluates
the eects of gaze cues, i.e., eye movements indicating a certain
direction, provided by avatars compared to humans.
In this paper, we explored how to direct users’ visual attention
on 2D screens using gaze cues from avatars and humans. For this,
we conducted a controlled lab experiment (N = 30) based on three
independent variables: (1) stimulus shown either as avatars or hu-
man faces, (2) target direction with a target appearing either on
the left or right side from a stimulus, and (3) gaze validity indicat-
ing whether a stimulus’ gaze was directed towards a target (valid
gaze) or in the opposite direction from it (invalid gaze). Our results
show that participants’ total and average xation on a target lasted
longer in the presence of the human than the avatar stimulus when
a target appeared on the right side and when a stimulus’ gaze was
looking in the target’s direction. Moreover, participants’ average
xation was longer on a human than on the avatar stimulus with
an eye gaze in the opposite direction from a target than in the
target’s direction. Our research contribution includes an empirical
evaluation and design guidelines for directing visual attention on
2D screens using gaze cues from avatars and humans.
SUI ’23, October 13–15, 2023, Sydney, NSW, Australia Julius Albiz, Olga Viberg, Andrii Matviienko
2 RELATED WORK
This section provides an overview of three main pillars of related
work: (1) visual hierarchy in web design, (2) eye tracking and visual
attention, and (3) gaze cues on avatars and humans.
2.1 Visual hierarchy in web design
Since the birth of the internet, websites have been a medium for com-
municating information and have found various areas of use [
22
].
Due to websites being often task-oriented, how a website can com-
municate information to users in a way benecial for the task at
hand is essential for web designers [
22
]. Within a website, designers
present the visual elements to the users, with the users mentally as-
sembling the elements to uncover the meaning behind them. Good
web design particularly focuses on how eciently it guides users’
visual attention from one element to another and does it in the
correct order [
22
]. This is also known as Visual Hierarchy and is
an essential concept in the advertisement eld [
9
]. The main goal
of visual advertisement design is (1) visual communication and (2)
visual attention allocation. Several layout patterns take advantage
of people’s scanning and reading patterns on a visual interface [
9
].
The three main are the Guttenberg diagram, the z-pattern layout,
and the f-pattern layout. The Z-pattern and F-pattern layouts facili-
tate interaction with elements in a pattern that follows the letters
Z or F. On the contrary, the Guttenberg Diagram is a pattern that
follows an even distribution of information and suggests that the
user’s attention sweeps across the interface in a series of horizontal
movements called axes of orientation. Each of these axes starts
increasingly from the left edge and moves towards the right edge.
This pattern, therefore, suggests that users pay the least attention
to the bottom-left part of the interface [9].
2.2 Eye tracking and visual attention
Eye tracking is typically employed to assess users’ cognitive pro-
cesses and distribution of their visual attention [
24
,
35
]. Eye tracking
is an experimental method that records users’ eye movements and
gaze locations across time and task [
6
], such as xations and sac-
cades. For example, Eraslan et al. [
10
] used eye tracking to investi-
gate if participants with ASD had dierent strategies for processing
information on websites compared to participants with no neuro-
logical disabilities. Results showed that the participants with ASD
tended to look more at irrelevant visual elements, had shorter xa-
tion durations, and had longer scan paths. Eye tracking also allows
designers to assess users’ visual attention to understand why cer-
tain elements are not interacted with. For example, Boardman and
Mccormick [
5
] used eye tracking to understand consumer viewing
patterns on shopping websites. The results showed that users’ at-
tention was directed in a dierent pattern than in product listing or
information, supported by another study about viewing strategies
on a Facebook website [
29
]. It indicates that users’ viewing strat-
egy depends on motivation, unlike previous research that claims
that people use specic patterns to scan textual websites [
9
,
23
].
Another example includes the assessment of visual attention to the
website advertisements [
21
,
36
]. One way of advertising focuses on
banner-ads a rectangular display embedded into a website that
redirects users to the sponsors’ website if clicked. Previous research
has focused on the design of banner ads to avoid banner blindness,
i.e., when users ignore banner ads consciously or unconsciously
and increase the eectiveness of advertisements [
21
,
36
]. Their re-
sults indicate that ads should have visual elements that stand out
to grab visual attention [
36
]. Therefore, in this work, we employed
eye tracking to better understand the allocation of users’ visual
attention in the presence of virtual and human-like avatars and
directional cues, which we outline in the following subsection.
2.3 Gaze cues on avatars and humans
A gaze cue is a visual cue that can be a visual display of a human
or a virtual avatar [
25
] to provide social information and inuence
human behavior [
8
]. As discussed in the previous subsection, eye
tracking helps in understanding the eectiveness of advertisements,
and one way of guiding users’ attention to ads is by using gaze
cues [
12
,
32
]. For instance, Sajjacholapunt and Ball [
28
] measured
the eectiveness of banner ads in three conditions: no face, mutual
(the gaze direction is towards the users), and averted (a gaze cue is
applied) gaze. Their results showed that dwell time on vertical ban-
ner ads was higher with averted gaze, and both averted and mutual
gaze led to higher dwell times compared to the no face, with the
averted gaze accumulating the highest dwell time. Another study
explored the inuence of gaze direction on food preferences [
17
].
Participants had to write down their willingness to pay, taste, and
health preferences before the test, in which they looked at images of
people with food and varying gaze directions. The results showed
participants’ willingness to pay more for the images that utilized
gaze cues in the direction of the food and lower preference for
the food on the images with gaze cues in the opposite direction
from the food. Arrow cues are another visual cue to allocate visual
attention on websites [
7
,
13
,
19
]. For example, Joseph et al. [
13
]
used an fMRI scan to measure brain activity when presented with
gaze and arrow cues and found that humans direct visual attention
more unconsciously when presented with a gaze cue than with an
arrow cue. Although humans might direct attention unconsciously
regarding gaze cues, the overall cueing eect is similar to when
presented with arrow cues [
7
,
19
]. Moreover, arrow cues eciently
direct attention to a group of objects, whereas gaze cues allocate at-
tention to the specic object [
7
]. These ndings show no signicant
dierences in the cueing eect, but gaze cues are more ecient
in providing social aspects compared to arrows [
17
]. Thus, this
work investigates gaze cues on human-like and virtual avatars to
understand better how they aect visual attention allocation.
Social signals, e.g., eye gaze, play an important role in human-
to-human [
3
,
15
,
18
] communication, which led to the implemen-
tation of social signals on virtual avatars [
3
]. In the presence of
social cues on virtual avatars, users showed faster task-completion
times [
2
,
14
,
18
], higher preferences for virtual avatars [
3
], and
lower error rates [
2
]. Realistic and engaging avatars increase im-
mersiveness, learning among users, and overall enjoyment [
27
]. For
instance, Khoramashi et al. [
14
] employed gaze behaviors to make
avatars realistic and engaging. In an experiment where participants
completed a simple task of mirroring the avatar, two conditions
were set, one in which the avatar used gaze cues and one in which
it did not. Results showed that gaze cues signicantly improved
participants’ reaction time to the avatar’s movements, made the
task feel less dicult, and showed that the avatar’s gaze movements
Guiding Visual Aention on 2D Screens: Eects of Gaze Cues from Avatars and Humans SUI ’23, October 13–15, 2023, Sydney, NSW, Australia
Human
Avatar
Val id Va li dInvalid Invalid
Figure 1: Overview of eight experimental conditions: four conditions (left) included a virtual avatar and four conditions (right)
a human-like avatar. They were further split into target direction, i.e., left or right from the avatar, and validity, i.e., the eyes
looking in the same direction where a target appears or the opposite.
were cooperative, human-like, and realistic. In summary, previous
research has shown that gaze cues are important in guiding users’
visual attention. However, there is a limited understanding regard-
ing their eectiveness in directing visual attention using avatars
and humans as stimuli since previous research shows that gaze cues
are eective with humans and avatars. In this paper, we compare
these types of stimuli, particularly in the presence of valid and
invalid gaze guidance, which we describe in detail in the evaluation
section in the following.
3 EVALUATION
We conducted a controlled lab experiment to assess the inuence of
virtual and human-like avatars on users’ visual attention guidance.
The research question for this experiment is: How can we direct
users’ visual attention using virtual and human-like stimuli for 2D
interaction?
3.1 Participants
We recruited 30 participants (15 male, 15 female) aged between 21
and 50 (
𝑀=
34
.
1
, 𝑆𝐷 =
7
.
9). Participants had previous experience
with virtual avatars from series and movies (N = 13), social media (N
= 12), video games (N = 10), commercials (N = 3), internet browsing
(N = 3), and work (N = 3). We recruited the participants through the
advertising channels of our institution. We excluded participants
who had eye surgery, wore glasses with more than one power,
or had any eye movement or alignment abnormalities, such as
lazy eyes, Strabismus, or Nystagmus. Participants did receive any
compensation for their participation.
3.2 Study design
The study was designed to be within-subject with three indepen-
dent variables: stimulus,target direction, and gaze validity. The
stimulus consisted of two levels which included (1) a human and
(2) a virtual avatar face. Since daily interaction with computers
shifts towards virtual environments that include virtual charac-
ters [
2
,
14
,
15
,
18
,
26
,
30
], i.e., “avatars”, we explore human and
virtual faces to understand better their inuence on guiding visual
attention. The target direction consisted of two levels: (1) left with
a target appearing on the left, and (2) right with a target on the
right. The selection of target appearance on the left and right sides
was based on the people’s scanning and reading patterns on visual
interfaces, which typically follow side-wise movements rather than
up and down or in-between directions [
9
]. Lastly, the gaze validity
also had two levels: (1) valid with the eyes of the stimuli moving
in the same direction where a target appeared and (2) invalid with
the eyes of the stimuli moving in the opposite direction where a
target appeared. We explore gaze validity to investigate distraction
and focus introduced by gaze cues often employed to direct visual
attention based on the direction of people’s gaze, e.g., to allocate
users’ visual attention to content important to interact with [
25
]. To
explore all levels of independent variables, we created eight experi-
mental conditions (2 stimuli x 2 target directions x 2 gaze validities)
(Figure 1). The sequence of eight conditions was counterbalanced
using a Balanced Latin square.
At the beginning of each condition, a xation cross appeared in
the middle of the screen, which was replaced by an avatar (300x455
pixels) after 670 ms. The starting avatar’s gaze was directed toward
the participant. Afterward, the stimulus’s gaze changed after 900
ms to either left or right. After 300 ms, a target shaped as a circle
(610 pix) appeared at a distance 100 pixels away from the edge of
the screen, either to the left or right, depending on the experimental
condition. Each trial could therefore be a valid gaze cue, in which the
object appears in the gaze direction of the stimulus, or an invalid,
in which the object appears in the opposite gaze direction. The
consequent condition started in 2000 ms with a xation cross to
regain participants’ gaze to the middle of the screen. Participants
sat in a chair 60-65 cm away from the screen.
3.3 Apparatus
We employed screen-based eye tracker Tobii Pro Nano
1
to assess
participants’ attention. We chose a screen-based eye tracker to
create natural interaction for the participants and remove possible
distractions caused by wearable eye-tracking glasses. The study
1https://www.tobii.com/products/eye-trackers/screen-based/tobii- pro-nano
SUI ’23, October 13–15, 2023, Sydney, NSW, Australia Julius Albiz, Olga Viberg, Andrii Matviienko
Fixation cross
Avatar
Tar get
A B
Figure 2: Study setup: (a) a participant is looking at the x-
ation cross in the middle of the screen before the target ap-
pears, and (b) a participant is looking at the target circle on
the left side of the screen with a human avatar in the middle
of the screen.
was conducted using a Dell XPS 15 laptop with a screen size of
15.6 inches and the screen-based eye tracker placed at the bottom
of the screen (Figure 2). The study design followed three earlier
developed approaches [
1
,
4
,
25
], which have been used to measure
the gaze cueing eect of participants. The virtual avatar was created
using the Unity Asset UMA 2 Multipurpose Avatar
2
, which was
customized to look similar to the human one. Before the experiment,
we dened Areas of Interest (AOI) to create areas for collecting data
on participants’ gaze. We analyzed the data gathered from the eye
tracking session using the Analyze tab of Tobii Pro Lab.
3.4 Measurements
To compare participants’ attention based on dierent stimuli, target
direction, and gaze validity, we measured the following dependent
variables using the AOI tool in Tobii Pro Lab. By creating an AOI,
the users ensure all gaze data within the AOI is registered and
available for analysis. Metrics such as xations, saccades, and visits
are registered within the AOI. In this study, xations are the main
metrics, and to measure these, Tobii Pro Lab has metrics such as
total duration, average duration, time to rst xation, and frequency
of xations. The following measures were gathered in this study:
Duration of xations on a target (in ms): we measured
how long participants xated on a target in total and on
average.
Duration of xations on a stimulus (in ms): we measured
how long participants xated on a stimulus in total and on
average.
Frequency of xations: we measured how often partici-
pants xated on a target and a stimulus on average.
Time to the rst xation on a target (in ms): we measured
the time between the target’s appearance and participants’
gaze landing on it.
2
https://assetstore.unity.com/packages/3d/characters/uma-2-unity- multipurpose-
avatar-35611
3.5 Procedure
After obtaining informed consent, we explained the experiment’s
goal and allowed the participant to test the setup for familiarization
purposes and calibrate the eye tracker. Their task was to look at the
middle of the screen, the beginning, and a circular target upon its
appearance. At the end of the study, the participants reected on
their experience of the stimuli and the gaze cues. The entire study
lasted approximately 15 minutes per participant.
4 RESULTS
We discovered that participants’ total and average xation on a
target lasted longer with a human image than an avatar stimulus
when a target appeared on the right side and when a stimulus’ gaze
was looking toward the target. Moreover, participants’ average
xation was longer on a human than an avatar stimulus with an eye
gaze in the opposite direction from a target than toward it. Lastly,
participants glanced more often at a stimulus when a stimulus’
was looking toward the target, and it took longer for participants
to glance at a target for the rst time if a stimulus’ gaze was not
looking toward the target.
4.1 Duration of xations on a target
4.1.1 Total duration. We discovered that participants’ total xation
on a target lasted longer in the presence of a human stimulus (
𝑀𝑑 =
1630
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
305) than a virtual avatar (
𝑀𝑑 =
1529
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
992). As for the direction, participants’ total xation on a target
lasted longer when it appeared on the right (
𝑀𝑑 =
1647
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
262) than on the left (
𝑀𝑑 =
1475
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
999). Lastly, partici-
pants’ total xation on a target lasted longer when the stimulus’ eye
gaze was looking in a target’s direction (
𝑀𝑑 =
1658
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
308) than in the opposite (
𝑀𝑑 =
1496
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
983). These
ndings were supported by the statistically signicant main ef-
fects for the stimulus type (
𝐹(
1
,
29
)=
165
, 𝑝 <
0
.
001
, 𝜂2=
0
.
85),
target direction (
𝐹(
1
,
29
)=
99
, 𝑝 <
0
.
001
, 𝜂2=
0
.
77), and gaze va-
lidity (
𝐹(
1
,
29
)=
96
, 𝑝 <
0
.
001
, 𝜂2=
0
.
77). The post-hoc analysis
has shown statistically signicant dierences between all pairs
(𝑝<0.001) for all independent variables.
Our statistical analysis revealed three statistically signicant in-
teraction eects for stimulus*target direction (
𝐹(
1
,
29
)=
82
.
7
, 𝑝 <
0
.
001
, 𝜂2=
0
.
74), stimulus*gaze validity (
𝐹(
1
,
29
)=
46
.
7
, 𝑝 <
0
.
001
, 𝜂2=
0
.
61), and target direction*gaze validity (
𝐹(
1
,
29
)=
88
.
7
, 𝑝 <
0
.
001
, 𝜂2=
0
.
75). As for the rst interaction eect, the
post-hoc analysis showed that participants’ total xation on a tar-
get lasted longer when it appeared on the right than on the left
with a virtual avatar (
𝑝<
0
.
001) and a human stimulus (
𝑝<
0
.
001).
The total xation on a target was also longer in the presence of
a human stimulus than a virtual avatar when a target appeared
on the left (
𝑝<
0
.
001) and on the right (
𝑝<
0
.
001). As for the
second interaction eect, participants’ total xation on a target
lasted longer with a valid than invalid gaze for a virtual avatar
(
𝑝<
0
.
001) and a human stimulus (
𝑝<
0
.
001). The total xation on
a target was also longer with a human stimulus than in a virtual
avatar when a gaze was valid (
𝑝<
0
.
001) and invalid (
𝑝<
0
.
001).
As for the third interaction eect, participants’ total xation on a
target lasted longer with a valid gaze than with invalid (
𝑝<
0
.
001)
when it appeared on the left, but when it appeared on the right, it
Guiding Visual Aention on 2D Screens: Eects of Gaze Cues from Avatars and Humans SUI ’23, October 13–15, 2023, Sydney, NSW, Australia
Valid
Invalid
Left Right Left Right
0
500
1000
1500
Total duration of fixations (Target), msec
Valid
Invalid
Left Right Left Right
0
100
200
300
400
Total duration of fixations (Stimulus), msec
Valid
Invalid
Left Right Left Right
0
500
1000
Average duration of fixations (Target), msec
Valid
Invalid
Left Right Left Right
0
100
200
300
Average duration of fixations (Stimulus), msec
Avatar Human
Figure 3: Overview of the results: total and average xations on a target and stimulus for the combination of independent
variables: stimulus (avatar/human), target direction (left/right), and gaze validity (valid/invalid).
lasted longer for the invalid than valid gaze (
𝑝<
0
.
001). The total
xation on a target was also longer with invalid gaze when a target
appeared on the right than left (
𝑝<
0
.
001) and with valid gaze
when it appeared on the left than right (
𝑝<
0
.
001). The remaining
pairwise comparisons were not statistically signicant (𝑝>0.05).
4.1.2 Average duration. Participants’ average xation on a tar-
get lasted longer in the presence of a human stimulus (
𝑀𝑑 =
1486
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
893) than a virtual avatar (
𝑀𝑑 =
893
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
1059). As for the direction, participants’ average xation on a target
lasted longer when it appeared on the right (
𝑀𝑑 =
1516
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
904) than on the left (
𝑀𝑑 =
858
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
1077). Lastly, partici-
pants’ average xation on a target lasted longer when the stimulus’
eye gaze was looking in a target’s direction (
𝑀𝑑 =
1516
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
917) than in the opposite (
𝑀𝑑 =
841
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
1015). These nd-
ings were supported by the statistically signicant main eects
for the stimulus type (
𝐹(
1
,
29
)=
28
, 𝑝 <
0
.
001
, 𝜂2=
0
.
49), target
direction (
𝐹(
1
,
29
)=
20
, 𝑝 <
0
.
001
, 𝜂2=
0
.
41), and gaze valid-
ity (
𝐹(
1
,
29
)=
36
, 𝑝 <
0
.
001
, 𝜂2=
0
.
56). The post-hoc analysis
has shown statistically signicant dierences between all pairs
(𝑝<0.001) for all independent variables.
Our statistical analysis revealed three statistically signicant in-
teraction eects for stimulus*target direction (
𝐹(
1
,
29
)=
16
.
9
, 𝑝 <
0
.
001
, 𝜂2=
0
.
37), stimulus*gaze validity (
𝐹(
1
,
29
)=
7
.
9
, 𝑝 <
0
.
001
, 𝜂2
=
0
.
21), target direction*gaze validity (
𝐹(
1
,
29
)=
19
, 𝑝 <
0
.
001
, 𝜂2=
0
.
4). As for the rst interaction eect, the post-hoc analysis showed
that participants’ average xation on a target lasted longer when it
appeared on the right than on the left in the presence of a virtual
avatar (
𝑝<
0
.
05) and a human stimulus (
𝑝<
0
.
05). The average
xation on a target was also longer in the presence of a human
stimulus than a virtual avatar when a target appeared on the left
(
𝑝<
0
.
05) and on the right (
𝑝<
0
.
05). As for the second interaction
eect, participants’ average xation on a target was comparable for
all pairs (
𝑝>
0
.
05). As for the third interaction eect, participants’
average xation on a target lasted longer with a valid gaze than
with invalid (
𝑝<
0
.
05) when it appeared on the left, but when it
appeared on the right it lasted longer for the invalid than valid
gaze (
𝑝<
0
.
05). The average xation on a target was also longer
with invalid gaze when a target appeared on the right than left
(
𝑝<
0
.
05) and with valid gaze when it appeared on the left than
right (
𝑝<
0
.
05). The remaining pairwise comparisons were not
statistically signicant (
𝑝>
0
.
05). Figure 3provides a detailed
overview of the results.
4.2 Duration of xations on a stimulus
4.2.1 Total duration. We discovered that participants’ total xa-
tion on a stimulus was comparable for a human stimulus (
𝑀𝑑 =
221
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
172) and a virtual avatar (
𝑀𝑑 =
148
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
236). As for the direction, participants’ total xation on a stimu-
lus was comparable when a target appeared on the right (
𝑀𝑑 =
231
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
168) and on the left (
𝑀𝑑 =
232
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
159). Lastly, participants’ total xation on a stimulus was com-
parable when the stimulus’ eye gaze was looking in a target’s
direction (
𝑀𝑑 =
226
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
176) and in the opposite di-
rection (
𝑀𝑑 =
236
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
149). These ndings were sup-
ported by the statistically non-signicant main eects for the stim-
ulus type (
𝐹(
1
,
29
)=
1
.
6
, 𝑝 >
0
.
05
, 𝜂2=
0
.
05), target direction
(
𝐹(
1
,
29
)=
0
.
27
, 𝑝 >
0
.
05
, 𝜂2=
0
.
001), and gaze validity (
𝐹(
1
,
29
)=
0
.
11
, 𝑝 >
0
.
05
, 𝜂2=
0
.
003). None of the interaction eects were
statistically signicant (𝑝>0.05).
4.2.2 Average duration. We discovered that participants’ average
xation on a stimulus was comparable to a human stimulus (
𝑀𝑑 =
198
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
129) and a virtual avatar (
𝑀𝑑 =
218
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
125). As for the direction, participants’ average xation on a stim-
ulus was comparable when a target appeared on the right (
𝑀𝑑 =
209
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
127) and the left (
𝑀𝑑 =
217
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
130).
Lastly, participants’ average xation on a stimulus was compa-
rable when the stimulus’ eye gaze was looking in a target’s di-
rection (
𝑀𝑑 =
197
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
130) and in the opposite direc-
tion (
𝑀𝑑 =
228
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
119). These ndings were supported
by the statistically non-signicant main eects for the stimulus
SUI ’23, October 13–15, 2023, Sydney, NSW, Australia Julius Albiz, Olga Viberg, Andrii Matviienko
Valid
Invalid
Left Right Left Right
0.0
0.5
1.0
1.5
Frequency of fixations (Target)
Valid
Invalid
Left Right Left Right
0.0
0.5
1.0
1.5
Frequency of fixations (Stimulus)
Valid
Invalid
Left Right Left Right
0
100
200
300
400
Time to first fixation on a target, msec
Avatar Human
Figure 4: Overview of the results: frequency of xations on a target and stimulus, and time to the rst xation on a target for the
combination of independent variables: stimulus (avatar/human), target direction (left/right), and gaze validity (valid/invalid).
type (
𝐹(
1
,
29
)=
0
.
01
, 𝑝 >
0
.
05
, 𝜂2=
0
.
0003), target direction
(
𝐹(
1
,
29
)=
0
.
7
, 𝑝 >
0
.
05
, 𝜂2=
0
.
024), and gaze validity (
𝐹(
1
,
29
)=
0
.
7
, 𝑝 >
0
.
05
, 𝜂2=
0
.
024). Additionally, there was one statistically
signicant interaction eect for stimulus*gaze validity (
𝐹(
1
,
29
)=
8
.
6
, 𝑝 <
0
.
001
, 𝜂2=
0
.
23). The post-hoc analysis has shown that
participants’ average xation was longer on a human stimulus with
an eye gaze in the opposite direction from a target than to the
right direction (
𝑝<
0
.
05). The remaining interaction eects were
not statistically signicant (
𝑝>
0
.
05). Figure 3provides a detailed
overview of the results.
4.3 Frequency of xations
4.3.1 On a target. We discovered that participants glanced at a
target a comparable number of times in the presence of a human
stimulus (
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1) and a virtual avatar (
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1).
The same applies to the direction left (
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1) and right
(
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1) of the target’s appearance, and the correct
(
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1) and incorrect gaze direction (
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1).
These ndings were supported by the statistically non-signicant
main eects for the stimulus type (
𝐹(
1
,
29
)=
1
.
1
, 𝑝 >
0
.
05
, 𝜂2=
0
.
03), target direction (
𝐹(
1
,
29
)=
1
.
2
, 𝑝 >
0
.
05
, 𝜂2=
0
.
04), and
gaze validity (
𝐹(
1
,
29
)=
1
.
26
, 𝑝 >
0
.
05
, 𝜂2=
0
.
04). None of the
interaction eects were statistically signicant (𝑝>0.05).
4.3.2 On a stimulus. We discovered that participants glanced at a
stimulus a comparable number of times in the presence of a human
stimulus (
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1) and a virtual avatar (
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1).
The same applies to the direction left (
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1) and right
(
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1) of the target’s appearance, and the valid (
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1) and invalid gaze direction (
𝑀𝑑 =
1
, 𝐼𝑄 𝑅 =
1). However,
the main eect for the gaze validity was statistically signicant
(
𝐹(
1
,
29
)=
0
.
04
, 𝑝 <
0
.
05
, 𝜂2=
0
.
14), indicating that participants
glanced more often at a stimulus looking in the direction of target’s
appearance (
𝑝<
0
.
05). The remaining main and interaction eects
were not statistically signicant (
𝑝>
0
.
05). Figure 4provides a
detailed overview of the results.
4.4 Time to the rst xation on a target
We discovered that it takes a comparable time to glance at a target
after its appearance in the presence of a human stimulus (
𝑀𝑑 =
315
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
172) and a virtual avatar (
𝑀𝑑 =
300
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
157). The same applies to the direction left (
𝑀𝑑 =
317
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
174) and right (
𝑀𝑑 =
301
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
145) of the target’s appear-
ance, and the valid (
𝑀𝑑 =
273
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
159) and invalid gaze
direction (
𝑀𝑑 =
324
𝑚𝑠𝑒𝑐 , 𝐼𝑄𝑅 =
143). However, the main eect for
the gaze validity was statistically signicant (
𝐹(
1
,
29
)=
16
.
7
, 𝑝 <
0
.
001
, 𝜂2=
0
.
37), indicating that it takes longer for participants
to glance at a target for the rst time if a stimulus’ gaze is not
looking toward target’s appearance (
𝑝<
0
.
05). The remaining main
and interaction eects were not statistically signicant (
𝑝>
0
.
05).
Figure 4provides a detailed overview of the results.
4.5 Qualitative feedback
When it came to the gaze cues provided by the avatar, eleven (out
of 30) participants mentioned that the gaze cue was similar to
the human gaze cue, as it was a similar experience following the
gaze direction. As P4 noted: “The animated gure seemed more
tired, but it was equally simple to follow the gaze direction of them
both.. Additionally, participants commented that even though the
avatar seemed a little more tired, following the gaze direction of
both stimuli was as simple. For example, P28 mentioned that “it
felt natural when the avatar glanced at the object and therefore felt
natural to follow the gaze direction. However, other participants
commented that “I thought they were somewhat neutral in relative to
one another. They did not stand out against each other. Same kind of
facial expressions. [P7] and “I thought the avatar had a very unclear
facial expression” [P3]. Two participants were curious about why
they were tricked when the gaze was in the wrong direction. For
example, some participants noted “The avatar looked diagonally
to the side, which made me want to look there instead. I was more
interested in seeing what it was looking at” [P8] and “It was harder to
Guiding Visual Aention on 2D Screens: Eects of Gaze Cues from Avatars and Humans SUI ’23, October 13–15, 2023, Sydney, NSW, Australia
be "prepared," you always wanted to glance in the direction the person
was looking in” [P20].
Three participants (P2, P7 & P17) underlined that after some
time, they stopped focusing as much on their eyes and used their
peripheral vision to identify where the object would appear quickly.
P7 further explained that at rst, they trusted the human stimulus
more and were more fooled by it, but after a while, even the trust
aspect disappeared, and they learned not to always focus on the
eyes. Trust was a factor that many participants stressed in dierent
ways. For example, P19 and P29 thought both stimuli aimed to
trick them “It felt like they were trying to trick me, I was tricked
the rst time, but after that, I did not look where they looked on
purpose. I did not trust them. [P29] and P23 chose to dierentiate
the stimuli “I trusted the human face more when the animated face
came I rst thought it would look in the right direction. I had more
trust in the human face and was tricked more by it.. This opinion of
being tricked by the human stimulus was also shared by P14, who
explained it in the following way: “You look at the eyes more on the
human as you recognize it to be more human eyes. With the avatar,
it was easier to look away and at the object..
5 DISCUSSION AND FUTURE WORK
By exploring visual attention guidance using humans and avatars
under valid and invalid gaze directions on 2D screens, we have
shown conceptual dierences and derived a set of design guidelines
and practical implications discussed in the following.
5.1 Human to focus and avatar to distract
With this experiment, we have shown fundamental dierences in
guiding visual attention using avatars and humans. We discovered
that human gaze cues facilitate focus on a target, while avatars
are better at distracting users’ attention. This outcome is based on
the result that participants’ total and average xation on a target
lasted longer with human images than with an avatar stimulus.
This applies to the situations when a target appears on the right
side relative to the avatar and when a stimulus’ gaze is looking
in the target’s direction, which is in line with previous work on
gender dierences in gaze cueing [
1
,
4
]. These implications can go
beyond the interaction on laptops but also on smartphones. For
example, during the interview, participants mentioned Snapchat
as a social media platform where they interact with avatars since
every person has a small avatar representing themselves. When
interacting with other people on Snapchat or browsing the map,
one can see other people’s avatars. Therefore, participants may
deem this an interaction with avatars. Still, it might be a completely
dierent type of interaction compared to watching avatars in an-
imated series/movies or video games. It can provide more or less
experience of interacting with avatars.
As for the xations on a stimulus, our ndings indicate that
participants xated on them for a comparable amount of time. This
highlights that the focus on a target was stronger and marginally
aected by the appearance of stimuli, e.g., participants tended to
focus better on human images than virtual avatars. Similarly, x-
ations on a target were comparable between human images and
virtual avatars, indicating one glance on average. This implies that
participants’ gaze did not tend to jump between a stimulus and a
target. However, we observed that participants glanced more often
at a stimulus when their gaze was toward a target. Most likely, after
quickly and successfully spotting a target, participants were look-
ing for a conrmation of their selection by looking at a stimulus
again. Lastly, participants did not feel that the gaze cues provided
by the stimuli diered greatly, which is in line with previously
mentioned studies [
2
,
14
,
18
] on decreased task completion time
with avatar gaze cues. Our evaluation aimed to add to the research
by taking a well-known gaze cue method and exploring its eect
on humans and avatars. Results thus show dierences and avatars’
gaze cues can allocate visual attention dierently to human gaze
cues, depending on the situation. Even though previous research,
such as [
2
,
14
,
18
,
26
,
30
], have proven that gaze cues from avatars
work, this study compares the gaze cues from the two stimuli. It
shows that there is no statistically signicant dierence in task
completion. This can lead to designers having avatars as a choice
of providing gaze cues, given that it is viable in the specic setting.
Therefore, we derived a set of design guidelines, which we list in
the following subsection.
As for the target direction, our results indicated that participants
xated longer on targets when they appeared on the right side
regardless of stimulus type. This could be explained by the cultural
background of our sample, which fully consisted of participants who
read and write from left to right. Thus, moving their eyes from left
to right could have been a natural and habitual movement, which
introduced a directional bias in their spatial cognition [
11
]. This
nding and explanation also align with other results demonstrating
that participants xate more on a target on the left in the presence
of a valid gaze, i.e., a stimulus looking toward a target. Still, it has the
opposite eect when a target appears on the right. This implies that
participants’ reading and writing direction possibly overwrites the
gaze cues presented by avatars, also known as script directionality
eects that arise from left to right reading and writing habits [
33
].
5.2 Design guidelines
Based on the results of our study and the discussion above, we de-
rived the following design guidelines (GL) guiding visual attention
on 2D screens using gaze cues with avatars and humans:
GL1: Human images are better suited to facilitate focus on a
target and avatars to guide focus away.
GL2: Gaze cues of a stimulus looking toward a target lead
to better focus on it.
GL3: Using human images increases the speed of the rst
xation.
GL4: Frequency of glances at a target is not aected by
the avatar and human images and direction of a target’s
appearance.
GL5: Target appearance on the right side leads to longer
total and average duration of xations on it independently
of stimulus type.
GL6: Target appearance on the left combined with stimulus’
eyes looking in the target’s direction leads to longer total
and average duration of xations on it.
GL7: Target appearance on the right combined with stimulus’
eyes looking in the target’s direction leads to shorter total
and average duration of xations on it.
SUI ’23, October 13–15, 2023, Sydney, NSW, Australia Julius Albiz, Olga Viberg, Andrii Matviienko
5.3 Practical implications
Gaze cues with avatars can nd use in settings where avatars are
used to communicate, such as video games or immersive learning,
and not only in 2D interfaces, such as websites and smartphone
apps. Being able to use gaze cues in these settings can help in
making interactions feel more human-like, but at the same time
decrease task-completion times [
2
,
14
,
15
,
18
] and error rates [
2
].
With participants sharing that following the gaze direction of the
avatar felt natural, this is a good sign that avatars can appear more
realistic and engaging with gaze cues. Therefore, this can lead to
increased immersiveness when presented in a 3D space, e.g., in
virtual reality, increased user learning, and overall enjoyment [
27
].
Gaze cues are often used in advertisement settings, where the
goal is to allocate visual attention to the desired parts of the adver-
tisements, which often is the product or brand [
28
,
32
]. Based on
our ndings, game designers could, for example, use their avatars
in their advertisements and provide gaze cues, increasing exposure
to their brand and product. At the same time, using their already
branded avatars instead of unknown humans also may increase
their branding, as the avatars may pique the interest of the peo-
ple looking at the advertisements. Similarly, animated movies or
series may also use their already-created avatars instead of taking
images of unknown humans for marketing purposes. It is also im-
portant to distinguish the settings where gaze cues with avatars
might produce dierent results for humans. Suppose avatars are
used in settings where they feel misplaced. In that case, they may
allocate attention to the avatar instead of in the gaze direction and,
therefore, not produce the intended results, which they might do if
the setting is relevant. In the after-study interviews, some partici-
pants mentioned that they stopped looking at the stimulus’s eyes
after several trials. Doing the same test twice but with dierent
time pauses would aect the viability of the gaze cues and lead to
more participants strictly using their peripheral vision to locate the
object instead of using the gaze direction.
Previous work [
1
,
4
] also included a test task meant to serve
as a test run and, combined with the instructions, prepare the
participants for the test. Due to this test only running eight trials
compared to the other studies that had signicantly more, in this
study, we did not choose to have test tasks to minimize the learning
eect on the participants. Instead, the instructions were tested
on the pilot study and rened to explain better the test and how
the trials would work. As for the rst trial of every participant,
there was no clear dierence in the time to rst xation, and the
trials that were randomly ordered as the rst trial were evenly
distributed across the dierent eight trials, meaning that roughly
the same amount of invalid trials and valid trials were the rst trial
the participants saw. For future studies, a test task might still be
viable to prepare the participants better to avoid large dierences
in time to the rst xation.
5.4 Future work
Further research on whether participants’ previous experience with
avatars may aect their interaction with them would be an inter-
esting continuation of this line of research. In this study, there
was no formal questionnaire or form to measure their previous
experience and no way of determining how each media inuences
the experience gained. Finding a way of ranking experiences and
having a tested method to measure experiences would be a valuable
contribution. In this study, we used only a male avatar and male
human image to provide the gaze cues, leaving the gender aspect
out of the scope. However, there might be a correlation between
a participant’s gender and gaze behaviors [
4
]. Future work could
expand the gender aspect to focus on the stimuli that present the
gaze cues to see if there is a dierence in how people allocate visual
attention based on the gender of the stimuli and if the stimulus
type can aect this dierence. Moreover, this evaluation focused
on the interaction on a single screen and focus on a target with-
out an implicit interaction. Future work can further explore setups
with multiple screens and use participants’ gazes as input for en-
tertainment, visual attention guidance, or eye-tracking calibration
methods [34].
6 LIMITATIONS
The experiment presented in this work focused on systematically
evaluating three aspects of visual attention guidance on a laptop
screen due to the availability of powerful mobile eye-tracking sys-
tems. However, we should have investigated other types of devices
with 2D user interfaces, e.g., smartphones and tablets, that should
be considered in future work. Another limitation of this work is
that avatars and targets were presented to participants in isolation
from a context, e.g., a website. The presence of other user interface
elements would compete for users’ attention even more and lead
to lower duration and frequency of xations on a target. However,
we aimed to provide an initial empirical evaluation of the eects
of avatars and humans on visual attention guidance to create a
baseline, and future work should consider more complicated and
realistic scenarios for attention guidance. Within the scope of this
study, we investigated one representation of an avatar and one
representation of a human. Other representations might lead to dif-
ferent results and should be more systematically explored in future
work, especially since some avatars might create better emotional
connections with users than others, e.g., based on familiarity or
cultural background. Moreover, only eye gaze played a role of guid-
ance in this experiment, and adding mimics, e.g., lip or eyebrow
movements, is a better way to guide users’ attention due to a higher
level of avatars’ expressiveness.
7 CONCLUSION
In this work, we investigated how to direct users’ visual attention
on 2D screens using gaze stimuli from avatars and humans. We
found that participants’ overall and average xations on a target
lasted longer when a human than an avatar stimulus was present,
the target appeared on the right side, and the stimulus’ gaze was
directed toward the target. In addition, participants’ average x-
ation on a human than an avatar stimulus was longer when the
gaze was directed in the opposite direction of the target than in the
direction of the target. Finally, participants glanced more frequently
at a stimulus when the stimulus’s gaze was directed in the direction
of the target’s appearance, and it took longer for participants to
rst glance at a target when the stimulus’s gaze was not directed
in the direction of the target’s appearance.
Guiding Visual Aention on 2D Screens: Eects of Gaze Cues from Avatars and Humans SUI ’23, October 13–15, 2023, Sydney, NSW, Australia
ACKNOWLEDGMENTS
We would like to thank all participants who took part in our exper-
iment.
REFERENCES
[1]
N. Alwall, D. Johansson, and S. Hansen. 2010. The gender dierence in gaze-
cueing: Associations with empathizing and systemizing. Personality and Individ-
ual Dierences 49, 7 (Nov. 2010), 729–732. https://doi.org/10.1016/j.paid.2010.06.
016
[2]
Sean Andrist, Michael Gleicher, and Bilge Mutlu. 2017. Looking Coordinated:
Bidirectional Gaze Mechanisms for Collaborative Interaction with Virtual Char-
acters. In Proceedings of the 2017 CHI Conference on Human Factors in Computing
Systems (CHI ’17). Association for Computing Machinery, New York, NY, USA,
2571–2582. https://doi.org/10.1145/3025453.3026033
[3]
Sean Andrist, Tomislav Pejsa, Bilge Mutlu, and Michael Gleicher. 2012. De-
signing eective gaze mechanisms for virtual agents. In Proceedings of the
SIGCHI Conference on Human Factors in Computing Systems (CHI ’12). As-
sociation for Computing Machinery, New York, NY, USA, 705–714. https:
//doi.org/10.1145/2207676.2207777
[4]
Andrew P. Bayliss, Giuseppe di Pellegrino, and Steven P. Tipper. 2005. Sex
dierences in eye gaze and symbolic cueing of attention. The Quarterly Journal
of Experimental Psychology Section A 58, 4 (May 2005), 631–650. https://doi.org/
10.1080/02724980443000124
[5]
Rosy Boardman and Helen Mccormick. 2021. Attention and behaviour on fashion
retail websites: an eye-tracking study. Information Technology & People 35, 7 (Jan.
2021), 2219–2240. https://doi.org/10.1108/ITP-08- 2020-0580
[6]
Benjamin T. Carter and Steven G. Luke. 2020. Best practices in eye tracking
research. International Journal of Psychophysiology 155 (Sept. 2020), 49–62. https:
//doi.org/10.1016/j.ijpsycho.2020.05.010
[7]
Jeanette A. Chacón-Candia, Juan Lupiáñez, Maria Casagrande, and Andrea
Marotta. 2023. Eye-Gaze direction triggers a more specic attentional ori-
enting compared to arrows. PLOS ONE 18, 1 (Jan. 2023), e0280955. https:
//doi.org/10.1371/journal.pone.0280955
[8]
Robert B. Cialdini and Robert B. Cialdini PH.D, PhD. 1993. Inuence (rev): The
Psychology of Persuasion. HarperCollins. Google-Books-ID: mTYj9XUlYvMC.
[9]
Doaa Farouk Badawy Eldesouky. 2013. Visual Hierarchy and Mind Motion
in Advertising Design. Journal of Arts and Humanities 2, 2 (2013), 148–162.
https://doi.org/10.18533/journal.v2i2.78
[10]
Sukru Eraslan, Victoria Yaneva, Yeliz Yesilada, and Simon Harper. 2019. Web
users with autism: eye tracking evidence for dierences. Behaviour & Information
Technology 38, 7 (July 2019), 678–700. https://doi.org/10.1080/0144929X.2018.
1551933
[11]
Naseh Faghihi and Jyotsna Vaid. 2023. Reading/writing direction as a source of
directional bias in spatial cognition: Possible mechanisms and scope. Psychonomic
Bulletin & Review 30, 3 (2023), 843–862. https://doi.org/10.3758/s13423- 022-
02239-1
[12]
Kassandra Friebe, Sabína Samporová, Kristína Malinovská, and Matej Homann.
2022. Gaze Cueing and the Role of Presence in Human-Robot Interaction. In
Social Robotics (Lecture Notes in Computer Science), Filippo Cavallo, John-John
Cabibihan, Laura Fiorini, Alessandra Sorrentino, Hongsheng He, Xiaorui Liu,
Yoshio Matsumoto, and Shuzhi Sam Ge (Eds.). Springer Nature Switzerland,
Cham, 402–414. https://doi.org/10.1007/978-3- 031-24667- 8_36
[13]
Robert M. Joseph, Zachary Fricker, and Brandon Keehn. 2015. Activation of
frontoparietal attention networks by non-predictive gaze and arrow cues. Social
Cognitive and Aective Neuroscience 10, 2 (Feb. 2015), 294–301. https://doi.org/
10.1093/scan/nsu054
[14]
Mahdi Khoramshahi, Ashwini Shukla, Stéphane Raard, Benoît G. Bardy, and
Aude Billard. 2016. Role of Gaze Cues in Interpersonal Motor Coordination:
Towards Higher Aliation in Human-Robot Interaction. PLOS ONE 11, 6 (June
2016), e0156874. https://doi.org/10.1371/journal.pone.0156874
[15]
Simon Kimmel, Frederike Jung, Andrii Matviienko, Wilko Heuten, and Susanne
Boll. 2023. Let’s Face It: Inuence of Facial Expressions on Social Presence
in Collaborative Virtual Reality. In Proceedings of the 2023 CHI Conference on
Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association
for Computing Machinery, New York, NY, USA, Article 429, 16 pages. https:
//doi.org/10.1145/3544548.3580707
[16]
Michael Lankes and Argenis Gomez. 2022. GazeCues: Exploring the Eects of
Gaze-based Visual Cues in Virtual Reality Exploration Games. Proceedings of
the ACM on Human-Computer Interaction 6 (Oct. 2022), 1–25. https://doi.org/10.
1145/3549500
[17]
Apoorva Rajiv Madipakkam, Gabriele Bellucci, Marcus Rothkirch, and Soyoung Q.
Park. 2019. The inuence of gaze direction on food preferences. Scientic Reports
9, 1 (April 2019), 5604. https://doi.org/10.1038/s41598-019- 41815-9
[18]
Santiago Martinez, Robin Sloan, Andrea Szymkowiak, and Kenneth Scott-Brown.
2011. Animated Virtual Agents to Cue User Attention. International Journal On
Advances in Intelligent Systems 4 (Jan. 2011), 299–308.
[19]
Takashi Mitsuda, Mio Otani, and Sayana Sugimoto. 2019. Gender and individual
dierences in cueing eects: Visuospatial attention and object likability. Attention,
Perception, & Psychophysics 81, 6 (Aug. 2019), 1890–1900. https://doi.org/10.3758/
s13414-019- 01743-2
[20]
Jewoong Moon and Jeeheon Ryu. 2021. The eects of social and cognitive
cues on learning comprehension, eye-gaze pattern, and cognitive load in video
instruction. Journal of Computing in Higher Education 33, 1 (April 2021), 39–63.
https://doi.org/10.1007/s12528-020- 09255-x
[21] Francisco Muñoz-Leiva, Janet Hernández-Méndez, and Diego Gómez-Carmona.
2019. Measuring advertising eectiveness in Travel 2.0 websites through eye-
tracking technology. Physiology & Behavior 200 (March 2019), 83–95. https:
//doi.org/10.1016/j.physbeh.2018.03.002
[22]
Xufang Pang, Ying Cao, Rynson W. H. Lau, and Antoni B. Chan. 2016. Directing
user attention via visual ow on web designs. ACM Transactions on Graphics 35,
6 (Dec. 2016), 240:1–240:11. https://doi.org/10.1145/2980179.2982422
[23]
Kara Pernice. 2019. Text Scanning Patterns: Eyetracking Evidence. https:
//www.nngroup.com/articles/text-scanning-patterns- eyetracking/
[24]
Kara Pernice and Jakob Nielsen. 2009. How to Conduct Eyetracking Studies |
Nielsen Norman Group Report. https://www.nngroup.com/reports/how-to-
conduct-eyetracking- studies/
[25]
Qian Qian, Miao Song, and Keizo Shinomori. 2013. Gaze cueing as a function of
perceived gaze direction. Japanese Psychological Research 55, 3 (2013), 264–272.
https://doi.org/10.1111/jpr.12001
[26]
Radiah Rivu, Ken Pfeuer, Philipp Müller, Yomna Abdelrahman, Andreas Bulling,
and Florian Alt. 2021. Altering Non-Verbal Cues to Implicitly Direct Atten-
tion&nbsp;in&nbsp;Social&nbsp;VR. In Proceedings of the 2021 ACM Sympo-
sium on Spatial User Interaction (Virtual Event, USA) (SUI ’21). Association
for Computing Machinery, New York, NY, USA, Article 18, 2 pages. https:
//doi.org/10.1145/3485279.3485309
[27]
K. Ruhland, C. E. Peters, S. Andrist, J. B. Badler, N. I. Badler, M. Gleicher, B. Mutlu,
and R. McDonnell. 2015. A Review of Eye Gaze in Virtual Agents, Social Robotics
and HCI: Behaviour Generation, User Interaction and Perception. Computer
Graphics Forum 34, 6 (2015), 299–326. https://doi.org/10.1111/cgf.12603
[28]
Pitch Sajjacholapunt and Linden Ball. 2014. The inuence of banner ad-
vertisements on attention and memory: human faces with averted gaze can
enhance advertising eectiveness. Frontiers in Psychology 5 (2014). https:
//www.frontiersin.org/articles/10.3389/fpsyg.2014.00166
[29]
Graham G. Scott and Christopher J. Hand. 2016. Motivation determines Facebook
viewing strategy: An eye movement analysis. Computers in Human Behavior 56
(March 2016), 267–280. https://doi.org/10.1016/j.chb.2015.11.029
[30]
William Steptoe, Robin Wol, Alessio Murgia, Estefania Guimaraes, John Rae,
Paul Sharkey, David Roberts, and Anthony Steed. 2008. Eye-Tracking for Avatar
Eye-Gaze and Interactional Analysis in Immersive Collaborative Virtual Environ-
ments. In Proceedings of the 2008 ACM Conference on Computer Supported Cooper-
ative Work (San Diego, CA, USA) (CSCW ’08). Association for Computing Ma-
chinery, New York, NY, USA, 197–200. https://doi.org/10.1145/1460563.1460593
[31]
David Stevens. 2022. Optimizing Visual Cues in Educational Software. In Aug-
mented Cognition: 16th International Conference, AC 2022, Held as Part of the 24th
HCI International Conference, HCII 2022, Virtual Event, June 26 July 1, 2022,
Proceedings. Springer-Verlag, Berlin, Heidelberg, 287–303. https://doi.org/10.
1007/978-3- 031-05457- 0_23
[32]
Rita Ngoc To and Vanessa M Patrick. 2021. How the Eyes Connect to the Heart:
The Inuence of Eye Gaze Direction on Advertising Eectiveness. Journal of
Consumer Research 48, 1 (June 2021), 123–146. https://doi.org/10.1093/jcr/ucaa063
[33]
Jyotsna Vaid. 2011. Asymmetries in representational drawing: Alternatives to
a laterality account. Spatial dimensions of social thought 18 (2011), 231–255.
https://doi.org/10.1515/9783110254310.231
[34]
Simon Voelker, Andrii Matviienko, Johannes Schöning, and Jan Borchers. 2015.
Combining Direct and Indirect Touch Input for Interactive Workspaces Using
Gaze Input. In Proceedings of the 3rd ACM Symposium on Spatial User Interaction
(Los Angeles, California, USA) (SUI ’15). Association for Computing Machinery,
New York, NY, USA, 79–88. https://doi.org/10.1145/2788940.2788949
[35]
Jiahui Wang, Pavlo Antonenko, Mehmet Celepkolu, Yerika Jimenez, Ethan Field-
man, and Ashley Fieldman. 2019. Exploring Relationships Between Eye Tracking
and Traditional Usability Testing Data. International Journal of Human–Computer
Interaction 35, 6 (April 2019), 483–494. https://doi.org/10.1080/10447318.2018.
1464776
[36]
Qiang Yang, Yuanjian Zhou, Yushi Jiang, and Jiale Huo. 2021. How to overcome
online banner blindness? A study on the eects of creativity. Journal of Research
in Interactive Marketing 15, 2 (Jan. 2021), 223–242. https://doi.org/10.1108/JRIM-
12-2019- 0212
[37]
Yanxia Zhang, Ken Pfeuer, Ming Ki Chong, Jason Alexander, Andreas Bulling,
and Hans Gellersen. 2017. Look together: using gaze for assisting co-located
collaborative search. Personal and Ubiquitous Computing 21 (2017), 173–186.
https://doi.org/10.1007/s00779-016- 0969-x
... They found that distinctive visual feedback in a remote MR environment represents gaze behaviours more effectively in sharing attention and enhancing mutual communication, and that encouraging frequent joint gaze interactions, and gaze behaviour visualisations have improved active mutual reaction to the shared interest. This is consistent with research on more standard displays that has shown the influence of gaze cues on target acquisition and fixation times [29]. ...
Article
Full-text available
Collaborative decision-making increasingly involves wall-sized displays (WSDs), allowing teams to view, analyse and discuss large amounts of data. To enhance workspace awareness for mixed-presence meetings, previous work proposes digital cues to share gestures, gaze, or entire postures. While several isolated cues were proposed and demonstrated useful in different workspaces, it is unknown whether results from previous studies can be transferred to a mixed-presence WSD context and to what extend such cues can be used in a combined way. In this paper, we report on the results from a user study with 24 participants (six groups of four participants), testing a mixed-presence collaboration scenario on two different setups of connected WSDs: audio-video link only vs. full setup with seven complementary cues. Our results show that the version with cues enhances workspace awareness, user experience, team orientation and coordination, and leads teams to take more correct decisions.
... Previous work has explored social behaviors and norms in individual or group scenarios [2,6,24,30], aiming to understand how humans interact with robots and how these interactions can be improved to enhance the user experience. These studies have examined the impact of social cues such as gaze [1], facial expressions [16], and body posture on users' perceptions of robots' social presence and their willingness to interact with them [26,28]. Other studies have focused on investigating the role of robot behavior's role in shaping users' attitudes and behaviors toward them, such as politeness. ...
Conference Paper
Full-text available
Politeness and embodiment are pivotal elements in human-agent interactions. While many previous works advocate the positive role of embodiment in enhancing these interactions, it remains unclear how embodiment and politeness affect individuals joining groups. In this paper, we explore how politeness behaviors (verbal and nonverbal) exhibited by three distinct embodiments (humans, robots, and virtual characters) influence individuals’ decisions to join a group of two agents in a controlled experiment (N=54). We assessed agent effectiveness regarding persuasiveness, perceived politeness, and participants’ trajectories when joining the group. We found that embodiment does not significantly impact agent persuasiveness and perceived politeness, but politeness does. Direct and explicit politeness strategies have a higher success rate in persuading participants to join the group at the furthest side. Lastly, participants adhered to social norms when joining at the furthest side, maintained a greater physical distance from humans, chose longer paths, and walked faster when interacting with humans.
Conference Paper
Full-text available
As the world becomes more interconnected, physical separation between people increases. Existing collaborative Virtual Reality (VR) applications, designed to bridge this distance, are not yet sufficient in providing a sense of social connection comparable to face-to-face interactions. Possible reasons are the limited multimodality of VR systems and the lack of non-verbal cues in VR avatars. We systematically investigated how facial expressions influence Social Presence in two collaborative VR tasks. We explored four types of facial expressions: eyes and mouth movements, their combination, and no expressions, for two types of explanations: verbal and graphical. To examine how these expressions influence Social Presence, we conducted a controlled VR experiment (N = 48), in which participants had to explain a specific term to their counterpart. Our results demonstrate that eye and mouth movements positively influence Social Presence in VR. Particularly, combining verbal explanations and eye movements induces the highest feeling of co-presence.
Article
Full-text available
Visual hierarchy is a significant concept in the field of advertising, a field that is dominated by effective communication, visual recognition and motion. Designers of advertisements have always been trying to organize the visual hierarchy throughout their advertising designs to aid the eye to recognize information in the desired order, to achieve the ultimate goals of clear perception and effectively delivering the advertising messages. However many assumptions and questions usually rise on how to create effective hierarchy throughout advertising designs and lead the eye and mind of the viewer in the most favorable way. This paper attempts to study visual hierarchy and mind motion in advertising designs and why it is important to develop visual paths when designing an advertisement. It explores the theory behind it, and how the very principles can be used to put these concepts into practice. The paper demonstrates some advertising samples applying visual hierarchy and mind motion in a representation of applying the basics and discussing the results.
Article
Full-text available
Numerous studies have shown that eye-gaze and arrows automatically shift visuospatial attention. Nonetheless, it remains unclear whether the attentional shifts triggered by these two types of stimuli differ in some important aspects. It has been suggested that an important difference may reside in how people select objects in response to these two types of cues, eye-gaze eliciting a more specific attentional orienting than arrows. To assess this hypothesis, we examined whether the allocation of the attentional orienting triggered by eye-gaze and arrows is modulated by the presence and the distribution of reference objects (i.e., placeholders) on the scene. Following central cues, targets were presented either in an empty visual field or within one of six placeholders on each trial. In Experiment 2, placeholder-objects were grouped following the gestalt’s law of proximity, whereas in Experiment 1, they were not perceptually grouped. Results showed that cueing one of the grouped placeholders spreads attention across the whole group of placeholder-objects when arrow cues were used, while it restricted attention to the specific cued placeholder when eye-gaze cues were used. No differences between the two types of cues were observed when placeholder-objects were not grouped within the cued hemifield, or no placeholders were displayed on the scene. These findings are consistent with the idea that socially relevant gaze cues encourage a more specific attentional orienting than arrow cues and provide new insight into the boundary conditions necessary to observe this dissociation.
Article
Full-text available
Becoming literate has been argued to have a range of social, economic and psychological effects. Less examined is the extent to which repercussions of becoming literate may vary as a function of writing system variation. A salient way in which writing systems differ is in their directionality. Recent studies have claimed that directional biases in a variety of spatial domains are attributable to reading and writing direction. This claim is the focus of the present paper, which considers the scope and possible mechanisms underlying script directionality effects in spatial cognition, with particular attention to domains with real-world relevance. Three questions are addressed: (1) What are possible mediating and moderator variables relevant to script directionality effects in spatial cognition? (2) Does script directionality exert a fixed or a malleable effect? and (3) How can script directionality effects be appropriately tested? After discussing these questions in the context of specific studies, we highlight general methodological issues in this literature and provide recommendations for the design of future research.
Chapter
Full-text available
This study examined the effects of software graphical user interface (GUI) visual cues in educational software on user performance. It specifically studied the effectiveness of three distinct and commonly used visual cues -- bolded text, buttons, and arrows to guide a software application user through a series of tasks. The study attempted to prove the hypothesis that specific visual cues in educational software applications could decrease task time.The study population consisted of a group of 134 post-secondary undergraduate students in Honolulu, Hawaiʻi, that engaged in a web-based educational software simulation which recorded response times when prompted by each of the three distinct visual cues.The web-based simulation consisted of six steps. Each step consisted of a simple question and the appearance of a new visual cue to lead the participant to the next step only after selecting the correct answer. Each step of the experiment was automatically timed and recorded in milliseconds from the moment the participant selected the correct answer until the moment they clicked on the visual cue to proceed to the next step.Slower response times indicated that during the first two steps, participants were still scanning the screen for the visual cue after they selected the correct answer. Of the three cues studied on the first two steps of the simulation, the Arrows and Bolded text were clearly the most quickly recognized cues among participants, while the response times for the Button cue were significantly slower.However, in the last four steps of the simulation, no visual cue could be identified as the leader in participant response times. This would indicate that, since the visual cues were consistently in the same position, the participants acclimated to the position of the cue. At this point, there was no notable differences in response times among the different cues.This study suggests that using arrows and/or bolded text in educational software are better choices for visual cues than buttons. It also suggests that keeping visual cues for common functions in a consistent location is optimal.KeywordsVisual cuesSRK frameworkTask timeGUIEducational software
Article
Full-text available
Purpose The purpose of this paper is to identify attention, cognitive and affective responses towards a fashion retailer's website and the behavioural outcomes when shopping online. Design/methodology/approach 52 eye-tracking tests and 52 qualitative semi-structured interviews were conducted. Findings Consumer attention and behaviour differ across web pages throughout the shopping journey depending on its content, function and consumers' goal. Top-down attention is more dominant than bottom-up attention when consumers are shopping online for fashion items. The product listings page was the most frequented and had the most time spent on it. Consumers enjoy browsing for products and adding them to their basket to evaluate them together later. Customisation and personalisation features are the most valued due to their ability to make the experience more convenient and enjoyable. Originality/value This article contributes novel findings that the content and design of the website affects attention in different ways. It demonstrates that research cannot simplify viewing patterns for fashion shopping online. The study extends the SOR framework, showing that top-down attention, when provided with personalisation and customisation features, results in approach behaviour. A lack of personalisation or customisation features results in avoidance behaviour. The complex nature of consumer attention and behaviour during their holistic shopping journey advocates the need for eye-tracking research to be conducted on a live website for ecological validity, providing a methodological contribution which can be used for future research.
Chapter
Gaze cueing is a fundamental part of social interactions, and broadly studied using Posner task based gaze cueing paradigms. While studies using human stimuli consistently yield a gaze cueing effect, results from studies using robotic stimuli are inconsistent. Typically, these studies use virtual agents or pictures of robots. As previous research has pointed to the significance of physical presence in human-robot interaction, it is of fundamental importance to understand its yet unexplored role in interactions with gaze cues. This paper investigates whether the physical presence of the iCub humanoid robot affects the strength of the gaze cueing effect in human-robot interaction. We exposed 42 participants to a gaze cueing task. We asked participants to react as quickly and accurately as possible to the appearance of a target stimulus that was either congruently or incongruently cued by the gaze of a copresent iCub robot or a virtual version of the same robot. Analysis of the reaction time measurements showed that participants were consistently affected by their robot interaction partner’s gaze, independently on the way the robot was presented. Additional analyses of participants’ ratings of the robot’s anthropomorphism, animacy and likeability further add to the impression that presence does not play a significant role in simple gaze based interactions. Together our findings open up interesting discussions about the possibility to generalize results from studies using virtual agents to real life interactions with copresent robots.KeywordsHuman-robot interactionGaze cueingPresence
Article
This paper examines different gaze-based cue visualization techniques in a Virtual Reality (VR) exploration game. It is argued that gaze could be a valuable design tool for providing cues to players. However, little is known about how these elements could be visually presented and integrated to inform players and fit into the game world. An exploratory study was carried out to investigate four different design approaches (subtle, overlaid-virtual, integrated, emphasized) and their effects on aspects such as clarity, usefulness, and curiosity. In general, the visualizations led to different player impressions on all scales. While players perceived the subtle and emphasized variants as more aesthetically appealing, integrated and virtual-overlaid solutions received more positive ratings regarding tool-related aspects such as ease of extraction and accurateness. We anticipate future work could employ gaze-based cues in various application contexts based on their utility as eye input in VR games continues developing.
Article
Purpose This study aims to explore whether creativity can overcome banner blindness in the viewing of web pages and demonstrate how visual saliency and banner-page congruity constitute the boundary conditions for creativity to improve memory for banner ads. Design/methodology/approach Three studies were conducted to understand the influence of advertising creativity and banner blindness on recognition of banner ads, which were assessed using questionnaires and bias adjustment. The roles of online user tasks (goal-directed vs free-viewing), visual saliency (high vs low) and banner-page congruity (congruent vs incongruent) were considered. Findings The findings suggest that creativity alone is not sufficient to overcome the banner blindness phenomenon. Specifically, in goal-directed tasks, the effect of creativity on recognition of banner ads is dependent on banner ads’ visual saliency and banner-page congruity. Creative banners are high on visual saliency, and banner-page congruity yields higher recognition rates. Practical implications Creativity matters for attracting consumer attention. And in a web page context, where banner blindness prevails, the design of banners becomes even more important in this respect. Given the prominence of banners in online marketing, it is also necessary to tap the potential of creativity of banner ads. Originality/value First, focusing on how creativity influences memory for banner ads across distinct online user tasks not just provides promising theoretical insight on the tackling of banner blindness but also enriches research on advertising creativity. Second, contrary to the popular belief of extant literature, the findings suggest that, in a web page context, improvement in memory for banner ads via creativity is subject to certain boundary conditions. Third, a computational neuroscience software program was used in this study to assess the visual saliency of banner ads, whereas signal detection theory was used for adjustment of recognition scores. This interdisciplinary examination combining the two perspectives sheds new light on online advertising research.