Conference PaperPDF Available

Validating the Benefits of Glanceable and Context-Aware Augmented Reality for Everyday Information Access Tasks

Authors:

Abstract and Figures

Glanceable Augmented Reality interfaces have the potential to provide fast and efficient information access for the user. However, where to place the virtual content and how to access them depend on the user context. We designed a Context-Aware AR interface that can intelligently adapt for two different contexts: solo and social. We evaluated information access using Context-Aware AR compared to current mobile phones and non-adaptive Glanceable AR interfaces. We found that in a solo scenario, compared to a mobile phone, the Context-Aware AR interface was preferred, easier, and significantly faster; it improved the user experience; and it allowed the user to better focus on their primary task. In the social scenario, we discovered that the mobile phone was slower, more intrusive, and perceived as the most difficult. Meanwhile, Context-Aware AR was faster for responding to information needs triggered by the conversation; it was preferred and perceived as the easiest for resuming conversation after information access; and it improved the user’s awareness of the other person's facial expressions.
Content may be subject to copyright.
Validating the Benefits of Glanceable and Context-Aware Augmented
Reality for Everyday Information Access Tasks
Shakiba Davari*Feiyu LuDoug A. Bowman
Center for Human-Computer Interaction
Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
Figure 1: We propose a context-aware augmented reality interface which minimizes intrusiveness while providing fast and easy
information access during both solo and social contexts: (a) In a solo context, the virtual content is glanceable and opaque when not
occluding any object of importance in the real world; (b) In a social context, the real-world is prioritized, keeping the virtual content
transparent when not needed, avoiding occlusion, distraction, and visual clutter; (c) In a social context, a piece of virtual content
relevant to the ongoing conversation moves up and turns opaque to provide fast and easy access without blocking the conversation
partner’s face.
ABSTRACT
Glanceable Augmented Reality interfaces have the potential to pro-
vide fast and efficient information access for the user. However,
where to place the virtual content and how to access them depend
on the user context. We designed a Context-Aware AR interface that
can intelligently adapt for two different contexts: solo and social. We
evaluated information access using Context-Aware AR compared to
current mobile phones and non-adaptive Glanceable AR interfaces.
We found that in a solo scenario, compared to a mobile phone, the
Context-Aware AR interface was preferred, easier, and significantly
faster; it improved the user experience; and it allowed the user to bet-
ter focus on their primary task. In the social scenario, we discovered
that the mobile phone was slower, more intrusive, and perceived as
the most difficult. Meanwhile, Context-Aware AR was faster for
responding to information needs triggered by the conversation; it
was preferred and perceived as the easiest for resuming conversation
after information access; and it improved the user’s awareness of the
other person’s facial expressions.
Index Terms: Human-centered computing—Mixed / augmented re-
ality; Human-centered computing—Interaction techniques; Human-
centered computing—Empirical Studies in HCI
1I
NTRODUCTION
In modern life people rely on digital information to perform a variety
of daily tasks [4, 7]. This leads to near-constant information acquisi-
*e-mail: sdavari@vt.edu
e-mail: feiyulu@vt.edu
e-mail: dbowman@vt.edu
tion in different situations and contexts [32]. For example, digital
information can be needed when you are alone reading a book and
would like to check your emails, when you are walking down the
street and need to check the map, or when you are planning a hike
with a friend and would like to check the weather for that weekend.
Currently, mobile phones are the most pervasive personal com-
puting devices that enable convenient information access [7,32].
However, acquiring information via mobile phones requires physical
interaction which occupies the user’s hands, and it can distract the
user’s attention from their environment. Phones also have limited
display size, so they typically only allow access to one app at a time,
and thus require cognitive effort to find and open different apps when
seeking multiple types of information.
For example, if someone is in a conversation about their availabil-
ity to grab a coffee, they may need to: (1) pull out their phone from
their pocket, (2) unlock the screen, (3) find their calendar app and
open it, and then (4) check for available times. Performing all these
steps takes time and induces mental workload, which can distract
the user from their on-going conversation. In addition, during this
process the user mainly has their eyes on the device and pays less
attention to the other person involved in the conversation. Finally,
the other party may be unaware of the motivation behind this in-
formation access, which can increase social awkwardness. While a
wearable smartwatch can address some of these issues by providing
more instant and discreet access to certain information, its display
size limitations restrict the ability to perform many tasks.
Recent advances in the quality and affordability of augmented
reality (AR) head-worn displays (HWDs) improve the chances that
ubiquitous lightweight and powerful AR glasses may replace our
smartphones as the primary tools for all-day information access.
Feiner envisioned the future of AR HWDs as “much like telephones
and PCs” and displaying information “that we expect to see both
at work and at play” [11]. These future AR systems will have the
potential to support continuous hands-free access to multiple pieces
436
2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR)
2642-5254/22/$31.00 ©2022 IEEE
DOI 10.1109/VR51125.2022.00063
of information without requiring the user to close and open appli-
cations each time. These glasses are anticipated to make possible
“Glanceable AR Interfaces” in which virtual content can be displayed
anywhere and anytime without the need for physical displays, and
information acquisition can be as effortless as a glance [6,23,25].
While Glanceable AR information access may be highly conve-
nient and efficient, there is also a tradeoff between the persistent
accessibility of information and the visual clutter, occlusion, and
distraction that it can cause. However, the balance of concerns in this
tradeoff depends on the user’s context. Each context presents its own
challenges and criteria for success. An interface that is desirable
for one context can be completely inappropriate for another. For
example, mobile contexts present challenges such as social interac-
tions, restricted time for information access [32], where to present
the virtual content and how should it follow the user [18]. Thus, the
challenge for Glanceable AR is “providing the right information, at
the right time, and in the right form for the current context” [32].
For example, when using an AR interface for information access,
while alone in the static and familiar environment of one’s own room,
the constant presence of multiple glanceable virtual apps is desirable
since information access convenience would be prioritized. However,
when walking down a busy street, the same interface would create
challenges such as occlusion and visual clutter. In this context, the
user’s safety and awareness of their ever-changing environment is
more crucial than the ease and efficiency of information access.
In social contexts, Glanceable AR interfaces should prioritize a
clear view of the other person’s face to avoid interrupting the social
interactions and creating socially awkward situations.
In this research, we consider how Glanceable AR systems can
sense the user’s context and adapt to it, in order to manage the
tradeoff between efficient information access and ability to view and
interact with the real world. We propose Context-Aware Glance-
able AR interfaces that intelligently adapt to multiple contexts and
scenarios. The contributions of this work include: (1) design of a
Context-Aware AR system to differentiate between solo and social
contexts and adapt the virtual content display and information access
technique to it; (2) evidence that both non-adaptive and Context-
Aware Glanceable AR provides more efficient and less intrusive
information access and a better user experience than current mobile
phones in two different contexts; and (3) validation of the effective-
ness of Context-Aware AR, compared to non-adaptive Glanceable
AR, in a social context.
2R
ELATED WORK
2.1 Pervasive and Everyday AR
In 2016, Grubert et al. proposed that future everyday AR interfaces
will be pervasive, universal, and omnipresent, enabling continuous
augmentation to everyday scenarios and tasks [13]. One benefit
of such an interface is to fulfill the ubiquitous information needs
of users [33]. Similar to how people rely on mobile phones and
wearables today, users could obtain information directly through the
AR glasses without pulling out a separate display. For example, Di
Verdi et al. proposed ARWin, a desktop augmented by everyday
information such as weather and clock [9]. Langlotz et el. proposed
ARBrowser, a mobile application that augmented the real world
with relevant information anytime [20]. Lu & Bowman explored
applications of AR HWDs to display personal apps in everyday sce-
narios [23]. Using an AR device’s sensors, Morrison et al. designed
an open-ended AI system to extend a blind child’s capabilities in
social situations [26]. Their results in a real-world deployment study
shed light on the potential of AR HWDs to support the everyday
general-purpose information needs of users.
However, to fulfill the pervasive and everyday AR vision, two
research gaps need to be filled. First, it remains unclear how AR
compares to existing devices in terms of obtaining daily information
in different scenarios. Prior work has indicated that users believe AR
might be more convenient than smartphones for information access
[23], but we are not aware of any direct empirical comparisons.
Second, most existing studies on AR information access focus on
single-user scenarios. How do users perceive using AR HWDs
for information access not only in situations when they are alone,
but also during conversations which involve active participation of
co-present others? In this research, we aimed to fill both gaps by
running a systematic comparison between AR and mobile phones in
both single-user and social contexts.
2.2 Context-Aware Adaptive AR
To be prepared for a pervasive AR future, as Grubert et al. suggested,
it is necessary that an “AR system can adapt to the current, changing
requirements and constraints of the user’s context and thus allows
for a continuous usage” [13]. To reach this design goal, the interface
must be adaptive both spatially (e.g., layouts, and frame of reference)
and visually (e.g., size, transparency, and level of detail). Existing
research has extensively explored user-triggered adaptation of AR
interfaces. For example, Lages & Bowman explored adaptation of
the layout and frame of reference of virtual windows via button
presses on a handheld controller [18,19]. The use of gaze, hand, and
head-based inputs for varying the transparency and level of detail of
virtual information has been extensively explored [24,28
30]. Ens
et al. explored adaptation of size and fixation of the virtual windows
in AR HWDs through a finger tap [10].
Although user-triggered adaptations could lead to good control-
lability and predictability [31], they also require users to determine
what, when, and how to adapt, increasing the user workload. Davari
et al. found that automatic adaptation of AR interfaces could lead
to an increased level of efficiency in information access tasks as
compared to user-triggered adaptations [6]. As such, recent research
has also looked into “context-aware AR interfaces” [13], in which
the interface adapts automatically based on context changes.
The term “context” refers to characteristics of people, places, or
objects that are considered relevant to the interaction between the
user and the interface [8]. For example, Lindlbauer et al. explored
context-aware adaptation of frame of reference, level of detail, and
spatial location of the AR content based on real-time estimated
cognitive load and task environment [22]. Cheng et al. explored
layout adaptation of AR windows based on semantic changes when
users move between multiple different locations [3]. In this study,
we mainly looked into the context of conversation, in which users
are having interpersonal interactions while using AR interfaces.
2.3 AR UIs in Social Situations
If AR interfaces are to be used in everyday situations, social sce-
narios would be one of the most frequent use cases. Research has
found that users are more likely to acquire certain information dur-
ing social conversations, which were found to be key contextual
factors in prompting information needs of the users [5,32]. However,
one major issue is the social acceptance of AR HWDs. HWDs are
inherently designed for only one user and may lead to isolation of
the wearer and exclusion of others [15]. Everyday AR users will
be immersed in private and personalized virtual experiences, which
could be invisible to collocated others due to privacy concerns [14].
Frequent uses of mobile phones during social scenarios can be
considered impolite because it indicates lack of attention to the
speaker [1,16]. Similarly, during conversation, virtual content in AR
HWDs might take the user’s attention away from the interlocutor,
yielding uncomfortable social interactions.
Twenty years ago, H
¨
ollerer et al. explored an AR interface that
intelligently moved away when it occluded other people’s heads in
a social space [17]. Similarly, recent work by Orlosky et al. pro-
posed a non-invasive view management AR system that changed
its interface layouts automatically to avoid interfering with interper-
sonal interactions [27]. However, in some cases information could
437
be needed for smooth progression of the conversation, so simply
moving or removing the virtual content is not sufficient. While
pulling out a mobile phone could be considered socially intrusive
and rude, if designed intelligently, AR interfaces have the potential
to propose an alternative that can support users’ socially prompted
information needs in an unobtrusive, quick, and convenient manner
that can be perceived as less intrusive and more appealing in social
contexts. Our proposed context-aware Glanceable AR solution aims
to address these challenges in social contexts with AR HWDs.
3C
ONTEXT-AWARE GLANCEABLE AR
For Glanceable AR interfaces to be accepted as a replacement for
proven and familiar technologies like smartphones, it is not suf-
ficient to simply provide more efficient and reliable information
access. They must also do so in a wide variety of real-world usage
contexts, without causing additional problems such as distraction,
safety hazards, or obtrusiveness. At first glance, these design goals
may appear to be incompatible. Increased efficiency and produc-
tivity can be achieved by using the expansive/immersive display
space of AR to display more information, but this approach would
inevitably lead to more obtrusiveness, clutter, and occlusion.
Intelligent AR interfaces that are aware of the user’s context pro-
vide the potential to achieve both these goals. Such interfaces can
automatically adapt to the user’s context based on different contex-
tual cues, of which the user may or may not be aware. Considering
the context, allows the interface to display the right information at
the right time in the right place, while also supporting the user’s
awareness of and interaction with the real world.
We have designed and prototyped a Context-Aware Glanceable
AR interface that can adapt to provide a balance of efficiency and
unobtrusiveness in multiple contexts. As a proof-of-concept, our
interface adapts between two common usage contexts: a static, solo
context and a dynamic social context. To our knowledge, our work is
the first to prototype a socially aware and supportive Context-Aware
AR interface that facilitates conversation between two users and
evaluates it in a highly ecologically valid conversation.
In the solo context, the user is seated, stationary, alone, and
focused only on a single object (e.g., a book or laptop) in the real-
world. The real-world object of interest is somewhere below the
user’s eye level. Therefore, virtual content that is at the user’s
eye level will be unobtrusive and will not occlude the object of
interest. The interface can also prioritize virtual content (by making
it opaque) without distracting the user. This interface for the solo
context (Fig. 1a) allows efficient information access (via a quick
upward glance) without causing any obstruction or visual clutter.
The interface for the solo context would be completely inappro-
priate for a social context, however. In the social context, the user is
standing and conversing with one or more others. The conversation
partners may move, leave the conversation, or join the conversation
in progress. Furthermore, some conversation topics may necessi-
tate information access. Thus, the context-aware interface for this
context should also be dynamic and intelligent, adapting to both
the presence of conversation partners and the topics of conversation.
To design a Context-Aware Glanceable AR interface for the social
context, we propose three design considerations (DCs). Using the
user context, DC1 and DC2 aim to address the existing challenges
of AR (i.e., visual clutter and obtrusiveness), while DC3 aims at
improving information access efficiency.
DC1. Real-World prioritization. Constantly visible virtual content
produces visual clutter and distraction when engaged in dynamic
real-world tasks [6]. Our design prioritizes the real-world by keeping
the virtual content’s level of transparency high (Fig. 2a). This allows
the user to see the outline of the content and know where to find the
information when needed, but ignore it otherwise.
DC2. Support for viewing social cues. Facial expressions play
Figure 2: Screenshots of our prototype in the social context. (a)
When in conversation, real-world is prioritized by keeping the virtual
content’s level of transparency high. (b) The desired virtual content
is provided automatically based on the content of the conversation
without occluding the conversation partner’s faces.
an important role in social interactions [2]. For example, ostensive
gestures such as the eyebrow flash can indicate the intention to
communicate and are of great importance to the receiver [12]. If
virtual content is blocking the interlocutor’s face, the user will miss
their facial expressions, just as they would if they were looking down
at their mobile phone. Similar to H
¨
ollerer et al. [17], we propose that
in conversation, the AR interface should not occlude other people’s
faces, leaving space for sufficient attention and social awareness
(Fig. 2b). However, our interface moves the virtual content away
only when it occludes the faces of people who are actively engaged
in a conversation with the user. This reduces over-generalization of
the no-occluded-faces rule and enables the user to place their virtual
content anywhere they want without undesired adaptations.
DC3. Support for socially relevant information access. Since a
significant number of information inquiries arise from something
mentioned by others, conversation is a context that prompts many
information acquisitions [32]. Based on the content of the con-
versation, an intelligent interface can support the conversation by
automatically providing the desired information, leading to a smooth
and less interrupted conversation experience (Fig. 2b). Our interface
supports this by automatically making relevant apps opaque based
on speech recognition during conversation, and by placing those
apps near to the interlocutor’s face, but without occluding it.
Fig. 1 illustrates our proof-of-concept context-aware Glanceable
AR interface in a social and a solo context. Our system can adapt
virtual content placement and transparency automatically to reflect
both the static and dynamic aspects of the user’s current context.
4E
XPERIMENT
To explore the benefits and limitations of Context-Aware Glanceable
AR, we evaluated our prototype interface in both solo and social con-
texts. We also evaluated a naive Glanceable AR interface (without
context-awareness), and a mobile phone interface was included as a
baseline since it is the most pervasive tool for information access.
4.1 Research Questions
Our study aimed to address three research questions. The first
two questions address the benefits of Glanceable AR over mobile
phones for information access, while the third question addresses
the benefits of context-awareness.
1. Do Glanceable AR interfaces improve the speed and con-
venience of information access in different contexts, compared to
mobile phones? The transition to Glanceable AR interfaces is jus-
tifiable only if these interfaces offer more efficient and appealing
information access compared to the current gold-standard of mobile
phones. Therefore, we designed the study to compare the speed and
convenience of information access using Glanceable AR and mobile
phones, and to explore the effect of user context on these measures.
Is one interface always more efficient and convenient, regardless of
the user’s context, or does the context influence the benefits of one
interface over the other?
438
2. Compared to mobile phones, does information access using
Glanceable AR improve the user experience and focus on the primary
task in a solo context? The physical interaction requirements and
time-consuming process of information access using mobile phones
can distract user’s focus from their primary task. We compared
Glanceable AR and mobile phones in terms of user experience and
distraction from the primary task when accessing information.
3. Do the automatic adaptations in Context-Aware Glanceable AR
improve the user experience and reduce social intrusiveness during
information access in a social context? In a dynamic social context,
naive Glanceable AR may result in occlusion, distraction, and loss
of social awareness, and its advantages compared to mobile phones
may be lost. We designed our study to compare user experience
and social awareness with three interfaces (Context-Aware Glance-
able AR, naive Glanceable AR, and mobile phone) for information
access during a social conversation, in order to evaluate whether
Context-Aware AR can balance efficiency and awareness, retaining
the advantages of Glanceable AR while mitigating its weaknesses.
4.2 Experimental Design
In this within-subjects study each participant conducted 5 different
sessions. The first two sessions were in a solo context. The next
three sessions were conducted in a social context:
4.2.1 Solo Context
The solo session was designed to simulate situations in which a user
is doing some real-world activity alone in a static environment, and
is motivated internally to access digital information. For example,
the user may be reading a book when she thinks to check the current
weather. We refer to such questions as Self-Triggered Questions
(STQs). In our experiment, the user was asked to read an article.
While reading, at random times they were asked to use the avail-
able interface to answer an STQ (played from a computer). The
participants were asked to say the answer and immediately resume
reading. Each user performed the session twice: once using the
mobile phone and once using Context-Aware Glanceable AR. The
order of interfaces was counterbalanced. With both interfaces, the
user had access to three different applications: Email, Exercise and
Weather. The answer to the STQs were different each time (i.e., the
content in the apps changed throughout the session), but the question
from each application was always the same (“How many VIP emails
do you have?”, “How many zone-minutes of exercise have you had
today?”, and “What is the high temperature on Sunday?”). Each
question was asked twice, with a total of six questions per session.
4.2.2 Social Context
The social context was designed to represent an authentic social
conversation. In each session, the experimenter would start a conver-
sation with the participant about a specific topic (“Hiking places in
the area,” “Local food and restaurants,” or “Weather”). Throughout,
the experimenter asked them questions to keep them engaged in
a two-way conversation. Each participant performed the session
three times: one time with each of three different interfaces: mobile
phone, non-adaptive Glanceable AR and Context-Aware AR. The
order of interfaces was counterbalanced with a Latin Square design.
In this context the participants were asked to perform three tasks:
Task 1: Answer self-triggered questions. During the conversa-
tion at random times the participant answered an STQ, played from
a computer, just as they did in the solo context. The participants
were asked to say only the answer to the question and immediately
resume the conversation without interruption. Each STQ was asked
twice, with a total of six STQs per session.
Task 2: Answer conversation-triggered questions. During
the conversation, the experimenter would ask the participant to
answer specific questions, which we call Conversation-Triggered
Questions (CTQs), using the interface. The answer to these CTQs
were different each time but the question from each application was
always the same: (“How many unread messages do you have on
your email?”, “How many steps have you walked so far today?”, and
“Do you know by what percentage it is going to rain on Saturday?”).
Each CTQ was asked once, for a total of three CTQs per session.
Task 3: Maintain awareness of the conversation partner’s
facial expression. At certain times during the conversation, the
experimenter would wink at the participant. The participants were
asked to raise their hand to indicate that they saw the wink. During
each session, the experimenter winked at the participant seven times
total, three of which were right after a question was asked. This task
allowed us to evaluate how the interface affected social awareness.
4.3 Interfaces
To answer STQs and CTQs during the experiment, participants used
three interfaces for information access.
Mobile Phone Interface: The participant was given a mobile phone
that was placed face down on a table in front of them. To answer a
question, participants had to pick up the phone and turn the screen
on by swiping up on the display (Fig. 3a). Once turned on, the
phone displayed the icons of three apps. Users then tapped on the
appropriate app to display a screen from which they could read the
answer to the question. To simulate today’s most commonly used
information access method for answering an unexpected question
and to keep conditions similar for answering all questions, we asked
participants to go back to the home screen, turn the mobile phone’s
screen off, and place it face down on the table after answering
each question. In our experiment the user only had access to three
applications, and was not required to use any password, fingerprint,
or face recognition to unlock the mobile phone, making it faster than
similar real-world situations.
Glanceable AR Interface: Glanceable AR interfaces are “sec-
ondary, concise and Multi-tasking AR interfaces that are user fixed
and temporary” [6]. The Glanceable AR interface continuously dis-
played the three applications at the eye-level of the user. The apps
had an identical appearance to those displayed on the mobile phone,
but an AR HWD rendered the apps, and all three apps were visible at
once. In the social context, since we were interested in investigating
the worst-case scenario of using Glanceable AR in a social context,
the scene was set so that the virtual content would block the con-
versation partner’s face. Previous work has explored different input
modalities, for example, gaze, hand, or head-based interactions to
toggle content visibility in the interfaces for a glance [24, 25]. To
allow users to decide whether to view the face or the content, users
could use a hand-based input (i.e., “air-tap”) on individual apps to
toggle them between transparent and opaque modes (Fig. 3b).
Context-Aware Glanceable AR Interface: Our context-aware inter-
face actively detects whether the user is in a social context using face
and speech detection. Specifically, if our interface detects speech
and one or more faces in front of the AR user, this is interpreted as
a social context. In the solo context, when no social conversation
is detected, the Context-Aware AR interface prioritizes the virtual
content (i.e., displays it as opaque, at the user’s eye-level), so it is
identical to the non-adaptive Glanceable AR interface (Fig. 1 (a)).
Thus, there were only two interface conditions in the solo context
sessions.
When the social context is detected, the Context-Aware AR inter-
face automatically switches to real-world prioritized mode (i.e., the
apps become translucent when not needed (Fig. 1b). Based on the
content of the conversation, detected by the speech recognition sys-
tem, the system identifies when the user needs to access information
from one of the apps (i.e., when a CTQ is asked) and automatically
makes the app opaque. Since the STQs simulated questions in the
user’s head, the interface has no knowledge of the user’s need for
that information. Thus, the user must manually click on the intended
439
Figure 3: Conditions compared to Context-Aware Glanceable AR
in the experiment. (a) Mobile phone interface. (b) Non-adaptive
Glanceable AR interface, in which the user can air-tap an app to turn
it opaque or transparent.
app to answer STQs. To simulate these questions in our experiment,
the wording of our STQs were designed to not be recognized as as-
sociated with any of the available apps. If a piece of virtual content
is opaque and blocks the conversation partner’s face (the position
of the face is given by the face detection algorithm), the interface
automatically moves it above the face (Fig. 1c). Finally, all opaque
apps automatically turn transparent seven seconds after the last time
they were activated.
4.4 Metrics
For each context, we assessed the interfaces based on Information
Access Efficiency, Intrusiveness to the Primary Task, and Overall
User Experience & Preference.
Information Access Efficiency: We measured information access
time and perceived ease of information access. We used the video-
recording of the sessions to measure information access time (T-
Answer), based on how long it took participants to answer a question
(from the time the question was asked verbally). Perceived ease of
information access was evaluated through participant responses to
the post-session survey question (seven-point rating scale): “When
using the current interface (while reading an article / during the
conversation), how easy it was to access information from the apps?”
Intrusiveness to the Primary Task: We measured time away from
the primary task (T-Away) and ease of resuming the primary task af-
ter information access in both contexts. We used the video-recording
of the sessions to measure the time that the user was away from
the primary task (reading or conversation) and interacting with the
interface (i.e., holding the mobile device, clicking on the interface,
or looking at it). After finishing all sessions for each context, the
participants were interviewed and asked to “rank the interfaces from
the easiest to the most difficult for resuming reading / conversation)
after information access.”
For the solo scenario, we measured participants’ self-reported
focus on reading, by asking them in the post-context interview if
they felt that they read more of the article using a specific interface.
In the social context, we evaluated social intrusiveness by mea-
suring participants’ awareness of others’ facial expressions and
their perception of the interface’s social intrusiveness. We used
the video-recording of the sessions to measure the number of times
the participant missed the investigator’s wink during the conver-
sation. Users’ perception of interface’s social intrusiveness was
evaluated through participant responses to a post-session survey
question (seven-point rating scale): “when using the current inter-
face during the conversation, how intrusive was the interface to your
awareness of the social cues happening around you?”
User Experience & Preference: To assess the user experience (UX),
we used the standard User Experience Questionnaire (UEQ) after
each session [21]. After finishing all sessions for each context, the
participants were interviewed and asked to: (1) rank the interfaces
based on their preference for that context, and (2) describe the
advantages and disadvantages of each interface in that context.
Participants’ tendency to adopt AR for their daily information
access was evaluated through their responses to a five-point rating-
scale question asked before and after the study: “If you had access
to lightweight AR eyeglasses, how likely do you find it to use them
instead of your mobile phone for your daily information access?
Please explain why.
4.5 Hypotheses
We developed and tested the following hypotheses.
Information Access Efficiency:
H1. Since using the mobile phone involves physical interaction,
more steps, and increased workload, the mobile phone will be the
least efficient interface for information access in both contexts.
H2. In the social context, the Context-Aware AR interface will
take significantly less time for answering CTQs than the other two
interfaces, because it automatically makes the required content avail-
able to the user.
Intrusiveness to the Primary Task:
H3. In the solo context, Context-Aware AR will be less intrusive
to the user’s primary task of reading than the mobile phone, since in-
formation acquisition will be faster and less cognitively demanding.
H4. In the social context, Context-Aware AR will be the least
intrusive interface to the conversation, since the virtual content
automatically defaults to transparent when not needed, and the con-
versation partner’s face is never occluded.
H5. In the social context, the non-adaptive Glanceable AR inter-
face will be more socially intrusive than the mobile phone, since
the virtual content in Glanceable AR can occlude the conversation
partner’s face.
User Experience & Preference:
H6. In both contexts, Context-Aware AR will be the best interface
in terms of user experience and overall ranking.
H7. In the social context, mobile phone will be the worst interface
in terms of user experience and overall ranking.
H8. Users’ reported tendency to adopt AR as their primary in-
formation access tool will increase from the beginning of the study
to the end of the study, due to their exposure to Context-Aware
Glanceable AR.
4.6 Participants
We gathered data from 36 participants (14 female), between 18 and
45 years of age (M = 26.53, SD = 6.06), recruited from our local
university community. Nearly half of the participants (17) had little
to no experience with AR/VR, and only two had used AR/VR more
than 10 times.
4.7 Apparatus
For the mobile phone interface, we used a Samsung Galaxy S7
device. We designed a web app for the mobile phone interface in
Python 3 using the Flask package. The web app was running on a
Virtual Machine (VM) Server at our university (12 Core, ARCH:
x86 64, RAM 32 GB, OS: Ubuntu 20.04).
For the AR interfaces, the experiment used a Microsoft HoloLens
(1st gen) AR HWD. This device is wireless and has a resolution of
1268 x 720 per eye and 30 deg x 17 deg FoV. The software was
developed via Unity 2018.4.22f1 using the MRTK toolkit provided
by Microsoft.
For the Context-Aware AR interface we developed a face detec-
tion system using Python 3. We used the OpenCV Deep Neural
Network (DNN) Module, with caffemodel as our pre-trained neural
network. We ran the face-detection on our VM Server. Using Python
3 sockets, we made a low-level network connection to the HoloLens
device to receive the image from the HoloLens camera and send
back the bounding-box around each detected face. On the HoloLens,
440
we used UnityEngine.XR.WSA.WebCam to capture a photo from
the HoloLens camera. The UnityEngine.Networking module was
used to send the image to the server and receive the face detection
results. By transforming the bounding box of the detected face from
the camera space to the screen space, we detected whether any of
the virtual apps was blocking a detected face. Using MRTK speech
recognition and Dictation Recognizer, we detected whether some-
one was speaking, and if so, whether the content of the conversation
contained keywords (e.g., “rain” or “exercise”) related to one of the
available apps. This fully implemented context-aware interface was
used during the user study, without any manual intervention by the
experimenter.
All the sessions were also video-recorded by a Logitech C615
HD Webcam with 1080p resolution for comprehensive observation
of user behaviors.
4.8 Procedure
The experiment was divided into two parts. In the first part, partici-
pants completed two sessions, one with each interface, in the solo
context. In the second part, participants completed three sessions,
one with each interface, in the social context.
This study was approved by the Institutional Review Board of
our university. Upon arrival we welcomed participants and asked
them to read the consent form, if they had not already, and sign it.
We then asked them to fill out a pre-study questionnaire to collect
demographic information and prior experience with AR, and to
answer to one question regarding their tendency to adopt AR for
daily information access. Next, the participants were introduced to
the experiment background, hardware, and the sessions and contexts
involved in the study. When participants had no further questions, we
introduced the first part of the study, the solo context, and the tasks
involved in this part to the participant. Afterwards, the participant
experienced each interface (Context-Aware AR and mobile phone)
in separate sessions. After the first part, the second part of the study,
the social context, and the tasks involved in this part were introduced
to the participant. Afterwards, the participant experienced each
interface (Context-Aware AR, non-adaptive Glanceable AR, and
mobile phone) in separate sessions.
Before each of the five sessions, the participant completed training
to get familiar with the tasks and interface. After each session,
participants were asked to fill out a custom-designed questionnaire
and the UEQ on a computer. In our post-context interview, after
each part of the study, the participant was asked about the interfaces
they used in that context, their preferences, and the advantages and
disadvantages of each interface.
After finishing both parts (all five sessions), participants were
interviewed about their overall experience in both contexts, the
benefits and disadvantages of each interface in each context, and
their tendency to adopt AR for daily information access. The entire
experiment took about 75 minutes. Participants were allowed to take
a break anytime in between sessions.
5R
ESULTS
We conducted a series of analyses to test our hypotheses and explore
the trade-offs between the interfaces for each context individually.
For all the analyses, therefore, we separated the data based on the
context. Shapiro-Wilk tests found that our data was not normally
distributed. As such, we ran non-parametric Friedman tests to test
the significant effect of independent variables. Wilcoxon signed-
rank tests were conducted for pairwise comparisons with Bonferroni
corrections applied. We also used Wilcoxon signed-rank tests for
our qualitative data collected from the questionnaire and interview.
We used an αlevelof0.05 in all significance tests.
Figure 4: Time to answer questions (left) and time away from the
reading task (right) in the solo context.
Figure 5: Time to answer questions (left) and time away from the
conversation (right) for both question types in the social context.
5.1 Information Access Efficiency
We evaluated the effect of interface on information access time
(T-Answer) in both contexts. In the solo context, a Wilcoxon signed-
rank test (
Z=17.62,p<0.001
) revealed that Context-Aware AR
(
M=2.83
) was significantly faster than the mobile phone (
M=
7.31) (Fig. 4, left).
In the social context, a Friedman test on T-Answer for CTQs
(
χ2(2)=162.66,p<0.001
) and STQs (
χ2(2)=226.8,p<0.001
)
revealed that the type of interface had a significant effect on in-
formation access time for both question types. Pairwise Wilcoxon
signed-rank tests showed that answering both question types took
significantly longer using the mobile phone. For CTQs, mobile
phone (
M=7.44
) took significantly longer than both Glanceable
AR (
M=3.19
)(
Z=12.16,p<0.001
) and Context-Aware AR
(
M=2.78
)(
Z=12.09,p<0.001
). For STQs, mobile phone
(
M=8.46
) took significantly longer than both Glanceable AR (
M=
3.83
)(
Z=15.34,p<0.001
) and Context-Aware AR (
M=4.78
)
(
Z=13.07,p<0.001
). These pair-wise tests also revealed that
compared to Glanceable AR, Context-Aware AR significantly re-
duced T-Answer for CTQs (
Z=2.98,p=0.009
), but it significantly
increased T-Answer for STQs (
Z=5.33,p<0.001
) (Fig. 5, left).
We also evaluated the effect of interface on perceived ease of
information access. In the solo context, a Wilcoxon signed-rank
test revealed that Context-Aware AR (
M=6.11
) was perceived as
significantly easier compared to mobile phones (
M=4.69
)(
Z=
3.85,p<0.001).
In the social context, a Friedman test showed a significant effect
of interface (
χ2(2)=27.469,p<0.001
)onperceived ease of infor-
mation access. Pairwise Wilcoxon signed-rank tests revealed that
mobile phone (
M=3.83
) was perceived as significantly more diffi-
cult than both Glanceable AR (
M=5.42
)(
Z=4.14,p<0.001
)
and Context-Aware AR (M=5.64) (Z=4.78,p<0.001).
441
Figure 6: Missed social cues overall (left) and during information
access (right) in the social context.
5.2 Intrusiveness to the Primary Task
We evaluated the effect of interface on the user’s time away from
their primary task when accessing information. In the solo context, a
Wilcoxon signed-rank test on T-Away revealed that Context-Aware
AR (
M=3.64
) significantly reduced time away from reading, com-
pared to mobile phones (
M=11.35
)(
Z=17.98,p<0.001
) (Fig. 4,
right).
In the social context, a Friedman test for CTQs (
χ2(2)=
181.45,p<.001
) and STQs (
χ2(2)=268.6,p<.001
) revealed
a significant effect of interface on the user’s time away from the
conversation. For each question type, we ran Wilcoxon signed-
rank tests on each interface pair. The test results showed that
time away from the conversation was significantly longer with
the mobile phone. For CTQs, T-Away for the mobile phone
(
M=11.84
) was significantly higher than for both Glanceable
AR (
M=4.59
)(
Z=11.92,p<0.001
) and Context-Aware AR
(
M=2.8
)(
Z=12.39,p<0.001
). For STQs, T-Away when us-
ing mobile phone (
M=11.97
) was significantly higher than both
Glanceable AR (
M=4.83
)(
Z=16.02,p<0.001
) and Context-
Aware AR (
M=5.02
)(
Z=16.01,p<0.001
). We also found that,
compared to Glanceable AR, Context-Aware AR significantly re-
duces T-Away for CTQs (Z=6.99,p<0.001) (Fig. 5, right).
We asked participants to rank the interfaces based on ease of
resuming the primary task after information access. In both contexts,
most participants (97% in the solo context and 67% in the social
context) considered Context-Aware AR as the easiest to resume the
primary task. In the social context, 72% ranked mobile phone as
the most difficult.
In the solo context, the participants were given the same amount of
time to read the article in both sessions. However, their self-reported
focus on reading indicates that 75% read more and understood the
article better when they were using Context-Aware AR.
We also evaluated the social intrusiveness of the interfaces. A
Friedman test on the total number of missed winks during the conver-
sation (
χ2(2)=36.24,p<.001
) and the number of missed winks
during information access (
χ2(2)=25.35,p<.001
) revealed signif-
icant effects of interface on awareness of others’ facial expressions
(Fig. 6). Wilcoxon signed-rank tests for each interface pair showed
that users missed a significantly smaller number of winks overall,
and even during information access, when using Context-Aware
AR. Using Context-Aware AR (
M=1.42
) significantly reduced the
total number of missed winks compared to both Glanceable AR
(
M=2.89
)(
Z=4.36,p<0.001
) and mobile phone (
M=2.97
)
(
Z=4.66,p<0.001
). During information access, Context-Aware
AR (
M=1.28
) significantly reduced the number of missed winks
compared to both Glanceable AR (
M=2.28
)(
Z=4.4,p<0.001
)
and mobile phone (M=2.28) (Z=4.3,p<0.001).
A Friedman test showed a significant effect of interface on par-
ticipants’ self-reported perception of interface’s social intrusive-
ness (
χ2(2)=19.451,p<0.001
). Wilcoxon signed-rank tests for
Figure 7: UEQ results in the solo context.
Fig
Fig
Fig
Fi
F
g
g
g
ure
u
re
r
u
ur
7: U
7
:
:
EQ r
EQ
r
r
esu
esu
u
e
e
s
s
u
u
u
lts
in t
t
he s
he s
h
he s
e
h
h
e
olo
olo
o
o
olo
con
on
c
co
n
tex
t
t
x
t.
Figure 8: UEQ results in the social context.
each pair showed that mobile phone (
M=5.44
) was perceived as
significantly more intrusive than both Glanceable AR (
M=3.64
)
(
Z=3.64,p<0.001
) and Context-Aware AR (
M=4.14
)(
Z=
3.88,p<0.001).
5.3 User Experience & Preference
Context-Aware AR was ranked first for preference by the majority of
participants (89% in the solo context and 64% in the social context).
In the social context, 64% of our participants ranked mobile phone
as their least preferred.
We performed Wilcoxon signed-rank tests for each interface pair
on our UEQ results from both contexts to compare UX aspects.In
the solo context, Context-Aware AR was significantly higher in at-
tractiveness, efficiency, stimulation, and novelty (Fig. 7). For the
social context, compared to mobile phone, both Context-Aware AR
and Glanceable AR had significantly higher attractiveness, stimula-
tion, and novelty and lower dependability and perspicuity (Fig. 8).
Finally, a Wilcoxon signed-rank test (
Z=2.44,p<0.01
)onthe
participants’ tendency to adopt AR comparing their responses before
the study with those afterwards showed that participating in the study
significantly increased their tendency to adopt AR for their daily
information access (MpreStudy =3.28, Mpost Study =3.89).
6D
ISCUSSION
6.1 Information Access Efficiency
In H1 we hypothesized that the mobile phone will be the least
efficient interface for information access in both contexts. Our
results support this hypothesis by showing that, compared to the
other two interfaces, the mobile phone increased information access
time, regardless of the context or question type, and was perceived
as more difficult for information access in both contexts. The study
also showed that Context-Aware AR took less time for answering
CTQs than the other two interfaces, supporting H2.
Together, this constitutes strong evidence that Glanceable AR can
provide rapid information access with a quick glance, and that the
442
expansive and persistent display provided by Glanceable AR has im-
portant benefits over mobile phones. Moreover, in the social context,
we showed that automatic activation of apps based on conversational
content can be a powerful way to support information access in a
social situation, without requiring any manual interaction.
6.2 Intrusiveness to the Primary Task
When using the Context-Aware AR interface in the solo context, our
participants spent less time away from reading, felt it was easier to
resume reading after information access, and reported higher focus
on reading compared to the mobile phone. This supports H3, that
Context-Aware AR is less intrusive to the user’s primary task.
Meanwhile, H4 was partially supported by our data. When us-
ing Context-Aware AR, participants took the least time away for
information access, but only when answering CTQs. They also per-
ceived the interface as the easiest for resuming the conversation after
information access and less socially intrusive compared to mobile
phones. Finally, they had the most awareness of the interlocutor’s
facial expressions in the Context-Aware condition.
We also noticed that participants occasionally missed the STQ,
either completely ignoring the question or forgetting it. Only 20% of
these missed questions occurred with Context-Aware AR, compared
to 54% with the mobile phone, which indicates a greater ability to
effectively multi-task with Context-Aware AR.
Together, the results from H1-H4 demonstrate that our Context-
Aware Glanceable AR prototype achieved its goals of balancing
rapid information access with minimal intrusion on the primary task.
In the solo context and in the case of CTQs in the social context, the
Context-Aware condition automatically both ensures the visibility of
needed content at a glance, and avoids occlusion of the primary task.
In the worst case (STQs during a conversation), the user still has
to manually interact to activate an app, but this is still significantly
faster than the mobile phone and results in less time away from
the conversation. We speculate that the non-adaptive Glanceable
AR had less time away than the Context-Aware condition for STQs
because participants waited for the app to automatically become
opaque before deciding to click on it themselves.
H5 hypothesized higher social intrusiveness of the non-adaptive
Glanceable AR compared to the mobile phone, but our results do not
support this hypothesis. We found no significant difference in partic-
ipants’ awareness of other person’s social expressions, or in their
self-reported perception of the interface’s social intrusiveness.We
can explain this finding by observing that all participants manually
made the apps translucent during the conversation, only activating
them during information access. When using the mobile phone, time
away from the conversation was longer due to the multiple steps
of interacting with the device, and participants found it difficult to
focus on the conversation or detect the social cues during this time
of interaction with the device (T-Away).
6.3 User Experience & Preference
Our results partially support H6 and H7. In both contexts, Context-
Aware AR was ranked as the most preferred, and mobile phone as the
least preferred interface. Context-Aware was also rated significantly
higher than mobile phone on the UEQ factors of attractiveness,
stimulation and novelty in both contexts. On the other hand, mobile
phone had significantly higher ratings for perspicuity (i.e., ease of
understanding) and dependability in the social context, which is
reasonable since users have high familiarity and experience with
mobile phone interfaces. In addition, we did not find any significant
advantages of Context-Aware over non-adaptive Glanceable AR
on any of the UEQ factors. We suggest that these results may be
explained by general uneasiness about automatic adaptations in the
Context-Aware condition. Many users expressed concerns of this
sort, such as: “What if I don’t need something and it keeps opening
the content?” and “What if I need other applications?”
Our results on users’ reported tendency to adopt AR as their
primary information access tool indicated an increase from the be-
ginning of the study to the end of the study, supporting H8. This
suggests that experience with our Context-Aware Glanceable AR
prototype helped participants envision the potential benefits of some-
day replacing their smartphones with intelligent AR glasses.
7L
IMITATIONS AND FUTURE WORK
This study aimed to compare the most commonly used interface for
daily information access against a glanceable AR interface acces-
sible through all-day wearable eyeglasses. We assume that future
lightweight AR glasses will stay unlocked as long as they are worn
after being “logged in” once the user puts them on. We also as-
sume such devices will become commonplace and socially accepted.
However, these assumptions might not prove to be fully accurate,
potentially limiting the validity of our results. Our prototype only
supported two contexts. However, a Context-Aware Glanceable
AR interface must differentiate and adapt to various daily contexts.
This raises research questions regarding the definition of AR user
contexts and proper adaptations to them.
Design Limitations: We limited our prototype to three Glanceable
apps and used air-tap interactions for both contexts. Our interface
focused on occlusion and intrusiveness challenges in AR and im-
proving information access (DC1, DC2, DC3), adapting the content
prioritization, virtual content’s translucency and placement. Fu-
ture studies should identify and address other challenges of AR
and investigate Context-Aware interfaces that intelligently adapt the
interaction technique and the group of available apps.
User Study Limitations: Our participants’ familiarity and expe-
rience with the three interfaces was not equal, which could have
biased against the AR conditions, and the Context-Aware AR con-
dition in particular. This actually makes our positive results about
AR even stronger, but it also helps to explain why Context-Aware
AR did not completely live up to our expectations. Furthermore,
the Context-Aware AR condition had a significant computational
load, even though the computer vision and speech recognition were
done on a remote server. This made performance of the Context-
Aware condition less than ideal, especially when users needed to
manually air tap on apps. This also helps explain some of the missed
expectations. Future studies should also examine the effects of
Context-Aware AR use on both sides of a social situation and on
larger social situations with more than two participants, since we did
not evaluate how Context-Aware AR or the other conditions affect
the perceptions or social acceptance of the interlocutor.
8C
ONCLUSIONS
In this work we designed a Context-Aware AR interface to intel-
ligently adapt to changes in the user’s context. The interface is
designed for a static solo context and a dynamic social context. In
the solo context the interface prioritizes the virtual content and places
it at the user’s eye-level, while in the social context, the interface
prioritizes the real world, and based on the conversation’s content,
automatically makes the related virtual content available in a non-
occluding manner. Our results largely validated our hypothesized
benefits of the Context-Aware Glanceable AR approach, showing
its advantages for information access efficiency, avoiding negative
effects on primary tasks or social interactions, and overall user ex-
perience. We conclude that for Glanceable AR apps to be a viable
replacement for smartphones to support everyday information access
needs in a variety of real-world scenarios, an intelligent, adaptive,
context-aware approach will be critical.
ACKNOWLEDGMENTS
The authors wish to thank the Virginia Tech community members
who participated in this study and the members of the 3DI Group at
Virginia Tech for their continuous support.
443
REFERENCES
[1]
M. M. V. Abeele, M. L. Antheunis, and A. P. Schouten. The effect of
mobile messaging during a conversation on impression formation and
interaction quality. Computers in Human Behavior, 62:562–569, 2016.
[2]
R. Adolphs. Social cognition and the human brain. Trends in cognitive
sciences, 3(12):469–479, 1999.
[3]
Y. Cheng, Y. Yan, X. Yi, Y. Shi, and D. Lindlbauer. Semanticadapt:
Optimization-based adaptation of mixed reality layouts leveraging
virtual-physical semantic connections. In The 34th Annual ACM Sym-
posium on User Interface Software and Technology, pp. 282–297, 2021.
[4]
K. Church, M. Cherubini, and N. Oliver. A large-scale study of daily
information needs captured in situ. ACM Transactions on Computer-
Human Interaction (TOCHI), 21(2):1–46, 2014.
[5]
K. Church and B. Smyth. Understanding the intent behind mobile
information needs. In Proceedings of the 14th international conference
on Intelligent user interfaces, pp. 247–256, 2009.
[6]
S. Davari, F. Lu, and D. A. Bowman. Occlusion Management Tech-
niques for Everyday Glanceable AR Interfaces. In 2020 IEEE Confer-
ence on Virtual Reality and 3D User Interfaces Abstracts and Work-
shops (VRW), pp. 324–330. IEEE, 2020. doi: 10. 1109/VRW50115.
2020.00072
[7]
D. Dearman, M. Kellar, and K. N. Truong. An examination of daily
information needs and sharing opportunities. In Proceedings of the
2008 ACM conference on Computer supported cooperative work, pp.
679–688. ACM, 2008.
[8]
A. K. Dey, G. D. Abowd, and D. Salber. A conceptual framework
and a toolkit for supporting the rapid prototyping of context-aware
applications. Human–Computer Interaction, 16(2-4):97–166, 2001.
[9]
S. Di Verdi, D. Nurmi, and T. Hollerer. ARWin - a desktop augmented
reality window manager. In The Second IEEE and ACM International
Symposium on Mixed and Augmented Reality, 2003. Proceedings., pp.
298–299. IEEE, 2003.
[10]
B. M. Ens, R. Finnegan, and P. P. Irani. The personal cockpit: a spatial
interface for effective task switching on head-worn displays. In Pro-
ceedings of the SIGCHI Conference on Human Factors in Computing
Systems, pp. 3171–3180, 2014.
[11]
S. K. Feiner. Augmented reality: A new way of seeing. Scientific
American, 286(4):48–55, 2002.
[12]
C. Frith. Role of facial expressions in social interactions. Philo-
sophical Transactions of the Royal Society B: Biological Sciences,
364(1535):3453–3458, 2009.
[13]
J. Grubert, T. Langlotz, S. Zollmann, and H. Regenbrecht. Towards
pervasive augmented reality: Context-awareness in augmented reality.
IEEE transactions on visualization and computer graphics, 23(6):1706–
1724, 2016.
[14]
J. Gugenheimer, C. Mai, M. McGill, J. Williamson, F. Steinicke, and
K. Perlin. Challenges using head-mounted displays in shared and social
spaces. In Extended Abstracts of the 2019 CHI Conference on Human
Factors in Computing Systems, pp. 1–8, 2019.
[15]
J. Gugenheimer, E. Stemasov, H. Sareen, and E. Rukzio. Facedisplay:
Towards asymmetric multi-user interaction for nomadic virtual reality.
In Proceedings of the 2018 CHI Conference on Human Factors in
Computing Systems, pp. 1–13, 2018.
[16]
J. A. Hall, N. K. Baym, and K. M. Miltner. Put down that phone and talk
to me: Understanding the roles of mobile phone norm adherence and
similarity in relationships. Mobile Media & Communication, 2(2):134–
153, 2014.
[17]
T. H
¨
ollerer, S. Feiner, D. Hallaway, B. Bell, M. Lanzagorta, D. Brown,
S. Julier, Y. Baillot, and L. Rosenblum. User interface management
techniques for collaborative mobile augmented reality. Computers &
Graphics, 25(5):799–810, 2001.
[18]
W. Lages and D. Bowman. Adjustable adaptation for spatial augmented
reality workspaces. In Symposium on Spatial User Interaction, p. 20.
ACM, 2019.
[19]
W. S. Lages and D. A. Bowman. Walking with adaptive augmented
reality workspaces: design and usage patterns. In Proceedings of
the 24th International Conference on Intelligent User Interfaces, pp.
356–366, 2019.
[20]
T. Langlotz, T. Nguyen, D. Schmalstieg, and R. Grasset. Next-
generation augmented reality browsers: rich, seamless, and adaptive.
Proceedings of the IEEE, 102(2):155–169, 2014.
[21]
B. Laugwitz, T. Held, and M. Schrepp. Construction and evaluation of a
user experience questionnaire. In A. Holzinger, ed., HCI and Usability
for Education and Work, pp. 63–76. Springer Berlin Heidelberg, Berlin,
Heidelberg, 2008. doi: 978-3-540-89350-9 6
[22]
D. Lindlbauer, A. M. Feit, and O. Hilliges. Context-aware online
adaptation of mixed reality interfaces. In Proceedings of the 32nd
Annual ACM Symposium on User Interface Software and Technology,
pp. 147–160, 2019.
[23]
F. Lu and D. A. Bowman. Evaluating the potential of glanceable AR
interfaces for authentic everyday uses. In 2021 IEEE Virtual Reality
and 3D User Interfaces (VR), pp. 768–777. IEEE, 2021.
[24]
F. Lu, S. Davari, and D. Bowman. Exploration of techniques for rapid
activation of glanceable information in head-worn augmented reality.
In Symposium on Spatial User Interaction, SUI ’21. Association for
Computing Machinery, New York, NY, USA, 2021. doi: 10.1145/
3485279.3485286
[25]
F. Lu, S. Davari, L. Lisle, Y. Li, and D. A. Bowman. Glanceable
AR: Evaluating information access methods for head-worn augmented
reality. In 2020 IEEE Conference on Virtual Reality and 3D User
Interfaces (VR), pp. 930–939, 2020. doi: 10. 1109/VR46266.2020.
00113
[26]
C. Morrison, E. Cutrell, M. Grayson, A. Thieme, A. Taylor, G. Roumen,
C. Longden, S. Tschiatschek, R. Faia Marques, and A. Sellen. Social
Sensemaking with AI: Designing an Open-Ended AI Experience with a
Blind Child. Association for Computing Machinery, New York, NY,
USA, 2021.
[27]
J. Orlosky, K. Kiyokawa, T. Toyama, and D. Sonntag. Halo content:
Context-aware viewspace management for non-invasive augmented re-
ality. In Proceedings of the 20th International Conference on Intelligent
User Interfaces, pp. 369–373, 2015.
[28]
K. Pfeuffer, Y. Abdrabou, A. Esteves, R. Rivu, Y. Abdelrahman,
S. Meitner, A. Saadi, and F. Alt. Artention: A design space for gaze-
adaptive user interfaces in augmented reality. Computers & Graphics,
95:1–12, 2021.
[29]
R. Piening, K. Pfeuffer, A. Esteves, T. Mittermeier, S. Prange,
P. Schr
¨
oder, and F. Alt. Looking for info: Evaluation of gaze based
information retrieval in augmented reality. In C. Ardito, R. Lanzilotti,
A. Malizia, H. Petrie, A. Piccinno, G. Desolda, and K. Inkpen,
eds., Human-Computer Interaction – INTERACT 2021, pp. 544–565.
Springer International Publishing, Cham, 2021.
[30]
R. Rivu, Y. Abdrabou, K. Pfeuffer, A. Esteves, S. Meitner, and F. Alt.
Stare: Gaze-assisted face-to-face communication in augmented reality.
In ACM Symposium on Eye Tracking Research and Applications, pp.
1–5, 2020.
[31]
Q. Roy, F. Zhang, and D. Vogel. Automation accuracy is good, but
high controllability may be better. In Proceedings of the 2019 CHI
Conference on Human Factors in Computing Systems, pp. 1–8, 2019.
[32]
T. Sohn, K. A. Li, W. G. Griswold, and J. D. Hollan. A diary study of
mobile information needs. In Proceedings of the SIGCHI Conference
on Human Factors in Computing Systems, pp. 433–442. ACM, 2008.
[33]
M. Weiser. The computer for the 21st century. ACM SIGMOBILE
mobile computing and communications review, 3(3):3–11, 1999.
444
... We designed a context-aware AR interface to intelligently adapt to both direct and indirect changes in the user's context [2]. The interface was designed for both a static solo context and a dynamic social context. ...
... We are working to enhance a previous taxonomy [4] and propose a comprehensive description for everyday user contexts that is aligned with our goals. We believe that user context can be described by three main components: (1) users and their personal characteristics, (2) user's tasks, and (3) setting and environmental features. Having proper knowledge of these components, an intelligent AR interface can infer users' wants and needs, the desired virtual content and interaction techniques, and the appropriate time, place and visual features for displaying that content. ...
Conference Paper
Full-text available
The recent developments in Augmented Reality (AR) eyeglasses promise the potential for more efficient and reliable information access. This reinforces the widespread belief that AR Glasses are the next generation of personal computing devices, providing efficient information access to the user all day, every day. However, to realize this vision of all-day wearable AR, the AR interface must address the challenges that constant and pervasive presence of virtual content may cause. Throughout the day, as the user's context switches, an optimal all-day interface must adapt its virtual content display and interactions. The optimal interface, that is the most efficient yet least intrusive, in one context may be the worst interface for another context. This work aims to propose a research agenda to design and validate different adaptation techniques and context-aware AR interfaces and introduce a framework for the design of such intelligent interfaces.
Conference Paper
Full-text available
To maintain safety and awareness of the real world while using head-worn AR glasses, it is essential for the system to manage occlusions involving virtual content that blocks the user’s view of the real world. We study this issue in the context of Glanceable AR interfaces, which involve presenting virtual information that the user can quickly access as a secondary task while performing other tasks in the real or virtual worlds. We propose eight different techniques to resolve these occlusions. The techniques differ in their content prioritization, automation level, and adaptation mechanism for resolving occlusion. We designed an experiment to understand the user experience with the techniques in a scenario that required both awareness of the real world and information access with the digital content. We measured task performance and user preference. The results show that techniques that prioritize real world viewing and those that automatically resolve occlusions result in better task performance. These techniques are also preferred by users, particularly when translucency is used to resolve occlusions. Despite the ease of information access, techniques that prioritize viewing of the virtual content were seen as less desirable by participants.
Chapter
This paper presents the results of an empirical study and a real-world deployment of a gaze-adaptive UI for Augmented Reality (AR). AR introduces an attention dilemma between focusing on the reality vs. on AR content. Past work suggested eye gaze as a technique to open information interfaces, however there is only little empirical work. We present an empirical study comparing gaze-adaptive to an always-on interface in tasks that vary focus between reality and virtual content. Across tasks, we find most participants prefer the gaze-adaptive UI and find it less distracting. When focusing on reality, the gaze UI is faster, perceived as easier and more intuitive. When focusing on virtual content, always-on is faster but user preferences are split. We conclude with the design and deployment of an interactive application in a public museum, demonstrating the promising potential in the real world.
Article
Augmented Reality (AR) headsets extended with eye-tracking, a promising input technology for its natural and implicit nature, open a wide range of new interaction capabilities for everyday use. In this paper we present ARtention, a design space for gaze interaction specifically tailored for in-situ AR information interfaces. It highlights three important dimensions to consider in the UI design of such gaze-enabled applications: transitions from reality to the virtual interface, from single- to multi-layer content, and from information consumption to selection tasks. Such transitional aspects bring previously isolated gaze interaction concepts together to form a unified AR space, enabling more advanced application control seamlessly mediated by gaze. We describe these factors in detail. To illustrate how the design space can be used, we present three prototype applications and report informal user feedback obtained from different scenarios: a conversational UI, viewing a 3D visualization, and browsing items for shopping. We conclude with design considerations derived from our development and evaluation of the prototypes. We expect these to be valuable for researchers and designers investigating the use of gaze input in AR systems and applications.
Conference Paper
We present an optimization-based approach for Mixed Reality (MR) systems to automatically control when and where applications are shown, and how much information they display. Currently, content creators design applications, and users then manually adjust which applications are visible and how much information they show. This choice has to be adjusted every time users switch context, i.e., whenever they switch their task or environment. Since context switches happen many times a day, we believe that MR interfaces require automation to alleviate this problem. We propose a real-time approach to automate this process based on users' current cognitive load and knowledge about their task and environment. Our system adapts which applications are displayed, how much information they show, and where they are placed. We formulate this problem as a mix of rule-based decision making and combinatorial optimization which can be solved efficiently in real-time. We present a set of proof-of-concept applications showing that our approach is applicable in a wide range of scenarios. Finally, we show in a dual-task evaluation that our approach decreased secondary tasks interactions by 36%.