Conference PaperPDF Available

Re-examining Whether, Why, and How Human-AI Interaction Is Uniquely Difficult to Design


Abstract and Figures

Artificial Intelligence (AI) plays an increasingly important role in improving HCI and user experience. Yet many challenges persist in designing and innovating valuable human-AI interactions. For example, AI systems can make unpredictable errors, and these errors damage UX and even lead to undesired societal impact. However, HCI routinely grapples with complex technologies and mitigates their unintended consequences. What makes AI different? What makes human-AI interaction appear particularly difficult to design? This paper investigates these questions. We synthesize prior research, our own design and research experience, and our observations when teaching human-AI interaction. We identify two sources of AI's distinctive design challenges: 1) uncertainty surrounding AI's capabilities, 2) AI's output complexity, spanning from simple to adaptive complex. We identify four levels of AI systems. On each level, designers encounter a different subset of the design challenges. We demonstrate how these findings reveal new insights for designers, researchers, and design tool makers in productively addressing the challenges of human-AI interaction going forward.
Content may be subject to copyright.
Re-examining Whether, Why, and How
Human-AI Interaction Is Uniquely Diicult to Design
Qian Yang
HCI Institute
Carnegie Mellon University
Aaron Steinfeld
Robotics Institute
Carnegie Mellon University
Carolyn Rosé
HCI Institute
Carnegie Mellon University
John Zimmerman
HCI Institute
Carnegie Mellon University
Articial Intelligence (AI) plays an increasingly important role in
improving HCI and user experience. Yet many challenges persist in
designing and innovating valuable human-AI interactions. For ex-
ample, AI systems can make unpredictable errors, and these errors
damage UX and even lead to undesired societal impact. However,
HCI routinely grapples with complex technologies and mitigates
their unintended consequences. What makes AI dierent? What
makes human-AI interaction appear particularly dicult to de-
sign? This paper investigates these questions. We synthesize prior
research, our own design and research experience, and our observa-
tions when teaching human-AI interaction. We identify two sources
of AI’s distinctive design challenges: 1) uncertainty surrounding
AI’s capabilities, 2) AI’s output complexity, spanning from simple
to adaptive complex. We identify four levels of AI systems. On
each level, designers encounter a dierent subset of the design
challenges. We demonstrate how these ndings reveal new insights
for designers, researchers, and design tool makers in productively
addressing the challenges of human-AI interaction going forward.
User experience, articial intelligence, sketching, prototyping.
ACM Reference Format:
Qian Yang, Aaron Steinfeld, Carolyn Rosé, and John Zimmerman. 2020.
Re-examining Whether, Why, and How Human-AI Interaction Is Uniquely
Dicult to Design. In CHI Conference on Human Factors in Computing
Systems (CHI ’20), April 25–30, 2020, Honolulu, HI, USA. ACM, New York, NY,
USA, 12 pages.
Advances in Articial Intelligence (AI) have produced exciting op-
portunities for human-computer interaction (HCI). From mundane
spam lters to autonomous driving, AI holds many promises for
improved user experiences (UX), and it enables otherwise impos-
sible forms of interaction. This trend has led to the idea of AI as a
design material in the research community, with the hope that HCI
researchers and designers can eectively envision and rene new
uses for AI that have yet to be imagined [13, 26, 58].
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specic permission
and/or a fee. Request permissions from
CHI ’20, April 25–30, 2020, Honolulu, HI, USA
©2020 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-6708-0/20/04. . . $15.00
The growing interest in how AI can improve UX implies that
HCI designers have become skilled at integrating AI’s capabilities
into their practices. Interestingly, the research shows something
else, that HCI designers largely struggle to envision and prototype
AI systems [
]. For example, even simple AI
applications can make inference errors that are dicult to antic-
ipate. These errors impact the intended user experience, and can
sometimes raise serious ethical concerns or result in societal-level
consequences. However, current HCI design methods meant to mit-
igate unintended consequences (i.e. sketching and prototyping) can
seem ill-tted for AI. HCI professionals cannot easily sketch the
numerous of ways an AI system might adapt to dierent users in
dierent contexts [
]. Nor can they easily prototype the types
of inference errors a not-yet developed AI system might make
[29, 42, 49].
Existing research frequently attributes these challenges to AI’s
technical complexity, demand for data, and unpredictable inter-
actions [
]. Less discussed is that HCI routinely grap-
ples with complex, resource-intensive technologies using simple,
inexpensive artifacts, e.g., paper prototypes and Wizard of Oz sys-
tems. What makes AI dierent and distinctly dicult to prototype?
Equally important, designers routinely choreograph complex, dy-
namic, sometimes unpredictable interactions, with a focus on mit-
igating technologies’ unintended consequences (e.g., [
]). What
makes AI interactions particularly dicult to sketch? A critical rst
step in designing valuable human-AI interactions is to articulate
the unique qualities of AI that made it so dicult to design.
The goal of this paper is to delineate whether,why, and how
human-AI interaction is distinctly dicult to design and innovate.
The paper has four parts:
We set the stage by cataloging the many human-AI interac-
tion design challenges that literature has reported as well as
solutions proposed.
We ask three provocative questions as a critique of current re-
lated work. These questions serve as a springboard for rethink-
ing how to approach the question of why human-AI interaction
appears so dicult to design.
We synthesize our own research, including studies on the chal-
lenges HCI practitioners faced when working with AI, our in-
sights from making AI things via research through design, and
our insights from teaching students in human-AI interaction
design and innovation. This synthesis identied two sources of
AI’s design complexities, and a framework that unravels their
eects on design processes.
We demonstrate the usefulness of the framework; specically,
how it is useful to those who design human-AI interactions, to
CHI ’20, April 25–30, 2020, Honolulu, HI, USA Q. Yang et al.
Figure 1: Mapping the human-AI interaction design challenges in the literature [13, 26, 53, 58]
onto a user-centered design process (Double Diamond [10])
those who research particular HCI issues of AI, and to those
who innovate AI design methods, tools, and processes.
This paper makes three contributions. First, it provides a synthesis
of many human-AI interaction design challenges and emergent
solutions in literature. Second, the provocation questions oer an
alternative lens for approaching the human-AI interaction design
challenge. We draw attention to AI’s design complexity rather than
technical complexity; We draw attention to how AI hinders the
interaction design process rather than the end product. Finally, our
framework gives structure to the currently fuzzy problem space
of human-AI interaction design. This provides a rst step towards
systematically understanding how HCI research might best choose
to approach these problems going forward.
Recent research has become increasingly interested in the oppor-
tunities and challenges AI brings to HCI and UX design. As re-
searchers produced a wealth of valuable, novel designs, they also
reported encountering many challenges in the process [
Some research has investigated challenges faced by UX practition-
ers who do not specialize in AI but who desire to integrate it into
their practice [
]. Research has chosen a number of dierent
frames for investigating these challenges including human-AI inter-
action design, AI/machine learning as a design material, the design
of intelligent systems, designing for/with data [
], and many
more [33, 42, 43].
To better unpack what is known about the challenge HCI re-
searchers and practitioners face when working with AI, we cata-
loged these challenges and their emergent solutions. To gain a new
perspective of this work, we mapped the challenges and solutions
to the familiar double diamond design process used to describe
user-centered design (Figure 1) and to a diagram displaying a lean
startup process with its focus on producing a minimal viable prod-
uct (MVP) (Figure 2), a design approach becoming more popular
with the growing use of agile software development.
2.1 UX Design Challenges of AI
Across HCI and UX communities, researchers and practitioners
have reported challenges in working with AI at almost every step
of a user-centered design process, in both divergent and convergent
stages. From left to right on Figure 1, they reported:
Challenges in understanding AI capabilities (rst divergent think-
ing stage): Designers frequently report that it is dicult to grasp
what AI can or cannot do. This hampers designers’ brainstorming
and sketching processes from the start [13, 26, 51, 56].
Challenges in envisioning many novel, implementable AI things for
a given UX problem (in both divergent thinking stages): AI-powered
interactions can adapt to dierent users and use contexts, and
they can evolve over time. Even when designers understand how
AI works, they often found it dicult to ideate many possible new
interactions and novel experiences with much uidity [13, 58].
Challenges in iterative prototyping and testing human-AI inter-
action (in both convergent thinking stages): One core practice of
Re-examining Whether, Why, and How
Human-AI Interaction Is Uniquely Diicult to Design CHI ’20, April 25–30, 2020, Honolulu, HI, USA
Figure 2: Mapping UX design challenges of AI in prior research on a technology-driven design innovation process [5, 41]
HCI design and innovation is rapid prototyping, assessing the
human consequences of a design and iteratively improving on
it. HCI practitioners cannot meaningfully do this when working
with AI. As a result, AI’s UX and societal consequences can seem
impossible to fully anticipate. Its breakdowns can be especially
harmful for under-served user populations, including people with
disabilities [45].
HCI researchers have tried two approaches to addressing this
challenge. One approach is to create Wizard of Oz systems or
rule-based simulators as an early-stage interactive AI prototype
(e.g. as in [
]). This approach enables HCI profession-
als to rapidly explore many design possibilities and probe user
behaviors. However, this approach fails to address any of the
UX issues that will come from AI inference errors because there
is no way to simulate these errors [
]. The second approach
is to create a functioning AI system, and deploy it among real
users for a period of time [
]. This time-consuming, eld-trial
prototyping process enables designers to fully understand AI’s
intended and unintended consequences. However, it loses the
value that comes from rapid and iterative prototyping. This ap-
proach does not protect teams from over-investing in ideas that
will not work. It does not allow them to fail early and often.
Challenges in crafting thoughtful interactions (in the last conver-
gent thinking stage): Designers struggled to set user expectations
appropriately for AI’s sometimes unpredictable outputs [
]. They
also worried about the envisioned designs’ ethics, fairness, and
other societal consequences [13, 26].
Challenges in collaborating with AI engineers (throughout the de-
sign process: For many UX design teams, AI technical experts
can be a scarce resource [
]. Some designers also found it
challenging to eectively collaborate with AI engineers, because
they lacked a shared workow, boundary objects, or a common
language for scaolding the collaboration [19, 28, 52].
Propelled by these challenges, a few researchers speculated that,
when working with AI, designers should start with an elaborate
matching process that pairs existing datasets or AI systems with
the users and situations that are most likely to benet from the
pairing [
]. This approach deviates from more traditional user-
centered design in that the target user or UX problem is less xed.
It is more similar to customer discovery in an agile development
process that focuses on the creation and continual evaluation of a
minimal viable product (MVP) [
]. In this light, we decided to also
map the human-AI interaction design challenges onto an MVP inno-
vation process. However, it seems a similar set of design challenges
that curbed user-centered design also thwarted technology-driven
design innovations (Figure 2, from left to right), for example:
Challenges in understanding AI capabilities;
Challenges in mapping out the right user stories and user
cases of a “minimum viable" AI system, or envisioning how
it can be applied in less obvious ways [13];
Challenges in collaborating with AI engineers.
We found no agreed-upon set of root causes or themes around
which we can easily summarize these challenges. Some researchers
suggested that AI systems’ technical complexity causes the interac-
tion design problems [
]. Some considered the unpredictable system
behaviors as the cause [
]. Some argued that AI appeared to be
dicult to design because AI is just “a new and dicult design ma-
terial," suggesting that over time, known HCI methods will likely
address these challenges [
]. Others argued that user-centered
design needs to change in order to work for AI [
]. These
proposals rarely share key citations or cross-citations that indicate
emerging agreements.
2.2 Facilitating Human-AI Interaction Design
HCI researchers have started to investigate how to make it easier to
design and innovate human-AI interactions. We identify ve broad
themes in this body of work:
Improving designers’ technical literacy. An emerging consensus
holds that HCI and UX designers need some technical under-
standing of AI to productively work with it. Designer-facing AI
education materials have become available to help (e.g. [
]). However, substantial disagreement remains in what
CHI ’20, April 25–30, 2020, Honolulu, HI, USA Q. Yang et al.
kinds of AI knowledge are relevant to UX design, and in how ad-
vanced a technical understanding is good enough for designers
[9, 48, 54].
Facilitating design-oriented data exploration. This body of work
encourages designers to investigate the lived-life of data and
discover AI design opportunities [
]. For example, [
investigated users’ music app metadata as a material for design-
ing contemplative music experience; [
] explored the design
opportunities around intimate somatic data. Notably, this body
of work often used terms like data-driven or smart systems; It
was not always clear when the authors specically aimed at AI.
Enabling designers to more easily “play with" AI in support of
design ideation, so as to gain a felt sense of what AI can do. This
work created interactive machine learning (iML) tools and rule-
based simulators as AI prototyping tools, for example, Wekina-
tor for gesture-based interactions [
] and the Delft AI Toolkit
for tangible interactions [
]. Currently, almost all iML tools are
application-domain-specic. In order to make the systems acces-
sible to designers and maximally automate data prepossessing
and model training, these systems had to limit the range of
possible in/outputs, therefore focused on particular application
domains [38, 39].
Aiding designers in evaluating AI outputs. In recent years, tech-
nology companies have proposed more than a dozen human-AI
interaction principles and guidelines (See a review here [
These guidelines covered a comprehensive list of design consid-
erations such as “make clear how well the system can do, what
it can do" [4] and “design graceful failure recovery" [21].
Creating AI-specic design processes. Some researchers have pro-
posed that AI may require design processes less focused on one
group of users, and instead on many user groups and stakehold-
ers [
]; processes focused less on fast, iterative prototyping,
and instead on existing datasets and functioning AI systems
]; or processes focused less on one design as the nal de-
liverable to engineers, and instead on closer, more frequent
collaborations [19].
These themes demonstrated the remarkable heterogeneity of ap-
proaches researchers have taken to address the challenges around
human-AI interaction design. Similar to most design methods pub-
lished within HCI research, we found no empirical evaluations of
the proposed design tools, guidelines, or workows. It is dicult to
control for and measure improvements in a design process to show
that a method is producing better designs. Throwing AI into the
mix only seems to increase this challenge.
We wanted to articulate whether, why, and how AI is distinctly
dicult to design. The preceding review of related work revealed a
remarkable set of insights and approaches to this complex problem
space. Now we step back and critically examine this rich body of
research in order to more holistically understand AI’s resistance
to design innovation. What has research missed? Can we see gaps
or emerging opportunities across this work? Our reection of the
related work led to three provocative questions. These questions
served as a springboard for rethinking how we might further ad-
vance our understanding of AI’s design challenges.
3.1 What is AI?
One critical question has been absent from the research discourse
around human-AI interaction: What is AI? Or rather, what should
count as AI as it relates to HCI and UX design? Prior literature has
used a range of poorly-dened terms, such as machine learning
systems, intelligent/smart systems, AI-infused systems, and more.
The research discourse on understanding machine intelligence as a
technological material is sometimes mixed with intelligence as an
experiential factor.
Locating the elusive concept of AI is dicult. What is commonly
referred to as AI encompasses many disconnected technologies (e.g.,
decision tree, Bayesian classier, computer vision, etc.). The techni-
cal boundary of AI, even in AI research communities, is disputed
and continuously evolving [
]. More importantly, an action-
able, designerly understanding of AI is likely to be very dierent
from a technical denition that guides algorithmic advances.
Yet discussing AI’s design challenges without any bounding
seems problematic. What makes a challenge distinctly AI and not
a part of the many challenges designers regularly face in HCI and
UX work? Current research does not always make this distinction.
For example, Amershi et al. systematically evaluated 20 popular
AI products and proposed a set of guidelines for designing human-
AI interactions [
]. These guidelines include “make clear what
the system can do" and “support ecient error correction". These
seem important to AI systems, but they also seem to be issues that
designers must address in systems with no AI. What is less clear is
if AI requires specic considerations in these areas.
3.2 What Are AI’s Capabilities and Limits?
Designers need to understand the capabilities and limitations of a
technology in order to know the possibilities it oers for design [
Engineers create new technological capabilities; designers create
new, valuable products and services with existing technological
capabilities and limitations [34].
Interestingly, AI’s capabilities and limitations have not been the
focus of current research. Instead, most work has focused on getting
designers to understand how AI functions (2.1 theme 1). This is
quite dierent from the traditional ways of moving new technology
from research to design practice, which assume designers do not
need to understand the technical specics of the technology. In
addition, research has produced many rule-based and Wizard of Oz
simulators to help designers better understand AI’s design opportu-
nities (themes 2 and 3). Little is known about whether these systems
can sensitize designers to AI’s limitations realistically. This moti-
vates the question: Can an articulation of AI’s capabilities foster a
more incisive examination of its design challenge?
3.3 Why Is Prototyping AI Dicult?
AI brings challenges to almost all stages of a typical design process.
However, the proposed AI design methods and tools have mostly
focused on the two ends of this creative process (Figure 1 and
2); either helping designers to understand what AI is and can do
generally, or enhancing the evaluation of the nal design. The
central activities of an interaction design process, i.e. sketching
and prototyping, are under-explored. Research through Design
Re-examining Whether, Why, and How
Human-AI Interaction Is Uniquely Diicult to Design CHI ’20, April 25–30, 2020, Honolulu, HI, USA
(RtD) projects are rare when it comes to designing and innovating
human-AI interaction [51].
Sketching and prototyping may constitute a fruitful lens for
understanding AI’s design challenges. They are cornerstones of
any design process. It is through sketching and prototyping that
designers understand what the technology is and can do, engage in
creative thinking, and assess and improve on their designs. Interro-
gating why is it dicult to abstract AI-powered interactions into
sketches and prototypes may shed light on how the other tangled
design challenges relate to each other.
We set out to investigate whether, why, and how human-AI in-
teraction is uniquely dicult to design and innovate. We want to
construct a framework that provides meaningful structure to the
many tangled challenges identied in prior research. The preceding
provocation questions informed how we advance towards this goal:
We rst worked to identify an operational bounding of AI. Within
this bounding, we curated a set of human-AI interaction sketching
and prototyping processes as case studies, including our rsthand
experiences and the experiences of other practitioners and students
whom we observed. We synthesized these case studies, in searching
for a coherent, useful framework.
One limitation of this work is that the case studies are mainly
from our own research/design/teaching experiences. This is nei-
ther a representative sample nor a comprehensive one. The meta-
analysis nature of our research goal calls for an extensive collection
of AI design projects, ideally covering all kinds of AI systems for all
kinds of design contexts. This is beyond what one paper can achieve.
The synthesis of our experience and the resulting framework are
intended to serve as a moderate rst step in this direction.
4.1 An Operational Bounding of AI
The denitions of AI generally fall into two camps [
]. One
describes AI as computers that perform tasks typically associated
with the human mind (“AI is whatever machines haven’t done yet
]). The other denes AI in relation to certain computational
mechanisms. We chose a widely-adopted denition from the latter
camp, because our focus is AI the technology, rather than what
people perceive as “intelligent”.
In this work, AI refers to computational systems that
interpret external data, learn from such data, and use
those learnings to achieve specic goals and tasks through
exible adaptation. [27]
Importantly, we did not intend to draw a technical boundary of
what counts as AI here. We also do not consider this denition as
valuable for HCI practitioners in working with AI. Instead, we used
this denition only to examine whether the systems that are techni-
cally considered as AI indeed require new HCI design methods. For
example, this denition describes AI as “learning" from data, yet
does not specify what counts as “learning." (It remains an issue of
debate in technical AI communities.) Therefore in our synthesis, we
considered the challenges designers reported in working with a full
range of data-driven systems, including machine learning, classic
expert systems, crowd sourcing, etc. We then examined whether
the challenges are dierent across the spectrum from systems that
we all agree “learned" from data to those that we all agree did not.
This way, we started to separate the design challenges that are
unique to AI and those HCI routinely copes with.
4.2 UX Design Processes as Data
Within this bounding, we curated a set of AI design process from
our own research, design, and teaching experience. All projects
described below except teaching have been published at DIS and
CHI. We re-analyzed the data collected across these projects for the
purpose of this work. Below is a brief overview of these projects.
4.2.1 Designing the UX of AI Applications. First, we draw on our
many years of experience in designing a wide range of AI systems,
from simple adaptive user interfaces [
], to large-scale crowd-
sourced transportation systems [
]; from clinical decision supports
] to natural language productivity tools [
]. These
experiences enabled us to give a rsthand account of the design
challenges of AI, as well as a felt understanding of the solution that
naturally emerged from the process.
4.2.2 Studying UX Practitioners. We have studied HCI/UX prac-
titioners and their AI engineer collaborators in two projects. The
rst project focused on novice AI product designers [
]. We in-
terviewed 14 product designers/managers and surveyed 98 more
to understand how they incorporated, or failed to incorporate, AI
in their products. We also interviewed the 10 professional AI en-
gineers they hired to better understand where and how designers
sought help. The second project focused on experienced UX prac-
titioners [
]. We interviewed 13 designers who had designed AI
applications for many, many years, in order to understand how
they work with AI dierently compared to working with other
interactive technologies. Synthesizing and contrasting the ndings
across these two studies, we were able to see how novice and expert
designers approached the design challenges of AI dierently.
4.2.3 Teaching UX Design of AI Applications. Another set of obser-
vations come from our teaching. We hosted a series of Designing
AI workshops. Each workshop lasted for a day, with one instructor
working with 2-3 students. The instructor rst gave a half-hour
introduction to AI, and then provided students with a dataset and
a demonstrational AI system. Students were asked to design new
products/services with these materials for an enterprise client. 26
HCI Master students from two universities attended the workshop.
All of them had little to no technical AI background. Throughout
the series, we experimented with dierent ways of introducing AI.
We observed how students used the AI technical knowledge in their
design, where and how they struggled, and which challenges they
were able to resolve with known design methods.
We also taught a one semester design studio course: Designing AI
Products and Services. Approximately 40 undergraduate and master
students took the course. About half of them had a computer science
or data science background. In comparison to the workshops, the
course allowed us to observe students working with a more diverse
set of AI systems and design tasks, e.g. designing crowd as a proxy
for AI, designing simple UI adaptions, designing natural language
processing/generation applications.
CHI ’20, April 25–30, 2020, Honolulu, HI, USA Q. Yang et al.
4.3 Data Analysis
With this diverse set of design processes and observations, we
synthesized a framework meant to give structure to the many chal-
lenges around human-AI interaction design. We started by propos-
ing many themes that might summarize these challenges. We then
analyzed the emergent themes via anity diagramming, with a
focus on the characteristics of AI that may scaold a full range
of design challenges. Specically, we critiqued these frameworks
based on three criteria:
Analytical leverage: The framework should eectively scaold a
wide range of AI’s design opportunities and challenges. It should
help separate design challenges unique to AI from others;
Explanatory power: The framework should help researchers artic-
ulate how a proposed design method/tool/ workow contributes
to the challenges of human-AI interaction design, and the limits
of its generalizability.
Constructive potential: The framework should not only serve as
a holder of AI’s known challenges and solutions; It should also
provide new insights for future research.
We proposed and discussed more than 50 thematic constructs
and frameworks. The three authors, an external faculty, and an
industry researcher participated in this process. All have spent at
least 5 years researching AI and HCI. We also presented this work
to two research groups. One included about 40 HCI researchers.
The other included 12 machine learning researchers. They provided
additional valuable critiques and helped us rene the framework.
Our synthesis identied two attributes of AI that are central to the
struggles of human-AI interaction design: capability uncertainty
(uncertainties surrounding what the system can do and how well it
performs) and output complexity (complexity of the outputs that
the system might generate). Both dimensions function along a con-
tinuum. Together they form a valuable framework for articulating
the challenges of human-AI interaction. This section describes the
framework. In the next section, we demonstrate its usefulness.
5.1 Two Sources of AI Design Complexity
5.1.1 Capability Uncertainty. When speaking of the capabilities
of AI, we broadly refer to the functionality AI systems can aord
(e.g. detect spam emails, rank news feeds, nd optimal driving
routes), how well the system performs, and the kinds of errors it
produces. The capabilities of AI is highly uncertain. We illustrate
this by walking through the lifetime of an AI system, moving from
an emergent algorithmic capability in AI research labs to situated
user experience in the wild (Figure 3, left to right).
AI’s capability uncertainty is at its peak in the early design
ideation stage, when designers work to understand what design
possibilities AI can oer generally. This is not easy because there
exists no catalog of available AI capabilities. What might seem like
a blue-sky AI design idea may suddenly become possible because
of a newly available dataset. The performance of a deployed AI
system can constantly uctuate and diverge when it gains new data
to improve its learning. This great uncertainty in AI’s capabilities
makes it dicult for designers to evaluate the feasibility of their
emergent ideas, thereby hindering their creative processes.
The range of AI’s available capabilities includes more than the
capabilities of existing AI systems. It includes any AI things that are
technically feasible. When envisioning AI systems that do not yet
exist, designers face additional capability uncertainties. For example,
designers may choose to harvest their own dataset from users’
traces of interaction. This approach gives designers a relatively
high degree of control over the data they will eventually work with.
However, it is often very dicult to estimate how long it might take
to collect enough high-quality data and to achieve the intended
functionality. Designers frequently worked with user-generated
data in order to understand available AI capabilities. To understand
AI’s capabilities, to a great extent, is to understand this gap between
what the data appear to promise and the AI system built from that
data can concretely achieve. As one expert designer we interviewed
describes: To understand what AI can do is to conceptualize “a
funnel of what (data and/or system) exists and what is possible." [
Figure 3: The conceptual pathway translating between AI’s capabilities and thoughtful designs of human-AI interaction. AI’s
capability uncertainty and output complexity add additional steps (the colored segments) to a typical HCI pathway, make some
systems distinctly dicult to design. Designers encounter these challenges from left to right when taking a technology-driven
innovation approach; right to left when following a user-centered design process.
Re-examining Whether, Why, and How
Human-AI Interaction Is Uniquely Diicult to Design CHI ’20, April 25–30, 2020, Honolulu, HI, USA
Alternatively, designers may choose to leverage existing AI li-
braries or pre-built models to address their design problem at hand.
These systems free designers from the data troubles and allow them
to get a felt experience of the AI interactions. Unfortunately, these
toolkits represent a very narrow subset of the whole landscape of
AI capabilities.
What AI can do for a design problem at hand becomes clearer
once a functioning AI system is built. For most systems trained
on self-contained datasets, designers can measure and articulate
their performance and error modes. They then make design choices
accordingly. However, this performance is limited by any biases
present in a dataset and should only be viewed as an initial estimate
(labeled as “system performance” in Figure 3).
Some AI systems continue to learn from new data after deploy-
ment (labeled as “deployed system performance over time” in Figure
3). In the ideal case, the system will “grow," integrating new insights
from new data and adapting exibly to more varieties of users and
use contexts. Unfortunately, the new data might also drive system
performance in the wrong direction. Tay, the Twitter bot, provides
an extreme example [
]. More typically, the system’s performance
improves for users and use contexts that have produced rich data.
It performs worse for less frequent users and less typical situations.
That the system capability can constantly evolve, uctuate, and
diversify is another part of AI’s capability uncertainty.
Finally, user proles and use contexts could also impact an AI sys-
tem’s capability. Many context-aware and personalization systems
fall into this category. Consider the social media news feed ranker,
Amazon shopping recommendations, and ride-hailing app’s driver-
rider matching as examples. It is not dicult to conceptualize what
these systems can do in general (e.g. ranking news, recommending
items); however, it is no trivial task to envision, for a particular user
in a particular use context, what error the AI system might make,
and how the user might perceive that error in-situ. Anticipating
the situated, user-encountered capability of AI is dicult, yet it is
fundamental to user experience design.
5.1.2 Output Complexity. The second source of human-AI interac-
tion challenges concerns what an AI system produces as a possible
output. While capability uncertainty is responsible for the HCI
design challenges around understanding what AI can do, AI’s out-
put complexity aects how designers conceptualize the system’s
behaviors in order to choreograph its interactions.
Many valuable AI systems generate a small set of possible out-
puts. Designing interactions for these systems is similar to design-
ing for non-AI systems that generate probabilistic outputs. A face
detection tool, for example, outputs either “face" or “not face." To
design its interactions, the designer considers four scenarios: when
a face is correctly detected (true positive), when there is no face and
no face is detected (true negative), when there is no face and a face
is mistakenly detected (false positive), and when the image contains
a face but the system fails to detect it (false negative). Designers
consider each condition and design accordingly.
When designing systems that produce many possible outputs,
sketching and prototyping become more complex and cognitively
demanding. Imagine designing the interactions of a driving route
recommender. How many types of errors could the recommender
possibly produce? How might a user encounter, experience, and
interpret each kind of error, in various use contexts? How can
interaction design helps the user to recover from each error ele-
gantly? Some simulation-based methods or iML tools can seem nec-
essary for prototyping and accounting for the route recommender’s
virtually innite variability of outputs. The route recommender
exemplies the many AI systems that produce open-ended, adap-
tive outputs. The traditional, manual sketching and prototyping
methods struggle to fully capture the UX ramications of such
The system outputs that entail most design complexities are
those that cannot be simulated. Consider Siri as an example. Simi-
lar to route recommenders, Siri can generate innite possibilities
of outputs. Yet unlike route recommenders, the relationship be-
tween Siri’s in- and outputs follow complex patterns that cannot
be concisely described. As a result, rule-based simulators cannot
meaningfully simulate Siri’s utterances; nor can a human wizard.
We refer to such AI system outputs as “complex.
Notably, output complexity is not output unpredictability. While
prior research often viewed AI systems’ unpredictable errors as
causing UX troubles, we argue that AI’s output complexity is the
root cause. Let us illustrate this by considering how designers might
account for AI errors when designing two dierent conversational
systems. One is Siri. The other is a system that always replies to
user requests with a random word picked from a dictionary. While
highly unpredictable, the interactions of the latter system can be
easily simulated by a random word generator. Then following a
traditional prototyping process, designers can start to identify and
mitigate the AI’ costly errors. In contrast, Siri’s outputs are only
quasi-random, therefore resist abstraction or simulation. To date, it
remains unclear how to systematically prototype the UX of such
systems, in order to account for its breakdowns.
5.2 Two Complexity Sources Taken Together
Prior research has identied a wealth of human-AI interaction
design challenges. These challenges stem from AI’s capability un-
certainty and output complexity. For instance, designers struggled
to understand what AI can and cannot do even when they under-
stood how AI works [
]; This is because the capabilities of an AI
system can be inherently uncertain and constantly evolving. De-
signers struggled to rapidly prototype human-AI interaction [
because the interactions of two mutually adaptive agents resist
easy abstraction or simulation. Designers struggled to follow a
typical user-centered design workow when designing human-AI
interactions [
]. This is because the central point of a double
diamond process is to identify a preferred future, a dened design
goal that existing technologies can achieve. However, AI systems
have capabilities that do not fully take shape until after deployment,
so the preferred future can seem like “a funnel of what’s possible",
rather than what is concretely achievable.
Figure 3 maps the challenges onto the translation process be-
tween technological capabilities and user experience. When taking
a user-centered design approach, designers will encounter the chal-
lenges from the right to left. Taking a technology-driven design
innovation approach, from left to right. This diagram explains why
a similar set of design challenges appeared to have thwarted both
technology-driven and user-centered AI design processes.
CHI ’20, April 25–30, 2020, Honolulu, HI, USA Q. Yang et al.
AI’s evolving capabilities and adaptive behaviors have made
it a particularly powerful material for HCI and UX design. The
same qualities have also brought distinctive design challenges.
Human-AI interaction design and research, therefore, should not
simplistically reject AI’s capability uncertainty or output complex-
ity/unpredictability. Rather, it is important to understand how to
leverage these unique qualities of AI for desirable human ends,
while minimizing their unintended consequences.
In this section, we demonstrate the usefulness of the framework.
Specically, its usefulness to those who design human-AI interac-
tions (section 6.1), to researchers of particular HCI issues of AI (6.2),
and to researchers who innovate AI design methods and tools (6.3).
6.1 Four Levels of AI Systems
The framework can help expose whether and how a given AI sys-
tem is dicult to design with traditional HCI design processes and
methods (Figure 5). Existing HCI sketching and prototyping meth-
ods can suciently cover level one systems, systems with known
capability with few possible outputs. New challenges emerge when
designers work with systems that produce a broad set of possible
outputs, and when the deployed system continues to learn from
new user data. Figure 3 illustrates how they are likely to encounter
the eects of these challenges in their design process.
For practitioners, the framework helps them identify the low-
hanging fruit in integrating AI into their practice. For HCI re-
searchers [
], the framework helps to identify the unique chal-
lenges of human-AI interaction design and make a targeted contri-
Figure 4: The AI design complexity map. The two dimen-
sions of this map – capability uncertainty and output com-
plexity – outline whether and why a particular AI system is
dicult to design.
To make the framework easier to use as an analytical tool, we
summarized four levels of AI systems according to their design
complexity (Figure 5). We demonstrate its usefulness using Levels
1 and 4 systems as examples since they represent the two extremes
of AI’s design complexity. The design challenges of Level 4 are also
a superset of issues encountered in Levels 2 and 3.
6.1.1 Level one: probabilistic systems (with bounded capabilities,
producing simple outputs.) Level one systems learn from a self-
contained dataset. They produce a small, xed set of outputs. For
example, face-detection in camera apps, adaptive menus that ranks
which option the user is more likely to choose, clinical diagnostic
systems that make a diagnosis, text toxicity detectors that classify
a sentence as profane,threat, or non-toxic.
Designers can design the UX of these systems in similar ways
as designing non-AI, probabilistic systems. They are unlikely to
encounter the distinctive challenges of human-AI interaction design.
Consider this design situation: a design team wants to help online
community moderators to more easily promote civil discourses by
using an o-the-shelf text classier that ags toxic comments.
No particular challenges in understanding AI capabilities: By play-
ing with the system, the designers can easily develop a felt
understanding of what the classier can and cannot do. Because
the system will not learn from new data, this understanding will
remain valid post-deployment.
No particular challenges in envisioning novel and technically feasi-
ble designs of the technology: Designers can easily imagine many
use scenarios in which the agging-profane-text functionality
can provide value.
No particular challenges in iterative prototyping and testing: Be-
cause the outputs of the system are limited (profane, not profane),
designers can simply enumerate all the ways in which the in-
teractions may unfold (false positive, false negative, etc.) and
making interactive prototypes accordingly.
No particular challenges in collaborating with engineers: Once
the designers understand the functionality and the likely perfor-
mance and errors of the classier, they can design as usual and
provide wireframes as a deliverable to engineers at the end of
their design process.
Language toxicity detection is a complex technical problem at
the frontier of AI research. However, because the system’s capabili-
ties are bounded and the outputs are simple, existing HCI design
methods are sucient in supporting designers in sketching, proto-
typing, and assessing its interactions. Language toxicity exemplies
level one systems; They are valuable, low-hanging fruits for HCI
practitioners to integrate into today’s products and services.
6.1.2 Level four: evolving adaptive systems (with evolving capability,
producing complex outputs.) Level four systems learn from new
data even after deployment. They also produce adaptive, open-
ended outputs that resist abstraction. Search engines, newsfeed
rankers, automated email replies, a recommender system that sug-
gests “items you might like," would all t in this category. In design-
ing such systems, designers can encounter a full range of human-AI
interaction design challenges. Consider the face recognition system
within a photos app. It learns from the photos the user uploaded,
Re-examining Whether, Why, and How
Human-AI Interaction Is Uniquely Diicult to Design CHI ’20, April 25–30, 2020, Honolulu, HI, USA
Figure 5: Four levels of AI systems according to design complexity.
clusters similar faces across photos, and automatically tags the face
with the name inferred from the user’s previous manual tags.
Challenges in understanding AI capabilities: The system’s perfor-
mance and error modes are likely to change as it learns from
new images and tags. Therefore, prior to deployment, it is dif-
cult to anticipate what the system can reliably do, when and
how it is likely to fail. This, in turn, makes it dicult to design
appropriate interactions for these scenarios.
Challenges in envisioning novel and technically feasible designs of
the technology: Re-imagining many new uses of a face-recognition-
and-tagging tool – beyond tagging people on photos – can be
dicult. This is because its capabilities are highly evolved and
specialized for its intended functionality and interactions.
Challenges in iterative prototyping and testing: The system’s ca-
pabilities evolve over time as users contribute more images and
manually tags, challenging the very idea of rapid prototyping.
Challenges in collaborating with engineers. The system requires
a closer and more complex HCI-AI collaboration than as in a
traditional double-diamond process. Engineers and designers
need to collaborate on understanding how the face-recognition
performance will evolve with users’ newly uploaded photos and
tags, how to anticipate and mitigate the AI’s potential biases and
errors, as well as how to detect AI errors from user interactions
so as to improve system learning.
Face recognition and tagging are a relatively mature technology
that many people use every day. However, because its capabilities
are constantly evolving and the outputs are diverse, systematically
sketching, prototyping, and assessing the UX of face tagging re-
mains challenging. It exemplies level four systems; These are
opportune areas for HCI and RtD researchers to study human-AI
interaction and design, without getting deeply involved in techno-
logical complexities.
6.2 The Anatomy of AI’s HCI Issue
For researchers who study specic human-AI interaction design
issues (e.g. fairness, intelligibility, users’ senses of control, etc.), the
proposed framework gives a preliminary structure to these vast
issues. Take as an example the challenges surrounding account-
ing for AI biases, a challenge that many critical AI systems face
across application domains such as healthcare and criminal justice.
Building a “fair" AI application is widely considered as dicult,
due to the complexity both in dening fairness goals and in algo-
rithmically achieving the dened goals. Prior research has started
addressing these challenges and contributed valuable guidelines
and evaluation matrix [4, 35].
Our framework provides a more holistic structure to the problem
space of “AI fairness” (Figure 6). It illustrates that the current work
has mostly focused on building “a fair AI system pre-deployment”;
that algorithmic fairness is only part of the whole “AI fairness”
problem space. There is a real need for HCI and AI research in
collaboratively translating fairness as an optimization problem into
a feature of AI the socio-technical system (Figure 6, blue segment),
CHI ’20, April 25–30, 2020, Honolulu, HI, USA Q. Yang et al.
Figure 6: An example of the framework in use. Using the framework, researchers can easily outline the problem space of a
human-AI interaction issue of their interest, for example, the issue of AI fairness.
and into a situated, user experience of fairness (yellow segment).
The framework suggests a tentative agenda for these important
future research topics.
6.3 Implications for Design Methods and Tools
Recent HCI research has started to propose new methods, tools,
and workows for designing human-AI interaction. However, em-
pirically assessing the eectiveness of design tools is extremely
dicult. Assessing their generalizability to other AI systems or
across application domains only adds to the challenge.
The proposed framework intends to allow for a more principled
research discussion on how to support human-AI interaction de-
sign practice. It helps to identify the unique challenges AI brings
to HCI practice across application domains. It can help researchers
to articulate the contribution of their emergent AI design meth-
ods/tools/workows to these challenges as well as their scope of
generalizability. Finally, it can provide new insights into how to
address the remaining challenges.
We consider UX prototyping methods of AI as an example.
1. Identifying root challenges. Current research typically at-
tributes the diculty of prototyping AI to AI’s technical complex-
ity or reliance on big data. However, HCI routinely grapples with
complex, resource-intensive technologies using simple prototypes.
What makes AI unique? Our framework suggests that the root chal-
lenge of prototyping AI instead lies in its “output complexity” as well
as “capability uncertainty”, that AI’s capabilities are adaptive and its
outputs can autonomously diverge at a massive scale. Such systems
problematize the conventional HCI prototyping methods that treat
technology’s aordance as bounded and interactions prescriptive.
These methods can work when prototyping AI as an optimization
system in the lab (level one). They could fail in fully addressing
AI’s ramications over time as a real-world, sociotechnical system.
2. Articulating the contributions and limits of emergent design
methods/tools/processes. To make prototyping human-AI interac-
tion easier, researchers have created simple-rule-based simulators
]) as AI prototyping tools. Mapping the characteristics of
rule-based interactions onto the AI design complexity map (Figure
5), it becomes evident that rule-based simulators are most eec-
tive in prototyping level 1-2 systems. They can be particularly
valuable for systems that generate a broad set of outputs (level
2) where traditional, manual prototyping methods struggle. How-
ever, rule-based simulators cannot easily prototype systems that
autonomously learn from user-generated data (level 3-4). These are
living, sociotechnical systems; the rules that map their inputs to
outputs evolve in complex ways over time.
3. Providing new insights for future research. Framing level 3 and
4 AI systems as living, sociotechnical systems reveal new insights
into how we might more eectively prototype their interactions.
For example, CSCW research has investigated how to prototype
workplace knowledge sharing systems whose aordance co-evolves
with its users’ behaviors, the interactions among its users, and the
organizational contexts at large [
]. These are too living, sociotech-
nical systems with uncertain capabilities and complex outputs. This
body of work, though not typically considered as related to AI, could
oer a valuable starting place for considering how we might design
prototype human-AI interactions in the wild, over time. In this light,
the proposed conceptual framework oers actionable insights for
addressing the challenges of prototyping AI methodologically.
AI plays an increasingly important role in improving HCI and user
experience. Today, designers face challenges in working with AI
at almost every step of their design processes. Prior research of-
ten attributed these challenges to AI’s algorithmic complexity and
unpredictable system behaviors. Our synthesis of these challenges
provided an alternative view to this common assumption. We drew
attention to the process of human-AI interaction design rather than
its end product; we drew attention to the challenges of sketching
and prototyping, rather than those of assessment; we drew atten-
tion to AI’s design complexities rather than the complexities of
its algorithms. These new perspectives helped us to identify an
initial framework that explains whether, why, and how human-AI
interaction is dicult to design. Weoutlined four levels of AI design
complexity that traces how designers may encounter the resulting
challenges in their design processes. We encourage fellow HCI and
UX researchers to critique, evaluate, and improve on this framework
based on their respective design and research experiences.
The contents of this paper were partially developed under a grant
from the National Institute on Disability, Independent Living, and
Rehabilitation Research (NIDILRR grant number 90REGE0007). The
rst author was also supported by the Center for Machine Learning
and Health (CMLH) Fellowships in Digital Health and the 2019
Microsoft Research Dissertation Grant. We thank Karey Helms,
Saleema Amershi, and other contributing researchers for providing
valuable inputs on the framework. We thank Eunki Chung and
Nikola Banovic for their supports to the Designing AI workshops.
Re-examining Whether, Why, and How
Human-AI Interaction Is Uniquely Diicult to Design CHI ’20, April 25–30, 2020, Honolulu, HI, USA
2009. wekinator: Software for real-time, interactive machine learning. http:
2017. Designing the User Experience of Machine Learning Systems, Papers from
the 2017 AAAI Spring Symposium, Technical Report SS-17-04, Palo Alto, California,
USA, March 27-29, 2017. AAAI. ML/
2018. The Design of the User Experience for Articial Intelligence (The UX of AI),
Papers from the 2018 AAAI Spring Symposium, Palo Alto, California, USA, March
26-28, 2018. AAAI. AI/
Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi,
Penny Collisson, Jina Suh, Shamsi Iqbal, Paul Bennett, Kori Inkpen, Jaime Teevan,
Ruth Kikin-Gil, and Eric Horvitz. 2019. Guidelines for Human-AI Interaction.
human-ai- interaction/
Sara Bly and Elizabeth F Churchill. 1999. Design through matchmaking: technol-
ogy in search of users. interactions 6, 2 (1999), 23–31.
Kirsten Boehner and Carl DiSalvo. 2016. Data, Design and Civics: An Exploratory
Study of Civic Tech. In Proceedings of the 2016 CHI Conference on Human Factors
in Computing Systems (CHI ’16). ACM, New York, NY, USA, 2970–2981. https:
Sander Bogers, Joep Frens, Janne van Kollenburg, Eva Deckers, and Caroline
Hummels. 2016. Connected Baby Bottle: A Design Case Study Towards a Frame-
work for Data-Enabled Design. In Proceedings of the 2016 ACM Conference on
Designing Interactive Systems (DIS ’16). ACM, New York, NY, USA, 301–311.
Shan Carter and Michael Nielsen. 2017. Using Articial Intelligence to Aug-
ment Human Intelligence. Distill (2017).
Amber Cartwright. 2016. Invisible Design: Co-Designing with Machines. http:
Design Council. 2005. The ‘double diamond’ design process model. Design
Council (2005).
Justin Cranshaw, Emad Elwany, Todd Newman, Rafal Kocielnik, Bowen Yu,
Sandeep Soni, Jaime Teevan, and Andrés Monroy-Hernández. 2017. Calendar.
help: Designing a workow-based scheduling agent with humans in the loop. In
Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems.
ACM, 2382–2393.
Scott Davido, Brian D. Ziebart, John Zimmerman, and Anind K. Dey. 2011.
Learning Patterns of Pick-Ups and Drop-Os to Support Busy Family Coordi-
nation. In Proceedings of the SIGCHI Conference on Human Factors in Computing
Systems (CHI ’11). Association for Computing Machinery, New York, NY, USA,
Graham Dove, Kim Halskov, Jodi Forlizzi, and John Zimmerman. 2017. UX
Design Innovation: Challenges for Working with Machine Learning as a Design
Material. In Proceedings of the 2017 CHI Conference on Human Factors in Computing
Systems - CHI ’17. ACM Press, New York, New York, USA, 278–288. https:
Melanie Feinberg. 2017. A Design Perspective on Data. In Proceedings of the 2017
CHI Conference on Human Factors in Computing Systems (CHI ’17). ACM, New
York, NY, USA, 2952–2963.
Jodi Forlizzi. 2018. Moving Beyond User-centered Design. Interactions 25, 5 (Aug.
2018), 22–23.
Michael Freed, Jaime G Carbonell, Georey J Gordon, Jordan Hayes, Brad A Myers,
Daniel P Siewiorek, Stephen F Smith, Aaron Steinfeld, and Anthony Tomasic.
2008. RADAR: A Personal Assistant that Learns to Reduce Email Overload.. In
AAAI. 1287–1293.
William W. Gaver. 1991. Technology Aordances. In Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems (CHI ’91). ACM, New York,
NY, USA, 79–84.
Marco Gillies, Bongshin Lee, Nicolas D’Alessandro, Joëlle Tilmanne, Todd
Kulesza, Baptiste Caramiaux, Rebecca Fiebrink, Atau Tanaka, Jérémie Gar-
cia, Frédéric Bevilacqua, Alexis Heloir, Fabrizio Nunnari, Wendy Mackay, and
Saleema Amershi. 2016. Human-Centred Machine Learning. In Proceedings
of the 2016 CHI Conference Extended Abstracts on Human Factors in Comput-
ing Systems - CHI EA ’16. ACM Press, New York, New York, USA, 3558–3565.
Fabien Girardin and Neal Lathia. 2017. When User Experience Designers Partner
with Data Scientists. In The AAAI Spring Symposium Series Technical Report:
Designing the User Experience of Machine Learning Systems. The AAAI Press, Palo
Alto, California.
Mayank Goel, Nils Hammerla, Thomas Ploetz, and Anind K. Dey. 2015. Bridging
the Gap: Machine Learning for Ubicomp - Tutorial @UbiComp 2015. https:
// gap/.
Google. 2019. People + AI Guidebook: Designing human-centered AI products.
[22] Patrick Hebron. 2016. Machine learning for designers. O’Reilly Media.
Patrick Hebron. 2016. New York University Tisch School of the ArtsCourse: Learning
Karey Helms. 2019. Do You Have to Pee?: A Design Space for Intimate and Somatic
Data. In Proceedings of the 2019 on Designing Interactive Systems Conference (DIS
’19). ACM, New York, NY, USA, 1209–1222.
Douglas R Hofstadter et al
1979. Gödel, Escher, Bach: an eternal golden braid.
Vol. 20. Basic books New York.
Lars Erik Holmquist. 2017. Intelligence on tap: articial intelligence as a new
design material. interactions 24, 4 (2017), 28–33.
Andreas Kaplan and Michael Haenlein. 2019. Siri, Siri, in my hand: Who’s the
fairest in the land? On the interpretations, illustrations, and implications of
articial intelligence. Business Horizons 62, 1 (2019), 15–25.
Claire Kayacik, Sherol Chen, Signe Noerly, Jess Holbrook, Adam Roberts, and
Douglas Eck. 2019. Identifying the Intersections: User Experience + Research
Scientist Collaboration in a Generative Machine Learning Interface. In Extended
Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems
(CHI EA ’19). ACM, New York, NY, USA, Article CS09, 8 pages.
Scott R. Klemmer, Anoop K. Sinha, Jack Chen, James A. Landay, Nadeem
Aboobaker, and Annie Wang. 2000. Suede: A Wizard of Oz Prototyping Tool
for Speech User Interfaces. In Proceedings of the 13th Annual ACM Symposium
on User Interface Software and Technology (UIST ’00). ACM, New York, NY, USA,
Scott R. Klemmer, Anoop K. Sinha, Jack Chen, James A. Landay, Nadeem
Aboobaker, and Annie Wang. 2000. Suede: A Wizard of Oz Prototyping Tool
for Speech User Interfaces. In Proceedings of the 13th Annual ACM Symposium
on User Interface Software and Technology (UIST ’00). ACM, New York, NY, USA,
Esko Kurvinen, Ilpo Koskinen, and Katja Battarbee. 2008. Prototyping social
interaction. Design Issues 24, 3 (2008), 46–57.
Shane Legg, Marcus Hutter, et al
2007. A collection of denitions of intelligence.
Brian Y Lim, Anind K Dey, and Daniel Avrahami. 2009. Why and why not
explanations improve the intelligibility of context-aware intelligent systems. In
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.
ACM, 2119–2128.
Panagiotis Louridas. 1999. Design as bricolage: anthropology meets design
thinking. Design Studies 20, 6 (1999), 517–535.
Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman,
Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019.
Model Cards for Model Reporting. In Proceedings of the Conference on Fairness, Ac-
countability, and Transparency (FAT* ’19). Association for Computing Machinery,
New York, NY, USA, 220–229.
GINA NEFF and PETER NAGY. 2016. Talking to Bots: Symbiotic Agency and the
Case of Tay. International Journal of Communication (19328036) 10 (2016).
William Odom and Tijs Duel. 2018. On the Design of OLO Radio: Investigating
Metadata As a Design Material. In Proceedings of the 2018 CHI Conference on
Human Factors in Computing Systems (CHI ’18). ACM, New York, NY, USA, Article
104, 9 pages.
Kayur Patel, James Fogarty, James A. Landay, and Beverly Harrison. 2008. Ex-
amining diculties software developers encounter in the adoption of statistical
machine learning. In 23rd AAAI Conference on Articial Intelligence and the 20th
Innovative Applications of Articial Intelligence Conference. Chicago, IL, United
States, 1563–1566.
Kayur Dushyant Patel. 2012. Lowering the Barrier to Applying Machine Learning.
Ph.D. Dissertation. University of Washington.
Laurel D. Riek. 2012. Wizard of Oz Studies in HRI: A Systematic Review and
New Reporting Guidelines. J. Hum.-Robot Interact. 1, 1 (July 2012), 119–136.
[41] Eric Ries. 2011. The lean startup: How today’s entrepreneurs use continuous inno-
vation to create radically successful businesses. Crown Books.
Antonio Rizzo, Francesco Montefoschi, Maurizio Caporali, Antonio Gisondi,
Giovanni Burresi, and Roberto Giorgi. 2017. Rapid Prototyping IoT Solutions
Based on Machine Learning. In Proceedings of the European Conference on
Cognitive Ergonomics 2017 (ECCE 2017). ACM, New York, NY, USA, 184–187.
Albrecht Schmidt. 2000. Implicit human computer interaction through context.
Personal technologies 4, 2-3 (2000), 191–199.
Lisa Stifelman, Adam Elman, and Anne Sullivan. 2013. Designing Natural Speech
Interactions for the Living Room. In CHI ’13 Extended Abstracts on Human Factors
in Computing Systems (CHI EA ’13). ACM, New York, NY, USA, 1215–1220. https:
Maria Stone, Frank Bentley, Brooke White, and Mike Shebanek. 2016. Embedding
User Understanding in the Corporate Culture: UX Research and Accessibility at
Yahoo. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human
Factors in Computing Systems. ACM, 823–832.
CHI ’20, April 25–30, 2020, Honolulu, HI, USA Q. Yang et al.
Peter Stone, Rodney Brooks, Erik Brynjolfsson, Ryan Calo, Oren Etzioni, Greg
Hager, Julia Hirschberg, Shivaram Kalyanakrishnan, Ece Kamar, Sarit Kraus, et al
2016. Articial intelligence and life in 2030. One Hundred Year Study on Articial
Intelligence: Report of the 2015-2016 Study Panel (2016).
Jennifer Sukis. 2019. AI Design & Practices Guidelines (A Review). https:// design-guidelines- e06f7e92d864.
Mary Treseler. 2017. Designing with Data: Improving the User Experience with
A/B Testing. O’Reilly Media, Chapter Designers as data scientists. http://radar. scientists.html
Philip van Allen. 2018. Prototyping Ways of Prototyping AI. Interactions 25, 6
(Oct. 2018), 46–51.
Wikipedia contributors. 2019. Articial intelligence — Wikipedia, The Free
Qian Yang, Nikola Banovic, and John Zimmerman. 2018. Mapping Machine
Learning Advances from HCI Research to Reveal Starting Places for Design
Research. In Proceedings of the SIGCHI Conference on Human Factors in Computing
Systems - CHI ’18 (CHI ’18). ACM.
Qian Yang, Justin Cranshaw, Saleema Amershi, Shamsi T. Iqbal, and Jaime Teevan.
2019. Sketching NLP: A Case Study of Exploring the Right Things To Design
with Language Intelligence. In Proceedings of the 2019 CHI Conference on Human
Factors in Computing Systems (CHI ’19). ACM, New York, NY, USA, Article 185,
12 pages.
Qian Yang, Alex Scuito, John Zimmerman, Jodi Forlizzi, and Aaron Steinfeld. 2018.
Investigating How Experienced UX Designers Eectively Work with Machine
Learning. In Proceedings of the 2018 Designing Interactive Systems Conference (DIS
’18). ACM, New York, NY, USA, 585–596.
Qian Yang, Alex Scuito, John Zimmerman, Jodi Forlizzi, and Aaron Steinfeld. 2018.
Investigating How Experienced UX Designers Eectively Work with Machine
Learning. In Proceedings of the 2018 Designing Interactive Systems Conference (DIS
’18). ACM, New York, NY, USA, 585–596.
Qian Yang, AaronSteinfeld, and John Zimmerman. 2019. Unremarkable AI: Fitting
Intelligent Decision Support into Critical, Clinical Decision-Making Processes. In
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
(CHI ’19). ACM, New York, NY, USA, Article 238, 11 pages.
Qian Yang, Jina Suh, Nan-Chen Chen, and Gonzalo Ramos. 2018. Grounding
Interactive Machine Learning Tool Design in How Non-Experts Actually Build
Models. In Proceedings of the 2018 Designing Interactive Systems Conference (DIS
’18). ACM, New York, NY, USA, 573–584.
Qian Yang, John Zimmerman, Aaron Steinfeld, Lisa Carey, and James F Antaki.
2016. Investigating the Heart Pump Implant Decision Process: Opportunities
for Decision Support Tools to Help. In Proceedings of the 2016 CHI Conference on
Human Factors in Computing Systems. ACM, 4477–4488.
Qian Yang, John Zimmerman, Aaron Steinfeld, and Anthony Tomasic. 2016.
Planning Adaptive Mobile Experiences When Wireframing. In Proceedings of
the 2016 ACM Conference on Designing Interactive Systems - DIS ’16. ACM Press,
Brisbane, QLD, Australia, 565–576.
John Zimmerman, Jodi Forlizzi, and Shelley Evenson. 2007. Research through
design as a method for interaction design research in HCI. In Proceedings of the
SIGCHI conference on Human factors in computing systems. ACM, 493–502.
John Zimmerman, Anthony Tomasic, Charles Garrod, Daisy Yoo, Chaya Hirun-
charoenvate, Rafae Aziz, Nikhil Ravi Thiruvengadam, Yun Huang, and Aaron
Steinfeld. 2011. Field Trial of Tiramisu: Crowd-Sourcing Bus Arrival Times to
Spur Co-Design. In Proceedings of the SIGCHI Conference on Human Factors in
Computing Systems (CHI ’11). Association for Computing Machinery, New York,
NY, USA, 1677–1686.
John Zimmerman, Anthony Tomasic, Isaac Simmons, Ian Hargraves, Ken
Mohnkern, Jason Cornwell, and Robert Martin McGuire. 2007. Vio: A Mixed-
Initiative Approach to Learning and Automating Procedural Update Tasks. In
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
(CHI ’07). Association for Computing Machinery, New York, NY, USA, 1445–1454.
Lamia Zouhaier, Yousra Bendaly Hlaoui, and Leila Jemni Ben Ayed. 2013. Building
adaptive accessible context-aware for user interface tailored to disable users. In
2013 IEEE 37th Annual Computer Software and Applications Conference Workshops.
IEEE, 157–162.
... To start, possible patterns in the data were identified by reading the complete data set multiple times. Recent literature in Human-AI interaction places a strong emphasis on the tasks without paying attention to the socio-technical system that surrounds them [4,17,9,80]. In these initial passes of the data, a more holistic understanding of the contouring process was necessary to integrate auto-contouring technologies effectively. ...
Full-text available
Delineation of tumors and organs-at-risk permits detecting and correcting changes in the patients' anatomy throughout the treatment, making it a core step of adaptive proton therapy (APT). Although AI-based auto-contouring technologies have sped up this process, the time needed to perform the quality assessment (QA) of the generated contours remains a bottleneck, taking clinicians between several minutes up to an hour to complete. This paper introduces a fast contouring workflow suitable for time-critical APT, enabling detection of anatomical changes in shorter time frames and with a lower demand of clinical resources. The proposed human-centered AI-infused workflow follows two principles uncovered after reviewing the APT literature and conducting several interviews and an observational study in two radiotherapy centers in the Netherlands. First, enable targeted inspection of the generated contours by leveraging AI uncertainty and clinically-relevant features such as the proximity of the organs-at-risk to the tumor. Second, minimize the number of interactions needed to edit faulty delineations with redundancy-aware editing tools that provide the user a sense of predictability and control. We use a proof of concept that we validated with clinicians to demonstrate how current and upcoming AI capabilities support the workflow and how it would fit into clinical practice.
... As a result, a comprehensive perspective that more thoroughly takes human factors into the consideration of real-world practices of recommender systems is called upon [61]. This call can be particularly important when human-AI interaction receives increasing attention in recent years in the fields of human-computer interaction (HCI) and social computing, which have long been focusing on human aspects of technology [8,115]: with AI more and more pervasively deployed in real-world applications, humans realize the necessity of going beyond the technical aspects of AI to also attend to the complexity brought by deployment as well as social impacts [74]. ...
Full-text available
Recommender systems are playing an increasingly important role in alleviating information overload and supporting users' various needs, e.g., consumption, socialization, and entertainment. However, limited research focuses on how values should be extensively considered in industrial deployments of recommender systems, the ignorance of which can be problematic. To fill this gap, in this paper, we adopt Value Sensitive Design to comprehensively explore how practitioners and users recognize different values of current industrial recommender systems. Based on conceptual and empirical investigations, we focus on five values: recommendation quality, privacy, transparency, fairness, and trustworthiness. We further conduct in-depth qualitative interviews with 20 users and 10 practitioners to delve into their opinions towards these values. Our results reveal the existence and sources of tensions between practitioners and users in terms of value interpretation, evaluation, and practice, which provide novel implications for designing more human-centric and value-sensitive recommender systems.
... In the end, results provided by AI are generally based on statistical methods that will most likely never achieve an accuracy of 100% (Brynjolfsson and Mitchell, 2017). Thus, AI regularly produces errors (Amershi et al., 2019) that are often unpredictable (Yang et al., 2020). These unpredictable errors can lead to severe consequences in environments where sensitive or critical information is processed. ...
Conference Paper
Full-text available
Artificial Intelligence (AI) has an increasing impact on industries, establishing a new way of solving tasks and automating work routines. While AI-based systems have become new colleagues for some processes, the tasks of some humans have shifted towards supervising AI. Essentially, humans need to adapt to a new form of interaction with AI-based systems because AI functioning is more similar to cognitive processes of humans than traditional information systems, e.g., in terms of their intransparent decision making. Previous research indicates that AI adds new challenges to human-computer interaction, and new frameworks for human-AI interaction are developed. However, current research lacks empirical research on the design of such interactions. We conducted a 2x2x2 experiment of AI�supported information extraction and measured the ability of participants to validate the extracted information by the AI. Our results indicate that the design of human-AI interaction significantly impacts users’ supervising performance.
The main goal of the field of augmented cognition is to research and develop adaptive systems capable of extending the information management capacity of individuals through computing technologies. Augmented cognition research and development is therefore focused on accelerating the production of novel concepts in human-system integration and includes the study of methods for addressing cognitive bottlenecks (e.g., limitations in attention, memory, learning, comprehension, visualization abilities, and decision making) via technologies that assess the user’s cognitive status in real time. A computational interaction employing such novel system concepts monitors the state of the user, through behavioral, psychophysiological, and neurophysiological data acquired from the user in real time, and then adapts or augments the computational interface to significantly improve their performance on the task at hand. The International Conference on Augmented Cognition (AC), an affiliated conference of the HCI International (HCII) conference, arrived at its 16th edition and encouraged papers from academics, researchers, industry, and professionals, on a broad range of theoretical and applied issues related to augmented cognition and its applications. The field of augmented cognition has matured over the years to solve enduring issues such as portable, wearable neurosensing technologies and data fusion strategies in operational environments. These innovations coupled with better understanding of brain and behavior, improved measures of brain state change, and improved artificial intelligence algorithms have helped expand the augmented cognition focus areas to rehabilitation, brain-computer interfaces, and training and education. The burgeoning field of human-machine interfaces such as drones and autonomous agents are also benefitting from augmented cognition research. This volume of the HCII 2022 proceedings is dedicated to this year’s edition of the AC conference and focuses on topics related to understanding human cognition and behavior, brain activity measurement and electroencephalography, human and machine learning, and augmented cognition in extended reality. Papers of this one volume are included for publication after a minimum of two single-blind reviews from the members of the AC Program Board or, in some cases, from members of the Program Boards of other affiliated conferences. We would like to thank all of them for their invaluable contribution, support, and efforts.
Because humans instinctively trust and interact with explainable representations instead of latent features, intrinsically interpretable models (IIMs) aimed at representations with semantic meanings have been extensively studied. Previous IIMs relied heavily on data-driven deep neural networks, which resulted in a tremendous demand for data and complex mappings established between instances and their representations. In this study, inspired by how humans build semantic concepts by using meaningful primitives and recognize semantic instances by matching these instances with transformed primitives, we propose the manifold-based semantic representation model (MSRM), which aims to establish a concise mapping with semantic priors for explainable representations. In the MSRM, to reduce reliance on data, we introduced semantic priors into data-driven learning by building a semantic manifold for each different prior with parameterized transformations and then constructed an explainable representation space with these semantic manifolds for given semantic instances. Specifically, the MSRM represents an input semantic instance in three steps: the model extracts the instance’s low-dimensional features as transformation parameters, transforms semantic priors into semantic variations by using these parameters, and calculates the similarity between the input instance and these semantic variations by using inner products as the model’s explainable representation. Hence, the MSRM provides a succinct, prior-guided, and explainable mapping by establishing semantic manifolds by using transformed priors. For application, we propose a manifold-based semantic convolution (MSConv) for visual representation and a simple classification network that includes only one MSConv. The competitive representation power of the MSRM was theoretically analyzed and experimentally verified.
Recent work has explored how complementary strengths of humans and artificial intelligence (AI) systems might be productively combined. However, successful forms of human–AI partnership have rarely been demonstrated in real‐world settings. We present the iterative design and evaluation of Lumilo, smart glasses that help teachers help their students in AI‐supported classrooms by presenting real‐time analytics about students’ learning, metacognition, and behavior. Results from a field study conducted in K‐12 classrooms indicate that students learn more when teachers and AI tutors work together during class. We discuss implications of this research for the design of human–AI partnerships. We argue for more participatory approaches to research and design in this area, in which practitioners and other stakeholders are deeply, meaningfully involved throughout the process. Furthermore, we advocate for theory‐building and for principled approaches to the study of human–AI decision‐making in real‐world contexts.
Conference Paper
Full-text available
The management of bodily excretion is an everyday biological function necessary for our physiological and psychological well-being. In this paper, I investigate interaction design opportunities for and implications of leveraging intimate and somatic data to manage urination. This is done by detailing a design space that includes (1) a critique of market exemplars, (2) three conceptual design provocations, and (3) autobiographical data-gathering and labeling from excretion routines. To conclude, considerations within the labeling of somatic data, the actuating of bodily experiences, and the scaling of intimate interactions are contributed for designers who develop data-driven technology for intimate and somatic settings.
Conference Paper
Full-text available
Advances in artificial intelligence (AI) frame opportunities and challenges for user interface design. Principles for human-AI interaction have been discussed in the human-computer interaction community for over two decades, but more study and innovation are needed in light of advances in AI and the growing uses of AI technologies in human-facing applications. We propose 18 generally applicable design guidelines for human-AI interaction. These guidelines are validated through multiple rounds of evaluation including a user study with 49 design practitioners who tested the guidelines against 20 popular AI-infused products. The results verify the relevance of the guidelines over a spectrum of interaction scenarios and reveal gaps in our knowledge, highlighting opportunities for further research. Based on the evaluations, we believe the set of design guidelines can serve as a resource to practitioners working on the design of applications and features that harness AI technologies, and to researchers interested in the further development of human-AI interaction design principles.
Conference Paper
Full-text available
This paper investigates how to sketch NLP-powered user experiences. Sketching is a cornerstone of design innovation. When sketching designers rapidly experiment with a number of abstract ideas using simple, tangible instruments such as drawings and paper prototypes. Sketching NLP-powered experiences, however, presents unique challenges. It can be hard, for example, to visualize abstract language interaction, or to ideate a broad range of technically feasible intelligent functionalities. Via a first-person account of our sketching process when designing intelligent writing assistance, we detail the challenges we encountered and describe emergent solutions such as a new format of wireframe for sketching language interactions and a new wizard-of-oz-based NLP rapid prototyping method. These findings highlight the importance of abstraction in sketching language interfaces and of designing within the capabilities and limits of NLP systems.
Conference Paper
Full-text available
Clinical decision support tools (DST) promise improved health-care outcomes by offering data-driven insights. While effective in lab settings, almost all DSTs have failed in practice. Empirical research diagnosed poor contextual ft as the cause. This paper describes the design and field evaluation of a radically new form of DST. It automatically generates slides for clinicians' decision meetings with subtly embedded machine prognostics. This design took inspiration from the notion of Unremarkable Computing, that by augmenting the users' routines technology/AI can have significant importance for the users yet remain unobtrusive. Our field evaluation suggests clinicians are more likely to encounter and embrace such a DST. Drawing on their responses, we discuss the importance and intricacies of finding the right level of unremarkable-ness in DST design, and share lessons learned in prototyping critical AI systems as a situated experience.
Full-text available
Artificial intelligence (AI)—defined as a system's ability to correctly interpret external data, to learn from such data, and to use those learnings to achieve specific goals and tasks through flexible adaptation—is a topic in nearly every boardroom and at many dinner tables. Yet, despite this prominence, AI is still a surprisingly fuzzy concept and a lot of questions surrounding it are still open. In this article, we analyze how AI is different from related concepts, such as the Internet of Things and big data, and suggest that AI is not one monolithic term but instead needs to be seen in a more nuanced way. This can either be achieved by looking at AI through the lens of evolutionary stages (artificial narrow intelligence, artificial general intelligence, and artificial super intelligence) or by focusing on different types of AI systems (analytical AI, human-inspired AI, and humanized AI). Based on this classification, we show the potential and risk of AI using a series of case studies regarding universities, corporations, and governments. Finally, we present a framework that helps organizations think about the internal and external implications of AI, which we label the Three C Model of Confidence, Change, and Control.
Conference Paper
Creative generative machine learning interfaces are stronger when multiple actors bearing different points of view actively contribute to them. User experience (UX) research and design involvement in the creation of machine learning (ML) models help ML research scientists to more effectively identify human needs that ML models will fulfill. The People and AI Research (PAIR) group within Google developed a novel program method in which UXers are embedded into an ML research group for three months to provide a human-centered perspective on the creation of ML models. The first full-time cohort of UXers were embedded in a team of ML research scientists focused on deep generative models to assist in music composition. Here, we discuss the structure and goals of the program, challenges we faced during execution, and insights gained as a result of the process. We offer practical suggestions for how to foster communication between UX and ML research teams and recommended UX design processes for building creative generative machine learning interfaces.
Conference Paper
Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type [15]) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related artificial intelligence technology, increasing transparency into how well artificial intelligence technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.
Conference Paper
With the massive adoption of music streaming services globally, metadata is being generated that captures people's music listening histories in more precise detail than ever before. These metadata archives offer a valuable and overlooked resource for designing new ways of supporting people in experiencing the music they have listened to over the course of their lives. Yet, little research has demonstrated how metadata can be applied as a material in design practice. We describe the design of OLO Radio, a device that leverages music listening history metadata to support experiences of exploring and living with music from one's past. We unpack and reflect on design choices that made use of the exacting precision captured in listening history metadata archives to support relatively imprecise qualities of feedback and interaction to encourage rich, open-ended experiences of contemplation, curiosity, and enjoyment over time. We conclude with implications for HCI research and practice in this space.