Conference PaperPDF Available

The Effect of Audiences on the User Experience with Conversational Interfaces in Physical Spaces

Authors:

Abstract and Figures

How does the presence of an audience influence social interaction with a conversational system in a physical space? To answer this question, we analyzed data from an art exhibit where visitors interacted in natural language with three chatbots representing characters from a book. We performed two studies to explore the influence of audiences. In Study 1, we did fieldwork cross-analyzing the reported perception of the social interaction, the audience conditions (visitor is alone, a visitor is observed by acquaintances and/or strangers), and control variables such as the visitor's familiarity with the book and gender. In Study 2, we analyzed over 5,000 conversation logs and video recordings, identifying dialogue patterns and how they correlated with the audience conditions. Some significant effects were found, suggesting that conversational systems in physical spaces should be designed based on whether other people observe the user or not. CCS CONCEPTS • Human-centered computing → Empirical studies in HCI; Empirical studies in interaction design.
Content may be subject to copyright.
The Eect of Audiences on the User Experience with
Conversational Interfaces in Physical Spaces
Heloisa Candello
IBM Research
hcandello@br.ibm.com
Claudio Pinhanez
IBM Research
csantosp@br.ibm.com
Mauro Pichiliani
IBM Research
mpichi@br.ibm.com
Paulo Cavalin
IBM Research
pcavalin@br.ibm.com
Flavio Figueiredo
Fed. Univ. of Minas Gerais
aviovdf@dcc.ufmg.br
Marisa Vasconcelos
IBM Research
marisaav@br.ibm.com
Haylla Do Carmo
IBM Research
hayllat@br.ibm.com
ABSTRACT
How does the presence of an audience inuence the social
interaction with a conversational system in a physical space?
To answer this question, we analyzed data from an art ex-
hibit where visitors interacted in natural language with three
chatbots representing characters from a book. We performed
two studies to explore the inuence of audiences. In Study
1, we did eldwork cross-analyzing the reported perception
of the social interaction, the audience conditions (visitor is
alone, visitor is observed by acquaintances and/or strangers),
and control variables such as the visitor’s familiarity with
the book and gender. In Study 2, we analyzed over 5,000
conversation logs and video recordings, identifying dialogue
patterns and how they correlated with the audience condi-
tions. Some signicant eects were found, suggesting that
conversational systems in physical spaces should be designed
based on whether other people observe the user or not.
CCS CONCEPTS
Human-centered computing
Empirical studies in
HCI; Empirical studies in interaction design.
KEYWORDS
Conversational interfaces, audience eects, chatbot design.
1 INTRODUCTION
Conversational machines are being increasingly employed in
physical spaces for both private and public usage. Examples
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for prot or commercial advantage and that copies bear
this notice and the full citation on the rst page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with
credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specic permission and/or a fee. Request
permissions from permissions@acm.org.
CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK
©2019 Association for Computing Machinery.
ACM ISBN 978-1-4503-5970-2/19/05. . . $15.00
https://doi.org/10.1145/3290605.3300320
include hotel lobbies and store showrooms [
15
], car dash-
boards [
24
], and home devices [
40
]. With such machines, or
chatbots, human interactions may happen in the presence of
an audience, be it friends, family (e.g., while on a road trip),
or simply strangers and bystanders (e.g., in a hotel lobby).
Previous studies have found that humans tend to change
their normal conversation behavior when in front of others.
In such contexts, people sometimes resort to using long and
complicated words, uttering jokes, quoting from obscure
authors, and, in general, pretending to be smarter, wittier,
or funnier than in private conversations [
4
,
35
]. Also, when
in the presence of others, some people may enhance the
emission of dominant responses [
8
], according to the status
of those in the audience [
8
,
10
]. However, some people react
in the opposite way, becoming more shy than normal, failing
to complete sentences, getting nervous, or even stuttering.
Understanding such changes in behavior are important
because it may be necessary to design the machine conversa-
tion systems to handle those situations where people change
their usual behavior to accommodate the presence of an au-
dience. We were motivated to study this kind of behavior
change by some initial observations we made of visitors ex-
periencing an art exhibit where they interacted with a group
of chatbots either alone or in front of acquaintances and/or
strangers. For example, we observed some people trying to
amuse their friends by trying to “break” the machine with
impossible questions; asking questions related to local pol-
itics and sports to provoke the other visitors; and uttering
deep and complicated questions to show o to others their
knowledge about the subject of the artwork.
In any of those cases, we found that the art exhibit could
have been designed to better handle the presence of an audi-
ence. For example, the system could have a higher threshold
of guessing the right answer to complex questions when
an audience is present. It could, for instance, assume that a
visitor asking a complex question in front of an audience is
less a situation where she is looking for knowledge and more
like the system is being made fun of. While in the former
case the appropriate response could be trying to nd a good
CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK H. Candello et al.
Figure 1: The physical setup of Coee with the Santiagos.
answer as hard as it could, whereas in the group situation it
could be simply deecting the question.
Beyond art exhibits, as conversational systems become
more ubiquitous, similar situations will be common in more
down-to-earth scenarios. For instance, a conversation speaker
(like Amazon’s Echo or Google’s Home) could benet to ad-
just its behavior to handle audience eects. It could be less
prune to making jokes to avoid making anyone in the audi-
ence uncomfortable, or, even worse, feeling ridiculed in front
of acquaintances. In other words, by recognizing the audi-
ence context, the conversational system may be designed to
answer in a more appropriate form for a situation of group
social interaction, adapting to and enhancing the overall
experience of users and their audiences.
However, such considerations and strategies only make
sense if we understand whether and how users change their
behavior when conversing with machines in front of other
people. Do they feel more embarrassed, powerful, or wittier
by an audience when dialogging with a machine instead
of a person? Are the changes dierent if the audience is
comprised of acquaintances or strangers? To shed a light on
such questions, we went further and performed two studies
on the art exhibit and its visitors. This was a setting where
single or multiple visitors freely conversed in a physical
space with three text-based chatbots representing characters
from a well-known 19th century book in Brazil. No control
on how visitors interacted with the space or the chatbots was
in place, with the exception that they had to do it through a
single tablet. Images from the exhibit are depicted in Figure 1.
To explore changes in conversational behavior of people
due to the presence of an audience, we investigated the vis-
itor perceptions of the three agents’ social skills and the
user’s engagement with agents with the artwork. In the ma-
jority of the situations the interaction happened in front of
other visitors, some of them known to the users, but also
often in front of strangers. In our rst study, we conducted
92 semi-structured interviews with visitors, after observing
their behavior at the exhibit. Analyzing this data, we were
able to determine that, in some specic situations, it was
very likely that the audience presence was aecting the user
experience of the visitors. In a second study, we analyzed the
conversation logs of more than 5,000 sessions. Coupled with
a silent video of the audience interaction, which we used to
manually determine the occurrence and type of audience, we
were able to explore changes in conversation patterns which
could be related to the presence of other people around the
visitor. The two studies seem to provide evidence of audi-
ence eects, and that designers should be taken into account
audience eects in conversational systems in physical spaces.
Moreover, our ndings seem to indicate that those eects are
modulated by many factors, including gender, knowledge
about the content of the exhibit, and whether there were
strangers in the audience.
The next sections describe in detail the related work, the
experimental setup, the two studies, their ndings, and our
main conclusions. Finally, we discuss some design implica-
tions, indicating how our ndings may guide the design of
conversational systems in physical spaces.
2 RELATED WORK
In this session, we describe the previous work as a back-
ground for our study, both in the scope of social interaction
with chatbots and in the context of audience eects.
The Eect of Audiences on Conversational Interfaces CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK
Social Interaction with Chatbots:
With the recent ad-
vances in conversational and natural language technologies,
interest has increased on how humans interact with conver-
sational systems, here referred generically as chatbots, and
on how social presence and context may play a key role in
understanding the dynamics of the interaction [29, 37].
Social presence is described as the social connection and
involvement between two or more people in an interaction
often developing and maintaining some sort of personal rela-
tionship [
41
]. The perception of social presence is sometimes
connected to the anthropomorphism of physical robots, chat-
bots, and avatars. In particular, anthropomorphism is a pre-
vailing topic of Embodied Conversational Agents (ECAs), a
special case of embodied agents in which the agents provide
human-like capabilities of face-to-face dialogue.
Studies with ECAs have provided evidence that they can
induce social-emotional eects comparable to those in human-
to-human interactions [
38
,
43
]. Previous work found that
people conversing with ECAs or interacting with robots
show social reactions such as social facilitation or inhibition
[
3
,
38
,
50
], a tendency to socially desirable behavior [
20
,
39
,
43
], and increased cooperation [
32
]. For example, analyses
of users’ utterances while interacting with a museum agent
[
19
,
20
] showed semblance with human-to-human commu-
nication, with similarities in the amount of greetings and
farewells, common phrases (such as “How are you?”), and
human-likeness questions (e.g., “Do you have a girlfriend?”).
In general, system which exhibit human-like traits tend to
improve the quality of the user experience with them., Cafaro
et al
. [6]
found that smile, gaze and proxemics are important
for conversational museum guide agents, implying that those
agent inuenced user’s interpretation of agent’s extraversion
and aliation and impacted on the user’s decisions about
further encounters.
Although the degree of veracity in the dialogue often
improves the quality of the interaction, it might have the
opposite eect: the uncanny valley eect [
30
,
44
] where peo-
ple are averse to a high degree of human similarity has also
been observed. Experiments, such as [
18
,
32
], have validated
this hypothesis by observing the user’s emotion engagement
strategies towards agents of varying human likeness.
In this study, we contextualize our study object, the art
exhibit, as containing three embodied chatbots. Even though
the chatbots did not have a physical body they have a clear
physical presence provide by scenographic elements (see
Figure 1): female and male hats hanging above chairs around
a table unmistakably embodied the chatbots.
Audience Eects:
. Seminal work on drive-producing ef-
fects of the presence of an audience [
8
] uncovered specic
group interaction behaviors, which led to theories and de-
sign frameworks for spectatorship (e.g. [
4
,
35
]). Among the
implications and ndings of audience eects are the impact
of behavior and views of bystanders on the response to an
interaction, which has been known to inuence engagement,
either being related to attention, interest, or aective feelings
[4, 8, 35].
One of the early studies of audience eects concluded that
proximity and presence of audience enhance the emission of
dominant responses [
8
], i.e. responses governed by strong
verbal habits at the expense of responses governed by weaker
ones. Active audiences who looked and interacted with the
subjects directly aected individual performance measured
by the average number of responses in a word recognition
task. In 1982, Michaels et al
.
performed a classical study on
social facilitation showing that the performance of good pool
players improved 14% in front an audience while bad pool
players had a dramatic decrease of 30% [28].
Love and Perry
[23]
studied the behavior and views of
bystanders in response to a proximal mobile telephone con-
versation held by a third party. In their experiments, sub-
jects demonstrated noticeable changes in body posture when
viewing and listening to a confederate attending a call. The
inuence of audience has been also studied in video gaming,
where researches explored audience aspects including age
[
49
], size and distance of the interactor [
18
], typologies of
spectatorship [
27
], player performance and perceived game
diculty [
49
], co-located/remote and virtual/real audience
[
11
], cheering [
16
], supportiveness [
5
], activeness [
21
], and
social aspects [
9
]. Overall, the ndings report that dierent
characteristics and behaviors of audience have positive and
negative impacts, sometimes aecting the entire gameplay
experience.
Spectator experience design has been proposed by Reeves
et al
. [36]
, which produced a taxonomy that uncovers design
strategies based on interface manipulations and their result-
ing outcomes. Audience participation in public spaces has
also been studied from the point of view of interaction and
engagement in many domains, such as education [
47
], sports
[
7
], and arts [
21
]. One common observed practice which di-
rectly aects the experience is the honey-pot eect, where
interaction with a screen in public can drive social clustering
and further engagement [
4
]. Furthermore, Group interaction
helps to explain how users understand and react to displays
in public settings.
Although previous works explored audience and specta-
torship eects in games, sports, arts and other domains, to
the best of our knowledge no research eorts have been
made to study the experience of audience eects in scenar-
ios where the main interaction is conversing with chatbots
in a physical space. Finally, given that our setting is an art
exhibit, we use the terms visitor and user interchangeably. In
our study, a person is both a visitor of the exhibit as well as
the user of the physical chatbot architecture described next.
CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK H. Candello et al.
3 THE EXPERIMENTAL SETUP
In this section, we begin by describing the physical artwork
which was part of a large art exhibition of a Brazilian arts
center. Next, we discuss our experimental setup and the
research questions tackled using the data from the exhibit.
The Artwork: Coee with the Santiagos
The study reported in this paper was done in the context of
an artwork called Coee with the Santiagos by three Brazilian
artists. It recreated a dining room of the 19th century popu-
lated with physical representations of characters from one
of the most well known and acclaimed novels in Brazilian
literature: “Dom Casmurro” by Machado de Assis (the novel
was originally published on 1899).
The choice of the book was made to facilitate the visi-
tors experience by engaging them with familiar material and
characters. Also, the novel is known for not being clear about
some events of the story, specially whether the main charac-
ter, Bentinho, was betrayed by his wife Capitu with his best
friend Escobar.Bentinho is tormented with the resemblance
of his son with Escobar, but the actual betrayal is never de-
scribed in the book or admitted by Capitu. Therefore, we
expected that visitors could interact within the context of
the book from the start, and often they did so by asking the
Capitu chatbot whether she had betrayed or not her husband
(which she vehemently denied).
Figure 1 shows images from the exhibit. In the center of the
space there was a large table with cups, saucers, and a teapot
arranged as if the characters were having coee. Around the
table there were four chairs, three of which were“occupied”
by the main characters of the book, represented by oating
hats. Attached to the fourth chair there was a tablet and a
headphone which allowed visitors to interact with the bots.
Sounds of barking dogs, horse carriages, and singing birds
were played as environmental sound.
The utterances from the characters and the visitor were
seen as animated text projections on the table, as if departing
from the cup in front of each character or the visitor (see
right image of Figure 1), in an eect similar to White and
Small’s Poetic Garden [
48
]. The path of the text on the table
allowed it to be read independently of the position of the
visitor around the table. The projector also changed the color
of the cup associated with the current speaker to help visitors
understand from which character the utterance was coming
and varied the decorative motif of the saucers by projecting
images of 19th century watercolors on them.
The physical architecture of the Coee with the Santiagos is
shown on Figure 2. The main interaction with the system was
performed with a custom tablet application which captures
the visitor’s input (name, gender, and utterance/question)
and sends it to a main computer. The IBM Watson Assistant
Figure 2: Physical architecture of Coee with the Santiagos.
API service was employed to obtain the character response
for the visitor’s utterance from a set of pre-dened answers
captured from dialogues of the original book. Other web API
services were also called to generate the narrated audio of the
questions (for visitors with special needs). Both characters’
and visitor texts were displayed on the table by a projector
and the generated audios were played on the headphone.
The ambient soundtrack was reproduced on loop by a DVD
player connected to sound speakers.
When visitors arrived at the exhibit they could see the
characters conversing with each other and an inviting mes-
sage on the tablet. If they decided to interact, they had to
enter their names and gender, select the character to which
they wanted to send a message, and then type the message
(with the aid of auto-correction and completion). The visi-
tor’s message was projected on the table and was followed by
a reply from the selected character. Replies from characters
were mostly based on actual dialogue sentences from the
book. If the utterance of the visitor was deemed to be be-
yond the scope of the book or not recognized by the chatbot,
the character tried to divert the question by asking “More
coee?” to the visitor.
For some randomly selected visitors, the replies would
also contained a direct address to the visitor in the form of
a vocative, such as “No such a thing, dear Maria.” The use
of direct address intended to enhance the impression that
the character was talking back to the visitor and was part of
another study being conducted at the art exhibit.
Research estions
We carried out an exploratory study in the wild during the
last three weeks of the artwork in August of 2017. In this
period, 5,100 people interacted with the exhibit. Those partic-
ipants were videotaped (without sound) and close to 10,000
The Eect of Audiences on Conversational Interfaces CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK
questions were logged. Out of these visitors, 92 were ob-
served while interacting with the exhibit and asked to par-
ticipate in a semi-structured interview (described later).
We performed two studies to understand the eect of
audiences on the visitor’s experience and in particular on
how it aects the visitor’s utterances with the chatbots; and
the visitors’ perceptions of the social interaction with them.
As the interaction happens in a physical space, the users and
the audience are always co-located. Formally, we pose our
two research questions as:
RQ1
: What are the eects of audiences on the users’ percep-
tions of social interaction with chatbots?
RQ2
: Does the presence of audiences inuence the type and
content of user’s questions directed to chatbots?
We conducted the observations and semi-structured in-
terviews to answer RQ1 in Study 1, and we analyzed the
collected conversation-logs and video recordings to answer
RQ2 in Study 2. In both studies we classied the interaction
sessions into four non-mutually exclusive audience condi-
tions based on the videotape and notes from the observation
studies. The four audience conditions are:
(A) No audience:
the visitor was either alone or no one was
observing her/his experience in the artwork.
(B) Observed by acquaintances:
the visitor was accompa-
nied by friends or acquaintances who sometimes also shared
the tablet.
(C) Observed by strangers in the queue:
the visitor, with
acquaintances or not, had strangers observing her experience
from the artwork queue.
(D) Observed by strangers standing around the table:
the visitor with acquaintances or not, had strangers observing
her experience from around the table.
The (C) and (D) conditions were determined by consider-
ing the proximity of the audience to the users of the tablet.
Condition (C) was annotated by researchers (authors) that
were co-located with the exhibit while observing visitors.
Condition (D) was analyzed from the video recordings as
it was done in [
23
]. Unfortunately condition (C) was not
determinable from the video since the queue was out of the
eld of view of the camera, and therefore only the rst study
examines this condition.
4 STUDY 1: AUDIENCE EFFECTS ON SOCIAL
INTERACTION
The objective of the rst study (addressing RQ1) is to under-
stand the visitors’ experience and examine whether audience
aects or not the perception of their social interaction with
the art exhibit chatbots. For this study, we focus on the 92
users which had their interaction observed and completed
semi-structured interviews.
Table 1: Self-metric answers of participants in Study 1.
Dis. Neu. Agr.
Q5: I felt part of the conversation 18 9 65
Q6: The characters only talked to each other
64 5 23
Q7: The characters talked to me 16 5 71
Q8: The characters answered my questions 30 3 59
Q9: The characters answer about any subject
36 23 33
Q10: I asked everything I wanted 18 4 70
Procedure
During the last three weeks of the exhibition period, some
of the authors of the paper randomly selected visitors from
the art exhibit to observe. As soon as such visitors had n-
ished their interaction with the artwork, they were invited
to participate in an interview. In total, 92 participants agreed
to participate in the study and signed a consent form before
starting the interview. The semi-structured interview was
designed to be short and took between 8 and 10 minutes to
be completed. The questions in the interview were:
Q1. Are you familiar with the story of the book?
Q2. Please tell us how you would describe your experience
with this exhibit for a friend who will not be able to visit it.
Q3. How do you think the exhibit works?
Q4. Did you hear about this exhibit before visiting it today?
Participants also answered self-metric statements (Q5 to
10 on Table 1) by choosing one option among ve: totally
disagree, partially disagree, neutral, partially agree, and totally
agree. The perception of social interaction was measured
using those self-reported metric questions which explore
how participants had a sense of belonging to the talk vs. a
sense of being an outsider from the talk (Q5 to Q8); the sense
of perception on the chatbots’ scope of knowledge (Q9); and
the sense of satisfaction (Q10).
At the end of the interview, participants had the chance to
share any other thoughts they wished. Additionally, partici-
pants answered demographic questions. Data collection was
stopped when ndings were becoming repetitive, achieving
data saturation [
45
]. Interviews were audio-recorded, tran-
scribed, and analyzed using a mix-methods software package.
The qualitative data gathered was coded in categories. The
coding scheme emerged from the data by applying an in-
spired grounded-theory approach [
45
]. A thematic network
was applied as an analytical tool to understand better the
themes emerged from the conversation logs [
1
]. Additionally,
we also collected the conversation logs of the 92 participants
from the system log and integrated this data to this analy-
sis, so we could understand better their experience with the
chatbots.
CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK H. Candello et al.
Table 2: Demographics of the participants in Study 1.
Participants 92
Age group
16-26 46
27-37 19
38-48 16
49-59 7
60-72 4
Gender Female 50
Male 42
Familiarity with book plot
Familiar 56
More or Less 20
Not Familiar 16
Table 3: Number and percentage of participants according
to the audience condition in Study 1. Notice that a user ob-
served by acquaintances may also be observed by strangers
standing in queue (B C) or around the table (B D).
condition A B C D B C B D all
# of users 4 65 51 73 34 55 92
% of users 4% 71% 55% 79% 37% 60% 100%
Demographics of Participants
Our participants’ demographics are shown on Table 2. 65
of our participants were aged between 16 to 35 years old
(71%), with gender being roughly balanced. 76 of the par-
ticipants had familiarity with the story (83%). The duration
of the participants’ interaction experience with the artwork
was between 5 and 8 minutes. In Table 3 we present the
audience types. Notice that 65 participants (71% of the total)
were observed by acquaintances (B). Participants who were
observed by strangers were in one or two conditions, either
they were being watched from a queue of people waiting
to use the tablet (C) or they were watched by other visitors
from around the table (D). For the purpose of this analysis,
we preferred to leave those conditions separated due to the
dierent levels of audience eects that may arise in each
condition [23].
In this sample only four participants were interacting
alone and did not have any audience (A). For this reason, we
focused the analysis of this study more on the (B), (C) and
(D) conditions, and in particular on understanding whether
being with acquaintances or not makes a dierence on the
social interaction with the chatbots. We also consider gender,
age, and familiarity with the plot as independent variables.
Thematic Network Analysis
Initially, we describe here the trends and themes that emerged
from the user experience of being in the context of the art-
work. The thematic network approach was applied to orga-
nize the themes and understand the overall experience. The
source of this analysis were the open-ended questions (Q1
to Q4) from the 92 semi-structured interviews, notes from
observation studies and video recordings, and the rational-
ity reported by the participants when they answered the
self-reported questions (Q5 to 10).
The semi-structured interviews and observation studies
unveiled a set of emerging themes which helped to create
a picture of the visitors’ experience with the artwork. In
total 125 basic codes emerged from the analysis, which were
grouped and organized into four main organizing themes,
identied as important factors aecting the visitors’ social
interaction with chatbots. The organizing themes are: cu-
riosity and novelty; interest on the plot; expected chatbots
answers; and audience eects.
Curiosity and novelty:
It seems that the scenographic
elements created an atmosphere which attracted participants
to interact (as in [
31
]). The decorative hats representing
each chatbot as seating at the table, the 19th century dining
room complete with wall-paper, tapestries, and surrounding
sound created an atmosphere which attracted interaction.
When reporting their experience to the researcher (Q2) it
was evident that several scenographic elements provoked
their curiosity. (P27) “The soundtrack was awless. [.. .] And I
also noticed the animation of the words on the table and the
lights when someone responds, and it lights up and highlights
colors inside the cups, I found this to be very cool!”. Participants
also pointed to the utterances projected on the table and tried
to gure out what the best question to ask was when they
are accompanied by acquaintances. Others reported that the
scenography and the empty space in the chairs reminded
them of ghosts similar to a table-turning séance. (P14) “I
thought the visual was very interesting, it called my attention.
You see those hats, it seems like you’re talking to the beyond.
It was intriguing... very curious.
Expected chatbot answers:
Most of the users seem to
have a mental model of how the chatbots were supposed to
answer them. Some of the participants wanted only to test the
technology behind the exhibit and were not focused on the
content of the plot. (P4) “For me, I was more curious to see what
happened then ... I did not develop several questions, I was not
too involved with the intention of having a conversation.”. P(40)
expressed his intention when he asked the chatbots a topic
which he presumed the chatbots would not be able to answer
him: “I asked when we will have peace in the world. I guess I was
not too fair on him, I was very picky. I confess I did a very tricky
question. I wanted to challenge the software.”. However, for
others the technology was invisible. (P42) “I enjoyed! [laughs],
I didn’t try to imagine how it was working.” In those cases,
participants reported as if they were immersed in a parasocial
interaction [
14
], as highlighted by the language they used to
describe their experience and the questions they asked. They
also applied human social interaction rules [
25
,
34
], using
personal pronouns and asking questions related to human
The Eect of Audiences on Conversational Interfaces CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK
nature of the chatbots. As it was evident in the words of
(P1): “Her (Capitu) response was genuine. The answer for the
other two (Bentinho and Escobar), were neutral. As hers was
more elaborated I felt that she had thought to answer that. It
was not just a game”. Participants also consulted the chatbots
as oracles. (P35) “Do you believe in free love?” and reported
their emotional states to the bot. (P23) “it I’m scared. Are
you someone who scares people?”
Interest on the plot and characters:
Most of the partici-
pants seem to have only the betrayal question in mind and
reported they were satised with the number of questions
they asked. Participants who knew the story usually inter-
acted the same amount of time as the ones that did not know
the plot, usually asking two to three questions. As mentioned
before, some chatbot utterances projected on the table con-
tained the name of the participants (ex. “it Dear Paulo” ). In
those situations, participants were engaged with the dialogue
such as (P43): “I wrote my name and the exhibit interacts with
me, so to the point that as if I was there, because she (Capitu)
speaks my name, she calls me lady and such. And it was very
interesting because you can really ask any questions, I thought
you had pre-selected questions. Do you know what I mean?”.
Participants also tested the chatbots to validate their answers
when they knew the plot and had several opinions about
which character should be blamed for the betrayal. Others
only have one question they wanted to know, whether the
betrayal had happened or not1.
Audience eects:
We identied that people waiting in the
queue inuenced the participants’ interaction. Participants
reported they would have asked more questions, if there
was not a queue waiting for the interaction. As described by
(P23): “[. ..] If I had known that I was not disturbing I would
have asked more questions, I had a lot of those thoughts: oh,
my God there’s someone else waiting to see. I already had
two interactions, so it was already ... it was already good”.
When visitors were observed by acquaintances (B), it tended
to enhance certain behaviors of appreciation and visitors
appropriated the exhibit to communicate their feelings. (P29)
wrote to his sweetheart: “I love you, Mrs. Tatiana”. Finally, we
also saw participants asking questions not related to the plot
when in the presence of strangers (C and D) such as (P53):
Why do you think people are so cold?”; and (P40): “When are
we going to have peace in the world?”.
Statistical Analysis of the Self-Reported estions
We performed a statistical analysis based on the self-reported
questions (Q5 to 10). The answers to the self-reported ques-
tions (Q5 to Q10) were joined from the 5 categories into 3
to better perform a statistical analysis to complement the
qualitative ndings: D= totally disagree, partially disagree;
1
This “fundamental” question is never answered in the book Dom Casmurro.
N= Neutral; and A= agree, totally agree. Results are shown
in Table 1. Those questions served as response variables,
capturing the users’ perception of social interaction with
the chatbots. They were cross-sectioned with relation to age,
gender, familiarity with the plot, and the dierent audience
conditions (A to D) using contingency tables. To understand
the table, each column represents answers from Q5-10 (dis-
agree, neutral and agree), or responses. The rows capture
dierent participant conditions (e.g., gender or familiarity).
The cells thus counts the intersection between rows and
columns (e.g., females who answered agree on Q6).
Our statistical analysis is based on the Fisher exact tests.
The null hypothesis of the test states that responses are uni-
formly distributed on the contingency table. Rejecting this
hypothesis serves as evidence that the dierent conditions
or demographic variables led participants to indicate that the
social interaction between bots in each of the four conditions
(A, B, C, D) aected their experience.
From our 92 participants, only 4 were in condition (A).
Because of this, our study mostly focuses on the other condi-
tions (B to D). Similarly, when isolating some of the variables
(e.g. females), the number of participants naturally reduced.
Nevertheless, for all of the cases where we found some sta-
tistical signicance (
p<
0
.
05), we had at least
n=
32 partici-
pants. Before continuing, we point out that we did not nd
any statistical signicance in our sample when analyzing the
eects of audience considering age and also on (Q10). Most
participants (77%) reported they asked all the questions they
wanted, therefore they were satised.
Observed by acquaintances (B):
In this condition partici-
pants were observed by acquaintances, sometimes sharing
the interaction experience with them (
n=
65). Here, 59% (45
out of 65,
p<
0
.
05) of participants felt that the characters
answered their questions more often (Q8). In the other con-
ditions (C and D), that is, who experience the exhibit alone
or only observed by strangers, this percentage was of higher
(78% out of
n=
27). This may indicate that participants in
condition (B) were less able to pay attention to the chatbot
answers or that they were less focused on the conversation
with them, perhaps distracted by the acquaintances. How-
ever,the same participants in condition (B) who had previous
knowledge of the plot (51 out of 65) signicantly disagreed
(78
.
4%
,p<
0
.
05) that the characters only talked to each other
(Q6), showing it might be a higher degree of engagement
with the artwork when sharing the experience with acquain-
tances for users who understand better the context. Gender
did not have any eect in this category.
Observed by strangers in the queue (C):
In this condition
participants were observed by strangers in the exhibit space
usually next to or behind them in a queue (
n=
51), with or
without acquaintances. This group of users mostly indicated
CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK H. Candello et al.
(45 out of 51, 88
.
2%
,p<
0
.
01) that the characters talked di-
rectly to them (Q7).This is an interesting eect of participants
who, maybe inconvenienced by people waiting in the queue,
may have tried to focus more in the interaction with the chat-
bots. Participants in condition (C) who were
also
familiar
with the plot (
n=
44) indicated that the chatbots talked to
them (Q7), (41 out of 44, 93
.
2%
,p<
0
.
05). These same partic-
ipants agreed that they felt part of the conversation (Q5), (34
out of 44, 77
.
3%
,p<
0
.
05), and that the characters answered
their questions (Q8), (35 out of 44, 79
.
5%
,p<
0
.
05). When
cross-sectioning with gender variables, female participants
(
n=
32) felt more often the characters talked to them (Q7)
(25 out of 32, 87
.
5%
,p<
0
.
05), though we did not nd any
signicant result for males.
Observed by strangers standing around the table (D):
In this condition participants were observed by strangers
while typing their questions on the tablet (
n=
73), some also
observed by acquaintances. No statistical signicance was
found within the distributions of the self reported answers of
(D), or in comparison with participants not in this condition.
However, we found signicant ndings when we considered
some of the independent variables (gender and knowledge
of the plot). Of the users in this condition who
also
had
knowledge of the plot (
n=
60), 40% (24) of them disagreed
that the characters answered about any subject (Q9) (
p<
0
.
05). We looked in the conversation logs and found that most
of the questions asked here were out of scope questions, but
featuring a curiosity on the chatbots’ opinion about the story.
For instance, (P46) “Capitu, what do you think of Bentinho’s
brain sickness?”, poses a question for which the book does
not have an answer. (P19) goes deeper into the plot “Who is
the father of your son?”. (P24) humanizes the chatbots “What
is your favorite color?”.
Similar results were observed when we examined (Q9) con-
sidering gender. Of the females in this condition of having
an audience of strangers around the table (
n=
41), 43
.
9% (16)
disagreed that the characters answered about any subject
(
p<
0
.
05). Many of the female questions relied on curiosity
about the relationship among characters. In the same ques-
tion (Q9), only 21.9% (7 out of
n=
32) of the male participants
felt the same as females (
p<
0
.
05). From the conversation
logs, we found that often male questions were in the rst
person, giving opinions, showing o, such as doing love
declarations or trying to be humorous: (P11) “Am I hand-
some?”; (P41) “Yes, I think.”; or (P59) “I only have coee with
cigarettes, do you?” Therefore, we found evidence that male
participants disagreed less than females that the chatbots
answered about any subject, but it seems that the type of
out-of-scope questions made by each gender was not the
same. This suggests that the social interaction may have
been dierent comparing male and female participants in
the condition of strangers standing around the table.
5 STUDY 2: AUDIENCE EFFECTS ON USER
INTERACTION
In this second study, we tackle our second research question
RQ2: Does the presence of audiences inuence the type and
content of user’s questions directed to chatbots? by exploring
a larger dataset consisting approximately 5,000 utterances
logged during three weeks at the exhibition, combined with
information extracted from the silent video recordings of
the interactions. In this analysis, we employed both manual
coding and machine learning tools to analyze the dataset
using a semi-supervised approach.
Coding the Audience Conditions
Given that our ultimate goal is to understand audience ef-
fects, we ltered our initial dataset to consider only the user
interactions where we could be sure to determine the audi-
ence conditions (A, B, and D) from the video. Unfortunately,
we could not determine condition (C), when users were ob-
served by strangers in a queue, because the queue was not
visible in the video recordings. In the previous study, this was
possible because researchers were observing the participants
in the eld.
To code the dierent audience conditions (A, B, and D), in-
dependent human coders analyzed the video recordings from
the exhibit. A sample of 54 hours of video (four weekdays and
two weekends) was analyzed. This sample was chosen since
it captured dierent types of public (e.g., weekday visitors
and weekend visitors). Based on those recordings, sessions
were determined to be in one or more of the three settings
A, B, and D. Then three dierent human coders manually
observed and annotated features of the movement across the
room, communication cues, proximity, presence of a com-
panion, and closeness among the visitors while a participant
interacted with the tablet. In the end, we considered 633
sessions which were subdivided into 1,542 same-user inter-
actions (each session sometimes included several consecutive
users). Out of those user interactions, 240 were coded as (A),
no audience; 1200 were coded as (B), observed by acquain-
tances; and, 102 were coded as (D), observed by strangers
standing around the table. Considering only those interac-
tions, our resulting dataset contained approximately 5,000
user utterances. Since the majority of utterances were ques-
tions to the chatbots, we use from now on the latter term
instead of the former. Each question was then coded in a
semi-supervised manner as discussed next.
Clustering User estions into Topics
After coding the audience condition from the videos, we
proceeded to cluster the topics of the questions from the
users. Given the rich and distinct way users asked ques-
tions, instead of manually coding each sentence individually,
The Eect of Audiences on Conversational Interfaces CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK
we initially used a semi-supervised methodology to clus-
ter the full set of 5,000 user utterances into 32 clusters of
semantically-similar sentences. In our approach, we rst
employed a clustering algorithm which found an initial set
of meaningful topics. Next, a manual, open-coding of the
clusters was performed by two coders to validate the au-
tomatic clustering. We compared dierent methods such
as K-Means [
22
], Spectral Clustering [
26
], and Hierarchical
Dirichlet Process [
46
]. We proceed by considering only the
approach which showed the best results according to the
two coders who inspected the results, which was Spectral
Clustering.
Clustering Methodology:
Given the set of user questions,
our clustering methodology is based on converting this set
to an anity space
S
. That is, each utterance
qj
, where
1
jN
, is represented as a
N
-dimensional vector, where
position
i
represents the similarity of question
qj
to question
qi
. This process resulted in a
N×N
square matrix, denoted
S
, where all the values in the diagonal are equal to 1. For
computing the similarity, we employed Spacy [
42
], a popu-
lar natural language processing tool which determines the
similarity between two sentences as the Euclidean distance
between the average word vectors [
33
] of each sentence. Word
vectors refers here to the known technique of nding seman-
tic similarity across words by embedding each word into a
low
K
-dimensional space (usually
K=
100). The average
word vector of a sentence can thus be used as a proxy for
the semantics of that sentence. This technique is known to
have limitations for large texts. However, the sentences em-
ployed by users to interact with chatbots had on average 4
words (with a standard deviation of 2 words). For such short
texts, average word vectors are known to perform well in
machine learning tasks, and particularly in clustering [
17
].
In our analysis, we employed Glove Portuguese vectors [33].
After computing the anity space
S
, clustering was con-
ducted based on two steps. First, we ltered out rows in
S
where all values were equal to zero, except the value 1in the
diagonal. That is, all the questions which had null similarity
to the other questions were separated and put into a special,
single cluster denoted "Anything Else". Then, the remaining
rows in
S
were clustered with the Spectral Clustering algo-
rithm [
26
]. This process resulted in 32 clusters, which were
then manually corrected by two human coders.
From Clusters of Questions to Topic Clusters:
To as-
sociate each cluster with a meaningful topic, the same two
human coders named each of the 32 clusters separately. Then,
the coders met and discussed what they had found as mean-
ingful identications for the clusters and did some validation
steps (e.g., manually merging or splitting clusters which pre-
sented similar or dierent content, respectively).
After agreeing on the clustering, the coders worked on the
ltered dataset of 1,542 interactions together to clusterize
further. In the end, four main topic clusters were identied:
(S1) questions out of scope of the original book; (S2) questions
about characters of the book; (S3) greetings such as Hello and
Goodbye; and, (S4) reaction to failure, which corresponds to
the user reaction when the chatbots deected questions they
did know how to answer with More Coee? (for example, a
user replied that indeed she wanted some coee). After the
coding was performed, we measured the Cohen Kappa agree-
ment [
12
] between the coders. Overall we found a reasonable
score of 0.78 (strong agreement), with p<0.001.
The number of questions per topic cluster was: (S1) 271
out of scope questions; (S2) 978 questions about characters
of the book; (S3) 101 greetings; and (S4) 165 reactions to the
dialogue failure (that is, to the More coee? utterance). The
coders also found 27 utterances where the text was gibberish
or isolated numbers and discarded them. We also note that
896 of such questions were from females while 646 were
from males. Unlike the eld study, we did not gather an age
variable from the tablet. Understanding age eects on is thus
left as future work.
Statistical Analysis
After the clustering into the four main topic clusters was per-
formed, we then proceeded to determine whether there were
audience eects on the user interactions. To do so, we em-
ployed random eects logistic models [
13
] where the response
variable was the type of topic cluster (out of scope, about
characters, greetings, or reactions to failure). The explana-
tory variables were: direct address, gender, and audience
condition (A, B, and D). Recall that a direct address is de-
ned as a message from the chatbot to the user in the form
of a vocative, containing the user’s name, which was used
in some utterances from the chatbots. We employed logis-
tic regression models via a Bayesian MCMC sampler using
Bambi [
2
]. Every variable was coded as a categorical. Given
the imbalance in topic clusters S1-S4 for the interactions, on
every model we added an intercept to capture the settings
where the eect is simply due to the number of interactions
in the cluster. Notice that by predicting the topic cluster of
the utterance, we unveil the factors that may have led users
to choose a topic of interaction.
In Table 4 we show the results of the model for each topic
cluster. The table displays only the statistically signicant ex-
planatory variables. To determine those variables, we looked
into the highest posterior density (HPD) with a signicance
level of 95%
2
. Explanatory variables whose HPD contained
the value of 0 were deemed as insignicant. That is, their
eect cannot be determined to the either positive or negative.
The table shows the average value for each variable (
µ
) as
well as the lower (
95%) and upper HPD (
95%). One way to
2The HPD is the Bayesian analogous of the Condence Interval.
CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK H. Candello et al.
Table 4: Average value (µ) and the lower (95%) and upper
HPD (95%) values for signicant explanatory variables in
each topic cluster.
µ95% 95%
(S1) Out of Scope
Intercept -2.11 -3.04 -1.30
Male Gender & Aud. D -2.06 -3.98 -0.03
(S2) About Characters
Intercept 1.66 0.80 2.46
(S3) Greetings
Intercept -3.32 -4.56 -2.01
(S4) Reaction to Failure
Intercept -4.48 -6.26 -2.78
Audience B 1.74 0.18 3.60
Direct Address 1.88 0.38 3.61
Direct Address & Aud. B -1.74 -3.35 -0.16
read the values from the table is to consider that the eect
varies from the lower to the upper HPD, with
µ
being the
expected value. Positive values indicate that the predictor
tends to increase the change of the category, negative ones
show the opposite eect.
From Table 4, we can initially see that S2 (about characters)
and S3 (greetings) have the intercept as the only statistically
signicant explanatory variable. This indicates no variable,
including audience conditions, can predict the topic cluster
of those user utterances. We would expect a lower (for greet-
ings) or higher (for about characters) number of interactions
in each cluster due to the imbalances in interactions.
However, in the S1 topic cluster, corresponding to out of
scope questions, we can see that male users when observed
by strangers around the table, condition (D), decreased the
number of out of scope questions. Since the eect is negative,
a decrease of out of scope questions is expected. This maybe
be related to the dominance eect [8].
Next, for S4, reaction to failure, we can initially see that
most sentences are not of this topic (negative intercept),
which is expected, due to the imbalances. However, we here
saw clearly an audience eect on the user interaction, since
there is signicant increase in reaction to failures when the
user is being observed by acquaintances (B). This perhaps
can be explained by the honey-pot eect [
4
]. Also, we can
observe that direct address increases the chance of a reaction
to failure. This suggests that chatbot designers may choose
to add a direct address in situations where the chatbot does
not know the answer, as an attempt to further engage the
user.
However, and more interesting, the eect changes direc-
tion when both conditions are true, that is, when users are
directly addressed by failing chatbots in the presence of ac-
quaintances they tend to react less to failure. We hypothesize
that a direct address to a single audience member in such
situations may constrain the shared experience. However,
since there were a limited number of questions in this spe-
cic scenario, we believe those are interesting issues to be
explored by future work.
6 DISCUSSION
In both Study 1 and Study 2, we found evidence of audience
eects on how people interact with chatbots in public spaces.
Regarding both of our research questions RQ1: What are the
eects of co-located audiences on the users’ perceptions of social
interaction with chatbots? RQ2: Does the presence of audiences
inuence the type and content of user’s questions directed to
chatbots?
In Study 1, we saw that dierent audience conditions pro-
duced some signicant eects on the users’ self-reported user
experience as measured by our questionnaire. When users
were observed by acquaintances (B), users felt less that the
characters answered their questions (Q8) than users in the
other conditions, sometimes engaging in direct conversation
with the people they knew through the artwork. Conversely,
users were observed by strangers waiting in the queue (C)
reported more often that the characters were talking to them
(Q7) than users in the other conditions. A possible explana-
tion is that those users may have tried to focus more on the
conversation with the chatbots because they felt pressured
by people waiting behind them.
We also saw in Study 1 that users who had previous knowl-
edge about the story depicted in the artwork (as measured
by Q1), seem to be more strongly aected by audience con-
ditions. Such users, when observed by acquaintances (B),
signicantly disagreed that characters talked among them-
selves (Q6), showing an increased sense of belonging in the
talk. Similarly, users familiar with the plot when observed
by strangers in the queue (C) were signicantly more likely
to perceive the chatbots talking to them (Q7), to feel part of
the conversation (Q5), and to believe that the characters an-
swered their questions (Q8); and they tended to disagree that
the characters were talking to each other (Q6). When users
know about the context and were observed by strangers
standing around the table (D), there was a signicant in-
crease in the perception that the characters could talk about
any subject (Q9).
In some way, previous content knowledge seems to boost
the audience eects on the perception of social interactions,
both when a familiar audience surrounded users or observed
by strangers. We also saw that the audience eects on such
users were more pronounced when considered in conjunc-
tion with gender and the kind of questions males and females
posted to validate bots scope of knowledge.
In Study 1 we also saw evidence that audience eects
are dierent according to the gender of the user, particularly
The Eect of Audiences on Conversational Interfaces CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK
considering the presence of strangers. Female users observed
by strangers in the queue (C) felt more that the characters
talked to them (Q7), while male users did not report that.
When observed by strangers around the table (D), female
users perceived that the chatbots did not talk about other
subjects (Q9) to a greater extent and signicantly more than
male users. Our pieces of evidence seem to point towards
that the male gender will ’behave better’ when acquaintances
did not observe them. In contrast, the qualitative analysis of
Study 1 shows that when males ask out of scope questions,
their sentences intensify the behavior of showing o with
more frequency. Similar results exist in the literature [8].
Study 2 uncovered other types of audience eects, com-
plementary to the ones detected in Study 1. In particular,
user reactions to chatbot failures (the topic cluster Reaction
to failure) were more common when users were observed by
acquaintances (B). We also saw those male participants had
a higher tendency to ask sentences in the scope when they
were in an audience of strangers around the table (D).
Study 2 also showed enhanced audience eects when there
was a direct address in the chatbot response. In general, we
found that direct address increased the likelihood of the user
reacting to a chatbot failure. However, a shared experience
with acquaintances combined with direct address tended to
reduce the reaction to failure.
In summary, the two studies seem to have gathered enough
evidence to support positive answers to both our research
questions RQ1 and RQ2, and point towards the need of con-
sidering audience eects when designing conversational ex-
periences in physical spaces. As shown in the ndings, those
audience eects can be modulated by many factors, including
the audience being composed of acquaintances or strangers;
the existence of a waiting queue; the knowledge of the con-
text of the interaction; gender of the users; and use of direct
address.
7 RECOMMENDATIONS TO DESIGNERS
In this paper, we discussed the eects of co-located audi-
ences on chatbot interactions. The ndings, discussed in the
previous section, suggest several design recommendations:
DR1:
Designers should consider the user’s previous knowl-
edge of content as it tends to aect the social interaction
with machines, in particular when users have audiences.
DR2:
Designers should consider that the presence of strangers
in a queue waiting to interact with a physical conversational
system, may aect how users will experience the system.
DR3:
Designers should consider gender eects when craft-
ing public interactions with conversational systems, includ-
ing how to handle answers to out of scope questions.
DR4:
Designers should consider tailoring and using direct
address in some cases of chatbot utterances according to the
presence of an audience. Respecting the user expectation
of being considered to the bot. In general, chatbots should
use the direct address, such as vocatives or pronouns, to
acknowledge either all the participants in the audience or
should not use them.
8 FUTURE WORK
The results and ndings of our studies not only answered
some basic questions about audience eects in conversa-
tional systems but also uncovered new and exciting issues.
In particular, we believe it is necessary to investigate some of
the gender eects which arose in both of our studies, and also
whether other social demographical factors impact public
human experiences with chatbots. Also, investigating how
our ndings transfer to other, non-physical settings such
as online chatbots seems to be a promising line of future
work; and to more task-oriented contexts such as in retail or
hospitality.
Moreover, researchers might also want to use and improve
our mixed-methods approach to understand the implications
of social interaction with chatbots and to analyze the conver-
sation spectrum. In Study 1, the use of Thematic networks
helped unveil the main themes emerged from the user expe-
rience and in conjunction with statistical analysis identify
the visitors’ perception of social interaction with chatbots. In
Study 2, the clustering approach with human coders added
value and precision to classify the user question topics and
the statistical log-analysis to determine the audience eects
on visitors’ interactions. We hope other designers and re-
searchers also nd our methodological approach useful to
apply in similar projects.
9 LIMITATIONS OF THE WORK
We performed this study in Brazil, and thus our ndings
may not transfer to other cultures. However, it is essential to
take into account that this initial study looked into variables
which exist in any culture such as gender and social issues
in shared experiences. Another limitation is that our study
was conducted in an art exhibition context, and therefore the
ndings may be dierent in task-oriented shared experience
scenarios such as hospitality and retail.
ACKNOWLEDGEMENTS
We thank the participants from our eldwork study for their
support in providing the information required to perform our
study. We also thank the anonymous reviewers and David R.
Millen for the insightful comments and reviews that led to
this paper. Flavio Figueiredo is sponsored by personal grants
from Brazil’s National Council for Scientic and Technologi-
cal Development (CNPq).
CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK H. Candello et al.
REFERENCES
[1]
Jennifer Attride-Stirling. 2001. Thematic networks: an ana-
lytic tool for qualitative research. Qualitative Research 1, 3
(2001), 385–405. https://doi.org/10.1177/146879410100100307
arXiv:http://qrj.sagepub.com/content/1/3/385.full.pdf+html
[2]
Bambi 2019. BAyesian Model-Building Interface (BAMBI) in Python.
Retrieved Jan 04, 2019 from https://github.com/bambinos/bambi
[3]
Jim Blascovich. 2002. The Social Life of Avatars. Springer-Verlag,
Berlin, Heidelberg, Chapter Social Inuence Within Immersive Virtual
Environments, 127–145. http://dl.acm.org/citation.cfm?id=505799.
505807
[4]
Harry Brignull and Yvonne Rogers. 2003. Enticing People to Interact
with Large Public Displays in Public Spaces. In INTERACT.
[5]
Jennifer L. Butler and Roy F. Baumeister. 1998. The trouble with
friendly faces: skilled performance with a supportive audience. Journal
of personality and social psychology 75 5 (1998), 1213–30.
[6]
Angelo Cafaro, Hannes Högni Vilhjálmsson, and Timothy Bickmore.
2016. First Impressions in Human–Agent Virtual Encounters. ACM
Trans. Comput.-Hum. Interact. 23, 4, Article 24 (Aug. 2016), 40 pages.
https://doi.org/10.1145/2940325
[7]
Alan D. Chatham and Florian Mueller. 2013. Adding an interactive
display to a public basketball hoop can motivate players and foster
community. In UbiComp.
[8]
N. B. Cottrell, D. L. Wack, G. J. Sekerak, and R. H Rittle. 1968. Social
facilitation of dominant responses by the presence of an audience
and the mere presence of others. Journal of Personality and Socia
Psychology 9, 3 (1968), 245–250.
[9]
Travis Cox, Marcus Carter, and Eduardo Velloso. 2016. Public DisPLAY:
Social Games on Interactive Public Screens. In Proceedings of the 28th
Australian Conference on Computer-Human Interaction (OzCHI ’16).
ACM, New York, NY, USA, 371–380. https://doi.org/10.1145/3010915.
3010917
[10]
Naomi Ellemers, Cathy Dyck, Steve Hinkle, and Annelieke Jacobs.
2000. Intergroup Dierentiation in Social Context: Identity Needs
versus Audience Constraints. Social Psychology Quarterly 63 (04 2000),
60–74. https://doi.org/10.2307/2695881
[11]
Katharina Emmerich and Maic Masuch. 2018. Watch Me Play: Does
Social Facilitation Apply to Digital Games?. In Proceedings of the 2018
CHI Conference on Human Factors in Computing Systems (CHI ’18).
ACM, New York, NY, USA, Article 100, 12 pages. https://doi.org/10.
1145/3173574.3173674
[12]
Joseph L Fleiss, Bruce Levin, and Myunghee Cho Paik. 2003. Statistical
methods for rates and proportions; 3rd ed. Wiley, Hoboken, NJ. https:
//cds.cern.ch/record/1254063
[13]
Andrew Gelman and Jennifer Hill. 2006. Data Analy-
sis Using Regression and Multilevel/Hierarchical Models (1
ed.). Cambridge University Press. http://www.amazon.
com/Analysis-Regression-Multilevel-Hierarchical-Models/dp/
052168689X/ref=sr_1_1?s=books&ie=UTF8&qid=1313405184&sr=
1-1
[14]
David C. Giles. 2002. Parasocial Interaction: A Review of the Literature
and a Model for Future Research. Media Psychology 4, 3 (aug 2002),
279–305. https://doi.org/10.1207/s1532785xmep0403_04
[15]
Shang Guo, Jonathan Lenchner, Jonathan Connell, Mishal Dholakia,
and Hidemasa Muta. 2017. Conversational Bootstrapping and Other
Tricks of a Concierge Robot. In Proceedings of the 2017 ACM/IEEE
International Conference on Human-Robot Interaction (HRI ’17). ACM,
New York, NY, USA, 73–81. https://doi.org/10.1145/2909824.3020232
[16]
Dennis L. Kappen, Pejman Mirza-Babaei, Jens Johannsmeier, Daniel
Buckstein, James Robb, and Lennart E. Nacke. 2014. Engaged by Boos
and Cheers: The Eect of Co-located Game Audiences on Social Player
Experience. In Proceedings of the First ACM SIGCHI Annual Symposium
on Computer-human Interaction in Play (CHI PLAY ’14). ACM, New
York, NY, USA, 151–160. https://doi.org/10.1145/2658537.2658687
[17]
Tom Kenter and Maarten de Rijke. 2015. Short Text Similarity with
Word Embeddings. In Proceedings of the 24th ACM International on
Conference on Information and Knowledge Management (CIKM ’15).
ACM, New York, NY, USA, 1411–1420. https://doi.org/10.1145/2806416.
2806475
[18]
E. S. Knowles. 1983. Social physics and the eects of others: Tests
of the eects of audience size and distance on social judgments and
behavior. J Pers Soc Psychol 45, 6 (1983), 1263–1279.
[19]
Stefan Kopp, Lars Gesellensetter, Nicole C. Krämer, and Ipke
Wachsmuth. 2005. A Conversational Agent as Museum Guide – De-
sign and Evaluation of a Real-World Application. In Intelligent Virtual
Agents, Themis Panayiotopoulos, Jonathan Gratch, Ruth Aylett, Daniel
Ballin, Patrick Olivier, and Thomas Rist (Eds.). Springer Berlin Heidel-
berg, Berlin, Heidelberg, 329–343.
[20]
Nicole Krämer, Gary Bente, and Jens Piesk. 2003. The ghost in the
machine. The inuence of Embodied Conversational Agents on user
expectations and user behaviour in a TV/VCR application1. IMC
Workshop 2003, Assistance, Mobility, Applications (01 2003).
[21]
Celine Latulipe, Erin A. Carroll, and Danielle Lottridge. 2011. Love,
Hate, Arousal and Engagement: Exploring Audience Responses to
Performing Arts. In Proceedings of the SIGCHI Conference on Human
Factors in Computing Systems (CHI ’11). ACM, New York, NY, USA,
1845–1854. https://doi.org/10.1145/1978942.1979210
[22]
S. Lloyd. 1982. Least squares quantization in PCM. IEEE Transactions
on Information Theory 28, 2 (March 1982), 129–137. https://doi.org/10.
1109/TIT.1982.1056489
[23]
Steve Love and Mark Perry. 2004. Dealing with Mobile Conversations
in Public Places: Some Implications for the Design of Socially Intrusive
Technologies. In CHI ’04 Extended Abstracts on Human Factors in Com-
puting Systems (CHI EA ’04). ACM, New York, NY, USA, 1195–1198.
https://doi.org/10.1145/985921.986022
[24]
Josh Lowensohn. 2015. Elon Musk: cars you can drive will eventually
be outlawed. https://www.theverge.com/transportation/2015/3/17/
8232187/elon-musk- human-drivers-are- dangerous
[25]
Ewa Luger and Abigail Sellen. 2016. "Like Having a Really Bad PA":
The Gulf Between User Expectation and Experience of Conversational
Agents. In Proceedings of the 2016 CHI Conference on Human Factors in
Computing Systems (CHI ’16). ACM, New York, NY, USA, 5286–5297.
https://doi.org/10.1145/2858036.2858288
[26]
Ulrike Luxburg. 2007. A Tutorial on Spectral Clustering. Statistics
and Computing 17, 4 (Dec. 2007), 395–416. https://doi.org/10.1007/
s11222-007- 9033-z
[27]
Bernhard Maurer, Ilhan Aslan, Martin Wuchse, Katja Neureiter, and
Manfred Tscheligi. 2015. Gaze-Based Onlooker Integration: Exploring
the In-Between of Active Player and Passive Spectator in Co-Located
Gaming. In Proceedings of the 2015 Annual Symposium on Computer-
Human Interaction in Play (CHI PLAY ’15). ACM, New York, NY, USA,
163–173. https://doi.org/10.1145/2793107.2793126
[28]
J.W. Michaels, J.M. Blommel, R.M. Brocato, R.A. Linkous, and J.S Rowe.
1982. Social facilitation and inhibition in a natural setting. Replications
in social psychology 2 (1982), 21–24.
[29]
Robert J. Moore, Raphael Arar, Guang-Jie Ren, and Margaret H. Szy-
manski. 2017. Conversational UX Design. In Proceedings of the
2017 CHI Conference Extended Abstracts on Human Factors in Com-
puting Systems (CHI EA ’17). ACM, New York, NY, USA, 492–497.
https://doi.org/10.1145/3027063.3027077
[30]
Masahiro Mori, Karl F. MacDorman, and Norri Kageki. 2012. The
Uncanny Valley [From the Field]. IEEE Robot. Automat. Mag. 19 (2012),
98–100.
The Eect of Audiences on Conversational Interfaces CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK
[31]
Heather L. O’Brien and Elaine G. Toms. 2008. What is User Engage-
ment? A Conceptual Framework for Dening User Engagement with
Technology. J. Am. Soc. Inf. Sci. Technol. 59, 6 (April 2008), 938–955.
https://doi.org/10.1002/asi.v59:6
[32]
S. Parise, S. Kiesler, L. Sproull, and K. Waters. 1999. Cooperating with
life-like interface agents. Computers in Human Behavior 15, 2 (1999),
123–142.
[33]
Jerey Pennington, Richard Socher, and Christopher D. Manning. 2014.
Glove: Global vectors for word representation. In In EMNLP.
[34]
Martin Porcheron, Joel E. Fischer, Moira McGregor, Barry Brown,
Ewa Luger, Heloisa Candello, and Kenton O’Hara. 2017. Talking with
Conversational Agents in Collaborative Action. In Companion of the
2017 ACM Conference on Computer Supported Cooperative Work and
Social Computing (CSCW ’17 Companion). ACM, New York, NY, USA,
431–436. https://doi.org/10.1145/3022198.3022666
[35]
Stuart Reeves. 2011. Designing Interfaces in Public Settings: Understand-
ing the Role of the Spectator in Human-Computer Interaction (1st ed.).
Springer Publishing Company, Incorporated.
[36]
Stuart Reeves, Steve Benford, Claire O’Malley, and Mike Fraser. 2005.
Designing the spectator experience. In CHI.
[37]
Stuart Reeves, Martin Porcheron, Joel E. Fischer, Heloisa Candello,
Donald McMillan, Moira McGregor, Robert J. Moore, Rein Sikveland,
Alex S. Taylor, Julia Velkovska, and Moustafa Zouinar. 2018. Voice-
based Conversational UX Studies and Design. In Extended Abstracts
of the 2018 CHI Conference on Human Factors in Computing Systems
(CHI EA ’18). ACM, New York, NY, USA, Article W38, 8 pages. https:
//doi.org/10.1145/3170427.3170619
[38]
Raoul Rickenberg and Byron Reeves. 2000. The Eects of Animated
Characters on Anxiety, Task Performance, and Evaluations of User
Interfaces. In Proceedings of the SIGCHI Conference on Human Factors
in Computing Systems (CHI ’00). ACM, New York, NY, USA, 49–56.
https://doi.org/10.1145/332040.332406
[39]
Paul W. Schermerhorn, Matthias Scheutz, and Charles R. Crowell. 2008.
Robot social presence and gender: Do females view robots dierently
than males? 2008 3rd ACM/IEEE International Conference on Human-
Robot Interaction (HRI) (2008), 263–270.
[40]
Alex Sciuto, Arnita Saini, Jodi Forlizzi, and Jason I. Hong. 2018. "Hey
Alexa, What’s Up?": A Mixed-Methods Studies of In-Home Conver-
sational Agent Usage. In Proceedings of the 2018 Designing Interactive
Systems Conference (DIS ’18). ACM, New York, NY, USA, 857–868.
https://doi.org/10.1145/3196709.3196772
[41]
John Short, Ederyn Williams, and Bruce Christie. 1976. The Social
Psychology of Telecommunications. John Wiley and Sons Ltd.
[42] spaCy 2019. spaCy. Retrieved Jan 04, 2019 from http://spacy.io
[43]
Lee Sproull, Mani Subramani, Sara Kiesler, Janet H. Walker, and Keith
Waters. 1996. When the Interface is a Face. Hum.-Comput. Interact. 11,
2 (June 1996), 97–124. https://doi.org/10.1207/s15327051hci1102_1
[44]
Megan Strait, Lara Vujovic, Victoria Floerke, Matthias Scheutz, and
Heather L. Urry. 2015. Too Much Humanness for Human-Robot In-
teraction: Exposure to Highly Humanlike Robots Elicits Aversive Re-
sponding in Observers. In CHI.
[45]
Anselm Strauss and Juliet Corbin. 1994. Grounded theory methodology.
In Handbook of qualitative research. 273–285.
[46]
Yee Whye Teh, Michael I. Jordan, Matthew J. Beal, and David M. Blei.
2004. Sharing Clusters among Related Groups: Hierarchical Dirichlet
Processes. In NIPS.
[47]
Jane Webster and Hayes Ho. 1997. Audience Engagement in Mul-
timedia Presentations. SIGMIS Database 28, 2 (April 1997), 63–77.
https://doi.org/10.1145/264701.264706
[48]
Tom White and David Small. 1998. An Interactive Poetic Garden. In
CHI 98 Conference Summary on Human Factors in Computing Systems
(CHI ’98). ACM, New York, NY, USA, 335–336. https://doi.org/10.1145/
286498.286804
[49]
Laura K. Wolf, Narges Bazargani, Emma J. Kilford, Iroise Dumontheil,
and S J Blakemore. 2015. The audience eect in adolescence depends
on who’s looking over your shoulder. In Journal of adolescence.
[50]
Sarah Woods, Kerstin Dautenhahn, and Christina Kaouri. 2005. Is
someone watching me? - consideration of social facilitation eects in
human-robot interaction experiments. In CIRA. IEEE, 53–60.
... The growing popularity of VUIs over the last five years presents new questions for the HCI community, which has begun to examine various aspects of user experience with them. These include perceptions of privacy for accessing sensitive information [24], the effect of an audience on using VUIs in public places [17], children's use of voice interfaces [100], and VUIs in everyday activities like family conversations [75]. Over 3.25 billion 'digital voice assistants' now exist globally, and unit numbers will exceed the world's population by 2023 [101]. ...
... For instance, privacy concerns were notoriously highlighted in 2018 when an Amazon Echo smart speaker secretly recorded and revealed to another party a private conversation held by its users [25]. In some settings, other human voices may interfere with successful VUI interactions [17,24]. Some people find it more difficult to speak and think than to think and type/touch, and it can be cognitively harder to retain and review spoken information than visual information [93]. ...
... An actual application of the WYHIWYS principle was the art installation Santiagos, created by three Brazilian artists (including the author of this paper) as part of a large exhibit of electronic art in Brazil [13]. The work recreated a tea room of the 19 th century populated with physical representations of characters from a very novel (see Figure 3). ...
... Questions about contemporary events, attempts to flirt with the characters, and other out-of-context questions were incorrectly answered or answered with a deflection statement ("Do you want more tea?"). We observed that after a small number of interactions, most visitors would move into asking questions which the characters could comprehend and thus started to receive correct and appropriate answers [13]. ...
Article
Full-text available
Language-based interaction with pervasive devices today mostly follows a basic command-and-control paradigm, reminiscent of 1960s Star Trek, which is often cumbersome, inadequate, and insufficient. Key causes include an often deceptive portrayal of the language comprehension capabilities of the system and, in many cases, a problematic impersonation of human characters by computers. We argue here that a major challenge of pervasive computing is to rethink language-based interaction with applications and devices. Inspired by the imaginary pervasive conversations of Sci-Fi movies, we suggest a new design principle where machines should only utter statements that they can comprehend, which we call What You Hear Is What You Say (WYHIWYS) . We provide some examples of WYHIWYS in language interactions with pervasive applications, including a deployment in an art exhibit, and discuss some key research challenges it poses to HCI, AI, and pervasive computing.
... CAs for legal research (Sugumaran & Davis, 2001)), and articles that do not fit a specific context are considered "generic" (e.g., Candello et al., 2019). ...
... Others focus on the relationship between the user and the agent (11%), such as social roles (Seering et al., 2019) and social distance (Kim & Mutlu, 2014). Constructs less frequently studied, are the perceptions of CAs, and outcomes of the interaction include rather specific aspects, coded as other (5%), which include audience effects (Candello et al., 2019), and constructs related to learning (3%), for example, retention of learnt content (Van Der Meij, 2013). Finally, a single study focuses on ethics, developing a moral agency scale (Banks, 2018). ...
Article
Full-text available
Conversational agents (CAs), described as software with which humans interact through natural language, have increasingly attracted interest in both academia and practice, due to improved capabilities driven by advances in artificial intelligence and, specifically, natural language processing. CAs are used in contexts like people's private life, education, and healthcare, as well as in organizations, to innovate and automate tasks, for example in marketing and sales or customer service. In addition to these application contexts, such agents take on different forms concerning their embodiment, the communication mode, and their (often human-like) design. Despite their popularity, many CAs are not able to fulfill expectations and to foster a positive user experience is a challenging endeavor. To better understand how CAs can be designed to fulfill their intended purpose, and how humans interact with them, a multitude of studies focusing on human-computer interaction have been carried out. These have contributed to our understanding of this technology. However, currently a structured overview of this research is missing, which impedes the systematic identification of research gaps and knowledge on which to build on in future studies. To address this issue, we have conducted an organizing and assessing review of 262 studies, applying a socio-technical lens to analyze CA research regarding the user interaction, context, agent design, as well as perception and outcome. We contribute an overview of the status quo of CA research, identify four research streams through a cluster analysis, and propose a research agenda comprising six avenues and sixteen directions to move the field forward.
... Salem et al. [53] had participants control a robot's movement, and in the faulty condition, the robot would move erratically, incorrectly responding to the users' input. Candello et al. [9] defined failure as occasions in which someone asked a question that could not be understood or was out of scope of the voice assistants' knowledge, in which case it would divert the conversation to ask an unrelated question. ...
Preprint
Full-text available
Despite huge gains in performance in natural language understanding via large language models in recent years, voice assistants still often fail to meet user expectations. In this study, we conducted a mixed-methods analysis of how voice assistant failures affect users' trust in their voice assistants. To illustrate how users have experienced these failures, we contribute a crowdsourced dataset of 199 voice assistant failures, categorized across 12 failure sources. Relying on interview and survey data, we find that certain failures, such as those due to overcapturing users' input, derail user trust more than others. We additionally examine how failures impact users' willingness to rely on voice assistants for future tasks. Users often stop using their voice assistants for specific tasks that result in failures for a short period of time before resuming similar usage. We demonstrate the importance of low stakes tasks, such as playing music, towards building trust after failures.
... In consequence, chatbot researchers currently have an unprecedented opportunity for real-world study of users [106], user motivations [14], and implications at scale. In consequence, knowledge on chatbot use has been gathered for a range of contexts-in the private sphere [87], at work [74], and in public spaces [17]. A substantial body of research of relevance for chatbot use has been developed within broad domains such as health [105], education [84], and business [8], as well as more specific application domains such as polling [62], information search [73], libraries [92], and museums [64]. ...
Article
Full-text available
Chatbots are increasingly becoming important gateways to digital services and information—taken up within domains such as customer service, health, education, and work support. However, there is only limited knowledge concerning the impact of chatbots at the individual, group, and societal level. Furthermore, a number of challenges remain to be resolved before the potential of chatbots can be fully realized. In response, chatbots have emerged as a substantial research area in recent years. To help advance knowledge in this emerging research area, we propose a research agenda in the form of future directions and challenges to be addressed by chatbot research. This proposal consolidates years of discussions at the CONVERSATIONS workshop series on chatbot research. Following a deliberative research analysis process among the workshop participants, we explore future directions within six topics of interest: (a) users and implications, (b) user experience and design, (c) frameworks and platforms, (d) chatbots for collaboration, (e) democratizing chatbots, and (f) ethics and privacy. For each of these topics, we provide a brief overview of the state of the art, discuss key research challenges, and suggest promising directions for future research. The six topics are detailed with a 5-year perspective in mind and are to be considered items of an interdisciplinary research agenda produced collaboratively by avid researchers in the field.
... Analyses of user experiences related to mobile-based storytelling "guides" commonly used in museums have uncovered many issues in designing digital experiences for cultural, educational, and entertainment purposes. Studies have focused on themes that represent components, such as interactive story plots and narrative, staging and way-finding in the physical space, personalization, and social interaction [35][36][37]. ...
Article
Full-text available
The increased availability of chatbots has drawn attention and interest to the study of what answers they provide and how they provide them. Chatbots have become a common sight in museums but are limited to answering only simple and basic questions. Based on the observed potential of chatbots for history education in museums, we investigate how chatbots impact history education and improve the overall experience according to their appearance and language style. For this, we built three models, designed by factors on embodiment and reflection, and 60 sets of answer–questions, designed for the National Museum of Korea. We conducted a study with a total of 34 participants and carried out a variety of analyses covering individual learning styles, museum experience scales, gaze data, in-depth interviews and observations from researchers. We present various results and lessons regarding the effect of embodiment and reflection on the museum experience. Our findings show how people with different learning styles connect with chatbot models and how visitors’ behavior in the museum changes depending on the chatbot model. Specifically, the chatbot model equipped with embodiment and reflection shows its superiority in enhancing the museum experience, in general.
... Prior work in the western context has found that voice-based interactions on phones and smart speakers have a democratizing effect, as family members can interact with the devices without needing an invitation [6,66]. Also, a study conducted during an art exhibition in Brazil reported that users' perception of the use of voice-based devices changed with the type of audience (e.g., in the presence of an acquaintance, users felt that their questions were answered less frequently) [12]. In our study, smart speakers were placed in the living areas of all the households; in the U.S., they were primarily placed in kitchens [64] or bedrooms [79], albeit they did facilitate democratised access. ...
Preprint
Full-text available
Since smart speakers were introduced to the Indian markets at the end of 2017, they have been adopted by hundreds of thousands households in India. While the scholarship has examined the long-term use of voice-based devices in western contexts, little is known about user behavior in India-one of the fastest-growing smart speaker markets in the Asian Pacific region. Therefore, this study aims to explore how members families in India integrate Google Home into their daily lives. To this end, we collected long-term Google Home activity logs from 20 households and conducted interviews with one member from each generation in every household. Our findings shed light on the unique daily use patterns of adults and children, the users' approaches to, and challenges in learning to use, the devices, and factors that impact users' continued use or abandonment of the devices. We conclude the paper by discussing the implications of our findings and by proposing relevant design recommendations.
... Despite the criticism of the existing imperfections [9,12,38], today, VPA is gaining the place next to a human, managing the programs, applications, and smart devices at home, and being the interlocutor. There is the evidence that small talk with voice assistants accounts for 4 to 13% of all communication during the day [4]. ...
Conference Paper
Full-text available
Information and communication technologies are changing the life of a modern person in many aspects. One of the recent major achievements is the opportunity to communicate with modern technological appliances in natural language. The purpose of this study is to analyze the expectations of people about their future home assistant. The authors analyzed 220 comments - requests for improvement of Virtual Personal Assistant (VPI), which is becoming an increasingly popular home technology. The main research method is netnography, which made it possible to identify the main expectations about the home assistant of the future. In addition, the method of heuristic forecasting, expert method, the interpreted method and the case analysis were used. It was revealed that there is satisfaction with the VPA's communication skills, but full-fledged communication lacks individualization. VPA proved to be the technology that could extend the scope of new technological solutions for smart home. The most sophisticated solutions suggest that VPI could become a system that can do everything in the house as efficiently as a human. Overall, the opinions of the respondents about VPA reflect their vision of future integration of VPA, smart things and home robotics
... CS also are covered in Human-Computer Interaction (HCI) research. The HCI area has provided several contributions in the CS topics (Porcheron et al., 2018) in several contexts and users profile, such as elderly users (Trajkova and Martin-Hammond, 2020), and children (Lovato and Piper, 2015), as well as case studies (Candello et al., 2019) and literature reviews (Kocaballi et al., 2020). ...
Conference Paper
Full-text available
Conversational Systems (CS) are increasingly present in people's daily lives. CS must provide a good experience and meet the needs of its users. Therefore, the Usability and User Experience (UX) evaluation is an appropriate step before making CSs available to society. To guide developers to identify problems, improvement suggestions, and user perceptions during CSs development, we developed a technology named Usability and User Experience Evaluation of Conversational Systems (U2XECS). U2XECS is a questionnaire-based technology that provides Usability and UX statement specifics to evaluate CSs. We conducted an exploratory study performed to evaluate and evolve U2XECS. Our results evidenced positive points of U2XECS related to ease of use, usefulness, and intentions to use. Moreover, we identified opportunities for improvement in U2XECS, such as ambiguous statements that generated misinterpretations in subjects.
Conference Paper
Full-text available
Voice User Interfaces are becoming ubiquitously available, providing unprecedented opportunities to advance our understanding of voice interaction in a burgeoning array of practices and settings. We invite participants to contribute work-in-progress in voice interaction, and to come together to reflect on related methodological matters, social uses, and design issues. This one-day workshop will be geared specifically to present and discuss methodologies for, and data emerging from, ongoing empirical studies of voice interfaces in use and connected emerging design insights. We seek to draw on participants' (alongside organisers') contributions to explore ways of operationalising findings from such studies for the purposes of design. As part of this, will try to identify what can be done to improve user experience and consider creative approaches to how we might ameliorate challenges that are faced in the design of voice UIs.
Conference Paper
Full-text available
We describe the effective use of online learning to enhance the conversational capabilities of a concierge robot that we have been developing over the last two years. The robot was designed to interact naturally with visitors and uses a speech recognition system in conjunction with a natural language classifier. The online learning component monitors interactions and collects explicit and implicit user feedback from a conversation and feeds it back to the classifier in the form of new class instances and adjusted threshold values for triggering the classes. In addition, it enables a trusted master to teach it new question-answer pairs via question-answer paraphrasing, and solicits help with maintaining question-answer-class relationships when needed, obviating the need for explicit programming. The system has been completely implemented and demonstrated using the SoftBank Robotics humanoid robots Pepper and NAO, and the telepresence robot known as Double from Double Robotics.
Conference Paper
Full-text available
This one-day workshop intends to bring together both academics and industry practitioners to explore collaborative challenges in speech interaction. Recent improvements in speech recognition and computing power has led to conversational interfaces being introduced to many of the devices we use every day, such as smartphones, watches, and even televisions. These interfaces allow us to get things done, often by just speaking commands, relying on a reasonably well understood single-user model. While research on speech recognition is well established, the social implications of these interfaces remain underexplored, such as how we socialise, work, and play around such technologies, and how these might be better designed to support collaborative collocated talk-in-action. Moreover, the advent of new products such as the Amazon Echo and Google Home, which are positioned as supporting multi-user interaction in collocated environments such as the home, makes exploring the social and collaborative challenges around these products, a timely topic. In the workshop, we will review current practices and reflect upon prior work on studying talk-in-action and collocated interaction. We wish to begin a dialogue that takes on the renewed interest in research on spoken interaction with devices, grounded in the existing practices of the CSCW community.
Conference Paper
Full-text available
Determining semantic similarity between texts is important in many tasks in information retrieval such as search, query suggestion, automatic summarization and image finding. Many approaches have been suggested, based on lexical matching, handcrafted patterns, syntactic parse trees, external sources of structured semantic knowledge and distributional semantics. However, lexical features, like string matching, do not capture semantic similarity beyond a trivial level. Furthermore, handcrafted patterns and external sources of structured semantic knowledge cannot be assumed to be available in all circumstances and for all domains. Lastly, approaches depending on parse trees are restricted to syntactically well-formed texts, typically of one sentence in length. We investigate whether determining short text similarity is possible using only semantic features---where by semantic we mean, pertaining to a representation of meaning---rather than relying on similarity in lexical or syntactic representations. We use word embeddings, vector representations of terms, computed from unlabelled data, that represent terms in a semantic space in which proximity of vectors can be interpreted as semantic similarity. We propose to go from word-level to text-level semantics by combining insights from methods based on external sources of semantic knowledge with word embeddings. A novel feature of our approach is that an arbitrary number of word embedding sets can be incorporated. We derive multiple types of meta-features from the comparison of the word vectors for short text pairs, and from the vector means of their respective word embeddings. The features representing labelled short text pairs are used to train a supervised learning algorithm. We use the trained model at testing time to predict the semantic similarity of new, unlabelled pairs of short texts We show on a publicly available evaluation set commonly used for the task of semantic similarity that our method outperforms baseline methods that work under the same conditions.
Conference Paper
In-home, place-based, conversational agents have exploded in popularity over the past three years. In particular, Amazon's conversational agent, Alexa, now dominates the market and is in millions of homes. This paper presents two complementary studies investigating the experience of households living with a conversational agent over an extended period of time. First, we gathered the history logs of 75 Alexa participants and quantitatively analyzed over 278,000 commands. Second, we performed seven in-home, contextual interviews of Alexa owners focusing on how their household interacts with Alexa. Our findings give the first glimpse of how households integrate Alexa into their lives. We found interesting behaviors around purchasing and acclimating to Alexa, in the number and physical placement of devices, and in daily use patterns. Participants also uniformly described interactions between children and Alexa. We conclude with suggestions for future improvement for intelligent conversational agents.
Conference Paper
The presence of observers and virtual characters can significantly shape our gaming experience. Researchers suppose that most of the basic socio-psychological phenomena are also applicable for digital games. However, the social processes in gaming setups can differ from our experience in other social situations. Our work emphasizes that awareness. Insights are needed for the purposeful design of a game's social setting, specifically in applied contexts of learning and training. Here, we focus on the social facilitation effect, which describes an unconscious change in performance due to the presence of others, by investigating the impact of real observers and virtual agents on player experience and performance in four different games. The results of our four studies show that, in contrast to previous assumptions, in-game success was not significantly influenced by the presence of any social entity, indicating that social facilitation does not generally apply to the context of playing digital games.
Conference Paper
From Siri to Alexa to Cortana, conversational interfaces are hitting the mainstream and becoming ubiquitous in our daily lives. However, user experiences with such applications remain disappointing. Although it is easy to get a system to produce words, none of the current agents or bots display general conversational competence. Modeling natural conversation is still a hard problem. But in order to tackle it, conversational UX designers must possess a technical understanding of the structures of natural conversation. This workshop explores the intersection of user interface design and the design of natural conversation. It seeks to outline principles and guidelines for Conversational UX Design as a distinct discipline. Workshop participants will get their hands dirty building conversation flows.
Conference Paper
In this paper we introduce, examine, and reflect on player and spectator interaction, socialization, and engagement with two gesture-based multiplayer games deployed on two sensor-enabled and networked semi-public campus displays. One within a transitory corridor, the other in an open plan combined study area and student services space. Our results show that sensor placement and installation contexts of the display, as well as how players are introduced to the interaction techniques of the game, affect the screens' capacity to support social play. We subsequently offer concrete recommendations on how public display games can be built to encourage social play between two to four participants, limit social embarrassment, and encourage spectators to become active players. In doing so, we extend prior work that has primarily focused on single-user or crowd-based interaction.
Article
In greeting encounters, first impressions of personality and attitude are quickly formed and might determine important relational decisions, such as the likelihood and frequency of subsequent encounters. An anthropomorphic user interface is not immune to these judgments, specifically when exhibiting social interaction skills in public spaces. A favorable impression may help engaging users in interaction and attaining acceptance for long-term interactions. We present three studies implementing a model of first impressions for initiating user interactions with an anthropomorphic museum guide agent with socio-relational skills. We focus on nonverbal behavior exhibiting personality and interpersonal attitude. In two laboratory studies, we demonstrate that impressions of an agent's personality are quickly formed based on proximity, whereas interpersonal attitude is conveyed through smile and gaze. We also found that interpersonal attitude has greater impact than personality on the user's decision to spend time with the agent. These findings are then applied to a museum guide agent exhibited at the Boston Museum of Science. In this field study, we show that employing our model increases the number of visitors engaging in interaction.