Conference PaperPDF Available

Discovering Social Interaction Strategies for Robots from Restricted-Perception Wizard-of-Oz Studies

Conference Paper

Discovering Social Interaction Strategies for Robots from Restricted-Perception Wizard-of-Oz Studies

Figures

The Rule-based module of the strategy interaction controller. Offline data, shown in dotted lines, is collected from mock-up and WoZ studies and the Task AI to encode the interaction strategy rules. At run-time, the module checks perceptual information and can consult the AI and trigger specific rules to activate the robot's interaction behaviors. the guidance of the robot's interaction behaviors. Each module is designed according to distinct principles and based on different data gathered during the Data Collection phase. Naturally, we could try to learn interaction strategies by using only ML algorithms directly from the collected data. However, it is simply unfeasible to learn a controller from all raw input data available from the robot's sensors. Some recent solutions within reinforcement learning [16], based in deep learning techniques, do propose to discover efficient representations from high-dimensional sensory data to allow the learning of complex tasks. However, they rely in some form of reward signal to guide the agent's decision process in well-defined tasks. Within HRI, robots are permanently interacting with human subjects each with its own decision process and intervening in the task in a distinct manner. Notably, the interaction task may not be well-defined and as such there is no reward function that can be designed that leads the robot's performance in a desirable manner. As such, the purpose behind the hybrid controller is to have a flexible mechanism intervening at specific times depending on the task and context, and a robust system by identifying and learning from more complex situations for which there is no clear interaction behavior to be performed.
… 
Content may be subject to copyright.
Discovering Social Interaction Strategies for Robots
from Restricted-Perception Wizard-of-Oz Studies
Pedro Sequeira, Patr´
ıcia Alves-Oliveira, Tiago Ribeiro, Eugenio Di Tullio, Sofia Petisca,
Francisco S. Melo, Ginevra Castellanoand Ana Paiva
INESC-ID and Instituto Superior T´
ecnico, Universidade de Lisboa. Av. Prof. Cavaco Silva, 2744-016 Porto Salvo, Portugal
INESC-ID and Instituto Universit´
ario de Lisboa (ISCTE-IUL), CIS-IUL, Lisboa, Portugal
Department of Information Technology, Uppsala University, Sweden
pedro.sequeira@gaips.inesc-id.pt
Abstract—In this paper we propose a methodology for the cre-
ation of social interaction strategies for human-robot interaction
based on restricted-perception Wizard-of-Oz studies (WoZ). This
novel experimental technique involves restricting the wizard’s
perceptions over the environment and the behaviors it controls
according to the robot’s inherent perceptual and acting limita-
tions. Within our methodology, the robot’s design lifecycle is
divided into three consecutive phases, namely data collection,
where we perform interaction studies to extract expert knowledge
and interaction data; strategy extraction, where a hybrid strategy
controller for the robot is learned based on the gathered data;
strategy refinement, where the controller is iteratively evaluated
and adjusted. We developed a fully-autonomous robotic tutor
based on the proposed approach in the context of a collaborative
learning scenario. The results of the evaluation study show that,
by performing restricted-perception WoZ studies, our robots are
able to engage in very natural and socially-aware interactions.
I. INTRODUCTION
Recent studies foresee a major involvement of robots in
our daily-life activities in a very near future [1]. Although
considerable advances have been made in the creation of be-
haviors for social robots in the field of human-robot interaction
(HRI), e.g., [2], we still need to further investigate how to
develop and sustain effective social interactions. Striving for
a successful design and implementation of social interaction
strategies involves the creation of more socially evocative,
situated and intelligent robots. Towards that goal, in this paper
we advance a principled manner to discover interaction strate-
gies for social robots from restricted-perception Wizard-of-Oz
(WoZ) studies. We formally define an interaction strategy as
a mapping between some robot’s perceptual state and some
interaction behavior it can perform in a given task of interest.
We argue that in order to extract meaningful information
that can be used to design the autonomous controller of the
robot, its perceptual and behavioral limitations have to be
taken into account during the interaction studies. As such,
we propose that the human expert, referred to as the wizard,
should be restricted from perceiving everything occurring
within the task. The idea is to provide the wizard solely the
information regarding the task that the robot can have access
to as supported by its sensing capabilities, and then let him
dynamically develop some interaction strategy. As a result, we
minimize the correspondence problem faced when learning
from human demonstrations that occurs due to the inherent
perceptual and behavioral differences between the robot and
the human expert [3]. By performing restricted-perception
WoZ studies we therefore even out the kind and amount of
information and the interaction behaviors available to both the
wizard and the robot—a central tenet of our approach.
A. Methodology Overview
Discovering social interaction strategies from restricted-
perception WoZ studies influences the whole design lifecycle
of the robot. Therefore, we divide our methodology in three
phases to ultimately build a robot that can act autonomously
within its social environment. The process is outlined in Fig. 1.
In the Data Collection phase, researchers gather data from
different types of interaction studies with the objective of
gaining insight on common human interaction strategies in the
desired task. Namely, mock-up studies are performed in which
possible end-users of the system interact with a human expert
performing the task in place of the robot. In turn, this data is
used to build a set of task-related artificial intelligence (AI)
modules—henceforth referred to as the Task AI. These mod-
ules are the building blocks of the robot’s interaction strategy,
i.e., modeling all the necessary perceptions and basic behaviors
for it to perform the task itself and interact with humans.
The idea is to alleviate the wizard’s decision-making when
performing restricted-perception WoZ studies, so he can focus
on the relevant aspects of the robot’s social interaction. After
performing the WoZ studies under the restricted-perception
setting we collect all the expert knowledge and interaction
data resulting thereof in order to build the robot’s controller.
In the Strategy Extraction phase we build a hybrid interac-
tion strategy controller for the robot. The controller includes
arule-based module encoding task information, well-known
strategies and behavior patterns observed from the interaction
studies in the form of strategy rules. This module includes
hand-coded rules denoting common practices employed by the
wizard during the studies. In addition, the controller considers
strategies discovered using machine learning (ML) techniques,
allowing the identification of more complex situations. This
occurs within the controller’s ML-based module, where ML
algorithms learn to identify which and when to trigger an inter-
action behavior during the task. In that matter, collecting data
by performing restricted-perception WoZ studies is vital to
978-1-4673-8370-7/16/$31.00 © 2016 IEEE
197
Mock-Up Studies
Restricted-Perception
WoZ Studies
Data Collection
Hybrid
Controller
Strategy Extraction
Refinement &
Evaluation
Studies
Strategy Refinement
Interaction Data
Expert Knowledge
Interaction
Strategy
Task AI
Refinements
Rule-
based
Module
ML-
based
Module
Fig. 1: The different phases of the methodology for discovering interaction strategies from restricted-perception WoZ studies.
ensure that the robot has access to all the relevant information
to interact with humans in an appropriate manner—including
knowing when not to interact.
Finally, in the Strategy Refinement phase we conduct
evaluation studies to assess the performance of the robot
being autonomously controlled while interacting with others
within the given task. This phase allows HRI researchers to
iteratively refine the robot’s behaviors for situations that may
not have been properly learned or for which one could not
gather enough relevant information in the previous phases.
B. Related Work
To the extent of our knowledge, no previous study has ad-
dressed a methodology to design and create interaction strate-
gies for a robot having into account a restricted-perception
WoZ technique. Research among the HRI community devoted
to the design of robot interaction strategies has revealed a wide
range of directions [4]. Many of the methods proposed resort
to the WoZ methodology in order to simulate future behaviors
of the robot and foresee how the implemented mechanisms
will perform in a desired environment and task. In a recent
review of WoZ studies, Riek [5] shows different attempts of
the research community to adapt the WoZ method to a better
simulated situation or environment for learning with the wiz-
ard. In addition, Knox et al. [6] proposed a model for learning
interaction behaviors from human users by teaching a robot
to properly behave during social interactions. Namely, the
subjects are lead to believe they are teaching an autonomous
robot, when in fact the latter is being controlled by a human
expert. The objective is that the autonomous robot learns from
patterns generated by the wizard’s decisions. Furthermore,
Steinfeld et al. [7] make sensible differentiations between
several types of WoZ studies, highlighting the importance of a
rigorous distinction between human modeling and placeholder
simulations. Specifically, they argue that distinct models can
serve different research purposes and be part of several stages
of the robot developmental cycle.
C. Case-study
To illustrate the usefulness and applicability of the pro-
posed methodology, we built a fully-autonomous humanoid
robotic tutor capable of interacting with young learners in a
collaborative, empathic and social manner. Specifically, we
designed a scenario in the context of the EU FP7 EMOTE
project (http://www.emote-project.eu/) that aims to develop
novel artificial embodied tutors capable of engaging in em-
pathic interactions with students in a shared physical space
[8]. The scenario involves the interaction between a tutor
and two students playing MCEC—a multiplayer, collaborative
video game [9], which is a modified version of the serious
game EnerCities [10] that promotes strategies for building
sustainable cities. As illustrated in Fig. 2, the students and
the tutor make successive plays in the game via a touch
table.1Depicted are also the different entities playing the
role of the tutor throughout the several studies involved in
our methodology. Within the MCEC scenario, the students
correspond to prospective end-users of our system, while we
rely on the participation of human experts in pedagogy and
psychology to either play in the place of the robotic tutor or
remotely control its behavior. The realization of the proposed
methodology involves dealing with several practical problems,
ranging from the preparation of the multiple studies to the
implementation of a fully functional hybrid controller from
the collected data. As such, throughout the paper we use the
MCEC scenario as a case-study to analyze the challenges faced
when using the methodology in real-work settings. This paper
is organized according to the phases of our methodology, as
oulined in Fig. 1.
II. COLLECTING DATA F RO M INTERACTION STUDIES
This section details the first phase of the methodology,
Data Collection, as depicted in Fig. 1. The main purpose
is to prepare and perform restricted-perception WoZ studies
by acquiring useful knowledge about appropriate interaction
strategies in the task being considered. The idea is to let
humans that are experts on the given task to perform several
interaction sessions with prospective end-users of the system.
A. Building the Task AI from Mock-up Studies
Generally speaking, mock-up models are used to conduct
studies before hardware and software development with the
aim of abstracting the environment and the end-users in a set
of possible minimalist scenarios [11]. In our methodology we
use mock-up studies to prepare the WoZ studies and influence
and inspire the development and implementation of all the
system components controlling the robot’s interaction strategy,
as illustrated in Fig. 1. To achieve that, we conduct several
interaction sessions in which possible end-users of the system
interact with a human expert performing the desired task in
the place of the robot.
1In the case of the robotic tutor, its game-actions are performed “internally”
through direct communication with MCEC’s game engine.
198
Students
Classroom teacher
Tutor
(a) Mock-up study
Students
Controlled
robotic tutor
Human expert
(wizard)
Tutor
(b) Restricted-perception WoZ study
Students
Fully autonomous
robotic tutor
Tutor
(c) Refinement & Evaluation studies
Fig. 2: The HRI scenario used as a case-study for our methodology. Depending on the phase, the role of tutor can be played
either by: (a) a classroom teacher; (b) a humanoid robot controlled by a human expert (wizard); (c) a fully autonomous robot.
After acquiring expert knowledge from the mock-up studies,
we are able to devise what the robot’s perceptions and actions
will be in the task. Given the challenges related with pro-
cessing all the robot’s input data and its low-level control, we
simplify many of the interaction aspects that are solely related
with the task itself, both in terms of perception and behavior.
For example, if the goal of the robot is to recognize faces
and address to specific users during the interaction, one can
use specialized computer vision (CV) algorithms to perform
such task and use solely its output to control the robot’s
decisions. Likewise, if the task involves having the robot point
to someone or say something specific, one can create macro
operators that encode the necessary low-level behaviors. We
can also consider specialized planning and decision-making
algorithms that address specific problems, e.g., when the task
involves playing a game. We refer to the set of all such
perceptual and behavioral mechanisms as the Task AI. As
depicted in Fig. 1, these AI modules foster the realization of
restricted-perception WoZ studies by simplifying both what
the expert perceives and the decisions he has available. In
terms of perception, the Task AI yields a set of state features
that are accessible to the robot and are informative enough to
summarize the important aspects involved in the interaction. In
terms of behaviors, they allow the management of the robot’s
interactions with the humans in the desired task.
B. Performing Restricted-Perception WoZ Studies
Once the Task AI has been implemented based on the mock-
up studies, we can discover appropriate interaction strategies
for the robot by resorting to the proposed restricted-perception
WoZ technique, which is an extension of the standard WoZ
experimental framework [12] commonly used in HRI—from
now on referred to as the unrestricted WoZ. As illustrated
by Fig. 1, this is the pivotal phase in our methodology as
it allows the generation of interaction data that can later be
used to encode behavior rules and apply ML techniques to
automatically extract appropriate interaction strategies.
Given the impracticability of manually designing all be-
haviors for every predictable situation that the robot might
face beforehand, one of the most effective ways of devising
robot behaviors is to learn relevant interaction strategies given
expert demonstrations. Despite its success in helping identify
and address many challenges of HRI, the standard WoZ
technique bears some complications in later stages of the robot
design when we want to extract useful behaviors from the
wizard’s interactions. Specifically, many perceptual and acting
limitations of the robot are often disregarded by giving the
wizard complete access to observations over the interaction,
thus making it difficult for the robot to correctly interpret the
environment and properly act on its own [5]. For instance,
without a very accurate speech recognition and interpreta-
tion mechanism–which are still far from being commonly
available–an autonomous robot simply cannot make sense of
what is being said during the interaction, and therefore its
responses will mostly be far from expected.
To address such problems we perform restricted-perception
WoZ studies by limiting what the wizard can observe from
the task’s environment. In that respect, the Task AI provides
amongst other things informative knowledge about the state of
the task in the form of perceptual features and a high-level be-
havioral repertoire. Within our methodology, this corresponds
to all the information and behaviors available to the human
expert during the WoZ studies so that he may dynamically
choose an appropriate interaction strategy. Consequently, this
will be all the perceptual and behavioral data available to
later build the robot’s interaction strategy controller. As we
will show further ahead, this allows that complex behavioral
patterns exhibited by the experts during the interaction may
be discovered using ML algorithms. In order to prepare a
WoZ study in the restricted-perception setting, posible wizards
should undergo a training phase to get accustomed to the
robot’s perceptual and behavioral capabilities in the context
of the desired task. The experts’ feedback may also be used
to iteratively refine the user interface prior to the studies.
C. Data Collection in the MCEC Scenario
In the context of EMOTE, we performed a mock-up study
to draw the requirements and specifications for the robot’s
intended behavior within the MCEC task. Specifically, we
conducted experimental sessions in a high-school classroom
involving the interaction between an actual school teacher
and several students playing a game of MCEC, as illustrated
in Fig. 2(a). A total of 31 students aged between 1315
participated in this study and were randomly distributed across
two study conditions: 1) One teacher and two students played
a game of MCEC (5sessions); 2) Three students played the
game without the presence of the teacher (7sessions). The
purpose was to observe interaction differences to enable the
199
development of a robotic tutor able to act both as a companion
and as a tutor. We also interviewed the human experts to
understand their reasoning process and gather information
about interaction dynamics and common strategies used. This
allows us to gain insight on specific interaction strategies that
could be triggered by the wizards during the WoZ studies.
Depending on the specific context of the interaction task,
different perceptions and behaviors may be encoded by the
Task AI modules to be used during the WoZ studies. Within
the MCEC scenario, we derived a set of state features relating
the students’ expressive information, auditory features and
more task-oriented information from the data gathered during
the mock-up study. We used off-the-shelf CV software to
detect both students’ emotional facial expressions and gaze
information from a video camera, allowing the robot to react
in an empathic fashion. We also acquired data from different
microphones in the environment and used specialized software
to detect a restricted set of task-related keywords and iden-
tify the active speaking student. This allowed the restricted-
perception wizard to possibly infer discussions between the
students regarding the task, or e.g., informing about interven-
tion opportunities when no speaker is detected. In addition, and
because the robotic tutor is also an active player in MCEC,
we implemented a dedicated game-AI module capable of
autonomously performing actions in the game. It also adjusts
its game strategy according to the game’s status and adopts a
group strategy based on the students’ actions [9]. In terms of
features, the game-AI module adds information about critical
moments of the game, such as when a level changes.
Together with this perceptual information, the Task AI
included the implementation of the robot’s social behaviors,
i.e., all the animations, gaze functions and speech, that were
designed from the mock-up studies. In that regard, the ELAN
tool2was first used to annotate gestures and gaze of both the
students and the teacher from recorded video and audio data
[13]. We then distilled a set of social interaction behaviors for
the robot to perform within the task that emerged during all the
human-human interactions. Such social behaviors were coded
and analyzed in terms of dialog dimensions, each providing a
different interaction purpose [14]. For example, we organized
a set of pedagogical behaviors inspired on observed teacher-
students interactions such as prompting the students for more
information, promoting discussion on some task-related topic,
or managing the flow of the task. In addition, the non-verbal
behavior of the robot was also inspired in how the real teacher
and students interacted, e.g., by shifting the robot’s gaze
between the game and the players in order to drive their focus
of attention towards relevant aspects of the task.
We then performed a restricted-perception WoZ study to
acquire knowledge regarding possible pedagogical interaction
strategies for an autonomous robotic tutor. Particularly, we
trained one human expert in the MCEC task and collected data
about the specific communicative behaviors chosen and the
context in which they were triggered, i.e., to discover what the
2http://tla.mpi.nl/tools/tla-tools/elan/
robot should say and when to say it. A total of 56 students aged
between 1416 participated in the study conducted in a school
classroom context. As illustrated in Fig. 2(b), each session
consisted of two students playing MCEC with the robotic tutor
being remotely-operated. For the purpose of data gathering, we
recorded all state features available to the wizard at specific
intervals and whenever some behavior was triggered.
III. LEARNING STRATEGIES FROM THE WIZARD
In this section we present design principles and methods
to build an interaction strategy controller for a robot based
on the collected data regarding the human expert interaction
strategies, corresponding to the Strategy Extraction phase of
our methodology, outlined in Fig. 1. The objective of this
phase is to try to “infer” the decision process used by the
human experts during the restricted-perception WoZ studies
and distill an interaction strategy controller from them.
A. The Correspondence Problem
Within the control theory literature, the aforementioned
procedure can be seen as that of learning a policy from
demonstrations, i.e., discovering a mapping between states
and actions given a set of demonstrative behaviors performed
by an expert in some task of interest [3]. Within robotics,
one of the most difficult challenges to overcome is the cor-
respondence one—inherently, the robot whose controller we
are trying to learn will not have the same acting and sensing
capabilities as those of the human experts, hence a direct
mapping is simply not possible [3]. Usually, such problem
is addressed by creating sensory-motor maps—linking each
observed state and action to the imitator’s embodiment—to
allow task transference between different bodies [15]. One of
the principles proposed by our methodology to mitigate the
correspondence problem is, as described in Sec. II, to restrict
both what the wizard can perceive from the task’s environment
and the type of interactions that he is able to control during
the WoZ. This significantly reduces the complexity in finding
a correspondence between the task’s “state” as observed by the
human and the information available to the robot. Still, there is
a great amount of implicit knowledge used by humans that is
very hard to observe directly from raw digital sensor data. For
example, in the task of MCEC, a wizard knows the rules of the
game and its pedagogical objective, and that he is interacting
with students that may be playing the game for the first time,
thus inferring their doubts and initial challenges. Because it
is unfeasible to model all the information that a wizard might
use during its decision-making process, the features available
to the robot may not be sufficient to act optimally regarding
the expert behavior [3].
B. Hybrid Interaction Strategy Controller
Having into account the aforementioned problems, our
methodology proposes a hybrid solution for the control of
the robot’s interaction strategies. Specifically, this involves
the creation of a controller where a data-driven ML-based
module and an event-driven Rule-based module compete for
200
Fig. 3: The Rule-based module of the strategy interaction
controller. Offline data, shown in dotted lines, is collected
from mock-up and WoZ studies and the Task AI to encode
the interaction strategy rules. At run-time, the module checks
perceptual information and can consult the AI and trigger
specific rules to activate the robot’s interaction behaviors.
the guidance of the robot’s interaction behaviors. Each module
is designed according to distinct principles and based on
different data gathered during the Data Collection phase.
Naturally, we could try to learn interaction strategies by
using only ML algorithms directly from the collected data.
However, it is simply unfeasible to learn a controller from
all raw input data available from the robot’s sensors. Some
recent solutions within reinforcement learning [16], based in
deep learning techniques, do propose to discover efficient
representations from high-dimensional sensory data to allow
the learning of complex tasks. However, they rely in some
form of reward signal to guide the agent’s decision process
in well-defined tasks. Within HRI, robots are permanently
interacting with human subjects each with its own decision
process and intervening in the task in a distinct manner.
Notably, the interaction task may not be well-defined and as
such there is no reward function that can be designed that
leads the robot’s performance in a desirable manner. As such,
the purpose behind the hybrid controller is to have a flexible
mechanism intervening at specific times depending on the task
and context, and a robust system by identifying and learning
from more complex situations for which there is no clear
interaction behavior to be performed.
C. Rule-based Module
This module is responsible for modeling well-known strate-
gies in the form of behavior rules,i.e.,If-perceptual state-
Then-interaction behavior rules, that are automatically acti-
vated at specific times during the interaction. As illustrated
in Fig. 3, different types of rules may be encoded. Namely,
we use observation analysis performed in the mock-up studies
and information gathered from the interviews to derive ex-
plicit expert knowledge, i.e., manually designed rules denoting
consistent practices employed by the human experts during
the interactions. In addition, due to the restricted-perception
condition, the experts’ decisions are more likely to be based
on high-level, cognitive knowledge about the task and inter-
action scenario, than based on low-level, sensory information.
As such, we can take advantage of task-related information
provided by the Task AI to encode domain knowledge rules,
e.g., trigger some behavior whenever some task milestone
is reached, or activate some attention-calling function if the
system detects that the subjects are distracted from the task.
D. ML-based Module
This module is responsible for automatically discovering
complex situations that may have arisen during the interaction
sessions and for which it is hard to explicitly create behavior
rules. As discussed in Sec. III-A, this involves learning an
interaction strategy given a set of demonstrations provided
by the human experts. Given the complex nature of HRI, it
would be an exhausting endeavor to try to find a world model
given all the collected data—the multitude of possible task
states is usually sufficiently large that a massive amount of
demonstrations would be required to learn a suitable policy.
On the other hand, within our methodology we are already
modeling task-specific dynamics in the Rule-based module.
Therefore, this module aims at discovering complex situations
that triggered interaction behaviors during the WoZ studies—
by following this principle, the robot controller is able to
perform a much richer set of interaction behaviors without
having to manually design them for specific purposes.
Formally, in our methodology we learn interaction strategies
through a mapping function between the robot’s state features
and interaction behaviors given the wizard demonstrations in
the restricted-perception WoZ studies. Recall from Sec. II
that the wizards have the responsibility of choosing which
interaction behavior should be triggered and when to trigger it.
The same responsibility has to be transferred to the ML-based
module, notably that of also knowing when not to perform a
behavior—this is another crucial aspect of our methodology,
as intervening in an incorrect way at the wrong time can easily
lead to “breaks” in the interaction flow between humans and
the robot. To address such challenge, researchers may use
suitable ML algorithms to learn the mapping function, e.g., by
using classification or clustering algorithms. The process is
illustrated in Fig. 4. It starts with a Data Preparation phase
involving the transformation of the collected demonstrations
into a data-set of state features–behavior pairs referred to
as training instances. The Training phase learns a mapping
function encoding the observed interaction strategies from the
given data-set. After having learned the mapping, the ML-
based module may choose an appropriate interaction behavior
at run-time upon request, given the robot’s perceptual state.
E. Strategy Extraction in the MCEC Scenario
In EMOTE we preprocessed all the log files generated
during the interaction studies to create a set of binary state
features that at some instant can either be active or inactive.
In the ML-based module, the features are used to learn the
interaction context upon which to activate some behavior. In
201
Mapping
Function
interactive
behavior
perceptual
features
Perceptual State
Traini ng Run-time Classification Data Preparation
perceptual
features
wizard
behaviors
WoZ Log File
attributes class
0 0 0 0 1 0 0 1 behavior a
1 1 0 1 0 1 1 0 do_nothing
1 0 1 0 1 0 0 1 behavior c
0 1 0 1 1 1 0 1 behavior b
Data-Set
Fig. 4: A depiction of the ML-based module processing. A data-set is prepared from the WoZ data and then fed to some ML
algorithm to learn a mapping function that is used at run-time to select an appropriate robot interaction behavior.
the Rule-based module, they suggest that a certain behavior
should be triggered whenever a specific context is active.
Regarding the Rule-based module, we analyzed data
recorded from the mock-up and the sessions performed with
the teachers to create pedagogical strategies that enhanced
the pedagogical capabilities of the robotic tutor. For example,
whenever the game started, the robot would give a short
tutorial explaining the game rules, and when it finished it
would wrap-up by summarizing the main achievements and
analyzing the group’s performance. Also, upon each student’s
game play, the module communicates with the game AI
module to analyze the quality of his action and possibly
suggest an alternative, more suitable move. The rules also
encode interaction management functions like announcing the
next player or other game-related information. In the context of
EMOTE, such expert knowledge rules are especially important
for robotic tutors to improve the students’ comprehension of
the task and to understand their learning progress.
As for the ML-based module, we followed the procedure in
Fig. 4 to create a data-set from the restricted-perception WoZ
log files. Instead of sampling perceptual states at a constant
rate, we only recorded an instance whenever the value of
some binary state feature changed to avoid biasing towards
behaviors occurring during long, immutable states. For each
wizard-controlled behavior, we created a new training instance
mapping the respective perceptual state to the behavior’s label.
For states changing without the wizard’s intervention we cre-
ated instances labeled with DoNothing. In the training phase,
we used the WEKA software package3to build several classifi-
cation models based on different ML algorithms, e.g., decision
trees, neural networks, clustering algorithms, etc. We then
tested the several approaches using cross-validation to obtain
accuracy scores for each model. A few considerations from our
ML experiments within the MCEC scenario are worth noting.
First, the data generated from the restricted WoZ study was
quite noisy—the attributes changed very often with no wizard
behavior being performed. The data was also inconsistent—
the same input features were mapped to distinct behaviors and
3http://www.cs.waikato.ac.nz/ml/weka/
many behaviors were triggered in very different perceptual
states. Further, the data-set was quite unbalanced—behavior
DoNothing corresponded to about 90% of the whole data while
other behaviors had only a few demonstrations, resulting in
low accuracy scores across the models. Notwithstanding, this
is to be expected—the perceptual processing within the Task AI
may itself be very noisy, causing many instances of DoNothing
to be recorded. In addition, the infeasibility of determining the
exact features that the human experts were paying attention to
when triggering some behavior is naturally going to originate
the observed inconsistencies.
Given that the robot should not intervene very often and es-
pecially in the wrong situations, all this posed some challenges
that had to be addressed in order to select appropriate be-
haviors during the interactions with humans. In particular, we
addressed the noisiness problem by applying filters to clean the
data from redundant attributes. We also trained the classifiers
using only the behaviors performed by the wizards. While this
naturally improved the accuracy in many behaviors it also led
to a new problem—the trained models always tried to classify
every given test instance with a known behavior, thus leading
to overfitting [17]. This problem is potentially more serious
when interacting at run-time as plenty novel perceptual states
may be experienced, leading the robot to intervene inappro-
priately very often. To address such challenges we resorted
to lazy learning, a ML technique that takes leverage of local
information to discriminate between classification labels [17].
Specifically, we developed a technique based on an associative
metric within frequent-pattern mining [18]. For each available
robot behavior the method builds two structures capturing the
context in which it should and should not be executed. The
idea is to evaluate the confidence of each classification and
learn when not to trigger incorrect robot behaviors. The result
was a model encoding a conservative interaction strategy that
is able to discover the behaviors performed during the WoZ
studies that seemed more consistent and reliable.
IV. REFINING INTERACTION STRATEGIES
After creating the robot’s controller from the information
gathered during the HRI studies, we validate and refine the
202
Fig. 5: Two students interact with our autonomous tutor robot.
robot’s interaction model through an iterative process, as
illustrated in Fig. 1. To achieve that, in the Strategy Refinement
phase of our methodology we use two mechanisms that HRI
practitioners can use to refine the robot’s strategies, namely
through active learning and by means of corrective feedback.
Before that, iterative evaluation studies can be performed to
identify e.g., for which behaviors we need to improve the
selection policy, or to identify novel interaction contexts that
did not occur during the initial studies or for which we could
not appropriately define a strategy. In our methodology these
studies are performed by testing the autonomous operation of
the hybrid strategy controller within the task of interest.
Regarding active learning within ML, this technique puts
on the learner the responsibility of querying an expert about
specific inputs for which it needs to learn an output or improve
its accuracy [19]. In that respect, one can create methods
that automatically identify areas of the robot’s perceptual state
space that were less explored during the WoZ studies. Such
data points can then be used to actively query human experts
about which interaction behaviors should be performed in
that specific context. Additionally, one can build mechanisms
within the ML-based module that identify data inconsistencies
like the ones that occurred in the MCEC task scenario.
Another possibility is to let a human expert provide correc-
tive feedback during the evaluation study itself, again applying
the restricted-perception condition. The idea is to have the
expert observe the output of the strategy controller online,
i.e., while the robot is interacting with the human subjects, and
give him the opportunity to either accept the selected behavior
or suggest a different one (see e.g., [19]). The rationale behind
this technique is that those interaction contexts in which the
expert overrode the controller are most likely the ones either
incorrectly encoded or learned during the Strategy Extraction
phase. We can later leverage the provided feedback to refine
some interaction strategies for later evaluation studies, e.g., by
re-training the ML functions in the ML-based module.
A. Strategy Refinement in the MCEC Scenario
We conducted several evaluation studies using a fully-
autonomous robotic tutor interacting with students in a class-
room in the MCEC scenario, as illustrated in Fig. 5. In
total, our implemented system interacted with 54 high-school
students, some of which for 4different times over a period of
one month. All studies allowed us to assess the applicability
of the methodology and also discover possible refinements to
the interaction strategies learned from the preliminary studies.
In terms of interaction strategy refinements, we noted during
the evaluation studies that the behavior being exhibited by
the autonomously-controlled tutor did not correctly capture
some empathic subtleties of the interaction. To improve the
tutor’s interaction strategy we created a module within the Task
AI detecting the interaction’s emotional climate (EC), based
on facial expression features of both students. As a result,
the controller selected a slightly different behavior according
to the EC being detected online [20]. We also performed
interaction sessions using the thinking-aloud technique [21]
that allowed us to identify aspects of the system that were
confusing and in need of refinement—e.g., we reformulated
the way the task tutorial was performed by the robot and
split some utterances so to emphasize the message being
transmitted. Other kinds of refinements can be done at this
stage based on more quantitative data after each improvement
cycle. For example, recall from Sec. III-E that our experience
within the MCEC task scenario showed that the interaction
data gathered during the WoZ studies was quite noisy. To deal
with such problem, we used the confidence of all classifica-
tions to estimate how reliable a certain interaction behavior
was. We then selected the least reliable strategies to ask the
wizard about their complexity and how one could improve the
controller, e.g., by providing more training samples or creating
a new interaction strategy in the Rule-based module.
B. Evaluating the MCEC Scenario
In the context of EMOTE, the proposed methodology is
fundamental for the creation of robotic tutors capable of inter-
acting with real students in a collaborative fashion. To assess
the impact of the discovered pedagogical interaction strategies
in the student’s perception of the tutor, we evaluated the perfor-
mance of our robot using a fully-autonomous hybrid controller
against that of a restricted-perception WoZ and a baseline
condition using a standard unrestricted WoZ. Performance was
assessed according to several HRI metrics. Namely, we used
an adapted version of the interpersonal reactivity index (IRI)
[22] to measure the perceived empathy of students towards the
robot–perspective taking and empathic concern dimensions.
We applied the godspeed series to assess the perception of the
robot’s anthropomorphism, animacy, likeability, intelligence
and security [23]. We also measured the student’s engagement
levels using a task-specific questionnaire.
78 high-school students evaluated the interaction with the
robotic tutor by answering to the referred questionnaires, rating
their opinion in a 5 point-type Likert scale. The study results
are detailed in Table I. As we can see, in all studies the
students evaluated the robotic tutor with considerable empathic
capabilities with no significant results between conditions.
Regarding the godspeed series, the results suggest that the
students preferred interaction strategies selected by the wiz-
ard during both WoZ studies, especially in the unrestricted
WoZ condition—specifically the perception of animacy and
203
TABLE I: Comparative results of the three evaluation studies
according to the applied metrics. A * mark highlights statisti-
cally significant interaction effects between factors (p<.05).
Questionnaire Interaction studies
dimensions Restr. WoZ Autonomous Unres. WoZ
IRI
Persp. taking 3.60 ±0.60 3.44 ±0.41 3.60 ±0.66
Emp. concern 3.71 ±0.51 3.46 ±0.51 3.78 ±0.60
Godspeed
Anthropom. 3.37 ±0.60 3.00 ±0.73 3.23 ±0.62
Animacy 3.78 ±0.47*3.35 ±0.56*3.76 ±0.54*
Likeability 4.46 ±0.46*4.23 ±0.48*4.56 ±0.43*
Perc. intellig. 4.35 ±0.41 4.12 ±0.61 4.28 ±0.55
Perc. security 3.71 ±0.54 3.97 ±0.80*3.46 ±0.59*
Engagement 4.23 ±0.47 3.90 ±0.70 4.17 ±0.50
likeability were statistically significantly higher in both these
conditions. In contrast, perceived security was significantly
higher in the autonomous condition compared to the unre-
stricted WoZ. This result is in line with our expectations since
interaction decisions are still more appropriate when directly
performed by a human expert compared to a fully-autonomous
controller. Nevertheless, results were not significant for the
other metrics, revealing positive results for the godspeed series
across all studies including the restricted-perception WoZ
condition—this shows that the proposed methodology lead to
non-differentiable impressions by the students regarding the
robotic tutor. When analyzing the engagement, results showed
that in all conditions students had high engagement levels with
the robot revealing no significant differences.
V. C ONCLUSION
In this paper we proposed a formal methodology for dis-
covering social interaction strategies for robots that takes into
account their inherent real-world constraints and limitations.
The key aspect of the methodology is the performance of
restricted-perception WoZ studies limiting what the human
expert can observe from the task to match all the perceptual
features that are available to the robot. In this way, we
can approach the expert’s demonstrated behavior by learning
directly from the interaction data gathered during the WoZ
studies. We implemented a fully-autonomous robotic tutor in
the context of EMOTE involving the MCEC scenario based
on the proposed methodology. The results of our evaluation
studies show that the generated interaction strategies allow the
students to be engaged in the social interaction with the robot
and to perceive it positively in terms of its empathic capabil-
ities. Besides the positive results in this case-study, we argue
that the proposed methodology is general enough to be used in
different HRI applications. By performing interaction studies
using the restricted-perception WoZ technique we facilitate the
process of learning from human expert’s behaviors. As a result,
the robot interaction strategies emerging from the process do
endow more natural social interactions.
ACKNOWLEDGMENTS
This work was supported by national funds through Fundac¸˜
ao para a Ciˆ
encia e
a Tecnologia (FCT) with reference UID/CEC/50021/2013 and by the EU-FP7
project EMOTE under grant agreement no. 317923. P. Sequeira acknowledges
a BDP grant from project INSIDE ref. CMUP-ERI/HCI/0051/2013, P. Alves-
Oliveira FCT grant ref. SFRH/BD/110223/2015 and T. Ribeiro FCT grant
ref. SFRH/BD/97150/2013. The authors are grateful to Externato Marista de
Lisboa and Escola Quinta do Marquˆ
es for their involvement in the studies.
REFERENCES
[1] C. Breazeal, Designing Sociable Robots. MIT Press, 2004.
[2] C.-M. Huang and B. Mutlu, “Robot behavior toolkit: generating effective
social behaviors for robots,” in Proc. 7th annual ACM/IEEE Int. Conf.
on Human-Robot Interaction. ACM, 2012, pp. 25–32.
[3] B. D. Argall, S. Chernova, M. Veloso, and B. Browning, “A survey of
robot learning from demonstration,” Robotics and Autonomous Systems,
vol. 57, no. 5, pp. 469–483, May 2009.
[4] P. H. Kahn, N. G. Freier, T. Kanda, H. Ishiguro, J. H. Ruckert, R. L.
Severson, and S. K. Kane, “Design patterns for sociality in human-robot
interaction,” in Proc. 3rd Int. Conf. Human Robot Int., 2008, pp. 97–104.
[5] L. Riek, “Wizard of Oz Studies in HRI: A Systematic Review and
New Reporting Guidelines,Journal of Human-Robot Interaction, vol. 1,
no. 1, pp. 119–136, 2012.
[6] W. B. Knox, S. Spaulding, and C. Breazeal, “Learning social interaction
from the wizard: A proposal,” in Workshops at 28th AAAI Conf. on
Artificial Intelligence, 2014.
[7] A. Steinfeld, O. C. Jenkins, and B. Scassellati, “The Oz of Wizard:
Simulating the Human for Interaction Research,” in Proc. 4th ACM/IEEE
Int. Conf. on Human-Robot Interaction, 2009, pp. 101–107.
[8] G. Castellano, A. Paiva, A. Kappas, R. Aylett, H. Hastie, W. Barendregt,
F. Nabais, and S. Bull, “Towards empathic virtual and robotic tutors,
in Artificial Intelligence in Education, ser. Lecture Notes in Computer
Science. Springer Berlin Heidelberg, 2013, vol. 7926, pp. 733–736.
[9] P. Sequeira, F. S. Melo, and A. Paiva, ““Let’s Save Resources!”: A
Dynamic, Collaborative AI for a Multiplayer Environmental Awareness
Game,” in Prc. IEEE Conf. Comp. Intel. and Games, 2015, pp. 399–406.
[10] E. Knol and P. De Vries, “EnerCities: Educational Game about Energy,”
in Proc. Central Europe towards Sust. Building, 2010, pp. 12–15.
[11] C. Bartneck and J. Hu, “Rapid prototyping for interactive robots,” in The
8th Conf. on intelligent autonomous systems (IAS-8), 2004, pp. 136–145.
[12] J. F. Kelley, “An iterative design methodology for user-friendly natural
language office information applications,” ACM Transactions on Infor-
mation Systems, vol. 2, no. 1, pp. 26–41, Jan. 1984.
[13] P. Alves-Oliveira, E. Di Tullio, T. Ribeiro, and A. Paiva, “Meet me
halfway: Eye behaviour as an expression of robot’s language,” in 2014
AAAI Fall Symposium Series, 2014.
[14] P. Alves-Oliveira, S. Janarthanam, A. Candeias, A. Deshmukh,
T. Ribeiro, H. Hastie, A. Paiva, and R. Aylett, “Towards dialogue
dimensions for a robotic tutor in collaborative learning scenarios,” in
Proc. 23rd Int. Symp. Robot & Human Inter. Com., 2014, pp. 862–867.
[15] A. Billard and D. Grollman, “Robot learning by demonstration,” Schol-
arpedia, vol. 8, no. 12, p. 3824, 2013.
[16] V. Mnih, K. Kavukcuoglu, D. Silver, A. Rusu, J. Veness, M. Bellemare,
A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen,
C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra,
S. Legg, and D. Hassabis, “Human-level control through deep reinforce-
ment learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
[17] T. Mitchell, Machine Learning. McGraw-Hill, 1997.
[18] P. Sequeira and C. Antunes, “Real-Time Sensory Pattern Mining for
Autonomous Agents,” in 6th Int. Workshop on Agents and Data Mining
Interaction, 2010, pp. 71–83.
[19] S. Chernova and M. Veloso, “Interactive Policy Learning Through
Confidence-based Autonomy,Journal of Artificial Intelligence Re-
search, vol. 34, no. 1, pp. 1–25, Jan. 2009.
[20] P. Alves-Oliveira, P. Sequeira, E. D. Tullio, S. Petisca, C. Guerra, F. S.
Melo, and A. Paiva, ““It’s amazing, we are all feeling it!” Emotional
Climate as a Group-Level Emotional Expression in HRI,” in Art. Intel.
and Human-Robot Interaction, AAAI Fall Symposium Series, 2015.
[21] J. Nielsen, “Evaluating the thinking-aloud technique for use by computer
scientists,” Adv. in Human-Comp. Interaction, vol. 3, pp. 69–82, 1992.
[22] M. Davis, “Interpersonal reactivity index,Empathy: A Social Psycho-
logical Approach, pp. 55–6, 1996.
[23] C. Bartneck, D. Kuli ´
c, E. Croft, and S. Zoghbi, “Measurement in-
struments for the anthropomorphism, animacy, likeability, perceived
intelligence, and perceived safety of robots,Int. journal of social
robotics, vol. 1, no. 1, pp. 71–81, 2009.
204
... Previous work with such experts has demonstrated that a lot of the related expertise is intuitive and intangible, making it difficult to access in a way that can easily inform robot automation (Winkle et al., 2018). This is somewhat addressed by methods that capture domain expert operation of a robot directly, for example, end user programming tools (e.g., Leonardi et al., 2019) or learning from expert teleoperation of robots (e.g., Sequeira et al., 2016). However, these methods tend to focus on offline learning/ programming. ...
... Knox et al. (2014) proposed the "Learning from the Wizard" paradigm, whereby a robot would first be controlled in a WoZ experiment used to acquire the demonstrations and then machine learning would be applied offline to define a policy. Sequeira et al. (2016) extended and applied this Learning from Demonstration (LfD), with an emphasis on the concept of "restricted-perception WoZ", in which the wizard only has access to the same input space as the learning algorithm, thus reducing the problem of correspondence between the state and action spaces used by the wizard and the ones available to the robot controller. Both of these works could support a PD approach to robot automation, because they could be used to generate an autonomous robot action policy based on data from (non-roboticist) domain expert WoZ interactions in real-world environments. ...
... Requirement 3 ensures that the expert can create a mental model of the robot behaviour. This point is a key difference to non-interactive teaching methods such as the ones based on offline learning (e.g., Sequeira et al., 2016). With the feedback of the robot on its policy (through suggestions or visual behaviour representation), the expert can assess the (evolving) capabilities of the robot and decide what inputs would further improve the policy of the robot. ...
Article
Full-text available
Participatory design (PD) has been used to good success in human-robot interaction (HRI) but typically remains limited to the early phases of development, with subsequent robot behaviours then being hardcoded by engineers or utilised in Wizard-of-Oz (WoZ) systems that rarely achieve autonomy. In this article, we present LEADOR (Led-by-Experts Automation and Design Of Robots), an end-to-end PD methodology for domain expert co-design, automation, and evaluation of social robot behaviour. This method starts with typical PD, working with the domain expert(s) to co-design the interaction specifications and state and action space of the robot. It then replaces the traditional offline programming or WoZ phase by an in situ and online teaching phase where the domain expert can live-program or teach the robot how to behave whilst being embedded in the interaction context. We point out that this live teaching phase can be best achieved by adding a learning component to a WoZ setup, which captures implicit knowledge of experts, as they intuitively respond to the dynamics of the situation. The robot then progressively learns an appropriate, expert-approved policy, ultimately leading to full autonomy, even in sensitive and/or ill-defined environments. However, LEADOR is agnostic to the exact technical approach used to facilitate this learning process. The extensive inclusion of the domain expert(s) in robot design represents established responsible innovation practice, lending credibility to the system both during the teaching phase and when operating autonomously. The combination of this expert inclusion with the focus on in situ development also means that LEADOR supports a mutual shaping approach to social robotics. We draw on two previously published, foundational works from which this (generalisable) methodology has been derived to demonstrate the feasibility and worth of this approach, provide concrete examples in its application, and identify limitations and opportunities when applying this framework in new environments.
... Despite the assumed relevance of service robots in physical retail stores, there is a lack of knowledge in the academic and business domain how to set up and deploy service robots in instore service encounter settings [15]. This knowledge gap not only hampers the academic community to get a fuller understanding of the added value of service robots in retailing at large, it also prevents retail practitioners from optimizing the use of service robots. ...
... In the remainder of this paper, we first elaborate upon related work underlying our study. Then, using interaction design techniques and procedures suggested in previous study (e.g., [15]), we focus on the two Dutch fashion stores as specific use cases and report on the results of empirical exploration. We conclude with a discussion of our findings and look ahead by proposing future research directions. ...
Conference Paper
Full-text available
Service robots provide retailers with new opportunities to innovate their in-store service offerings. Despite advances made in the fields of human-robot interaction, information systems, and marketing, there is relatively little known about how to apply a service robot in retailing. In this paper we aim to shed light on this issue by exploring the added value, roles, and prototyping of a service robot in fashion retailing. Using two Dutch fashion stores as real-life settings, we apply different interaction techniques (observation, interview, survey, structured role play, prototyping) to generate first insights and obtain lessons learned. The results of our study suggest that fashion retailers would benefit most from using service robots for communication of promotions and provision of product information. When applying service robots to these use cases, customers seem to prefer briefly and clearly expressed information that is communicated in a style that matches (in-) store communications. Still, the lack of personal attention and social support associated with a service robot makes retailers and store personnel rather reluctant to use them for their service excellence-oriented stores.
... Leonardi et al. [21]) or learning from expert teleoperation of robots (e.g. Sequeira et al. [33]). ...
... Knox et al. [18] proposed the "Learning from the Wizard" paradigm, whereby a robot would first be controlled in a WoZ experiment used to acquire the demonstrations and then machine learning would be applied offline to define a policy. Sequeira et al. [33] extended and applied this Learning from Demonstration (LfD), with an emphasis on the concept of "Restricted-perception WoZ", in which the wizard only has access to the same input space as the learning algorithm, thus reducing the problem of correspondence between the state and action spaces used by the wizard and the ones available to the robot controller. Both of these works could support a PD approach to robot automation, as they could be used to generate an autonomous robot action policy based on data from (non-roboticist) domain expert WoZ interactions in real-world environments. ...
Preprint
Full-text available
Participatory Design (PD) in Human-Robot Interaction (HRI) typically remains limited to the early phases of development, with subsequent robot behaviours then being hardcoded by engineers or utilised in Wizard-of-Oz (WoZ) systems that rarely achieve autonomy. We present LEADOR (Led-by-Experts Automation and Design Of Robots) an end-to-end PD methodology for domain expert co-design, automation and evaluation of social robots. LEADOR starts with typical PD to co-design the interaction specifications and state and action space of the robot. It then replaces traditional offline programming or WoZ by an in-situ, online teaching phase where the domain expert can live-program or teach the robot how to behave while being embedded in the interaction context. We believe that this live teaching can be best achieved by adding a learning component to a WoZ setup, to capture experts' implicit knowledge, as they intuitively respond to the dynamics of the situation. The robot progressively learns an appropriate, expert-approved policy, ultimately leading to full autonomy, even in sensitive and/or ill-defined environments. However, LEADOR is agnostic to the exact technical approach used to facilitate this learning process. The extensive inclusion of the domain expert(s) in robot design represents established responsible innovation practice, lending credibility to the system both during the teaching phase and when operating autonomously. The combination of this expert inclusion with the focus on in-situ development also means LEADOR supports a mutual shaping approach to social robotics. We draw on two previously published, foundational works from which this (generalisable) methodology has been derived in order to demonstrate the feasibility and worth of this approach, provide concrete examples in its application and identify limitations and opportunities when applying this framework in new environments.
... It is also important to note that, although policy transfer can meaningfully improve the behaviour model, training the model is still a slow process, as the optimal policy seems to be reached only after 10-20 epochs. We believe that this effect can be alleviated by applying the learning from guidance concept [1], [23], [27], [28], which would also make the system safer and more applicable to real therapy. ...
Preprint
In robot-assisted therapy for individuals with Autism Spectrum Disorder, the workload of therapists during a therapeutic session is increased if they have to control the robot manually. To allow therapists to focus on the interaction with the person instead, the robot should be more autonomous, namely it should be able to interpret the person's state and continuously adapt its actions according to their behaviour. In this paper, we develop a personalised robot behaviour model that can be used in the robot decision-making process during an activity; this behaviour model is trained with the help of a user model that has been learned from real interaction data. We use Q-learning for this task, such that the results demonstrate that the policy requires about 10,000 iterations to converge. We thus investigate policy transfer for improving the convergence speed; we show that this is a feasible solution, but an inappropriate initial policy can lead to a suboptimal final return.
... That is, existing computational robot models lack the functionality and flexibility for studying how members will be able to learn within a future human-robot team, as these models do not yet sufficiently incorporate the principles of interdependence and autonomy (Lematta et al., 2019). In this study we use the method of a restricted-perception wizard-of-Oz (WOz), that has been advocated for the study of designs for strategies in human-robot interaction research (Sequeira et al., 2016). Half of the human-robot teams engaged in the LDPs; the other teams did not. ...
Article
Full-text available
The rapid advancement of technology empowered by artificial intelligence is believed to intensify the collaboration between humans and AI as team partners. Successful collaboration requires partners to learn about each other and about the task. This human-AI co-learning can be achieved by presenting situations that enable partners to share knowledge and experiences. In this paper we describe the development and implementation of a task context and procedures for studying co-learning. More specifically, we designed specific sequences of interactions that aim to initiate and facilitate the co-learning process. The effects of these interventions on learning were evaluated in an experiment, using a simplified virtual urban-search-and-rescue task for a human-robot team. The human participants performed a victim rescue- and evacuation mission in collaboration with a wizard-of-Oz (i.e., a confederate of the experimenter who executed the robot-behavior consistent with an ontology-based AI-model). The designed interaction sequences, formulated as Learning Design Patterns (LDPs), were intended to bring about co-learning. Results show that LDPs support the humans understanding and awareness of their robot partner and of the teamwork. No effects were found on collaboration fluency, nor on team performance. Results are used to discuss the importance of co-learning, the challenges of designing human-AI team tasks for research into this phenomenon, and the conditions under which co-learning is likely to be successful. The study contributes to our understanding of how humans learn with and from AI-partners, and our propositions for designing intentional learning (LDPs) provide directions for applications in future human-AI teams.
... Results suggested that the proposed algorithm, based on max-margin IRL, was capable of learning the user's preferences during interactions with the robot. Sequeira et al. (2016) proposed a method for creating social interaction strategies for human-robot interaction based on WoZ studies. The final robot's behaviours went through three design stages: data collection, in which the expert knowledge is gathered, strategy extraction, in which the robot's strategy is learned from this data, and finally strategy refinement, in which the robot's behaviour is interactively refined during the interactions. ...
Article
Full-text available
Socially assistive robots have the potential to augment and enhance therapist’s effectiveness in repetitive tasks such as cognitive therapies. However, their contribution has generally been limited as domain experts have not been fully involved in the entire pipeline of the design process as well as in the automatisation of the robots’ behaviour. In this article, we present aCtive leARning agEnt aSsiStive bEhaviouR (CARESSER), a novel framework that actively learns robotic assistive behaviour by leveraging the therapist’s expertise (knowledge-driven approach) and their demonstrations (data-driven approach). By exploiting that hybrid approach, the presented method enables in situ fast learning, in a fully autonomous fashion, of personalised patient-specific policies. With the purpose of evaluating our framework, we conducted two user studies in a daily care centre in which older adults affected by mild dementia and mild cognitive impairment ( N = 22) were requested to solve cognitive exercises with the support of a therapist and later on of a robot endowed with CARESSER. Results showed that: (i) the robot managed to keep the patients’ performance stable during the sessions even more so than the therapist; (ii) the assistance offered by the robot during the sessions eventually matched the therapist’s preferences. We conclude that CARESSER, with its stakeholder-centric design, can pave the way to new AI approaches that learn by leveraging human–human interactions along with human expertise, which has the benefits of speeding up the learning process, eliminating the need for the design of complex reward functions, and finally avoiding undesired states.
... One of the main characteristics of HRI using merging, is that the device being shared often is limited to the robot(s) (e.g., [60,80,168,203]). Several cases exist in which participants share additional digital artefacts instead of, or beside, the robot such as a connected tablet, screen, or interactive surfaces (e.g., [124,151,164]), see Figure 6. We could further observe, that the use of merging is often used in public areas, such as guides in shopping malls (e.g., [13,80,81,167]). ...
Article
Full-text available
Going beyond dyadic (one-to-one) interaction has been increasingly explored in HRI. Yet we lack a comprehensive view on non-dyadic interaction research in HRI. To map out 15 years of works investigating non-dyadic interaction, and thereby identifying the trend of the field and future research areas, we performed a literature review containing all 164 publications (2006-2020) from the HRI conference investigating non-dyadic interaction. Our approach is inspired by the 4C framework, an interaction framework focusing on understanding and categorising different types of interaction between humans and digital artefacts. The 4C framework consists of eight interaction principles for multi-user/multi-artefact interaction categorised into four broader themes. We modified the 4C framework to increase applicability and relevance in the context of non-dyadic human-robot interaction. We identify an increasing tendency towards non-dyadic research (36% in 2020), as well as a focus on simultaneous studies (85% from 2006-2020) over sequential. We also articulate seven interaction principles utilised in non-dyadic HRI and provide specific examples. Last, based on our findings, we discuss several salient points of non-dyadic HRI, the applicability of the modified 4C framework to HRI and potential future topics of interest as well as open-questions for non-dyadic research.
... The computational model built to drive the behavior of the SIA (embodied as the NAO robot) was a "hybrid behavior controller" combining a rule-based component and a data-driven one. The data-driven component was built with a dataset created using a restricted perception Wizard-of-Oz study [Sequeira et al. 2016]. The final system was tested in the robot showing its capability to foster meaningful discussions among students interacting with the robot and among themselves. ...
... One aspect of this problem is related to the fact that the wizard often acts on perceptual information at a higher level than an autonomous robot would do. One approach to prevent this is to restrict the wizard's perception (Sequeira et al., 2016). Another aspect is investigated in (Schlögl et al., 2010), where reported experiments show how the wizard's behavior sometimes varies significantly, thereby potentially influencing the outcome of the experiment. ...
Article
Full-text available
Wizard-of-Oz experiments play a vital role in Human-Robot Interaction (HRI), as they allow for quick and simple hypothesis testing. Still, a publicly available general tool to conduct such experiments is currently not available in the research community, and researchers often develop and implement their own tools, customized for each individual experiment. Besides being inefficient in terms of programming efforts, this also makes it harder for non-technical researchers to conduct Wizard-of-Oz experiments. In this paper, we present a general and easy-to-use tool for the Pepper robot, one of the most commonly used robots in this context. While we provide the concrete interface for Pepper robots only, the system architecture is independent of the type of robot and can be adapted for other robots. A configuration file, which saves experiment-specific parameters, enables a quick setup for reproducible and repeatable Wizard-of-Oz experiments. A central server provides a graphical interface via a browser while handling the mapping of user input to actions on the robot. In our interface, keyboard shortcuts may be assigned to phrases, gestures, and composite behaviors to simplify and speed up control of the robot. The interface is lightweight and independent of the operating system. Our initial tests confirm that the system is functional, flexible, and easy to use. The interface, including source code, is made commonly available, and we hope that it will be useful for researchers with any background who want to conduct HRI experiments.
Technical Report
Full-text available
Eye contact is a crucial behaviour in human communication and therefore an essencial feature in human-robot interaction. A study regarding the development of an eye behaviour model for a robotic tutor in a task-oriented environment is presented, along with a description of how our proposed model is being used to implement an autonomous robot in the EMOTE project.
Article
Full-text available
The objective of the present study was to analyze psychometric properties of the interpersonal Reactivity Index (IRI) Spanish version. The scale was administered to one sample of parents (N = 692) and to two samples of college students (N = 1997 and N = 515). Results showed that the Spanish version of the IRI has psychometric properties similar to those of the original version and lead to conclude that the IRI Spanish version is an adequate instrument to be used in Spain as a measure of empathy.
Technical Report
Full-text available
Emotions are a key element in all human interactions. It is well documented that individual-and group-level interactions have different emotional expressions and humans are by nature extremely competent in perceiving, adapting and reacting to them. However, when developing social robots, emotions are not so easy to cope with. In this paper we introduce the concept of emotional climate applied to human-robot interaction (HRI) to define a group-level emotional expression at a given time. By doing so, we move one step further in developing a new tool that deals with group emotions within HRI.
Conference Paper
Full-text available
There has been some studies in applying robots to education and recent research on socially intelligent robots show robots as partners that collaborate with people. On the other hand, serious games and interaction technologies have also proved to be important pedagogical tools, enhancing collaboration and interest in the learning process. This paper relates to the collaborative scenario in EMOTE EU FP7 project and its main goal is to develop and present the dialogue dimensions for a robotic tutor in a collaborative learning scenario grounded in human studies. Overall, seven dialogue dimensions between the teacher and students interaction were identified from data collected over 10 sessions of a collaborative serious game. Preliminary results regarding the teachers perspective of the students interaction suggest that student collaboration led to learning during the game. Besides, students seem to have learned a number of concepts as they played the game. We also present the protocol that was followed for the purposes of future data collection in human-human and human-robot interaction in similar scenarios.
Article
The main aim of this study was to analyze the psychometric properties of the Interpersonal Reactivity Index (IRI) (Davis, 1980, 1983) in its Spanish adaptation. It is one of the questionnaires but used to evaluate the empathy from a multidimensional perspective that includes two cognitive factors and two emotional. The Spanish adaptation has been carried out with a wide sample of fellows belonging to different educational centers of the Valencian Community (1.285 adolescents, 698 males and 597 women, with an age range between 13 and 18 years). The results of this study indicate the validity of the instrument to evaluate the different components of the empathy.
Article
The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past experience to new situations. Remarkably, humans and other animals seem to solve this problem through a harmonious combination of reinforcement learning and hierarchical sensory processing systems, the former evidenced by a wealth of neural data revealing notable parallels between the phasic signals emitted by dopaminergic neurons and temporal difference reinforcement learning algorithms. While reinforcement learning agents have achieved some successes in a variety of domains, their applicability has previously been limited to domains in which useful features can be handcrafted, or to domains with fully observed, low-dimensional state spaces. Here we use recent advances in training deep neural networks to develop a novel artificial agent, termed a deep Q-network, that can learn successful policies directly from high-dimensional sensory inputs using end-to-end reinforcement learning. We tested this agent on the challenging domain of classic Atari 2600 games. We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters. This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Article
Many researchers use Wizard of Oz (WoZ) as an experimental technique, but there are methodologi-cal concerns over its use, and no comprehensive criteria on how to best employ it. We systematically review 54 WoZ experiments published in the primary HRI publication venues from 2001 -2011. Us-ing criteria proposed by Fraser and Gilbert (1991), Green et al. (2004), Steinfeld et al. (2009), and Kelley (1984), we analyzed how researchers conducted HRI WoZ experiments. Researchers mainly used WoZ for verbal (72.2%) and non-verbal (48.1%) processing. Most constrained wizard produc-tion (90.7%), but few constrained wizard recognition (11%). Few reported measuring wizard error (3.7%), and few reported pre-experiment wizard training (5.4%). Few reported using WoZ in an iterative manner (24.1%). Based on these results we propose new reporting guidelines to aid future research.
Conference Paper
Building on existing work on artificial tutors with human-like capabilities, we describe the EMOTE project approach to harnessing benefits of an artificial embodied tutor in a shared physical space. Embodied in robotic platforms or through virtual agents, EMOTE aims to capture some of the empathic and human elements characterising a traditional teacher. As such, empathy and engagement, abilities key to influencing student learning, are at the core of the EMOTE approach. We present non-verbal and adaptive dialogue challenges for such embodied tutors as a foundation for researchers investigating the potential for empathic tutors that will be accepted by students and teachers.