Conference PaperPDF Available

Robot Theory of Mind with Reverse Psychology

Authors:

Abstract

Theory of mind (ToM) corresponds to the human ability to infer other people's desires, beliefs, and intentions. Acquisition of ToM skills is crucial to obtain a natural interaction between robots and humans. A core component of ToM is the ability to attribute false beliefs. In this paper, a collaborative robot tries to assist a human partner who plays a trust-based card game against another human. The robot infers its partner's trust in the robot's decision system via reinforcement learning. Robot ToM refers to the ability to implicitly anticipate the human collaborator's strategy and inject the prediction into its optimal decision model for a better team performance. In our experiments, the robot learns when its human partner does not trust the robot and consequently gives recommendations in its optimal policy to ensure the effectiveness of team performance. The interesting finding is that the optimal robotic policy attempts to use reverse psychology on its human collaborator when trust is low. This finding will provide guidance for the study of a trustworthy robot decision model with a human partner in the loop.
Robot Theory of Mind with Reverse Psychology
Chuang Yu
Cognitive Robotics Lab,
The University of Manchester, UK
chuang.yu@manchester.ac.uk
Baris Serhan
Cognitive Robotics Lab,
The University of Manchester, UK
baris.serhan@manchester.ac.uk
Marta Romeo
Cognitive Robotics Lab,
The University of Manchester, UK
marta.romeo@manchester.ac.uk
Angelo Cangelosi
Cognitive Robotics Lab,
The University of Manchester, UK
angelo.cangelosi@manchester.ac.uk
Figure 1: The pipeline of robot theory of mind with psychology. Player
𝑃1
and the robot work together as a team while player
𝑃2
plays by itself. Based on reinforcement learning, the robot learns how to decide which suggestions to give for a better team
performance. We assume that the robot knows the actions of
𝑃2
. However, player
𝑃1
is unaware that the robot has this extra
knowledge. When player
𝑃1
possesses a false belief in the robot performance, corresponding to a low trust level, the optimal
robot policy uses reverse psychology to give opposite advice to encourage
𝑃1
to do what the robot desires for a better team
performance.
ABSTRACT
Theory of mind (ToM) corresponds to the human ability to infer
other people’s desires, beliefs, and intentions. Acquisition of ToM
skills is crucial to obtain a natural interaction between robots and
humans. A core component of ToM is the ability to attribute false
beliefs. In this paper, a collaborative robot tries to assist a human
partner who plays a trust-based card game against another human.
The robot infers its partner’s trust in the robot’s decision system via
reinforcement learning. Robot ToM refers to the ability to implicitly
anticipate the human collaborator’s strategy and inject the predic-
tion into its optimal decision model for a better team performance.
In our experiments, the robot learns when its human partner does
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specic permission
and/or a fee. Request permissions from permissions@acm.org.
HRI ’23 Companion, March 13–16, 2023, Stockholm, Sweden
©2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9970-8/23/03. . . $15.00
https://doi.org/10.1145/3568294.3580144
not trust the robot and consequently gives recommendations in its
optimal policy to ensure the eectiveness of team performance. The
interesting nding is that the optimal robotic policy attempts to use
reverse psychology on its human collaborator when trust is low.
This nding will provide guidance for the study of a trustworthy
robot decision model with a human partner in the loop.
CCS CONCEPTS
Human-centered computing
Interaction design;Com-
puter systems organization Robotics;
KEYWORDS
robot theory of mind, reverse psychology, human-robot trust
ACM Reference Format:
Chuang Yu, Baris Serhan, Marta Romeo, and Angelo Cangelosi. 2023. Ro-
bot Theory of Mind with Reverse Psychology. In Companion of the 2023
ACM/IEEE International Conference on Human-Robot Interaction (HRI ’23
Companion), March 13–16, 2023, Stockholm, Sweden. ACM, 3 pages. https:
//doi.org/10.1145/3568294.3580144
HRI ’23 Companion, March 13–16, 2023, Stockholm, Sweden Chuang Yu, Baris Serhan, Marta Romeo, and Angelo Cangelosi
1 INTRODUCTION
Theory of mind (ToM) relates to the ability to attribute mental
states to ourselves and others. Human ToM plays a crucial role
in cognitive development and natural social interaction [
3
]. To
build advanced intelligent robot partners, researchers are trying to
empower robots with this cognitive skill [
1
,
8
,
9
]. With the human in
the loop, Romeo et al. [
8
] explored how robot ToM inuences human
decision-making in a challenging maze game. Without the human
in the loop, Chen et al. [
1
] built an AI observer to study the visual
behavior modeling for robot ToM. Their AI observer could predict a
robot actor’s future trajectory only given one image frame showing
the actor robot’s initial scene. ToM can be formalized as inverse
reinforcement learning [
5
]. Reinforcement learning (RL) tries to
nd the optimal policy with the guidance of a reward function. In
contrast, based on the observed behavior history of the agent or its
policy, inverse reinforcement learning (IRL) recovers the reward
function of the agent. In [
5
], the reward function, policy, and world
model in RL are mapped respectively to desires, intentions, and
beliefs in the ToM model. Rabinowitz et al. [
7
] came up with an
articial ToM model, namely ToMnet, which uses meta learning
to learn the policy of other agents through their behavior history.
The ToMnet model implicitly revealed the agent’s false beliefs (a
vital component of human ToM) about the world.
Reverse psychology refers to a manipulative behavior that af-
fects another individual’s behavior as it desires. By using reverse
psychology, an agent tries to make the interactor behave as wanted
by advocating an opposite behavior to the desired one. Guo et al.
[
4
] show the phenomenon in which the optimal robot policy tries
to exploit reverse psychology in a reconnaissance mission within
a human-robot team. The paper also proposes two trust-behavior
models and investigates how the dierent models aect a robot’s
optimal policy and HRI team performance.
In our paper, we explore robot ToM through reverse psychology
in a multiagent “Cheat game” scenario. From our experiment re-
sults, we found that when player
𝑃1
has a false belief in the robot
performance, corresponding to a low trust level, the optimal robot
policy always uses reverse psychology to give the opposite advice
to encourage player
𝑃1
to do what the robot desires for a better team
performance. This implicitly certies that the robot successfully
learned the false belief state of its human partner, a core component
of ToM.
2 METHODS
The “Cheat game” experimental setting is shown in Fig. 1. The
“Cheat game” is a card game with multiple players who try to get
rid of all their cards. In each round, one player discards one or more
cards face down and declares the number and the rank of the cards.
They can either tell the truth or lie about the rank and number of
cards they are playing. At this point, the opponents can decide to
call “cheat” on the hand that was just played. If “cheat” is called,
the hand is revealed and, if the player who had declared it lied, they
collect all the cards on the table. If they did not lie, whoever called
“cheat” collects the cards on the table. The game ends when one
player wins the game by managing to get rid of all of their cards.
We present a modied version with one robot and two human
players,
𝑃1
(co-playing with the robot) and
𝑃2
. At the start, each
player has 12 cards randomly chosen from a whole deck of cards.
Player
𝑃1
collaborates with the robot as a team during the multia-
gent game. The robot will give its advice (whether to call cheat or
not on
𝑃2
) to the human player
𝑃1
, guided by the maximal collabora-
tion performance. We assume that the robot knows the actions of
𝑃2
but
𝑃1
does not know that the robot has this additional knowledge.
The robot policy can be modeled as a POMDP (Partially Observ-
able Markov Decision Process). The states are: the card situation
of
𝑃1
hand (for example, how many cards that
𝑃2
claimes and
𝑃1
has), the number of the claimed card played by
𝑃2
, the action of
𝑃2
(cheat or not cheat), and the belief (belief is modeled with shape
parameters of beta distribution used in [
4
]) on
𝑃1
trust on the robot
performance.
𝑃1
trust in the robot is continuous in the range of
0 to 1. As
𝑃1
trust cannot be directly observed by the robot, the
robot models a belief state of trust based on its performance. We
dene the world state
𝑠𝑆
, the robot action
𝑎𝑅𝐴𝑅
and player
𝑃1
action
𝑎𝑃1𝐴𝑃1
. The state changes with a transition probability
𝑃𝑠|𝑠, 𝑎𝑅, 𝑎𝑃1
. After the state changes, a reward
𝑟𝑠, 𝑎𝑅, 𝑎𝑃1, 𝑠
is received. The robot reward function is based on the change of
the number of cards in 𝑃1and 𝑃2’s hands in each round of game.
We developed a simulation for the multiagent cheat game to
record data in order for the robot to learn the optimal policy to
give recommendations to maximise the team performance. In the
simulation,
𝑃1
and
𝑃2
are simulated human players. The behavior
of
𝑃1
is modeled with a trust dynamics model and a trust-based
policy model [
2
,
4
]. During the simulation for the data recording,
the robot player always performed random actions (gave random
recommendations). In total, 8000 games were recorded. The dataset
was divided in 4800 for training, 1600 for validation, and 1600
for testing. Oine RL algorithms can eectively learn a policy
from previously-collected static datasets without further interac-
tion, which is convenient for RL model in human-robot interaction.
Hence, this paper used the oine reinforcement learning model,
namely the Conservative Q-Learning (CQL) [
6
] to learn the optimal
robot policy. The CQL model minimizes the action-values under
the current policy and maximizes values under data distribution to
overcome the underestimation issue.
3 RESULTS
Our oine RL model is completed with the
𝑑
3
𝑟𝑙 𝑝𝑦
library [
10
].
After training the oine reinforcement learning model, the robot
learns how to decide which suggestion to give to maximize the
team performance. We tested our trained RL model (trained for
3963 epoches) in the simulation.
𝑃2
plays the same actions as in
the random policy case, in order to compare the dierence of team
performance between a random policy and optimal policy in the
same setting. The results are shown in Figure 2. The results show
that the optimal robot decision model (robot CQL policy) indeed
uses reverse psychology for a better team performance. For team
performance, the robot CQL policy gets an accuracy of 77.3% while
the random model 63.3%. We dene as accuracy how many times
𝑃1
guesses correctly the actions of
𝑃2
, after hearing the robot rec-
ommendation. When
𝑃2
cheats and
𝑃1
has low trust in the robot
(trust < 0.5), the optimal robot model mostly suggests to
𝑃1
not to
call “cheat” and
𝑃1
does not follow the robot and nally get the
right guess on
𝑃2
action (nearly 400 times). However, the robot
Robot Theory of Mind with Reverse Psychology HRI ’23 Companion, March 13–16, 2023, Stockholm, Sweden
Figure 2: The testing results on trained RL policy and random policy.
𝐴𝑐𝑡𝑖𝑜𝑛_𝑃
2”cheat” or ”not cheat” refers to whether
𝑃2
cheats on
𝑃1
.
𝐴𝑐𝑡𝑖𝑜𝑛_𝑃
1”cheat” or ”not cheat” refers to whether
𝑃1
calls ”cheat” or not. The reverse psychology phenomenon in
RL model testing happens when
𝑃1
trust is low. For example, in the top left subgure, the robot mostly advises not to cheat, and
𝑃1chooses to cheat.
random policy (not trained on the aquired dataset) does not use
reverse psychology in the same situation. When
𝑃2
cheats and
𝑃1
has high trust in the robot (trust > 0.5), the robot does not use
reverse psychology and advises to call “cheat”,
𝑃1
mostly follows
the advice in this case (more than 400 times). Similar results are
reached when 𝑃2does not lie when playing their hand.
4 DISCUSSION AND FUTURE WORKS
In this paper, we certied the existence of reverse psychology in
an optimal robot policy during a multiagent trust-based card game.
When the human trust in the robot is low, the optimal robot policy
using ToM uses reverse psychology for a better team performance.
However, this might be hazardous to long-term human-robot inter-
action. If the human interactor realizes that reverse psychology is
being used trust will collapse. We should take trust into account in
the robot reward function to avoid this collapse, which in turn can
damage the team’s performance. Hence, the robot model should
aim to strike a balance between trust maintenance and team per-
formance. How to reach this balance will be explored in future
work. Finally, this paper’s limitation is that all results are based on
simulated data. We will explore human-joined experiments in the
future.
ACKNOWLEDGMENTS
This work was funded and supported by the UKRI TAS Node on
Trust (EP/V026682/1) and the project THRIVE/THRIVE++ (FA9550-
19-1-7002).
REFERENCES
[1]
Boyuan Chen, Carl Vondrick, and Hod Lipson. 2021. Visual behavior modelling
for robotic theory of mind. Scientic Reports 11, 1 (2021), 1–14.
[2]
Min Chen, Stefanos Nikolaidis, Harold Soh, David Hsu, and Siddhartha Srini-
vasa. 2020. Trust-aware decision making for human-robot collaboration: Model
learning and planning. ACM Transactions on Human-Robot Interaction (THRI) 9,
2 (2020), 1–23.
[3] Jerry A Fodor. 1992. A theory of the child’s theory of mind. Cognition (1992).
[4]
Yaohui Guo, Cong Shi, and Xi Jessie Yang. 2021. Reverse psychology in trust-
aware human-robot interaction. IEEE Robotics and Automation Letters 6, 3 (2021),
4851–4858.
[5]
Julian Jara-Ettinger. 2019. Theory of mind as inverse reinforcement learning.
Current Opinion in Behavioral Sciences 29 (2019), 105–110.
[6]
Aviral Kumar, Aurick Zhou, George Tucker, and Sergey Levine. 2020. Conserva-
tive q-learning for oine reinforcement learning. Advances in Neural Information
Processing Systems 33 (2020), 1179–1191.
[7]
Neil Rabinowitz, Frank Perbet, Francis Song, Chiyuan Zhang, SM Ali Eslami, and
Matthew Botvinick. 2018. Machine theory of mind. In International conference on
machine learning. PMLR, 4218–4227.
[8]
Marta Romeo, Peter E McKenna, David A Robb, Gnanathusharan Rajendran,
Birthe Nesset, Angelo Cangelosi, and Helen Hastie. 2022. Exploring Theory of
Mind for Human-Robot Collaboration. In 2022 31st IEEE International Conference
on Robot and Human Interactive Communication (RO-MAN). IEEE, 461–468.
[9]
Brian Scassellati. 2002. Theory of mind for a humanoid robot. Autonomous Robots
12, 1 (2002), 13–24.
[10]
Takuma Seno and Michita Imai. 2022. d3rlpy: An Oine Deep Reinforcement
Learning Library. Journal of Machine Learning Research 23, 315 (2022), 1–20.
http://jmlr.org/papers/v23/22-0017.html
... In this paper, we model the robot's decision-making for human-robot collaboration as a Partially Observable Markov Decision Process, where the robot's trust in the collaborator remains unobservable. Prioritizing only team benefits can induce reverse psychology in human-robot interactions [29]. We introduce the ToP-ToM model, which incorporates trust and adjusts rewards of RL-based robot policy based on varying ToM beliefs (e.g. ...
Conference Paper
Full-text available
Theory of Mind (ToM) is a fundamental cognitive architecture that endows humans with the ability to attribute mental states to others. Humans infer the desires, beliefs, and intentions of others by observing their behavior and, in turn, adjust their actions to facilitate better interpersonal communication and team collaboration. In this paper, we investigated trust-aware robot policy with the theory of mind in a multiagent setting where a human collaborates with a robot against another human opponent. We show that by only focusing on team performance, the robot may resort to the reverse psychology trick, which poses a significant threat to trust maintenance. The human's trust in the robot will collapse when they discover deceptive behavior by the robot. To mitigate this problem, we adopt the robot theory of mind model to infer the human's trust beliefs, including true belief and false belief (an essential element of ToM). We designed a dynamic trust-aware reward function based on different trust beliefs to guide the robot policy learning, which aims to balance between avoiding human trust collapse due to robot reverse psychology and leveraging its potential to boost team performance. The experimental results demonstrate the importance of the ToM-based robot policy for human-robot trust and the effectiveness of our robot ToM-based robot policy in multiagent interaction settings.
Article
Full-text available
To facilitate effective human-robot interaction (HRI), trust-aware HRI has been proposed, wherein the robotic agent explicitly considers the human's trust during its planning and decision making. The success of trust-aware HRI depends on the specification of a trust dynamics model and a trust-behavior model. In this study, we proposed one novel trust-behavior model, namely the reverse psychology model, and compared it against the commonly used disuse model. We examined how the two models affect the robot's optimal policy and the human-robot team performance. Results indicate that the robot will deliberately ‘manipulate’ the human's trust under the reverse psychology model. To correct this \textcolor{blue}{‘manipulative’} behavior, we proposed a trust-seeking reward function that facilitates trust establishment without significantly sacrificing the team performance.
Article
Full-text available
Behavior modeling is an essential cognitive ability that underlies many aspects of human and animal social behavior (Watson in Psychol Rev 20:158, 1913), and an ability we would like to endow robots. Most studies of machine behavior modelling, however, rely on symbolic or selected parametric sensory inputs and built-in knowledge relevant to a given task. Here, we propose that an observer can model the behavior of an actor through visual processing alone, without any prior symbolic information and assumptions about relevant inputs. To test this hypothesis, we designed a non-verbal non-symbolic robotic experiment in which an observer must visualize future plans of an actor robot, based only on an image depicting the initial scene of the actor robot. We found that an AI-observer is able to visualize the future plans of the actor with 98.5% success across four different activities, even when the activity is not known a-priori. We hypothesize that such visual behavior modeling is an essential cognitive ability that will allow machines to understand and coordinate with surrounding agents, while sidestepping the notorious symbol grounding problem. Through a false-belief test, we suggest that this approach may be a precursor to Theory of Mind, one of the distinguishing hallmarks of primate social cognition.
Article
Full-text available
If we are to build human-like robots that can interact naturally with people, our robots must know not only about the properties of objects but also the properties of animate agents in the world. One of the fundamental social skills for humans is the attribution of beliefs, goals, and desires to other people. This set of skills has often been called a theory of mind. This paper presents the theories of Leslie (1994) and Baron-Cohen (1995) on the development of theory of mind in human children and discusses the potential application of both of these theories to building robots with similar capabilities. Initial implementation details and basic skills (such as finding faces and eyes and distinguishing animate from inanimate stimuli) are introduced. I further speculate on the usefulness of a robotic implementation in evaluating and comparing these two models.
Article
Trust in autonomy is essential for effective human-robot collaboration and user adoption of autonomous systems such as robot assistants. This article introduces a computational model that integrates trust into robot decision making. Specifically, we learn from data a partially observable Markov decision process (POMDP) with human trust as a latent variable. The trust-POMDP model provides a principled approach for the robot to (i) infer the trust of a human teammate through interaction, (ii) reason about the effect of its own actions on human trust, and (iii) choose actions that maximize team performance over the long term. We validated the model through human subject experiments on a table clearing task in simulation (201 participants) and with a real robot (20 participants). In our studies, the robot builds human trust by manipulating low-risk objects first. Interestingly, the robot sometimes fails intentionally to modulate human trust and achieve the best team performance. These results show that the trust-POMDP calibrates trust to improve human-robot team performance over the long term. Further, they highlight that maximizing trust alone does not always lead to the best performance.
Article
We review the idea that Theory of Mind—our ability to reason about other people's mental states—can be formalized as inverse reinforcement learning. Under this framework, expectations about how mental states produce behavior are captured in a reinforcement learning (RL) model. Predicting other people’s actions is achieved by simulating a RL model with the hypothesized beliefs and desires, while mental-state inference is achieved by inverting this model. Although many advances in inverse reinforcement learning (IRL) did not have human Theory of Mind in mind, here we focus on what they reveal when conceptualized as cognitive theories. We discuss landmark successes of IRL, and key challenges in building human-like Theory of Mind.
Article
Theory of mind (ToM; Premack & Woodruff, 1978) broadly refers to humans' ability to represent the mental states of others, including their desires, beliefs, and intentions. We propose to train a machine to build such models too. We design a Theory of Mind neural network -- a ToMnet -- which uses meta-learning to build models of the agents it encounters, from observations of their behaviour alone. Through this process, it acquires a strong prior model for agents' behaviour, as well as the ability to bootstrap to richer predictions about agents' characteristics and mental states using only a small number of behavioural observations. We apply the ToMnet to agents behaving in simple gridworld environments, showing that it learns to model random, algorithmic, and deep reinforcement learning agents from varied populations, and that it passes classic ToM tasks such as the "Sally-Anne" test (Wimmer & Perner, 1983; Baron-Cohen et al., 1985) of recognising that others can hold false beliefs about the world. We argue that this system -- which autonomously learns how to model other agents in its world -- is an important step forward for developing multi-agent AI systems, for building intermediating technology for machine-human interaction, and for advancing the progress on interpretable AI.
Conservative q-learning for offline reinforcement learning
  • Aviral Kumar
  • Aurick Zhou
  • George Tucker
  • Sergey Levine
  • Kumar Aviral
Aviral Kumar, Aurick Zhou, George Tucker, and Sergey Levine. 2020. Conservative q-learning for offline reinforcement learning. Advances in Neural Information Processing Systems 33 (2020), 1179-1191.
d3rlpy: An Offline Deep Reinforcement Learning Library
  • Takuma Seno
  • Michita Imai
  • Seno Takuma
Takuma Seno and Michita Imai. 2022. d3rlpy: An Offline Deep Reinforcement Learning Library. Journal of Machine Learning Research 23, 315 (2022), 1-20. http://jmlr.org/papers/v23/22-0017.html