Modeling Theory of Mind and Cognitive Appraisal with Decision-Theoretic Agents
ABSTRACT Agent-based simulation of human social behavior has become increasingly important as a basic research tool to further our understanding of social behavior, as well as to create virtual social worlds used to both entertain and educate. A key factor in human social interaction is our beliefs about others as intentional agents, a Theory of Mind. How we act depends not only on the immediate effect of our actions but also on how we believe others will react. In this paper, we discuss PsychSim, an implemented multiagent-based simulation tool for modeling social interaction and influence. While typical approaches to such modeling have used first-order logic, PsychSim agents have their own decision-theoretic models of the world, including beliefs about their environment and recursive models of other agents. Using these quantitative models of uncertainty and preferences, we have translated existing psychological theories into a decision-theoretic semantics that allow the agents to reason about degrees of believability in a novel way. We demonstrate the expressiveness of PsychSim's decision-theoretic implementation of Theory of Mind by presenting its use as the foundation for a domain-independent model of appraisal theory, the leading psychological theory of emotion. The model of appraisal within PsychSim demonstrates the key role of a Theory of Mind capacity in appraisal and social emotions, as well as arguing for a uniform process for emotion and cognition.
- SourceAvailable from: David V. Pynadath[Show abstract] [Hide abstract]
ABSTRACT: popular as a basis for solving agent and multiagent problems, due to their ability to quantify the complex uncertainty and preferences that pervade most nontrivial domains. However, this quantitative nature also complicates the problem of constructing models that accurately represent an existing agent or multiagent system, leading to the common question, "Where do the numbers come from?" In this work, we present a method for exploiting knowledge about the qualitative structure of a problem domain to automatically derive the correct quantitative values that would generate an observed pattern of agent behavior. In particular, we propose the use of piecewise linear functions to represent probability distributions and utility functions with a structure that we can then exploit to more efficiently compute value functions. More importantly, we have designed algorithms that can (for example) take a sequence of actions and automatically generate a reward function that would generate that behavior within our agent model. This algorithm allows us to efficiently fit an agent or multiagent model to observed behavior. We illustrate the application of this framework with examples in multiagent modeling and social simulation, using decision-theoretic models drawn from the alphabet soup of existing research (e.g., MDPs, POMDPs, Dec-POMDPs, Com-MTDPs).07/2004;
Conference Paper: THESPIAN: An Architecture for Interactive Pedagogical Drama.[Show abstract] [Hide abstract]
ABSTRACT: Interactive drama is increasingly being used as a pedagogical tool in a wide variety of computer-based learning environments. However, the effort re- quired to build interactive dramas is quite significant. We built Thespian, an archi- tecture that supports faster development of IPDs, open-ended interaction, encoding of pedagogical goals and quantitative metrics for evaluating those goals. Thespian uses autonomous agents to control each character and assumes that the starting point for the design process is a set of standard scripts. A "fitting" algorithm facil- itates the design process by automatically adjusting the goals of the agents so that the agents perform their roles according to the scripts. This also ensures the agents will behave true to their character's motivations even when the interactive drama deviates from the scripts. In this paper, we discuss this basic approach in detail and illustrate its application to the Tactical Language Training System.Artificial Intelligence in Education - Supporting Learning through Intelligent and Socially Informed Technology, Proceedings of the 12th International Conference on Artificial Intelligence in Education, AIED 2005, July 18-22, 2005, Amsterdam, The Netherlands; 01/2005
Conference Paper: PsychSim: Agent-based Modeling of Social Interactions and Influence.[Show abstract] [Hide abstract]
ABSTRACT: Agent-based modeling of human social behavior is an increas- ingly important research area. For example, it is critical to designing virtual humans, human-like autonomous agents that interact with people in virtual worlds. A key factor in human social interaction is our beliefs about others, in particular a the- ory of mind. Whether we believe a message depends not only on its content but also on our model of the communicator. The actions we take are influenced by how we believe others will react. However, theory of mind is usually ignored in compu- tational models of social interaction. In this paper, we present PsychSim, an implemented multiagent-based simulation tool for modeling interactions and influence among groups or indi- viduals. Each agent has its own decision-theoretic model of the world, including beliefs about its environment and recursive models of other agents. Having thus given the agents a theory of mind, PsychSim also provides them with a psychologically motivated mechanism for updating their beliefs in response to actions and messages of other agents. We discuss PsychSim and present an example of its operation.Proceedings of the International Conference on Cognitive Modelling, ICCM 2004, Pittsburgh, Pennsylvania, USA, July 30 - August 1, 2004; 01/2004
Modeling Theory of Mind and Cognitive Appraisal with
David V. Pynadath1, Mei Si2, and Stacy C. Marsella1
1Institute for Creative Technologies, University of Southern California
12015 Waterfront Drive, Playa Vista, CA 90094-2536 USA
2Cognitive Science Department, Rensselaer Polytechnic Institute
110 8th Street, Troy, NY 12180 USA
email@example.com, firstname.lastname@example.org, email@example.com
April 7, 2011
Agent-based simulation of human social behavior has become increasingly important as a basic research tool to
further our understanding of social behavior, as well as to create virtual social worlds used to both entertain and
educate. A key factor in human social interaction is our beliefs about others as intentional agents, a Theory of Mind.
How we act depends not only on the immediate effect of our actions but also on how we believe others will react. In
this paper, we discuss PsychSim, an implemented multiagent-based simulation tool for modeling social interaction
and influence. While typical approaches to such modeling have used first-order logic, PsychSim agents have their
own decision-theoretic models of the world, including beliefs about their environment and recursive models of other
agents. Using these quantitative models of uncertainty and preferences, we have translated existing psychological
theories into a decision-theoretic semantics that allow the agents to reason about degrees of believability in a novel
way. We demonstrate the expressiveness of PsychSim’s decision-theoretic implementation of Theory of Mind by
presenting its use as the foundation for a domain-independent model of appraisal theory, the leading psychological
theory of emotion. The model of appraisal within PsychSim demonstrates the key role of a Theory of Mind capacity
in appraisal and social emotions, as well as arguing for a uniform process for emotion and cognition.
behavior and a range of applications. For example, computational models of psychological or sociological theories
promise to transform how theories of human behavior are formulated and evaluated . In addition, computational
models of human social interaction have also become increasingly important as a means to create simulated social
environments used in a variety of training and entertainment applications . For example, many serious games have
We argue that such models of social interaction must address the fact that people interact within a complex social
framework. A central factor in social interaction is the beliefs we have about each other, a Theory of Mind .
Our choice of action is influenced by how we believe others will feel and react. Whether we believe what we are
told depends not only on the content of the communication but also on our model of the communicator. How we
emotionally react to another’s action is influenced by our beliefs as well, for example whether we believe he or she
intended to cause harm . The central goal of our research is to bring such Theory of Mind capacities to the design
of computational models of social interaction both as a basic research tool and as a framework for virtual character
Unfortunately, traditional artificial intelligence techniques are ill-suited for modeling Theory of Mind. Representa-
tions using first-order logic are often insensitive to the distinctions among conflicting goals that people must balance in
a social interaction. For example, psychological research has identified a range of goals that motivate classroom bullies
(e.g., peer approval, sadism, tangible rewards) . Different bullies may share the same goals, but the relative priori-
ties that they place on them will lead to variations in their behavior. Resolving the ambiguity among equally possible,
but unequally plausible or preferred, options requires a quantitative model of uncertainty and preference. Unfortu-
nately, more quantitative frameworks, like decision theory and game theory, face their own difficulties in modeling
human psychology. Game-theoretic frameworks typically rely on concepts of equilibria that people rarely achieve
in an unstructured social setting like a classroom. Decision-theoretic frameworks typically rely on assumptions of
rationality that people violate.
We have developed a social simulation framework, PsychSim [20, 27], that operationalizes existing psychological
theories as boundedly rational computations to generate more plausibly human behavior. PsychSim allows a user to
quickly construct a social scenario where a diverse set of entities, groups or individuals, interact and communicate.
Each entity has its own preferences, relationships (e.g., friendship, hostility, authority) with other entities, private
beliefs, and mental models about other entities. The simulation tool generates the behavior for these entities and
provides explanations of the result in terms of each entity’s preferences and beliefs. The richness of the entity models
allows one to explore the potential consequences of minor variations on the scenario.
A central aspect of the PsychSim design is that agents have fully specified decision-theoretic models of others.
Such quantitative recursive models give PsychSim a powerful mechanism to model a range of factors in a principled
way. For instance, we exploit this recursive modeling to allow agents to form complex attributions about others,
send messages that include the beliefs and preferences of others, and use their observations of another’s behavior to
influence their model of that other.
In operationalizing psychological theories within PsychSim, we have taken a strong architectural stance. We
assume that decision-theoretic agents that incorporate a Theory of Mind provide a uniform, sufficient computational
core for modeling the factors relevant to human social interaction. While the sufficiency of our framework remains an
open question, such a strong stance yields the benefit of uniform processes and representations that cover a range of
phenomena. Our stance thus provides subsequent computational benefits, such as optimization and reuse of the core
algorithms that provide the agent’s decision-making and belief revision capacities.
More significantly, this uniformity begins to reveal common elements across apparently disparate psychological
phenomena that typically have different methodological histories. To illustrate such common elements, we have
demonstrated how a range of human psychological and social phenomena can be modeled within our framework,
including wishful thinking , influence factors , childhood aggression  and emotion .
In this article, we discuss two of those models. First, we use a model of childhood aggression to motivate the
discussion of the overall framework as well as to demonstrate its expressiveness. Second, in keeping with the theme
of this volume, we go into considerable detail on how PsychSim’s decision-theoretic agents with a Theory of Mind
provide a particularly effective basis for a computational model of emotion.
Computational models of emotion have largely been based on appraisal theory [5, 7, 18, 22, 6, 23, 28, 47], a leading
psychological theory of emotion. Appraisal theory argues that a person’s subjective assessment of their relationship
to the environment determines his or her emotional responses [8, 13, 14, 23, 29, 30, 41, 42]. This assessment occurs
along several dimensions, such as motivational congruence, accountability, novelty and control. For example, an
event that leads to a bad outcome for a person (motivationally incongruent) and is believed to be caused by others
(accountability) is likely to elicit an anger response. On the other hand, if the event is believed to be caused by the
person himself/herself, he/she is more likely to feel guilt or regret .
We approach the task of incorporating appraisal into the existing PsychSim multiagent framework as a form of
thought experiment: Can we leverage the existing processes and representations in PsychSim to model appraisal? The
motivations for this thought experiment are three-fold. First, we seek to demonstrate overlap between the theoretical
model of appraisal theory and decision-theoretic, social agents of PsychSim. Specifically, we are interested in whether
appraisal offers a possible blueprint, or requirements specification, for intelligent social agents by showing that an
existing framework not predesigned with emotion or appraisal in mind has, in fact, appraisal-like processes already
Conversely, we seek to illustrate the critical role that subjective beliefs about others plays in allowing agents to
model social emotions. Because the agent’s representations and decision-making processes are rooted in a Theory
of Mind capacity, incorporating and maintaining beliefs about others, the appraisal process inherits this social frame,
allowing the agent to appraise events from its own perspective as well as others. Thus, in keeping with the tenets of
social appraisal , the behaviors, thoughts and emotions of the other can also be appraised and thereby influence
Finally, we seek a design that is elegant, by reusing architectural features to realize new capabilities such as emo-
tion. Alternative approaches for creating embodied conversational agents and virtual agents often integrate separate
modules for emotion, decision-making, dialogue, etc., which leads to sophisticated but complex architectures .
The work here is an alternative minimalist agenda for agent design. In particular, based on the core theory of mind
reasoning processes, appraisal can be derived with few extensions.
We begin the paper with a demonstration of PsychSim’s application to a childhood aggression scenario. We then
discuss how PsychSim can represent appraisal theory and present a preliminary assessment of its implementation.
2 The Agent Models
This section describes PsychSim’s underlying architecture, using a school bully scenario for illustration. The agents
represent different people and groups in the school setting. The user can analyze the simulated behavior of the students
to explore the causes and cures for school violence. One agent represents a bully, and another represents the student
who is the target of the bully’s violence. A third agent represents the group of onlookers, who encourage the bully’s
exploits by, for example, laughing at the victim as he is beaten up. A final agent represents the class’s teacher trying
to maintain control of the classroom, for example by doling out punishment in response to the violence. We embed
PsychSim’s agents within a decision-theoretic framework for quantitative modeling of multiple agents. Each agent
maintains its independent beliefs about the world, has its own goals and it owns policies for achieving those goals.
2.1Model of the World
Each agent model starts with a representation of its current state and the Markovian process by which that state evolves
over time in response to the actions performed.
Each agent model includes several features representing its “true” state. This state consists of objective facts about
the world, some of which may be hidden from the agent itself. For our example bully domain, we included such
state features as power(agent), to represent the strength of an agent. trust(truster,trustee) represents
the degree of trust that the agent truster has in another agent trustee’s messages. support(supporter,
supportee) is the strength of support that an agent supporter has for another agent supportee. We represent
the state as a vector, ? st, where each component corresponds to one of these state features and has a value in the range
Agents have a set of actions that they can choose to change the world. An action consists of an action type (e.g.,
punish), an agent performing the action (i.e., the actor), and possibly another agent who is the object of the action.
For example, the action laugh(onlooker, victim) represents the laughter of the onlooker directed at the
The state of the world changes in response to the actions performed by the agents. We model these dynamics using a
transition probability function, T(? si,? a,? sf), to capture the possibly uncertain effects of these actions on the subsequent
Pr(? st+1= ? sf|? st= ? si,? at= ? a) = T(? si,? a,? sf)
For example, the bully’s attack on the victim impacts the power of the bully, the power of the victim, etc. The
distribution over the bully’s and victim’s changes in power is a function of the relative powers of the two—e.g., the
larger the power gap that the bully enjoys over the victim, the more likely the victim is to suffer a big loss in power.
PsychSim’s decision-theoretic framework represents an agent’s incentives for behavior as a reward function that maps
the state of the world into a real-valued evaluation of benefit for the agent. We separate components of this reward
function into two types of subgoals. A goal of Minimize/maximize feature(agent) corresponds to a nega-
tive/positive reward proportional to the value of the given state feature. For example, an agent can have the goal of
maximizing its own power. A goal of Minimize/maximize action(actor, object) corresponds to a nega-
tive/positive reward proportional to the number of matching actions performed. For example, the teacher may have
the goal of minimizing the number of times any student teases any other.
We can represent the overall preferences of an agent, as well as the relative priority among them, as a vector of
weights, ? g, so that the product, ? g · ? st, quantifies the degree of satisfaction that the agent receives from the world, as
represented by the state vector,? st. For example, in the school violence simulation, the bully’s reward function consists
of goals of maximizing power(bully), minimizing power(victim), and maximizing laugh(onlookers,
victim). By modifying the weights on the different goals, we can alter the motivation of the agent and, thus, its
behavior in the simulation.
2.3Beliefs about Others
As described by Sections 2.1 and 2.2, the overall decision problem facing a single agent maps easily into a partially
observable Markov decision problem (POMDP) . Software agents can solve such a decision problem using ex-
isting algorithms to form their beliefs and then determine the action that maximizes their reward given those beliefs.
However, we do not expect people to conform to such optimality in their behavior. Thus, we have taken the POMDP
algorithms as our starting point and modified them in a psychologically motivated manner to capture more human-like
behavior. This “bounded rationality” better captures the reasoning of people in the real-world, as well as providing the
additional benefit of avoiding the computational complexity incurred by an assumption of perfect rationality.
The agents have only a subjective view of the world, where they form beliefs,?bt, about what they think is the state of
the world, ? st. Agent A’s beliefs about agent B have the same structure as the real agent B. Thus, our agent belief
models follow a recursive structure, similar to previous work on game-theoretic agents . Of course, the nesting
of these agent models is potentially unbounded. However, although infinite nesting is required for modeling optimal
behavior, people rarely use such deep models . In our school violence scenario, we found that 2-level nesting was
sufficiently rich to generate the desired behavior. Thus, the agents model each other as 1-level agents, who, in turn,
model each other as 0-level agents, who do not have any beliefs. Thus, there is an inherent loss of precision (but with
a gain in computational efficiency) as we move deeper into the belief structure.
For example, an agent’s beliefs may include its subjective view on states of the world: “The bully believes that the
teacher is weak”, “The onlookers believe that the teacher supports the victim”, or “The bully believes that he/she is
powerful.” These beliefs may also include its subjective view on beliefs of other agents: “The teacher believes that the
bully believes the teacher to be weak.” An agent may also have a subjective view of the preferences of other agents:
“The teacher believes that the bully has a goal to increase his power.” It is important to note that we also separate an
agent’s subjective view of itself from the real agent. We can thus represent errors that the agent has in its view of itself
(e.g., the bully believes himself to be stronger than he actually is).
Actions affect the beliefs of agents in several ways. For example, the bully’s attack may alter the beliefs that agents
have about the state of the world—such as beliefs about the bully’s power. Each agent updates its beliefs according to
its subjective beliefs about the world dynamics. It may also alter the beliefs about the bully’s preferences and policy.
We discuss the procedure of belief update in Section 2.4.
2.3.2Policies of Behavior
Each agent’s policy is a function, π(?b), that represents the process by which it selects an action or message based on
its beliefs. An agent’s policy allows us to model critical psychological distinctions such as reactive vs. deliberative
simulating the behavior of the other agents and the dynamics of the world in response to the selected action/message.
Each agent i computes a quantitative value, Va(?bt
i), of each possible action, a, given its beliefs,?bt
i) =? gi·?bt
The agent computes the posterior probability of subsequent belief states (Pr(?bt+1
)) by using the transition function, T,
to project the immediate effect of the action, a, on its beliefs. It then projects another N steps into the future, weighing
each state against its goals, ? g. At the first step, agent i uses its model of the policies of all of the other agents, π¬i,
and, in subsequent steps, it uses its model of the policies of all agents, including itself, π. Thus, the agent is seeking
to maximize the expected reward of its behavior as in a POMDP. However, PsychSim’s agents are only boundedly
rational, given that they are constrained, both by the finite horizon, N, of their lookahead and the possible error in
their belief state,?b. By varying N for different agents, we can model entities who display different degrees of reactive
vs. deliberative behavior in their thinking.
2.3.3 Stereotypical Mental Models
If we applied this full lookahead policy within the nested models of the other agents, the computational complexity
of the top-level lookahead would quickly become infeasible as the number of agents grew. To simplify the agents’
reasoning, these mental models are realized as simplified stereotypes of the richer lookahead behavior models of the
agents themselves. For our simulation model of a bullying scenario, we have implemented mental models correspond-
ing to attention-seeking, sadistic, dominance-seeking, etc. For example, a model of an attention-seeking bully specifies
a high priority on increasing the approval (i.e., support) that the other agents have for it, a dominance-seeking bully
specifies a high priority on increasing its power as paramount, and a bully agent specifies a high priority on hurting
These simplified mental models also include potentially erroneous beliefs about the policies of other agents. Al-
though the real agents use lookahead exclusively when choosing their own actions (as described in Section 2.3.2),
the agents believe that the other agents follow much more reactive policies as part of their mental models of each
other. PsychSim models reactive policies as a table of “Condition⇒Action” rules. The left-hand side conditions may
trigger on an observation of some action or a belief of some agent (e.g., the bully believing himself as powerful). The
conditions may also be more complicated combinations of these basic triggers (e.g., a conjunction of conditions that
matches when each and every individual condition matches).
The use of these more reactive policies in the mental models that agents have of each other achieves two desirable
results. First, from a human modeling perspective, the agents perform a shallower reasoning that provides a more
accurate model of the real-world entities they represent. Second, from a computational perspective, the direct action
rules are cheap to execute, so the agents gain significant efficiency in their reasoning.
2.4 Modeling Influence and Belief Change
Messages are attempts by one agent to influence the beliefs of another. Messages have four components: source,
recipients, subject, and content. For example, the teacher (source) could tell the bully (recipient) that the principal
(subject of the message) will punish violence by the bully (content). Messages can refer to beliefs, preferences,
policies, or any other aspect of other agents. Thus, a message may make a claim about a state feature of the subject
(“the principal is powerful”), the beliefs of the subject (“the principal believes that he is powerful”), the preferences of
the subject (“the bully wants to increase his power”), the policy of the subject (“if the bully thinks the victim is weak,
he will pick on him”), or the stereotypical model of the subject (“the bully is selfish”).
A challenge in creating a social simulation is addressing how groups or individuals influence each other, how they
update their beliefs and alter behavior based on any partial observation of, as well as messages from, others. Although
many psychological results and theories must inform the modeling of such influence (e.g., [1, 3, 25]) they often suffer
from two shortcomings from a computational perspective. First, they identify factors that affect influence but do not
operationalize those factors. Second, they are rarely comprehensive and do not address the details of how various
factors relate to each other or can be composed. To provide a sufficient basis for our computational models, our
approach has been to distill key psychological factors and map those factors into our simulation framework. Here, our
decision-theoretic models are helpful in quantifying the impact of factors in such a way that they can be composed.
Specifically, a survey of the social psychology literature identified the following key factors:
Consistency: People expect, prefer, and are driven to maintain consistency, and avoid cognitive dissonance, be-
tween beliefs and behaviors.
Self-interest: The inferences we draw are biased by self-interest (e.g., motivated inference) and how deeply we
analyze information in general is biased by self-interest.
Speaker’s Self-interest: If the sender of a message benefits greatly if the recipient believes it, there is often a
tendency to be more critical and for influence to fail.
Trust, Likability, Affinity: The relation to the source of the message, whether we trust, like or have some group
affinity for him, all impact whether we are influenced by the message.
2.4.3 Computational Model of Influence
To model such factors in the simulation, one could specify them exogenously and make them explicit, user-specified
factors for a message. This tactic is often employed in social simulations where massive numbers of simpler, often
identical, agents are used to explore emergent social properties. However, providing each agent with quantitative
models of itself and, more importantly, of other agents gives us a powerful mechanism to model this range of factors
in a principled way. We model these factors by a few simple mechanisms in the simulation: consistency, self-interest,
and bias. We can render each as a quantitative function of beliefs that allows an agent to compare alternate candidate
belief states (e.g., an agent’s original?b vs. the?b?implied by a message).
Consistency is an evaluation of the degree to which a potential belief agreed with prior observations. In effect,
the agent asks itself, “If this belief holds, would it better explain the past better than my current beliefs?”. We use a
Bayesian definition of consistency based on the relative likelihood of past observations given the two candidate sets
of beliefs (e.g., my current beliefs with and without believing the message). An agent assesses the quality of the
competing explanations by a re-simulation of the past history. In other words, it starts at time 0 with the two worlds
implied by the two candidate sets of beliefs, projects each world forward up to the current point of time, and computes
the probability of the observation it received. The higher the value, the more likely that agent is to have chosen the
observed action, and, thus, the higher the degree of consistency.
In previous work, we have investigated multiple methods of converting such action values into a degree of con-
sistency . For the purposes of the current work, we use only one of those methods, defining the consistency of a
sequence of observations, ω0,ω1,..., with a given belief state,?b, as follows:
consistency(?bt,?ω0,ω1,...,ωt−1?) = Pr
The algorithm first ranks the utilities of the actor’s alternative actions in reversed order (rank(v)). The value function,
V , computed is with respect to the agent performing the action at time τ. Thus, the higher the rank of the likelihood
of the observation, the more consistent it is with the candidate belief state.
Self-interest is similar to consistency, in that the agent compares two sets of beliefs, one which accepts the message
and one which rejects it. However, while consistency evaluates the past, we compute self-interest by evaluating
the future using Equation 3. An agent can perform an analogous computation using its beliefs about the sender’s
preferences to compute the sender’s self-interest in sending the message.
Bias factors represent subjective views of the message sender that influence the receiver’s acceptance/rejection of
the message. We treat support (or affinity) and trust as such a bias on message acceptance. Agents compute their
support and trust levels as a running history of their past interactions. In particular, one agent increases (decreases) its
trust in another, when the second sends a message that the first decides to accept (reject). Similarly, an agent increases
(decreases) its support for another, when the second selects an action that has a high (low) reward, with respect to the
preferences of the first. In other words, if an agent selects an action a, then the other agents modify their support level
for that agent by a value proportional to ? g ·?b, where ? g corresponds to the goals and?b the new beliefs of the agent
modifying its support.
Upon receiving any information (whether message or observation), an agent must consider all of these various
factors in deciding whether to accept it and how to alter its beliefs (including its mental models of the other agents).
For a message, the agent determines acceptance using a weighted sum of the five components: consistency, self-
interest, speaker self-interest, trust and support. Whenever an agent observes an action by another, it checks whether
the observation is consistent with its current beliefs (including mental models). If so, no belief change is necessary. If
not, the agent evaluates alternate mental models as possible new beliefs to adopt in light of this inconsistent behavior.
Agents evaluate these possible belief changes using the same weighted sum as for messages.
Each agent’s decision-making procedure is sensitive to these changes that its actions may trigger in the beliefs of
others. Each agent accounts for the others’ belief update when doing its lookahead, as Equations 2 and 3 project the
future beliefs of the other agents in response to an agent’s selected action. Similar to work by  this mechanism
provides PsychSim agents with a potential incentive to deceive, if doing so leads the other agents to perform actions
that lead to a better state for the deceiving agent.
We see the computation of these factors as a toolkit for the user to explore the system’s behavior under existing
theories, which we can encode in PsychSim. For example, the elaboration likelihood model (ELM)  argues that
the way messages are processed differs according to the relevance of the message to the receiver. High relevance or
importance would lead to a deeper assessment of the message, which is consistent with the self-interest calculations
our model performs. PsychSim’s linear combination of factors is roughly in keeping with ELM because self-interest
values of high magnitude would tend to dominate.
3 Childhood Aggression Model
The research literature on childhood aggression provides interesting insight into the role that Theory of Mind plays
in human behavior. Investigations of bullying and victimization  have identified four types of children; we focus
here on nonvictimized aggressors, those who display proactive aggression due to positive outcome expectancies for
aggression. Children develop expectations on the likely outcomes of aggression based on past experiences (e.g., did
past acts of aggression lead to rewards or punishment). This section describes the results of our exploration of the
space of different nonvictimized aggressors and the effectiveness of possible intervention strategies in dealing with
3.1 Scenario Setup
The user sets up a simulation in PsychSim by selecting generic agent models that will play the roles of the various
groupsorindividualstobesimulatedandspecializingthosemodelsasneeded. Inourbullyingscenario, weconstructed
generic bully models that compute outcome expectancies as the expected value of actions (Vafrom Equation 2). Thus,
when considering possible aggression, the agents consider the immediate effect of an act of violence, as well as the
possible consequences, including the change in the beliefs of the other agents. In our example scenario, a bully has
three subgoals that provide incentives to perform an act of aggression: (1) to change the power dynamic in the class
by making himself stronger, (2) to change the power dynamic by weakening his victim, and (3) to earn the approval
of his peers (as demonstrated by their response of laughter at the victim). Our bully agent models the first incentive as
a goal of maximizing power(bully) and the second as minimizing power(victim), both coupled with a belief
that an act of aggression will increase the former and decrease the latter. The third incentive seeks to maximize the
laugh actions directed at the victim, so it must consider the actions that the other agents may take in response.
For example, a bully motivated by the approval of his classmates would use his mental model of them to predict
whether they would laugh along with him. We implemented two possible mental models of the bully’s classmates:
encouraging, where the students will laugh at the victim, and scared, where the students will laugh only if the teacher
did not punish them for laughing last time. Similarly, the bully would use his mental model of the teacher to predict
whether he will be punished or not. We provide the bully with three possible mental models of the teacher: normal,
where the teacher will punish the bully in response to an act of violence; severe, where the teacher will more harshly
punish the bully than in the normal model; and weak, where the teacher never punishes the bully.
The relative priorities of these subgoals within the bully’s overall reward function provide a large space of possible
behavior. When creating a model of a specific bully, PsychSim uses a fitting algorithm to automatically determine the
appropriate weights for these goals to match observed behavior. For example, if the user wants the bully to initially
attack a victim and the teacher to threaten the bully with punishment, then the user specifies those behaviors and the
model parameters are fitted accordingly . This degree of automation significantly simplifies simulation setup. In
this experiment, we selected three specific bully models from the overall space: (1) dominance-seeking, (2) sadistic,
and (3) attention-seeking, each corresponding to a goal weighting that favors the corresponding subgoal.
PsychSim allows one to explore multiple tactics for dealing with a social issue and see the potential consequences.
Here, we examine a decision point for the teacher after the bully has attacked the victim, followed by laughter by the
rest of the class. At this point, the teacher can punish the bully, punish the whole class (including the victim), or do
nothing. We explore the impact of different types of proactive aggression by varying the type of the bully, the teacher’s
decision to punish the bully, the whole class, or no one, and the mental models that the bully has of the other students
and the teacher.
A successful outcome is when the bully does not choose to act out violently toward the victim the next time around.
Table 1: Outcomes of intervention strategies
By examining the outcomes under these combinations, we can see the effects of intervention over the space of possible
classroom settings. Table 1 shows all of the outcomes, where we use the “*” wildcard symbol to collapse rows where
the outcome was the same. Similarly, a row with “¬severe” in the Teacher row spans the cases where the bully’s
mental model of the teacher is either normal or weak.
We first see that the PsychSim bully agent meets our intuitive expectations. For example, we see from Table 1 that
if the bully thinks that the teacher is too weak to ever punish, then no immediate action by the teacher will change the
bully from picking on the victim. Thus, it is critical for the teacher to avoid behavior that leads the bully to form such
mental models. Similarly, if the bully is of the attention-seeking variety, then punishment directed at solely himself
will not dissuade him, as he will still expect to gain peer approval. In such cases, the teacher is better off punishing the
We can see more interesting cases as we delve deeper. For example, if we look at the case of a sadistic bully when
the teacher punishes the whole class, we see that bully can be dissuaded only if he thinks that the other students will
approve of his act of violence. This outcome may seem counter-intuitive at first, but the sadistic bully is primarily
concerned with causing suffering for the victim, and thus does not mind being punished if the victim is punished as
well. However, if the bully thinks that the rest of the class is encouraging, then the teacher’s punishment of the whole
class costs him peer approval. On the other hand, if the bully thinks that the rest of the class is already scared, so that
they will not approve of his violence, then he has no peer approval to lose.
Such exploration can offer the user an understanding of the potential pitfalls in implementing an intervention
strategy. Rather than providing a simple prediction of whether a strategy will succeed or not, PsychSim maps out
the key conditions, in terms of the bully’s preferences and beliefs, on which a strategy’s success depends. PsychSim
providesa richspace ofpossible modelsthatwe cansystematically exploreto understandthesocial behaviorthat arises
out of different configurations of student psychologies. We are continuing to investigate more class configurations and
the effects of possible interventions as we expand our models to cover all of the factors in school aggression identified
in the literature.
4A Model of Appraisal
To further illustrate the capacity of these decision-theoretic agents to model social interaction, we also used them to
implement a computational model of emotion, largely based on Smith and Lazarus’s theory of cognitive appraisal .
As noted in Section 1, our implementation was driven by a thought experiment: Can we leverage the existing processes
and representations in PsychSim to model appraisal? The motivations for this thought experiment are three-fold. First,
we seek to demonstrate an intrinsic coupling between the theoretical model of appraisal and decision-theoretic social
reasoning. Specifically, we wish to show that an existing social framework with no explicit emotion or appraisal
capabilities has, in fact, appraisal-like processes already present as part of its decision-making processes. Second, we
also seek to illustrate the critical role that subjective beliefs about others play in modeling social emotions. Finally, we
seek a minimalist design that elegantly reuses architectural features to model new social phenomena, like emotion.
This work on modeling appraisal is in the spirit of EMA [10, 19] (see also Chapter X), which defines appraisal
processes as operations over a plan-based representation (a causal interpretation) of the agent’s goals and how events
impact those goals. The agent’s existing cognitive processes maintain the causal interpretation, which appraisal-
specific processes leverage. Thus, in EMA, the cognitive processes for constructing the person-environment relation
representation are distinct from appraisal itself, which is reduced to simple and fast pattern matching over the plan
We seek to eliminate this distinction by demonstrating how the cognitive processes for decision-making can also
generate appraisals. While EMA uses a uniform representation across cognition and appraisal, we seek to additionally
establish uniformity over the algorithms underlying both cognition and appraisal. In so doing, we can identify that
appraisal itself is already an integral part of the cognitive processes that a social agent must perform to maintain its
beliefs about others and to inform its decision-making in a multiagent social context.
Specifically, we treat appraisal as leveraging component algorithms already present in a PsychSim agent by de-
riving key appraisal variables from the outputs of these algorithms. Furthermore, the appraisal process inherits the
intrinsic social nature of these algorithms, allowing the agent to appraise events from another’s perspective, as well as
from its own. Thus, in keeping with Manstead and Fischer’s concept of social appraisal, the behaviors, thoughts and
emotions of the other can also be appraised and thereby influence the agent.
We have modeled five appraisal dimensions so far: motivational relevance, motivational congruence, account-
ability, control and novelty. We adapted Smith and Lazarus’s  definitions for modeling motivational relevance,
motivational congruence and accountability. Our model of control is roughly equivalent to Smith and Lazarus’s 
definition of problem-focused coping potential. It is closer to Scherer’s  definition of control, because it accounts
fortheoverallchangeabilityofthesituationandnotanindividualagent’spowertomakeachange. Finally, wemodeled
novelty based on Leventhal and Scherer’s [15, 30] definition of predictability-based novelty, as there is no equivalent
concept in Smith and Lazarus.
The model of appraisal is built within Thespian [33, 34, 35, 36, 37, 38, 39], which extends PsychSim for modeling
and simulating computer-aided interactive narratives. This computational model of appraisal is one of Thespian’s
extensions. We demonstrate the application of the appraisal model in three different scenarios: a simple conversation
between two people, a firing-squad scenario as modeled in , and a fairy tale, “The Little Red Riding Hood”. The
last scenario will be described in detail in the next section as it will be used as an example to motivate the discussion
of our appraisal model. The details of the other two scenarios are given in Section 5.
4.1Little Red Riding Hood Domain
The story contains four main characters, Little Red Riding Hood (Red), Granny, the hunter and the wolf. The story
starts as Red and the wolf meet each other on the outskirts of a forest while Red is on her way to Granny’s house. The
wolf wants to eat Red, but it dares not because there is a wood-cutter close by. At this point, the wolf and Red can
either have a conversation or go their separate ways. The wolf may have future chances to eat Red if he finds her alone
at another location. Moreover, if the wolf hears about Granny from Red, it can even go to Granny’s house and eat her
as well. Meanwhile, the hunter is searching for the wolf to kill it. Once the wolf is killed, all of the wolf’s previous
victims can escape. Our model of this story builds upon the base PsychSim representation, as described in Section 2.
4.2Modeling Appraisal in Thespian
Appraisal is a continuous process . People constantly reevaluate their situations and cope with unfavorable situa-
tions, forming a “appraisal-coping-reappraisal” loop. In this section we illustrate how we can model this phenomenon
and derive appraisal dimensions by leveraging algorithms and information within a PsychSim agent’s belief revision
and decision-making processes.