PreprintPDF Available

Generating Justifications for Norm-Related Agent Decisions

Preprints and early-stage research may not have been peer reviewed yet.


We present an approach to generating natural language justifications of decisions derived from norm-based reasoning. Assuming an agent which maximally satisfies a set of rules specified in an object-oriented temporal logic, the user can ask factual questions (about the agent's rules, actions, and the extent to which the agent violated the rules) as well as "why" questions that require the agent comparing actual behavior to counterfactual trajectories with respect to these rules. To produce natural-sounding explanations, we focus on the subproblem of producing natural language clauses from statements in a fragment of temporal logic, and then describe how to embed these clauses into explanatory sentences. We use a human judgment evaluation on a testbed task to compare our approach to variants in terms of intelligibility, mental model and perceived trust.
arXiv:1911.00226v1 [cs.CL] 1 Nov 2019
Generating Justifications for Norm-related Agent Decisions
Daniel Kasenberg*, Antonio Roque, Ravenna Thielstrom,
Meia Chita-Tegmark, and Matthias Scheutz
Human-Robot Interaction Laboratory
Tufts University
We present an approach to generating nat-
ural language justifications of decisions de-
rived from norm-based reasoning. Assuming
an agent which maximally satisfies a set of
rules specified in an object-oriented temporal
logic, the user can ask factual questions (about
the agent’s rules, actions, and the extent to
which the agent violated the rules) as well as
“why” questions that require the agent com-
paring actual behavior to counterfactual tra-
jectories with respect to these rules. To pro-
duce natural-sounding explanations, we focus
on the subproblem of producing natural lan-
guage clauses from statements in a fragment of
temporal logic, and then describe how to em-
bed these clauses into explanatory sentences.
We use a human judgment evaluation on a
testbed task to compare our approach to vari-
ants in terms of intelligibility, mental model
and perceived trust.
1 Introduction
Recent research has enabled artificial agents (such
as robots) to work closely with humans, some-
times as team-mates, sometimes as independent
decision-makers. For these agents to be trusted by
the humans they interact with, they must be able
to follow human norms: expected standards of
behavior and social interaction. Crucially, agents
must be able to explain the norms they are follow-
ing and how those norms have guided their deci-
sions. Because these explanations may occur in
task settings, agents need to be able to make these
explanations in natural language dialogue that hu-
mans will easily understand.
The field of explainable planning (Fox et al.,
2017) emphasizes, as we do, rendering the be-
haviors of a particular agent explainable to hu-
man interactants. Our approach embodies a form
of Questions 1 to 3 as described by Fox et al,
applying “why did you do that” (Question 1) to
general queries representable in temporal logic,
and applying “why is what you propose [supe-
rior] to something else” (Question 3) by appeal-
ing to the agent’s rules. Our approach opera-
tionalizes Langley (2019)’s definition of justified
agency for an intelligent system as “follow[ing]
society’s norms and explain[ing] its activities in
those terms” by providing natural language ex-
planations which appeal to temporal logic rules
which may represent moral or social norms. Fur-
ther relevant recent papers in explainable plan-
ning include Vasileiou et al. (2019), who formu-
late a logic-based approach; Krarup et al. (2019),
who like us employ contrastive explanations; and
Kim et al. (2019), who construct temporal logic
specifications which demonstrate the differences
between plans. While the latter work has in com-
mon with ours a focus on explainable planning and
temporal logic, we are interested in justifying the
agent’s choice of plan, whereas they seek to suc-
cinctly describe the difference between two plans.
None of the above approaches are concerned with
providing natural language explanations.
Natural language explanations are provided by
Chiyah Garcia et al. (2018), who develop an ap-
proach to providing system explainability by hav-
ing a human expert “speak aloud” while watching
system videos, and turning that explanation into
tree form. Our approach is similar in that we pro-
vide explanations of agent behavior in natural lan-
guage, but in our case the content is provided not
by human experts but by the agent’s reasoning.
In that respect, our approach is similar to the
tradition of generating natural language explana-
tions of mathematical proofs. Horacek (2007)
describes the differences between proof explana-
tions as required by humans, as opposed to proof
explanations as produced by Automated Theo-
rem Provers. Fiedler (2001b) provides a sur-
vey (pgs 10-12) of early research in develop-
ing explanations that allow the output of Auto-
mated Theorem Provers to be intelligible to hu-
mans, and describes a system that includes a user
model to guide the production of explanations
of more or less abstraction. This work is sum-
marized by Fiedler (2001a). Recent work fo-
cusing on the more general problem of gener-
ating text from formal or logical structures in-
cludes Manome et al. (2018)’s generation of sen-
tences from logical formulas using a sequence-
to-sequence approach, and Pourdamghani et al.
(2016)’s generation of sentences from Abstract
Meaning Representations by linearizing and using
Phrase Based Machine Translation approaches.
Where our approach differs is that rather than
aiming to justify logical conclusions via proofs
or specify natural language translations of arbi-
trary logical forms, our approach justifies the de-
cisions of autonomous agents (governed by princi-
ples specified in logic) in a way that is understand-
able to human users. More similar in this vein is
the work of (Kutlak and van Deemter,2015), who
provide natural language descriptions of the pre-
and post-conditions of planner actions.
Our approach provides two main contributions.
First, we construct explanations for the behavior
of an agent governed by temporal logic rules act-
ing in a deterministic relational Markov decision
process (RMDP) , including questions about the
agent’s rules and actions and “why” queries re-
quiring a contrastive explanation (Elzein,2019)
appealing to the temporal logic rules (content gen-
eration; section 3). Second, we convert these
explanation structures into natural language, con-
structing natural language clauses from statements
belonging to a fragment of our temporal logic, and
embedding these into general response templates
(surface representation generation; section 4). We
evaluate the outputs of our approach in a testbed
domain against baselines, and find that our ap-
proach shows increased performance in terms of
the agent’s intelligibility, the user’s mental model
of the agent, and trust in the agent’s ability to obey
norms in a principled way (section 5). We con-
clude with a summary and discussion of our con-
tributions (section 6).
2 Generating agent behavior
2.1 Test scenario
To illustrate the workings of our approach, we will
allude to a test scenario chosen to be simple while
highlighting some of the virtues of our approach.
Our approach may be applied to other domains as
well; we describe the assumptions our approach
makes about the agent’s environment and norms
in section 4.1.1.
The scenario is as follows: a robot has just gone
shopping on behalf of a human user to a store that
sells a pair of glasses and a watch. The human
user wants the glasses and the watch, and the robot
has a rule for buying everything that the human
wants. However, the robot can only afford one of
these items. The robot is able to pick up items and
walking out of the store without paying for them,
but it also has a rule against doing so (i.e. against
stealing), and this rule is the stronger of the two.
After acting in this environment, the robot is
asked about its rules, its actions, and why it made
the decisions it did. The types of utterances that
are needed at this point are shown in Dialogues 1-
2.2 Agent environment
We consider agents operating in RMDPs, where an
RMDP assumes that each state of the world sS
can be decomposed into the states so1,··· , sok
of a set of objects o1, o2,··· , ok, where each oi
belongs to one of a finite set of object classes
C1,··· , C, as well as a “residual state” not cor-
responding to any objects, s\o. Agents may per-
form actions in the world, where each action may
be parameterized by zero or more of these objects.
Actions performed in the environment change the
state according to a transition function, which we
here assume to be deterministic. We will further
assume a set of atomic predicates Πwhich take in
zero or more objects as arguments, and which can
be true or false in a particular state.
In our example domain, the state consists of
the states of the objects glasses and watch, both
members of the object class F orS aleItem, as
well as separate environment variables such as
whether the agent is in the store. At each
time step the agent can perform the actions
pickup(o),putdown(o), or buy(o)where o
glasses, watch, or can perform the non-object ac-
tion leave. (The agent may only put down or buy
an object which it has picked up.) Predicates in-
clude whether object ohas been previously bought
(bought(o)), whether the agent is currently hold-
ing o(holding(o)), and whether ois currently
on the shelf (onShelf (o)) as well as whether the
agent has left the store (lef tStore). Each of
the agent actions also corresponds to a predicate
which indicates whether that action is performed
in the corresponding time step.
2.3 Violation enumeration language (VEL)
We generate justifications for an agent acting with
respect to a set of rules expressible in temporal
logic. In particular we represent the rules that the
agent is to follow in an object-oriented temporal
logic fragment which we refer to as violation enu-
meration language (VEL). VEL is based on linear
temporal logic (LTL), and thus incorporates tem-
poral operators roughly encoding the concepts of
“always” (G), “eventually” (F), “in the next time
step” (X), and “until” (U) as well as the standard
operators in propositional logic (¬,,,).
The main difference between VEL and LTL is
that in VEL atomic propositions have been re-
placed by atomic predicates of the sort found in
the RMDP environment. The arguments to these
predicates may be particular objects in the agent’s
environment, or may be object variables. Each ob-
ject variable is existentially () or universally ()
quantified, or declared “costly” . Costly variables
are those for which the cost of violating the rule
depends on the number of bindings that violate
the rule; where a formula has multiple costly vari-
ables, the cost depends on the number of violating
tuples of bound variables. The costly variables of
a VEL formula are listed, enclosed in angle brack-
ets (h,i) to the left of the formula.
In our example domain, we assume that the
agent must attempt to satisfy two VEL rules:
hoi.G¬(leave holding(o)∧ ¬bought(o)) (1)
hoi.F(leave holding(o)) (2)
These VEL rules correspond to the injunction
never leave the store while holding an object that
has not been bought (shoplifting; the agent is pe-
nalized for each such object) and to leave while
holding as many objects as possible.
2.4 Calculating and minimizing violation cost
The agent acts in its environment so as to maxi-
mally satisfy its VEL objectives, by minimizing a
violation cost. We define the violation cost of an
agent trajectory with respect to a VEL formula as
the number of bindings of the costly variables such
that the formula fails to hold for those bindings.
To compute an aggregate violation cost for a set
of rules, each rule is assigned a weight wR0
and a priority zN0. VEL rules with the same
HUMAN: What rules do you follow?
ROB OT:I must not leave the store while
holding anything which I have not
bought, and I must leave the store
while holding everything.
HUMAN: What did you do?
ROB OT:I picked up the glasses, bought the
glasses and left the store.
HUMAN: What rules did you break?
ROB OT:I did not leave the store while
holding the watch.
Dialogue 1: Questions about rules, actions, and viola-
HUMAN: Why didn’t you buy anything?
ROB OT:I bought the glasses.
Dialogue 2: “Why” query with false premise.
priority may be traded off (the agent minimizes
the weighted sum of the violation costs for these
rules), while rules with different priorities cannot
(no amount of satisfying rules of lesser priority
may justify violating a rule of a greater priority).
Given a set of VEL rules and their correspond-
ing weight/priority values and an environment
(where we assume the agent knows the environ-
mental dynamics), the agent can compute an opti-
mal policy that minimizes the violation cost with
respect to those rules (e.g., using value iteration).
For the purposes of the paper, we assume that the
agent generating justifications has done precisely
this process and has executed this optimal policy.
In our example domain, we assume that rule 1
has a priority of 1and that rule 2has a priority
of 0, so that they are incomparable (and thus the
weights of each are irrelevant). In order to max-
imally satisfy these rules, the agent picks up the
glasses, buys the glasses, and leaves the store.
3 Content Generation
This section describes how content of explanations
is generated; section 4describes how utterances
are constructed from this content.1
We have developed an algorithm which pro-
duces raw (non-NL) explanations from queries
1Because our contribution relates to natural language gen-
eration, we assume the existence of a parser that processes in-
put sentences such as the Human utterances in Dialogues 1-5.
Such a parser is not one of this paper’s contributions.
which contain VEL statements. We do not dis-
cuss this process in detail in this work; it is de-
scribed in a separate forthcoming paper. In this
work, we leverage this algorithm to support the
following types of queries from the user and their
corresponding responses:
1. The user may ask the agent for the contents of
its rules. The response will be a list of the VEL
rules that the agent attempts to follow.
2. The user may ask the agent for the sequence
of actions it actually performed. The response
will be a list of such actions.
3. The user may ask the agent which VEL rules it
violated in the observed trajectory. The result is
a list of such rules, which list will be non-empty
only if the rules are not all mutually satisfiable.
(This and the previous two query types are de-
picted in Dialogue 1).
4. The user may ask the agent “why φ”, where φ
is a VEL statement (where the statement may
involve quantification, but not costly variables):
why did the agent act in such a way as to make
φtrue? The response to this question can take
one of three forms:
The agent could determine that φis not en-
tailed by the agent’s trajectory (the premise
of the question is false). Here the algorithm
simply returns ¬φ(with existentially quan-
tified variables bound to a particular coun-
terexample where appropriate). Dialogue 2
shows a question-response pair of this type.
The agent may determine that φholds over
the agent’s trajectory and over all other tra-
jectories (φcannot be false in the given
RMDP). Here the generated response takes
the form “¬φis impossible”. Dialogue 3
shows a question-response pair of this type.
If there is an alternate (counterfactual) trajec-
tory over which φis not satisfied, our system
constructs such a trajectory, and then consid-
ers the relative preference of those trajecto-
ries with respect to the VEL rules.
If the trajectories are equally preferable, then
we state that equivalence, such as in the di-
alogue in Dialogue 4. If the real trajectory
is preferable to the counterfactual, then we
produce a statement to that effect, as in Dia-
logue 5(due to our assumption that the agent
behaves optimally with respect to its rules,
the agent does not consider the case that the
counterfactual trajectory is preferable).
HUMAN: Why didn’t you buy everything?
ROB OT:It was impossible for me to buy
Dialogue 3: “Why” query with impossibility.
HUMAN: Why did you buy the glasses?
ROB OT:I could have avoided buying the
glasses and that would not have bro-
ken more important rules.
HUMAN: How would you have done that?
ROB OT:I would have picked up the watch,
bought the watch, and left the store.
HUMAN: What rules would you have bro-
ROB OT:I would not have left the store
while holding the glasses.
Dialogue 4: “Why” query for equally preferable coun-
As Dialogues 4and 5indicate, in either case
the user may ask follow-up questions ask-
ing about the specific content of the coun-
terfactual trajectory, including the agent’s ac-
tion sequence and the rules it would have vi-
olated. If the real trajectory is preferable,
the user may also ask the agent to elaborate
about why the counterfactual would have
been worse, to which the answer is a minimal
list of (bound) rules violated by the counter-
factual trajectory with violation cost exceed-
ing that of the actual trajectory.
4 Surface Representation Generation
The primary challenge to constructing utterances
based on the output of our explanation generator
is the need to construct natural-sounding expres-
sions of VEL statements. Once this problem is
solved, we may use SimpleNLG (Gatt and Reiter,
2009) to plug the resulting clauses into template
sentences corresponding to each of the outputs we
are interested in producing.
4.1 Translating VEL to natural language
We here discuss how to construct clauses corre-
sponding to individual VEL formulae, a critical
subtask of generating natural language justifica-
tions for the behavior of agents with VEL rules.
Due to the difficulty of the task, rather than at-
tempt to handle every possible VEL formula, we
will work with a small fragment which neverthe-
less can express a large number of the plausible
HUMAN: Why did you not leave the store
while holding everything?
ROB OT:I could have left the store while
holding everything but that would
have broken more important rules.
HUMAN: How would you have done that?
ROB OT:I would have picked up the
glasses, picked up the watch, bought
the watch and left the store.
HUMAN: What rules would you have bro-
ROB OT:I would have left the store while
holding the glasses which I had not
HUMAN: How would that have been worse?
ROB OT:Leaving the store while holding
the glasses which I have not bought is
worse than not leaving the store while
holding the watch.
Dialogue 5: “Why” query for “worse” counterfactual.
agent rules/queries. (We are confident that many
more sentences will ultimately be representable,
although we are not convinced that it is possible to
express every VEL formula in coherent English.)
4.1.1 Key assumptions
Our assumptions about the structure of the VEL
statements we will convert are as follows:
The statements have the form Gφor Fφ, pos-
sibly with quantification and costly variables,
where φis a (possibly negated) conjunction of
(possibly negated) predicates with no temporal
operators (e.g. p1(o1)p2(o1)∧ ¬p3). Not all
predicates in the conjunction are negated.
Each predicate, with the exception of those
corresponding to the agent’s action set, cor-
responds to a parameterized English sentence
where either the subject is not the agent, or
the subject is the agent and the verb is in the
progressive or perfect tense (corresponding to
actions or processes currently in progress, or
which have finished in the past, respectively).
For example, in the shopping domain the predi-
cate bought(o)corresponds to “I have bought
o”, while holding(o)corresponds to “I am
holding o” and onShelf (o)corresponds to “o
is on the shelf”. Actions also correspond to
present-tense English sentences: buy(o)corre-
sponds to “I buy o”.
While predicates may take multiple objects
as parameters, at most one of these may
be quantified in a rule/query (and the rest
must refer to specific objects within the do-
main). The rule “xy.¬injures(x, y )” is
not permissible under this assumption, though
x.¬injures(bob, x)” is if bob is a particular
object in the environment.
Each specific object in the environment corre-
sponds to a particular referring expression in
English; e.g. glasses to “the glasses” and
watch to “the watch”. Each object class also
corresponds to an English referring expression;
e.g. F orS aleItem to “thing”.
4.1.2 VEL clause construction pipeline
The process of constructing a clause suitable for
embedding in a sentence from a VEL statement
(given the assumptions outlined in section 4.1.1) is
as follows. In particular, we construct a predicate
form based on the statement which is processed by
a separate NLG component within our robotic ar-
chitecture (which in turn calls SimpleNLG to per-
form realization). For simplicity, we will show
the realization of the predicate form instead of the
predicate representation itself.
1. If the formula has costly variables, these are
treated for the purposes of natural language
generation as universal quantifiers.2
2. If the conjunction of predicates is negated (e.g.
¬(p1(o1)∧ · · · ))), this negation is pushed out-
ward beyond the temporal operators and quan-
tifiers. This will negate the main verb of the
resulting clause.
3. Next the conjunction itself is processed into
a clause containing “which” and “while” sub-
clauses. Section 4.1.3 describes this process.
4. Existential and universal variable quantifiers
are removed and replaced by determiners on the
first instance of the variable in the main clause.
If the quantification is universal, the determiner
“every” is used; if existential, “a” is used (or
“any”, if the formula is negated). The names
of particular objects are substituted for the cor-
responding referring expressions; the names of
object variables are substituted for the referring
expression of the corresponding object class.
5. Finally, if the formula contains F(“eventu-
ally”), this is dropped in the clause represen-
tation (since this is usually implicit in English;
2Costly variables could alternately be handled using some
form of “as little as possible”, but we chose not to do this as
it would make the resulting sentences needlessly complex.
e.g. “I did not buy the watch” generally means
“I did not eventually buy the watch”). If the
clause has a remaining negation (¬), the result-
ing clause is negated.
Figure 1outlines this process of constructing a
clause from the agent’s rule against shoplifting in
the example domain.
4.1.3 Processing a conjunction of predicates
Because processing a conjunction of (pos-
sibly negated) predicates is the most non-
straightforward part of our clause construction
process, we now explain how this is done.
Throughout the process, we maintain the list of
unused conjunction arguments unusedArgs; the
algorithm is finished when this list is empty.
We first sort predicates on three criteria, in
decreasing order of importance: (1) whether
the predicate is negated (non-negated first); (2)
whether the predicate corresponds to an action in
the RMDP, or a sentence with the agent as subject
and a verb in progressive (first) or perfect (second)
tense, or whether the predicate does not feature the
agent as subject; and (3) which objects and vari-
ables appear in the predicate (those without ob-
jects first; then those with variables; then those
with specific objects). Sorting predicates in this
order will help to ensure that the main verb of the
resulting clause is as similar to an action the agent
is performing as possible (“I leave the store while
having bought” instead of “I have bought while
leaving the store”), while minimizing the likeli-
hood of a double negation in the sentence (“I do
not not leave the store while holding...” ).
Once predicates are sorted in this order, the
first predicate becomes the primary verb of the
clause (and is removed from unusedArgs). If this
predicate has an object/variable as an argument, a
“which” clause is constructed for that argument.3
A “which” clause is built for an object/variable
oby isolating each predicate in unusedArgs
which contains that object/variable as a parameter
(if ois the name of a particular object, all pred-
icates which also contain quantified object vari-
ables are excluded from this list: they will be han-
dled in the “which” statement for that object vari-
able). These are sorted in a slightly different or-
der: predicates with the agent as subject (“agent-
subject”), first perfect then progressive tense;
3If the argument in question is a universally-quantified ob-
ject variable, we swap “which” for “all of which” because
“which” is not semantically accurate.
then those in which the object/variable is subject
(“object-subject”) and finally those with another
subject entirely (“other-subject”). Within each of
these classes, the corresponding sentences are con-
joined by “and”, and in the “object-subject” class
the subject is elided. Then the individual “which”
statements are conjoined by “and”. Each pred-
icate in the “which” statement is removed from
unusedArgs. For example, if the main predi-
cate is putdown(t)and the other arguments are
¬bought(t),¬onShelf (t), and holding(t), the
result is “put down t, which I have bought and I
am holding and which is not on the shelf”.
Once the first verb has been processed (and po-
tentially modified with its “which” clause), re-
maining predicates are handled by adding “while”
statements. If these have the agent as sub-
ject, the subject is elided and the verb conju-
gated into present participle form (“I leave the
store while holding...”).4Predicates without the
agent as subject are added in separate while
clauses, joined by “and”. For example, the clause
leave holding(glasses)bought(glasses)
onShelf (watch)would correspond to the phrase
“I leave the store while holding the glasses, which
I have bought, and while the watch is on the shelf”.
Predicates handled in such a way are removed
from unusedArgs, and again “which” clauses
are constructed for them. Once unusedArgs is
empty, the algorithm terminates.
4.2 Embedding VEL clauses into response
Listing rules, actions, or violations: When the
user asks about the rules the agent follows or about
the complete list of actions performed or rules vio-
lated by either the actual or the counterfactual tra-
jectory, the response returned is a conjunction of
the converted clauses using “and”. In the case of
rules, each such clause is modalized with “must”
(and each rule for which the agent is not the sub-
ject of the clause is prefaced with “make sure
that”); for actions or violations, the sentences are
transformed into the past tense and (for the coun-
terfactual trajectory) modalized with “would”.
Rejecting the “why” premise: When respond-
ing to “why φ?” by asserting ¬φ(perhaps with a
variable binding ), the agent simply constructs the
VEL clause corresponding to ¬φand converts it
4With perfect-tense predicates with the agent as subject
the auxiliary ‘have’ is converted into participle form, e.g.
“while having bought...”.
Input: hoi.G¬(leave holding(o)∧ ¬bought(o))
o.G¬(leave holding(o)∧ ¬bought(o))
¬(t.F(leave holding(o)∧ ¬bought(o)))
¬∃t.“I eventually leave the store while holding t which I have not bought”
¬“I eventually leave the store while holding any thing which I have not bought”
Output: “I do not leave the store while holding any thing which I have not bought”
Costly variables to universal quantification
Push negation outward
Process conjunction (see section 4.1.3)
Replace quantifiers with determiners; add ref. expressions
Drop “eventually” and apply outermost “¬” to clause
Figure 1: Converting a VEL statement into an English clause.
into the past tense, e.g. “I did not φ”.
Query cannot be false: When the response to
“why φ?” is that it is not possible for it to be oth-
erwise, the VEL clause is converted into infinitive
form in a sentence of the form “it was impossible
for (subject) (not?) to (VP-infinitive)”.
Counterfactual explanations: When a coun-
terfactual trajectory is constructed to explain “why
φ?”, the VEL clause for ¬φis computed. If
this clause is negated (e.g. “I do not leave the
store”), then the negation is removed, and the verb
“avoid” added as an auxiliary, e.g. “I avoid leav-
ing the store”. Regardless, the clause is modal-
ized with “could” and put into the past tense, and
a canned subordinate clause is added depending on
whether the real trajectory was preferable (“...but
that would have violated more important rules”)
or equivalent (“...and that would not have broken
more important rules”).
Comparing real and counterfactual viola-
tions: When elaborating on the counterfactual ex-
planation for “why φ” by outputting a set of rules
violated by the counterfactual trajectory sufficient
to exceed the violation cost of the rules violated by
the actual trajectory, each such rule is negated and
converted into its corresponding VEL clause, each
of which are converted into gerund form. The re-
sulting sentence takes the form “X is worse than
Y” where X is the set of counterfactual violations
conjoined by “and”, and Y is the set of actual vio-
lations, also conjoined by “and”.
5 Evaluation
We conducted a preliminary evaluation to quan-
tify the human attitudes towards our approach in
terms of trust, mental model, and intelligibility of
We hypothesized that the explanations provided
by our approach would provide better performance
than both baselines in terms of mental model and
trust, and better than the surface representation
baseline in terms of intelligibility.
90 participants were recruited through Ama-
zon Mechanical Turk; 1 participant’s data was re-
moved because of technical failure delivering tran-
scripts. A total of 89 participants (Male: 54, Fe-
male: 32, Other: 2, No answer: 1) provided usable
data. The mean age was 35.4 years (SD: 11.4).
5.1 Conditions
Each participant was shown 5 text transcripts of
human-robot dialogue containing natural language
explanations, one at a time. Each explanation was
one of three conditions as described below. We
further discuss these conditions in section 6.
In the content generation baseline the system
produces a high-level statement indicating that it
compared the outcome to other reasons, without
fully explaining what alternatives it considered
and how they would be worse (and with no attempt
to translate the agent’s norms). These sentences
were hand-crafted by the authors. The purpose
of this baseline is to test for any improvements
that our content generation system produces. Dia-
logue 6is an example of this.
In the surface representation generation
baseline the system produces explanations with
the same content as our approach, but directly
translating VEL formulas (without our surface
representation) and inserting them into canned
sentences. The purpose of this baseline is to test
for any improvements that our surface generation
system produces. Dialogue 7is an example of this.
Our experimental explanation baseline uses
our content generation and surface generation ap-
proaches to provide a brief explanation subdia-
logue. Dialogues 1through 5give the dialogues
appearing in this condition, where the Robot utter-
ances are the actual output of our system.
HUMAN: Why didn’t you buy any thing?
ROB OT:The assumption of the question is
HUMAN: Why didn’t you buy every thing?
ROB OT:The alternative was impossible.
HUMAN: Why did you buy the glasses?
ROB OT:For no rule-related reason; the al-
ternative would have broken no more
important rules.
HUMAN: Why did you not leave the store
while holding every thing?
ROB OT:The alternative would have bro-
ken more important rules.
Dialogue 6: Examples of Content Generation Baseline.
5.2 Measures
Along with each dialogue interaction the subjects
were prompted to provide three ratings on a 5-
point Likert scale. In all three cases the labels of
the Likert scale were: Strongly agree, Agree, Neu-
tral, Disagree, Strongly disagree.
The first rating was designed to measure the in-
telligibility of the system giving the explanation.
The wording of the explanation was: “I understand
what the robot is trying to communicate.
The second rating was designed to measure the
human subject’s perceived mental model of the
system giving the explanation. The wording of
the explanation was: “I understand how the robot
makes decisions.
The third rating was designed to measure the
subject’s trust in the system giving the explana-
tion. The wording was: “I trust this robot’s ability
to obey norms in a principled way.”
5.3 Results
To investigate whether the type of explanations
had an effect on people’s comprehension of what
the robot was trying to communicate (intelligi-
bility), people’s model of the robot’s decision-
HUMAN: Why did you not leave the store
while holding every thing?
ROB OT:I could have made “For every
thing, eventually leave the store and
holding thing” false but that would
have broken more important rules.
HUMAN: How would you have done that?
ROB OT:I would pick up the glasses, pick
up the watch, buy the watch, and
leave the store.
HUMAN: What rules would you have bro-
ROB OT:I would have broken the rule “For
every thing, always not leave the store
and holding thing and not bought
HUMAN: How would that have been worse?
ROB OT:Breaking the rule “always not
leave the store and holding the
glasses and not bought the glasses” is
worse than breaking the rule “even-
tually leave the store and holding the
Dialogue 7: Examples of Surface Representation Base-
making process (mental model) and how much the
robot was trusted (trust) we conducted three one-
way ANOVAs. For each model we used the fol-
lowing measures as dependent variables respec-
tively: a) intelligibility b) mental model and c)
trust. For all models we used condition (exper-
imental, content and surface) as the independent
variable. We found a main effect of condition
on intelligibility,F(2,86) = 14.26,p < .001,
p=.25, pairwise comparisons revealing that
people perceived explanations in the experimen-
tal condition as more intelligible than in both the
surface (p < .001) and content (p < .001) con-
ditions. The condition variable also significantly
impacted people’s formation of a mental model,
F(2,86) = 16.82,p < .001,η2
p=.28, the exper-
imental condition leading to better understanding
(more agreement with the statement) than both the
surface (p < .001) and content (p=.001) condi-
tions. Finally, we found a main effect of condition
on trust,F(2,86) = 5.70,p=.005,η2
Pairwise comparisons showed that the robot was
trusted significantly more in the experimental con-
dition than in the surface condition (p=.004).
The comparison between the experimental condi-
tion and the content condition with regards to trust
approached significance but did not pass the 95%
CI threshold (p=.060). Throughout, we found no
significant differences between the baseline condi-
tions, surface and content.
6 Discussion and Conclusion
In terms of content generation, our primary con-
tribution is an algorithm that constructs explana-
tions for the behavior of an agent governed by
rules specified in violation enumeration language
(VEL) acting in an RMDP. The system can answer
queries about the rules themselves, how the ob-
served trajectory violates these rules, and “why”
queries which invite reasoning about counterfac-
tual trajectories. Here the assumption that the en-
vironment is deterministic is restrictive. Introduc-
ing nondeterministic environments raises the pos-
sibility that not one but many counterfactual tra-
jectories would need be generated in describing
why an agent made a particular decision. Further-
more, how to construct reasonable explanations
when a bad outcome occurs due to environmental
stochasticity is a topic for empirical research.
From the perspective of generating surface rep-
resentations, our contribution is in a method for
constructing clauses corresponding to VEL state-
ments, which we then embed into response sen-
tence templates. One limitation is that we re-
strict the set of statements from which we can
construct clauses to a small fragment of VEL. We
note that the algorithmic (rule-based) approach we
employ to translating VEL statements to English
may require significant revision to be applicable
to broader categories of statements. Relaxing a
few of our assumptions (such as allowing disjunc-
tions) is likely fairly straightforward; others (such
as complex combinations of temporal operators)
would be significantly more involved even if it is
possible to express these sentences in a succinct
way that humans can understand.
As mentioned in section 2.1, we chose the shop-
ping robot domain for its simplicity rather than for
realism. In principle, the system may operate on
any RMDP and set of VEL norms that meet the
assumptions set out in section 4.1.1. Nevertheless,
implementing our approach on a physical robot
operating in a real environment is a topic for fu-
ture work.
The study confirmed our hypotheses that the ex-
planations provided by our approach would pro-
vide better performance than both baselines we se-
lected in terms of mental model and trust, and bet-
ter than the surface realization baseline in terms of
intelligibility. Comparison to these provide some
evidence for our approach’s value in terms of both
content generation and surface representation gen-
Our results corroborate Lim et al. (2009)’s find-
ing that explanations increase trust. Our results
also complement Chiyah Garcia et al. (2018)’s
finding that mental models can in some cases in-
crease the mental model of an agent: in their case
by varying the soundness and completeness, and
in our case through our approach to content gener-
ation and surface representation generation.
Our evaluation demonstrates that explaining be-
havior in terms of norms translated from VEL to
English can facilitate trust and improve mental
models versus naive methods for explaining the
agent’s behavior that (a) do not directly reference
the agent’s norms, or (b) translate those norms in
the most naive possible way. We do not compare
our approach to other approaches to constructing
English text from formulae, e.g. in first-order
logic (Kutlak and van Deemter,2015;Flickinger,
2016). These approaches solve a slightly different
problem than our approach does, and would likely
require significant adaptation to solve the prob-
lem of explaining norm-related agent decisions.
Nevertheless, comparison with these methods (and
with state-of-the-art deep learning methods such
as in Manome et al.,2018) is a fruitful topic for
future work.
By enabling agents to craft natural language
explanations of behavior governed by temporal
logic rules, our approach provides an early step
towards systems which can not only explain their
behavior, but also engage in model reconciliation
(Chakraborti et al.,2019), updating their under-
standing of both the rules and their relative impor-
tance and the dynamics of the environment by in-
teracting with human users while informing those
users about the way the system operates.
7 Acknowledgements
This project was supported in part by ONR
MURI grant N00014-16-1-2278 and NSF IIS
grant 1723963.
Tathagata Chakraborti, Sarath Sreedharan, Sachin
Grover, and Subbarao Kambhampati. 2019. Plan
explanations as model reconciliation. In 2019 14th
ACM/IEEE International Conference on Human-
Robot Interaction (HRI), pages 258–266. IEEE.
Francisco Javier Chiyah Garcia, David A. Robb,
Xingkun Liu, Atanas Laskov, Pedro Patron, and He-
len Hastie. 2018. Explainable autonomy: A study
of explanation styles for building clear mental mod-
els. In Proceedings of the 11th International Con-
ference on Natural Language Generation, pages 99
108, Tilburg University, The Netherlands. Associa-
tion for Computational Linguistics.
Nadine Elzein. 2019. The demand for contrastive
explanations. Philosophical Studies, 176(5):1325–
Armin Fiedler. 2001a. Dialog-driven adaptation of ex-
planations of proofs. In International Joint Con-
ference on Artificial Intelligence, volume 17, pages
Armin Fiedler. 2001b. User-adaptive proof explana-
tion. Ph.D. thesis, Universitat des Saarlandes.
Dan Flickinger. 2016. Generating English paraphrases
from logic. In From Semantics to Dialectometry,
pages 99–107.
Maria Fox, Derek Long, and Daniele Magazzeni. 2017.
Explainable planning. In Proceedings of the IJCAI
2017 Workshop on Explainable AI.
Albert Gatt and Ehud Reiter. 2009. SimpleNLG: A re-
alisation engine for practical applications. In Pro-
ceedings of the 12th European Workshop on Natural
Language Generation (ENLG 2009), pages 90–93.
Helmut Horacek. 2007. How to build explanations of
automated proofs: A methodology and requirements
on domain representations. In Proceedings of AAAI
ExaCt: Workshop on Explanation-aware Comput-
ing, pages 34–41.
Joseph Kim, Christian Muise, Ankit Shah, Shubham
Agarwal, and Julie Shah. 2019. Bayesian inference
of temporal specifications to explain how plans dif-
fer. In Proceedings of the ICAPS 2019 Workshop on
Explainable Planning (XAIP).
Benjamin Krarup, Michael Cashmore, Daniele Mag-
azzeni, and Tim Miller. 2019. Model-based con-
trastive explanations for explainable planning. In
Proceedings of the ICAPS 2019 Workshop on Ex-
plainable Planning (XAIP).
Roman Kutlak and Kees van Deemter. 2015. Generat-
ing Succinct English Text from FOL Formulae. In
Procs. of First Scottish Workshop on Data-to-Text
Pat Langley. 2019. Explainable, normative, and jus-
tified agency. In Proceedings of the Thirty-Third
AAAI Conference on Artificial Intelligence.
Brian Y Lim, Anind K Dey, and Daniel Avrahami.
2009. Why and why not explanations improve the
intelligibility of context-aware intelligent systems.
In Proceedings of the SIGCHI Conference on Hu-
man Factors in Computing Systems, pages 2119–
2128. ACM.
Kana Manome, Masashi Yoshikawa, Hitomi Yanaka,
Pascual Mart´ınez-G´omez, Koji Mineshima, and
Daisuke Bekki. 2018. Neural sentence genera-
tion from formal semantics. In Proceedings of the
11th International Conference on Natural Language
Generation, pages 408–414, Tilburg University, The
Netherlands. Association for Computational Lin-
Nima Pourdamghani, Kevin Knight, and Ulf Herm-
jakob. 2016. Generating English from abstract
meaning representations. In Proceedings of the 9th
International Natural Language Generation confer-
ence, pages 21–25, Edinburgh, UK. Association for
Computational Linguistics.
Stylianos Loukas Vasileiou, William Yeoh, and
Tran Cao Son. 2019. A general logic-based ap-
proach for explanation generation. In Proceedings
of the ICAPS 2019 Workshop on Explainable Plan-
ning (XAIP).
ResearchGate has not been able to resolve any citations for this publication.
Recent work in explanation generation for decision making agents has looked at how unexplained behavior of autonomous systems can be understood in terms of differences in the model of the system and the human's understanding of the same, and how the explanation process as a result of this mismatch can be then seen as a process of reconciliation of these models. Existing algorithms in such settings, while having been built on contrastive, selective and social properties of explanations as studied extensively in the psychology literature, have not, to the best of our knowledge, been evaluated in settings with actual humans in the loop. As such, the applicability of such explanations to human-AI and human-robot interactions remains suspect. In this paper, we set out to evaluate these explanation generation algorithms in a series of studies in a mock search and rescue scenario with an internal semi-autonomous robot and an external human commander. We demonstrate to what extent the properties of these algorithms hold as they are evaluated by humans, and how the dynamics of trust between the human and the robot evolve during the process of these interactions.
Conference Paper
There is ample evidence that results produced by problem-solving methods and ingredients suitable for human-adequate explanations may differ fundamen- tally, which makes documenting the behavior of intelli- gent systems and explaining the solutions they produce quite challenging. Focusing on the explanation of solu- tions found by the most general problem-solvers, auto- mated theorem provers, we sketch what has emerged as a methodology over the past decade in our work- ing group for building content specifications for these kind of explanations. This methodology is conceived as a stratified model with dedicated transformation pro- cesses bridging between adjacent strata. Our investiga- tions have shown that explanation capabilities based on problem-solving knowledge only are limited in a num- ber of ways, which motivates one to represent extra knowledge relevant for communication purposes.
Today, automated theorem provers are becoming more and more important in practical industrial applications and more and more useful in mathematical education. For many applications, it is important that a deduction system communicates its proofs reasonably well to the human user. To this end, proof presentation systems have been developed. However, state-of-the-art proof presentation systems suffer from several deficiencies. First, they simply present the proofs, at best in a textbook-like format, without motivating why the proof is done as it is done. Second, they neglect the issue of user modeling and thus forgo the ability to adapt the presentation to the specific user, both with respect to the level of abstraction chosen for the presentation and with respect to steps that are trivial or easily inferable by the particular user and, therefore, should be omitted. Finally, they do not allow the user to interact with the system. He can neither inform the system that he has not understood some part of the proof, nor ask for a different explanation. Similarly, he cannot ask follow-up questions or questions about the background of the proof. As a first step to overcome these deficiencies, we shall develop in this talk a computational model of user-adaptive proof explanation, which is implemented in a generic, user-adaptive proof explanation system, called P.rex (for PRoof EXplainer). To do so, we shall use techniques from three different fields, namely from computational logic to represent proofs from various calculi with several levels of abstractions ensuring the correctness of the proofs; from cognitive science to model the users mathematical knowledge and skills; and from natural language processing to plan the explanation of the proofs and to accept and appropriately react to the user's interactions.
The demand for contrastive explanations
Nadine Elzein. 2019. The demand for contrastive explanations. Philosophical Studies, 176(5):1325-1339.
Generating English paraphrases from logic
  • Dan Flickinger
Dan Flickinger. 2016. Generating English paraphrases from logic. In From Semantics to Dialectometry, pages 99-107.