International Conference on Naturalistic Decision Making 2019, San Francisco, California
Naturalistic Psychological Model of Explanatory
Reasoning: How People Explain Things to Others and to
Gary KLEINa, Robert HOFFMANb, and Shane MUELLERc
bInstitute for Human Machine Cognition
cMichigan Technological University
The process of explaining something to another person is more than offering a statement. Explaining
means taking the perspective and knowledge of the Learner into account, and determining whether
the Learner is satisfied. While the nature of explanationconceived of as a set of statements has
been explored philosophically, empirically, and experimentally, the process of explaining, as an
activity, has received less attention.
We conducted a naturalistic study, looking at 74 cases of explaining. The results imply three
models: local explaining about why a device (such an intelligent system) acted in a surprising way,
global explaining about how a device works, and self-explaining in which people craft their own
understanding. The examination of the process of explaining as it occurs in natural settings helped
us to identify a number of mistaken beliefs about how explaining works.
Sensemaking; electronics; engineering
From time to time, we all need to help another person understand why something happened, why a machine
behaved in an unexpected manner, or even how a complex device works. When we explain, we don’t want to
fashion an explanation that is too detailed, or one that lacks the necessary detail. Our objective in the work reported
here was to explore the nature of explaining in various contexts. One of those contexts is the field of Artificial
The challenge of explaining how AI systems think has been a long-standing one, as seen in the pioneering
works of William Swartout (McKeown & Swartout, 1983; Swartout & Moore, 1993; William Clancey, 1983,
1987). They argued that the AI system needs to have within it an explainable model of the task, and also, a model
of the user. In contrast to current Deep Net and Machine Learning systems, the early explainable systems had easy
access to symbolic notation of knowledge, possibly making it easier to create human-meaningful accounts of the
workings of the system. Yet many of those explanation systems failed to achieve what they had promised. Today’s
AI systems use calculational mechanisms that are much more opaque and may take substantial inferencing to be
made sense of, and so the challenge is even greater.
To some extent, the same argument could be made about how humans think about the reasoning of other
humans. Nevertheless, Pearl (2018) notes that even though we have such a meager understanding of how our
minds work, and how other people think, we can still communicate with each other, learn from each other, guide
each other and motivate each other. We can dialog in a language of cause and effect. In contrast, we cannot dialog
with intelligent machines and one reason is that they do not “speak” meaningfully about cause and effect. People
communicate via a language of reasons, which is different from the AI language of variables and weights and
We believe that today’s AI explanation systems are still in the formative stage and are just starting to
tackle the challenges of explanation that proved the undoing of earlier AI systems.
In their review of the literature, Hoffman et al. (2018) found that while there was ample material on the
process of generating explanations, the process of explaining has not been much studied. Explanations are
Gary K. et al – Explanatory Reasoning
typically taken to be statements and can be evaluated in terms of factors such as clarity, comprehensiveness, and
accuracy. In contrast, “explaining” is an interactive dialog that involves the Explainer and at least one Learner.
To be effective, the process of explaining needs to take the Learner's perspective into account.
As part of the DARPA Explainable Artificial Intelligence (XAI) program we conducted a naturalistic
study of what happens when a person tries to explain the reasons for a decision or action or the workings of a
device to another person. Our goal for this naturalistic study was to model how explanations are formed. Such a
model might help the researchers seeking to enhance the explainability of intelligent systems.
CORPUS OF EXAMPLES
We identified and examined 74 examples, some as complex as the Air France 447 disaster, others much simpler,
attempting to learn about the process of explaining from these examples. A number of the examples came from
Degani's (2004) descriptions of automation failures. Other cases came from news media, and from interviews we
had conducted for other projects. A number of examples came from the Reddit web site “Explain Like I’m Five,”
which attempts to explain complex phenomena for unsophisticated audiences. [[From this set of 74, we selected
26 examples for further examination based on their richness and their judged potential for constructing a model
of explaining.]] The cases included intelligent systems (e.g., IBM's Watson playing Jeopardy, AlphaGo playing
Go), minimally intelligent systems (e.g., autopilots of commercial airlines and passenger ships, cruise controls for
automobiles), mechanical systems (e.g., ceiling fans, motel alarm clocks, blood pressure monitors), and some
decision making events that did not involve machines (e.g., Magnus Carlsen’s winning move in a chess
Types of Explaining Activity
We divided the examples into two types: local and global. Local explaining seeks to justify why specific actions
were taken or decisions were made. In contrast, global explaining involves confusion or uncertainty, usually about
how a system works. The explaining in these instances are not tethered to any particular incident.
In addition to local and global examples, we investigated a third type of explaining activity, self-explaining, where
there is no Explainer. Instead, Learners have to gather available evidence and sort it out by themselves. We make
no claim for the exhaustiveness of these categories — the local/global distinction formed a part of the initial
rationale of the DARPA XAI program, and the addition of an orthogonal self-explaining category emerged
through observation of efforts of the eleven teams in the DARPA program.
RESULTS PART 1: LOCAL EXPLAINING
People request local explanations when they want to know why something happened. Events did not go as
expected, or a machine acted up for some reason. The process is fairly straightforward — a surprising event
engenders a need for something to be explained, leading to a diagnosis of this request, followed by a process of
building and then packaging the explanation in the context of the Learner’s background. We identified 28
examples of local explaining, out of our set of 74 cases, and we used them to create a generic script for the process
of explaining, as shown in Figure 1. In all of the examples we studied, the Explainer began by trying to diagnose
the reason for the inquiry. Since an assumption was violated, the Explainer tried to determine which assumption
was wrong, in order to correct it. This process is very focused and can be very brief, as opposed to presenting a
lengthy account of a system or a situation and then trying to extract the gist.
Gary K. et al – Explanatory Reasoning
Figure 1 : Model of the Process of local Explaining.
One example was the Airbus A330 tragedy, in which there were 228 deaths. The airplane took off from
Rio de Janeiro, Brazil on 1 June 2009, on its way to Paris, France. However, after lifting off and climbing, the
airplane crashed into the Atlantic Ocean. Some wreckage and two bodies were recovered in a few days, but the
black boxes, the remaining bodies, and the bulk of the wreckage were not located for another two years. Why did
the airplane crash? The explanation that the aircraft stalled begs the question of why it stalled. One obvious
explanation is pilot error, but a more careful examination of the incident shows that the pilot error was itself caused
by misrepresentations about the intelligent technology that supposedly made the airplanes stall-proof. The
mechanisms that were designed to prevent stall were valid as long as the airplane sensors were working correctly,
but when they iced up (as in the incident), the airplane became vulnerable to stalling. A more detailed attempt to
explain the accident identified seven distinct causes that resulted in confusions that chained and intersected. An
event that first seemed to make no sense came into focus.
Reviewing the 28 incidents, we identified several criteria for what would count as a plausible cause for
In parallel with identifying the causes to go into a story is the process of building the story around the
causes. In building the story, plausibility comes into play. Each state transition in the chain has to plausibly follow
from the previous state. The Learner needs to imagine how he/she would make the transitions. If plausibility is
breaking down, the explanation is seen as problematic.
With more complex cases, the Explainer may shift from story building to a Causal Landscape (Klein et
al., 2018) or some other representation of a larger number of causes that operate in parallel and also intersect.
Sometimes, the Explainer will draw a diagram to show the new belief/assumption. Explaining can take other
forms, such as using a contrast to illustrate how the current situation is different from one that seems similar, or
offering an analog to get the point across.
Next, the Explainer will give some thought to packaging the explanation. The context, along with the
Learner’s characteristics, will affect tradeoffs of effort, cost and time. Stories should not be too complicated —
perhaps invoking three causes or less, and no more than six transitions.
In story building there is a cognitive strain to provide appropriate complexity without being overwhelming. There
are several ways to reduce the number of causes so as to increase the Learner’s comprehension. One way to
maintain the constraint on three causes is to be selective about which causes to include, dropping the ones that are
less relevant. Another approach to keep things manageable is to lump several causes — to abstract them into a
more general cause.
When is the explaining process finished? The stopping point is when the Learner arrives as a perspective
shift, as a result of modifying or replacing a belief/assumption. This perspective shift enables the Learner to
appreciate that he/she would have made the decision or recommendation in question given what was known at the
Gary K. et al – Explanatory Reasoning
Successful explaining also leads to performance outcomes — The Learner can now do a better job in
carrying out a task. The Learner can generate more accurate expectancies or predictions about a system or about
another person. The Learner will now be able to shift perspectives and see the tasks from the viewpoint of the
intelligent system or the other person. The Learner will do better at gauging trust – especially trust in machines,
especially smart machines.
RESULTS PART 2: GLOBAL EXPLAINING
Researchers have acknowledged since at least as far back as Clancey (1983) that global understanding is an
important goal of explanation. In contrast to local explaining, which focuses on what happened during a specific
incident, global explaining is about how things work generally — how a device works, how a strategy works, how
an organization works. Our model of global explaining is based on the 46 out of the 74 cases, but we focused on
19 of those cases because of the completeness of the accounts of "How does x work?" The cases were most often
expressed as questions, such as: Why do some modern elevators not stop at the next floor? Why is it so hard to
set a digital watch? Why are we often confused by ceiling fans, by airplane reading lights, by the mute button on
a TV remote, by motel telephones and clock alarms?
For example, how does a ceiling fan work (and why do we sometimes get confused about operating
them)? Ceiling fans use a simple interface that doesn’t require a screen or anything fancy — just the cord and
the visual of the fan turning.
The source of confusion is that if the blades are rotating, it is easy to forget how many times you tugged
on the cord. And then you won’t know if the next tug will increase the rotation speed more or will turn it off. The
device does not display its history. All you know is whether or not it is rotating and how fast. Making it more
confusing, you don’t get instant feedback, as you would with a 3-way bulb. If you are already at the highest speed
the next tug will turn it off. But you wouldn’t know that because it continues to rotate. So you tug the cord again,
starting it up again. The delayed feedback can make it very difficult to control the fan. Therefore, the essence of
understanding the fan is to grasp the exception and the reason for it – lack of displayed history and current state,
plus delayed feedback.
In many ways our account of global explaining is similar to that of local explaining. However, we found
two fundamental differences. One difference is that local explanations typically assume that the Learner is familiar
with the set-up or with the device, hence the surprise when expectations are violated. Global explanations do not
assume familiarity. Therefore, global explanations cannot focus on the violated expectation.
A second difference is that with local explaining, the Explainer seeks to diagnose the confusion, typically
zeroing in on a flaw in the Learner’s mental model. The Explainer then seeks to help the Learner revise his mental
model. For global explaining, the Explainer has no reason to believe that the Learner’s mental model is defective
and so the Explainer is not seeking to correct the Learner’s mental model — only to expand or enrich it. Figure 2
shows a script for global explaining, starting with the issue of how the device or computational system performs
a function of interest.
Gary K. et al – Explanatory Reasoning
Figure 2: Model of the Process of Global Explaining.
All of the 46 cases of global explaining in our sample were triggered by ignorance and curiosity. The
Learner’s features are essentially the same as in local Explaining. The important factors are: sophistication of the
Learner’s mental model, the Learner’s goals, common ground issues, and time pressure.
Time pressure is usually less of an issue with global explaining than local explaining, which is triggered
by a need to understand and react to a surprise. In addition, the issue of situation understanding does not come up
for global Explaining because the explaining process is not tethered to a specific incident.
In our sample of 46 cases of global explaining, 37 appeared to involve some attempt to take the features
of the Learner into account and these addressed the first feature, the imagined sophistication of the Learner’s
mental model. The remaining 9 cases did not appear to imagine the Learner’s sophistication.
For global explaining, the process of diagnosis is different than for local Explaining. The Explainer is
not diagnosing the Learner’s confusion or flaws in the Learner’s beliefs. Instead, the Diagnosis process in Figure
2 is primarily about the Explainer’s speculations about what the Learner is missing. The Learner may be missing
a framework if the system is sufficiently strange. Or the Learner may be missing some of the components or some
of the links. Or the Learner may be missing the causal information that makes the story or the diagram plausible.
The Explainer’s assessment will guide the way he/she describes the working of the system.
Figure 2 introduces the concept of an Explanatory Script. The script consists of the topics most frequently
used in explaining how something works. In reviewing the 19 cases of global explaining in our corpus that we
studied in greater depth, we identified several recurring elements. These include: Components — The components
of the device or computational system. Links — The causal links connecting these components. Near neighbors
— There often is a comparable device that can serve as an analog, and the explaining will also describe contrasts
with this near neighbor. Exceptions — The situations that the device doesn’t handle well plus an account of why
they are so troublesome, such as the delayed feedback and lack of a history display with the ceiling fan. 13 of the
19 cases included an exception, often as the focus of the explanation. Tacit knowledge is often introduced here as
the types of knowledge needed to operate the device when it encounters these exceptions. We find that a preferred
format for global explaining is a diagram, rather than a story, to portray components and causal linkages. Further,
the diagram format is typically embellished with annotations in order to describe the challenges, the nearest
neighbor, the contrasts to that neighbor, and the exceptions.
Gary K. et al – Explanatory Reasoning
The Learner’s features come into play to select the level of detail the Explainer uses for the elements in
the Explanatory Script. The Learner's mental model affects the level of detail most heavily, but the set of goals
that motivated the Learner to seek an explanation would also impact the level of detail provided.
What is the stopping point? The Explainer and the Learner are both seeking an outcome in which the
Learner can mentally simulate the operation of the device. Each mental simulation is essentially a story, moving
from one state to another, with plausible transitions. The Learner is trying to imagine how these transitions work.
Learners will be satisfied to stop if they feel confident that in most cases they will be able to imagine the system
outputs if they are given the system inputs.
RESULTS, PART 3: SELF-EXPLAINING.
Our third path of investigation was to examine what happens when a person engages in self-explaining.
Chi and VanLehn have conducted extensive studies of the process of self-explaining (e.g., VanLehn et al., 1990 ;
Chi & VanLehn, 1991; Chi & Wylie, 2014). One of their conclusions is that Learners retain more information and
achieve deeper understanding when they construct an explanation — when they fill in details and critical steps
rather than passively receiving information. We have attempted to build on this work by applying it to self-
explanation of Machine Learning systems. For this path, we were only interested in attempts to understand the
workings of an intelligent system. We suggest that in self-explaining, the Learner begins with a frame of some
sort. It may be a story, which is a causal chain, or a visualization such as a diagram, or an analog.
With self-explaining, no one is around to diagnose the problem, the reason why Learner was surprised (for local
Explaining), or why the Learner was uncertain about how the system works (for global Explaining). Learners
have to do it themselves — have to gain insight into the flaws in their mental model or the missing cues and the
missing connections, or the mistaken information, or the ways that context changes the situation.
The Learner is engaged in sensemaking. That’s what self-explaining how a device works is all about. Learners
are engaging in a sensemaking activity to build or modify a frame for understanding devices. The Learner will
typically start with some sort of frame. Often this frame will take the form of a story -- a causal chain (which
constitutes a mental model) that will let Learners achieve their goals. The causal chain is the most typical process
but there are other forms of self-explaining: creating a visualization such as a diagram, or invoking an analog plus
the contrast to describe how the analog, although similar, is still different in important ways and that
comparison/contrast serves to deepen the self-explaining.
With local Explaining, the Learner is questioning a frame that has up to that point been used to understand
a situation. The frame can take several forms, e.g., a story, a visualization, an analog. As Klein, Hoffman and
Moon (2006) suggested, one tactic is to preserve the frame by explaining away the discrepancy or modifying the
frame in minor ways. Another tactic is to re-frame, either to discard the frame in favor of a better one or else to
use the available data to construct a frame from scratch. This re-framing or frame construction process is
particularly relevant for global Explaining, as the Learner is trying to build a mental model (which can be
considered as a type of frame).
Learners employ several different types of sensemaking tactics. (a) They can manipulate the devices to
explore how they work by changing inputs and seeing how the outputs vary. Of particular value are interventions
to make the AI fail. If Learners know how the AI works then they should be able to break it. By trying to break it
they are learning how it works. This is the concept of "explanation as exploration." Learners can do this via
occlusions and masks, or by altering the training set to see the impact. Or, they can do this by entering erroneous
data to see what happens. (b) Learners can diagnose failures, either historical failures or failures they incur or
failures they deliberately manufacture. Learners can look for situations in which the AI device generated the
wrong answers, even stupid answers. Some of these are available in the literature, and sometimes the developers
provide these examples. These are the exceptions discussed earlier with regard to global Explaining. (c) Learners
can study examples and analogs for clues about how the device works; the analogs are a source of frames to try
out. Analogs include similar cases and near neighbors; Learners can draw inferences about similarities and also
search for contrasts. (d) Learners can engage in a limited dialog with a device. (e) Learners can examine the
options considered by the AI system and the evaluation method to select from the options. To support this type of
probe, AI systems can show a list of the options generated by the system and a pre-determined set of evaluation
features. Another tactic is for the AI system to show the Learner the dominant goal (from a pre-defined and limited
menu of goals) for each segment of a mission. (f) Attention. Learners may find it useful to determine what the
system is attending to. Some of the devices present heat maps (a rough analog to attention) and saliency maps. An
AI system can also provide a timeline to synchronize attention with events. Learners may get to see the
components of an image that generated a label. Or else the Learners are shown the visualization of high-
dimensional features. (g) Representations of internal structure. Many kinds of diagrams and visualizations come
into play here as support for self-explanation and for enabling the Learner to visualize the internal functionality
of the AI. These include diagrams of algorithms, of a feature matrix, of an AND/OR graph, a decision tree, a
Gary K. et al – Explanatory Reasoning
causal network, modules in a reasoning chain, and tables listing features that justify a classification; (h) Verbal
labels for features. (i) Training History — Learners can benefit from a description of how the system was trained.
We also offer the distinction between mindless self-explaining and meaningful self-explaining. Mindless
self-explaining might be thought of as a shotgun approach, scattering « clues » and assessing whether users find
these helpful. The tactic is to provide (mostly) true features of the AI system, leaving it to the user to make sense
of them. In contrast, meaningful self-explaining is guided by a concept of what the self-explaining should achieve.
It would be guided by the Chi and VanLehn findings about the importance of constructing an explanation as
opposed to passively receiving information. It would be guided by the work of Klein (2013) on various pathways
for making discoveries. It would be guided by the work of Klein, Borders & Besujen (2018) on mental models.
One primary goal of self-explaining is to form a better mental model of how a device works, but Klein, Borders
and Besuijen (2018) have argued that mental models have several facets — not just a knowledge of how a system
works but also its limitations and the ways it can fail. Additionally, mental models of devices include tactics for
making a system work — adapting it as necessary. And mental models can include a knowledge of the ways that
people can get confused and misled in operating a system. We suggest that attempts to use self-explaining to gain
better understanding of AI systems address all of these facets and not just the first one, the way the system works.
Of particularly importance would be material about how a system might fail, and the diagnoses for failures. We
suggest that attempts to foster self-explaining formulate a richer concept of what the user needs to understand,
going beyond the notion of more specifics about how the system works.
CONCLUSIONS ABOUT LOCAL EXPLAINING
Surprise: The explaining process is triggered by a surprise, a violated expectation, as opposed to seeing
explanations as attempts to fill slots and gaps in knowledge.
Diagnosis: Diagnosis is critical on the part of the Explainer to pin down the violated expectation In contrast, other
accounts of explanations start with a complete explication of all the relevant causes and their connections, and
view the challenge as trimming details and simplifying — gisting. We disagree. In our view, by diagnosing a
single flawed assumption, or a small set of flawed assumptions, the process of explaining is a very focused process
rather than trimming details from a comprehensive account.
Perspective Mismatch: How do we diagnose the reason why another person or a mechanical device took a
surprising decision? Our hypothesis is that there is a small set of possible reasons and these can help the earner
make a perspective shift. The reasons we identified are: That person/device might have different knowledge than
I do, or different goals, or might be operating under different constraints, or using different reasoning tactics
(which is especially important in dealing with AI), or is aware of different affordances than I am, or has a different
mindset, or has sized up the situation differently than I did, or has a different value system than I do.
Stopping Rule: The stopping rule for explaining is based on a perspective shift in which the Learner gains the
ability to see the situation from the vantage point of the other person or the device, so that the “surprising” event
is no longer a surprise.
Language of Reasons: Explaining relies on a language of reasons. These reasons can be causes, analogs,
contrasts, confusions, and stories. The language of reasons, of causality, is different from the language of
correlation and the strengthening/weakening of connections between layers in a neural net.
Contrasts: Stories often explain by presenting contrasts. Our literature review (Hoffman et al., 2018) turned up
papers asserting that the Learner is not simply wondering why a device recommended course of action x, but
rather, why did it recommend x as opposed to y? Our naturalistic study showed that there are other contrasts of
interest besides alternative courses of action. There can be contrasts in beliefs, in goals, in the way the situation is
CONCLUSIONS ABOUT GLOBAL EXPLAINING
Explanatory Script: We postulate an Explanatory Script which is a set of several items: components of the
system, the causal links between the components, the nearest neighbor along with contrasts to that analog, and the
Exceptions: The last component of the Explanatory Script, the exceptions, is often the richest one for explaining
how the system works — or doesn’t work. These kinds of exceptions provide insight into the inner workings of a
program and serve an important function in reminding us that Machine Learning systems rely on very different
reasoning strategies than people do.
Diagrams: Global explaining typically depends on a diagram of internal structure, often with annotations, as
opposed to the story format for local Explaining.
Stopping Point: The stopping point for global Explaining is to get the Learner to be able to run a mental
simulation with standard starting conditions and be reasonably confident in the output of the mental simulation.
Gary K. et al – Explanatory Reasoning
CONCLUSIONS (THUS FAR) ABOUT SELF-EXPLAINING
We view self-explaining as a type of sensemaking. We have also identified several sensemaking tactics that come
into play: manipulating the device, diagnosing the reasons for failures, studying examples, and studying analogs
(near neighbors). Another tactic for novice device users is to discard a mental model that assumes that the
advanced AI device thinks the way they do.
Acknowledgement and Disclaimer
This material is approved for public release. Distribution is unlimited. This material is based on research sponsored
by the Air Force Research Lab (AFRL) under agreement number FA8650-17-2-7711. The U.S. Government is
authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation
thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as
necessarily representing the official policies or endorsements, either expressed or implied, of AFRL or the U.S.
Degani, A. (2004). Taming HAL: Designing interfaces beyond 2001. New York: Palgrave/Macmillan.
Chi, M.T.H., & VanLehn, K. (1991). The content of physics self-explanations. The Journal of the Learning
Sciences, 1 (1), 69-105.
Chi, M.T.H., & Wylie, R. (2014). The ICAP framework : Linking cognitive engagement to active learning
outcomes. Educational Psychologist, 49(4), 219-243.
Clancey, W. J. (1983). The epistemology of a rule-based expert system—a framework for explanation. Artificial
Intelligence, 20(3), 215–251.
Clancey, W.J. (1987). Knowledge-based tutoring: The GUIDON program. Cambridge, MA: MIT press.
Hoffman, R.R., Klein, G., and Mueller, S.T. (2018, January). "Literature Review and Integration of Key Ideas for
Explainable AI." Report prepared by Task Area 2, DARPA XAI Program.
Klein, G. (2013). Seeing what others don’t : The remarkable ways we gain insights. New York : PublicAffairs.
Klein, G. (2018, March/April). Explaining explanation Part 3: The causal landscape. IEEE Intelligent Systems
Klein, G., Borders, J., & Besuijen, R. (2018). Mental models project report. Final report prepared for the Center
for Operator Performance, Dayton, Ohio.
McKeown, K.R., and Swartout, W. R. (1987). Language generation and explanation. Annual Review of Computer
Science, 2(1), 401–449.
Pearl, J. (2018). The book of why: The new science of cause and effect. New York: Basic Books.
Swartout, W. R., & Moore, J. D. (1993). Explanation in second generation expert systems. Second Generation
Expert Systems, 543, 585.
VanLehn, K., Ball, W., & Kowalski, B. (1990). Explanation-based learning of correctness : Towards a model of
the self-explanation effect. In M. Piatelli-Palmarini (Ed.) Proceedings of the Twelfth Annual Conference
of the Cognitive Science Society, Hillsdale, NJ : Erlbaum, pps. 717-724.