ArticlePDF Available


What makes for an explanation of “black box” AI systems such as Deep Nets? We reviewed the pertinent literatures on explanation and derived key ideas. This set the stage for our empirical inquiries, which include conceptual cognitive modeling, the analysis of a corpus of cases of "naturalistic explanation" of computational systems, computational cognitive modeling, and the development of measures for performance evaluation. The purpose of our work is to contribute to the program of research on “Explainable AI.” In this report we focus on our initial synthetic modeling activities and the development of measures for the evaluation of explainability in human-machine work systems.
Robert R. Hoffman
Institute for Human and Machine Cognition
Pensacola, FL
Gary Klein
Macrocognition, LLC
Washington, DC
Shane T. Mueller
Michigan Technological University
Houghton, MI
What makes for an explanation of "black box" AI systems such as Deep Nets? We reviewed the
pertinent literatures on explanation and derived key ideas. This set the stage for our empirical
inquiries, which include conceptual cognitive modeling, the analysis of a corpus of cases of
"naturalistic explanation" of computational systems, computational cognitive modeling, and the
development of measures for performance evaluation. The purpose of our work is to contribute to
the program of research on “Explainable AI.” In this report we focus on our initial synthetic
modeling activities and the development of measures for the evaluation of explainability in
human-machine work systems.
The importance of explanation in AI has been
emphasized in the popular press, with considerable
discussion of the explainability of Deep Nets and
Machine Learning systems (e.g., Kuang, 2017). For
such “black box” systems, there is a need to explain
how they work so that users and decision makers can
develop appropriate trust and reliance. As an
example, referencing Figure 1, a Deep Net that we
created was trained to recognize types of tools.
Figure 1. Some examples of Deep Net classification.
Outlining the axe and overlaying bird silhouettes
on it resulted in a confident misclassification. While a
fuzzy hammer is correctly classified, an embossed
rendering is classified as a saw. Deep Nets can
classify with high hit rates for images that fall within
the variation of their training sets, but are nonetheless
easily spoofed using instances that humans find easy
to classify. Furthermore, Deep Nets have to provide
some classification for an input. Thus, a Volkswagen
might be classified as a tulip by a Deep Net trained to
recognize types of flowers. So, if Deep Nets do not
actually possess human-semantic concepts (e.g., that
axes have things that humans call "blades"), what do
the Deep Nets actually "see"? And more directly,
how can users be enabled to develop appropriate trust
and reliance on these AI systems?
Articles in the popular press highlight the
successes of Deep Nets (e.g., the discovery of
planetary systems in Hubble Telescope data;
Temming 2018), and promise diverse applications "...
the recognition of faces, handwriting, speech...
navigation and control of autonomous vehicles... it
seems that neural networks are being used
everywhere" (Lucky, 2018, p. 24).
And yet "models are more complex and less
interpretable than ever... Justifying [their] decisions
will only become more crucial" (Biran and Cotton,
2017, p. 4). Indeed, a proposed regulation before the
European Union (Goodman and Flaxman, 2016)
asserts that users have the "right to an explanation.”
What form must an explanation for Deep Nets take?
This is a challenge in the DARPA "Explainable
AI" (XAI) Program: To develop AI systems that can
engage users in a process in which the mechanisms
and "decisions" of the AI are explained. Our tasks on
the Program are to:
(1). Integrate philosophical studies and psychological
research in order to identify consensus points, key
concepts and key variables of explanatory reasoning,
(2). Develop and validate measures of explanation
goodness, explanation satisfaction, mental models
and human-XAI performance,
(3) Develop and evaluate a computational model of
how people understand computational devices, and
Copyright 2018 by Human Factors and Ergonomics Society. DOI 10.1177/1541931218621047
Proceedings of the Human Factors and Ergonomics Society 2018 Annual Meeting 197
then evaluate the model using the validated
(4). Generate a corpus of cases in which people try to
explain the workings of complex systems, especially,
computational systems work, and
(5) From the case analysis create a "naturalistic
decision making" model of explanation that can guide
the development of XAI systems by computer
In this presentation we report progress on our
synthesis of ideas and concepts of explanation, the
development of models of the explanation process,
and the development of metrics.
A thorough analysis of the subject of explanation
would have to cover literatures spanning the history
of Western philosophy: Disciplines including
philosophy and psychology of science, cognitive
psychology, psycholinguistics, and expert systems.
The archive we created includes over 700 papers.
The challenge of XAI entrains concepts of
representation, modeling, language understanding,
and learning. Concepts that are entrained include
abductive inference, causal reasoning, mental
models, and self-explanation. Potentially measurable
features of explanation include: various forms of
explanation (e.g., contrastive explanation,
counterfactual reasoning, mechanistic explanation,
etc.); various utilities or uses of explanation (e.g.,
diagnosis, prediction), the limitations or foibles of
explanatory reasoning (e.g., people will believe
explanations to be good even when they contain
flaws or gaps in reasoning) (Lombrozo & Carey,
Many researchers present a list of the features
that are believed to characterize “good” explanations
(e.g., Brezillon and Pomerol, 1997). These include
context- or goal-relevance, reference to cause-effect
covariation and temporal contiguity, and plausibility.
There are also some contradictions in the literature:
Some assert that good explanations are simple; others
assert that good explanations are complete. Clearly,
good explanations fall in the sweet spot between
detail and comprehensibility.
A number of conceptual psychological models of
the explanation process have been presented in the
research literature. The first step in the model of
Krull and Anderson (1994), the noticing of an event,
is reminiscent of the first step in C.S. Peirce's model
of abduction (1891), that is, the observation of
something that is interesting or surprising.
Subsequent steps are Intuitive Explanation, Problem
Formulation and Problem Resolution. The model is
not specific about what is involved in these steps, but
is explicit about the role of motivation and effort.
Johnson and Johnson (1993) studied an
explanation process in which experts explained to
novices the processes of statistical data analysis.
Transcripts of explainer-learner dialogs were
analyzed. A key finding was that the explainer would
present additional declarative or procedural
knowledge at those points in the task tree where sub-
goals had been achieved. The Johnson and Johnson
model is expressed as a chain of events in which the
explainer provides analogies, instructions, and
Artificial Intelligence
AI has has a history of work on explanation. (A
review of the literature, with a bibliography, is
available from the authors.) Starting with the first
generation of expert systems, it has generally been
held that explanations must present easy-to-
understand coherent stories in order to ensure good
use of the AI or good performance of the human-
machine work system (Biran & Cotton, 2017;
Clancey, 1986).
Attempts to explain Deep Nets have often taken
contrastive approaches. These include occlusion (e.g.,
Zeiler & Fergus, 2014), which shows how
classifications differ as regions are removed from an
image, and counter-examples (e.g., Shafto, Goodman,
& Griffiths, 2014). A limitation of these approaches
is that they conflate explanation and justification. So,
for example, one team of computer scientists might
“explain” how their Deep Net works by showing a
matrix of node weights at the multiple layers within a
network. This works as a justification of the
architecture to computer scientists but does not work
for explaining the Deep Net to a human user who is
not a computer scientist. Furthermore, the focus of
the contrastive approaches is "local" explanation, that
is, explaining why the AI made a particular
determination for a particular case. An example
would be to show the user a heat map that highlights
the eyes and beak of a bird, accompanied by a brief
statement that the beak and eye features make this
bird a sparrow. This is different from "global”
explanation, which is aimed at explaining how an AI
system works in general (e.g., Doshi-Velez and Kim,
2017). Finally, explainability is often conflated with
interpretability, which is a formal/logical notion in
computer science. The fact that a computer system is
interpretable does not mean that it is human
understandable; the formal interpretation has
explanatory value only to computer scientists.
From these literatures, we have identified some
key concepts that serve as guidelines to consider in
the development of XAI systems.
Proceedings of the Human Factors and Ergonomics Society 2018 Annual Meeting 198
(1). Explaining is a Continuous Process.
Humans are motivated to “understand the goals,
intent, contextual awareness, task limitations, [and]
analytical underpinnings of the system in an attempt
to verify its trustworthiness” (Lyons, et al., 2017).
One of the consensus points coming from the
philosophy of science is that explanations have a
heuristic function: They guide further inquiry. The
delivery of an explanation is not always an end point.
Indeed, it must be thought of as a continuous process
since the XAI system that provides explanations must
enable the user to develop appropriate trust and
reliance in the AI system with continued experience.
The user must be able to actively explore the states
and choices of the AI, especially when the system is
operating close to its boundary conditions, including
when it makes errors (see Amerishi, et al., 2015).
How can XAI work in concert with the AI to
empower learning-during-use?
(2). Explaining is a Co-adaptive Process. Many
conceptual models, such as that of Johnson and
Johnson (1993) assume that the explanation process
is a one-way street: The explainer presents
information and instruction to the explainee. In
addition, conceptual models typically assume that an
explanation can be “satisfying,” implying that it is a
process with clear-cut beginning and end points (the
delivery of instructional material that the user simply
assimilates). An alternative view is that explanation
is a collaboration or co-adaptive process involving, in
the case of XAI, the learner/user and the system.
“Explanations improve cooperation, cooperation
permits the production of relevant explanations”
(Brezillon and Pomerol, 1997, p. 7; Moore &
Swartout, 1991). This is the concept of “participatory
explanation,” similar to the notion of “recipient
design" in the conversation analysis literature, i.e.,
that messages must be composed so as to be sensitive
to what the recipient of the message is understanding
(Sacks & Schegloff, 1974). An assumption in some
of the first generation of AI-explanation and
intelligent tutoring systems was that it is only the
human who has to learn, or change, as a result of
explanations offered by the machine.
(3). Explanation Triggers. Not everything needs
to be explained, and explanations are quite often
triggered by violations of expectation. Explanations
among people serve the purpose of clarifying
unexpected behavior, and so a good explainable
system may need to understand what are the
appropriate triggers of explanation.
(4). Self-explanation. Psychological research has
demonstrated that self-explanation improves learning
and understanding. This finding holds for both self-
motivated explanation and self-explanation that is
prompted by the instructor (Chi, Leeuw, Chiu, &
LaVancher, 1994).
(5). Explanation as Exploration. An important
mode of explanation is helping the user understand
the boundaries of the intelligent system (Mueller, et
al. 2011). System developers are often reluctant to
tell the user what the system cannot do—until they
misuse it. Famously, Tesla’s autopilot system is
touted as a self-driving car, except when accidents
occur and the user is blamed for operating it in
circumstances in which it was not intended to be
used. Clarifying boundary conditions can help
produce appropriate trust, so that the user knows
when to rely on the system, and when to take over.
(6). Contrast Cases. When forming explanations
of intelligent systems, it can be as important to tell
what is not being done as to tell what is being done.
Contrastive reasoning has been identified as central
to all explanation (e.g., Miller, Howe, & Sonenberg,
2017) and it can be an effective way to help the user
understand why an expectation was violated. For
example, an explainable GPS system might explain
why a turn was made by describing why a (normally
shorter) route was not taken.
One purpose of our cognitive modeling is to
highlight the key concepts that must be mated with
measures and metrics. The creation of AI systems
that can explain themselves will require a number of
types of measures.
Explanations generated by the AI can be
evaluated in terms of the goodness criteria, of what
makes an explanation good, according to the
research literature. From a roster of those criteria we
developed an "Explanation Satisfaction Scale," which
has been evaluated using the Content Validity Ratio
method (Lawshe, 1975), and following that a test of
discriminant validity which resulted in a very high
Cohen's alpha ~.80. The final scale consists of seven
Likert items that reference understandability,
satisfyingness, detail, accuracy, completeness,
usability, usefulness, and trustworthiness. This scale
may be used by AI researchers in the XAI Program to
evaluate the explanations that their systems produce
but might be used in other applications as well.
Effective use of intelligent systems depends on
user mental models (Kass & Finn, 1988). These have
to be elicited and evaluated. In the XAI Program they
can be elicited using some forms of structured
interview in which users express their understanding
of the AI system, with the protocols compared for
their propositional concordance with explanations
provided by experts. Based on the literature, we have
Proceedings of the Human Factors and Ergonomics Society 2018 Annual Meeting 199
developed a guidebook that details a variety of
methods for eliciting mental models.
Finally, the evaluation of XAI systems will
measure the change in performance attributable to the
explaining process, via controlled experimentation.
Performance can be evaluated in a number of ways.
Good explanations should enable the user to:
Efficiently and effectively use the AI in their work,
for the purposes that the AI is intended to serve.
Correctly predict what the AI system will do for
given cases. This can include cases that the AI gets
right and also cases it gets wrong (e.g., failures,
Explain how the AI works to other people.
Correctly assess whether a system determination is
correct, and thereby have appropriate trust.
Judge when and how to rely on the AI even while
knowing the boundary conditions of the
competence of the AI, and thereby having
appropriate reliance.
Experiments will have to evaluate the learning
that occurs during training as well as during
performance. These experiments will have to take
into account the difference between global and local
explanations. These key variables are modeled in
Figure 1, which appears following the References.
Another aspect of our effort in XAI is to
develop a “naturalistic” model of explanation based
on the analysis of a corpus of cases in which people
create explanations of complex situations or systems.
The trigger for local explanations is typically a
violated expectancy. “Why did it do that?” signifies a
surprise, and calls for an account to revise the
violated expectancy. And this process requires the
explainer to diagnose what user expectations need
revision — where is the learner's mental model
flawed or incomplete. Second, many AI systems start
with a complete account and then try to whittle this
account down into something manageable, but if the
trigger for a local explanation is a violated
expectancy then the process of explaining is aimed at
the flawed expectancy, and no whittling down is
needed. Third, what is the stopping point for
explaining something? AI systems do not have a
clear stopping point whereas our initial review of
naturalistic cases suggests that the stopping point is a
perspective shift in which the user moves from “Why
did it do that?” to “Now I see that in this situation I
would have done the same.” The current state of art
for AI systems does not take perspective shifts into
This material is based on research sponsored DARPA
under agreement number FA8650-17-2-7711 The U.S.
Government is authorized to reproduce and distribute
reprints for Governmental purposes notwithstanding any
copyright notation thereon. The views and conclusions
contained herein are those of the authors and should not be
interpreted as necessarily representing the official policies
or endorsements, either expressed or implied, of DARPA,
AFRL or the U.S. Government.
Amershi, S., Chickering, M., Drucker, S.M., Lee, B., Simard, P., &
Suh, J. (2015). Modeltracker: Redesigning performance analysis
tools for machine learning. In Proceedings of the 33rd Annual
ACM Conference on Human Factors in Computing Systems (pp.
337–346). New York: Association for Computing Machinery.
Biran, O., & Cotton, C. (2017). Explanation and Justification in
Machine Learning: A Survey. IJCAI-17 Workshop on Explainable
Artificial Intelligence (XAI).
Brézillon, P., & Pomerol, J.-C. (1997). Joint cognitive systems,
cooperative systems and decision support systems: A cooperation
in context. In Proceedings of the European Conference on
Cognitive Science, Manchester (pp. 129–139).
Chi, M.T., Leeuw, N., Chiu, M.-H., & LaVancher, C. (1994).
Eliciting self explanations improves understanding. Cognitive
Science, 18(3), 439–477.
Clancey, W.J. (1986a). From GUIDON to NEOMYCIN and
HERACLES in twenty short lessons. AI Magazine, 7(3), 40.
Doshi-Velez, F., & Kim, B. (2017). A Roadmap for a Rigorous
Science of Interpretability. ArXiv Preprint ArXiv:1702.08608.
Retrieved from
Goodman, B., & Flaxman, S. (2016). European Union regulations
on algorithmic decision-making and a “right to explanation.”
Presented at the ICML Workshop on Human Interpretability in
Machine Learning, New York, NY.
Johnson, H., & Johnson, P. (1993). Explanation Facilities and
Interactive Systems. In Proceedings of the 1st International
Conference on Intelligent User Interfaces (pp. 159–166). New
York: Association for Computing Machinery.
Kass, R., & Finin, T. (1988). The Need for User Models in
Generating Expert System Explanation. International Journal of
Expert Systems, 1(4), 345–375.
Krull, D. S., & Anderson, C. A. (1997). The process of explanation.
Current Directions in Psychological Science, 6(1), 1–5.
Kuang, C. (2017, 21 November). Can A.I. be taught to explain
itself? The New York Times. Retrieved from
Lawshe, C. H. (1975). A quantitative approach to content validity.
Personnel Psychology, 28, 563–575.
Lombrozo, T., & Carey, S. (2006). Functional explanation and the
function of explanation. Cognition, 99, 167–204.
Lucky, R.W. (2018, January). The mind of neural networks. IEEE
Spectrum, p. 24.
Lyons, J.B., Clark, M.A., Wagner, A.R., & Schuelke, M.J. (2017).
Certifiable trust in autonomous systems: Making the intractable
tangible. AI Magazine, 38(3), 37–49.
Proceedings of the Human Factors and Ergonomics Society 2018 Annual Meeting 200
Miller, T., Howe, P., & Sonenberg, L. (2017). Explainable AI:
Beware of Inmates Running the Asylum. In Proceedings of the
International joint Conference on Artificial Intelligence (IJCAI-17)
Workshop on Explainable Artificial Intelligence (XAI).
Moore, J. D., & Swartout, W. R. (1991). A reactive approach to
explanation: taking the user’s feedback into account. In C. Paris,
W.R. Swartyout, & W.C. Mann (Eds.), Natural language
generation in artificial intelligence and computational linguistics
(pp. 3–48). New York: Springer.
Mueller, S.T. & Klein, G. (March-April 2011). Improving users’
mental models of intelligent software tools. IEEE: Intelligent
Systems, 26(2), 77—83.
Pirolli, P., & Card, S. (2005). The sensemaking process and
leverage points for analyst technology as identified through
cognitive task analysis. In Proceedings of International
Conference on Intelligence Analysis (pp. 2–4). Washington, DC:
Office of the Assistant Director of Central Intelligence for
Analysis and Production.
Sacks, H. & Schegloff, E. (1974). A simplest systematics for the
organization of turn-taking for conversation. Language, 50, 696-
Shafto, P., & Goodman, N. (2008). Teaching games: Statistical
sampling assumptions for learning in pedagogical situations. In
Proceedings of the 30th annual conference of the Cognitive
Science Society (pp. 1632–1637). Austin, TX: Cognitive Science
Society Austin.
Temming, M. (2018, 20 January). AI has found an 8-planet system
like ours in Keppler data. Science News, p. 12.
Zeiler, M. D., Krishnan, D., Taylor, G. W., & Fergus, R. (2010).
Deconvolutional networks. In Proceedings of the 2010 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR)
(pp. 2528–2535). New York: IEEE.
Figure 1. Explanation spans the training and performance contexts, but
in doing so requires different kinds of explanation.
Proceedings of the Human Factors and Ergonomics Society 2018 Annual Meeting 201
... Our results indicate, in agreement with previous work, that explanations both in excerpts and free-text hurt prediction performance. This stands in contrast with human learning, where self-explanation helps learning [15]. ...
... This is particularly true for commonsense reasoning tasks with distractor choices: explanations tend to highlight why these distractors are unreasonable alternatives [9]. Further, explanations are also a co-adaptive process in which the explainer and the explainee collaborate to obtain a satisfactory explanation [15]. Since humans have a limited capacity to process information, they tend to select simple explanations that are less specific and cite fewer causes over plausible ones [27]. ...
... It is well established that humans can improve their learning and understanding of a given context through self-explanation [15]. However, it is unclear if this holds for machine learning models. ...
Full-text available
The rise of explainable natural language processing spurred a bulk of work on datasets augmented with human explanations, as well as technical approaches to leverage them. Notably, generative large language models offer new possibilities, as they can output a prediction as well as an explanation in natural language. This work investigates the capabilities of fine-tuned text-to-text transfer Transformer (T5) models for commonsense reasoning and explanation generation. Our experiments suggest that while self-rationalizing models achieve interesting results, a significant gap remains: classifiers consistently outperformed self-rationalizing models, and a substantial fraction of model-generated explanations are not valid. Furthermore, training with expressive free-text explanations substantially altered the inner representation of the model, suggesting that they supplied additional information and may bridge the knowledge gap. Our code is publicly available, and the experiments were run on open-access datasets, hence allowing full reproducibility.
... While there seems to be a consensus in the XAI community that trust is a critical factor in human-AI interaction, researchers have identified challenges in measuring trust in the context of AI. For example, different conceptualizations of trust exist that are not clearly distinguished from one another (e.g., appropriate trust [15], calibrated trust [22], warranted trust [16], or reliance [33]). These various conceptualizations may lead to differences in the operationalization of trust. ...
Full-text available
Trust is a key motivation in developing explainable artificial intelligence (XAI). However, researchers attempting to measure trust in AI face numerous challenges, such as different trust conceptualizations, simplified experimental tasks that may not induce uncertainty as a prerequisite for trust, and the lack of validated trust questionnaires in the context of AI. While acknowledging these issues, we have identified a further challenge that currently seems underappreciated - the potential distinction between trust as one construct and \emph{distrust} as a second construct independent of trust. While there has been long-standing academic discourse for this distinction and arguments for both the one-dimensional and two-dimensional conceptualization of trust, distrust seems relatively understudied in XAI. In this position paper, we not only highlight the theoretical arguments for distrust as a distinct construct from trust but also contextualize psychometric evidence that likewise favors a distinction between trust and distrust. It remains to be investigated whether the available psychometric evidence is sufficient for the existence of distrust or whether distrust is merely a measurement artifact. Nevertheless, the XAI community should remain receptive to considering trust and distrust for a more comprehensive understanding of these two relevant constructs in XAI.
... Trust in AI more specifically has been viewed through the lens of trust being a function of explainability, performance (e.g. minimal false positives), and reliability (Hoffman et al., 2018;Lyons et al. 2016); however, recent research has shown that there are numerous factors affecting trust in AI (Dorton, & Harper, 2021;Dorton & Harper, 2022). ...
Full-text available
Artificial Intelligence (AI) is becoming ubiquitous in national security work (intelligence, defense, etc.); however, introducing AI into work systems is fraught with challenges. Trust is gained and lost through experiences, and there are many factors that affect trust in AI. Similarly, users adapt their workflows based on trust in these systems. We used a naturalistic approach to understand how intelligence professionals adapted their work practices after gaining or losing trust in AI. We found a variety of adaptations, which were characterized as either being task-based or frequency-based; where users added or removed tasks from their workflow or where they changed the frequency in which they used the AI in their workflow, respectively. We provide specific examples and quotes from participants along with findings, and discuss potential methodological implications for studying and designing AI-driven work systems.
... This study is a step forward in covering this gap. The work in [53] presented a scale to evaluate XIA model explanations in human-machine work systems, but they did not provide explanations. Other studies, such as [23], believed that the current XAI models do not satisfy the need to understand how an ML model works internally but rather give a shallow justification of how final results were extracted. ...
Full-text available
Artificial intelligence (AI) and machine learning (ML) models have become essential tools used in many critical systems to make significant decisions; the decisions taken by these models need to be trusted and explained on many occasions. On the other hand, the performance of different ML and AI models varies with the same used dataset. Sometimes, developers have tried to use multiple models before deciding which model should be used without understanding the reasons behind this variance in performance. Explainable artificial intelligence (XAI) models have presented an explanation for the models’ performance based on highlighting the features that the model considered necessary while making the decision. This work presents an analytical approach to studying the density functions for intrusion detection dataset features. The study explains how and why these features are essential during the XAI process. We aim, in this study, to explain XAI behavior to add an extra layer of explainability. The density function analysis presented in this paper adds a deeper understanding of the importance of features in different AI models. Specifically, we present a method to explain the results of SHAP (Shapley additive explanations) for different machine learning models based on the feature data’s KDE (kernel density estimation) plots. We also survey the specifications of dataset features that can perform better for convolutional neural networks (CNN) based models.
Modern artificial intelligence (AI) and machine learning (ML) systems have become more capable and more widely used, but often involve underlying processes their users do not understand and may not trust. Some researchers have addressed this by developing algorithms that help explain the workings of the system using ‘Explainable’ AI algorithms (XAI), but these have not always been successful in improving their understanding. Alternatively, collaborative user-driven explanations may address the needs of users, augmenting or replacing algorithmic explanations. We evaluate one such approach called “collaborative explainable AI” (CXAI). Across two experiments, we examined CXAI to assess whether users’ mental models, performance, and satisfaction improved with access to user-generated explanations. Results showed that collaborative explanations afforded users a better understanding of and satisfaction with the system than users without access to the explanations, suggesting that a CXAI system may provide a useful support that more dominant XAI approaches do not.
The field of explainable artificial intelligence (XAI) is gaining increasing importance in recent years. As a consequence, several surveys have been published to explore the current state of the art on this topic. One aspect that seems to be overlooked by these works is the applied presentation methods and, specifically, the role of natural language in generating the final explanations. This survey reviews 70 XAI papers published between 2006 and 2021 and evaluates their readiness with respect to natural language explanations. Thus, together with a set of hierarchical criteria, we define a multi-criteria decision-making model. Finally, we conclude that only a handful of recent XAI works either considered natural language explanations to approach final users (see, e.g.,(Bennetot et al., 2021)) or implemented a method capable of generating such explanations.
Full-text available
In his seminal book `The Inmates are Running the Asylum: Why High-Tech Products Drive Us Crazy And How To Restore The Sanity' [2004, Sams Indianapolis, IN, USA], Alan Cooper argues that a major reason why software is often poorly designed (from a user perspective) is that programmers are in charge of design decisions, rather than interaction designers. As a result, programmers design software for themselves, rather than for their target audience; a phenomenon he refers to as the `inmates running the asylum'. This paper argues that explainable AI risks a similar fate. While the re-emergence of explainable AI is positive, this paper argues most of us as AI researchers are building explanatory agents for ourselves, rather than for the intended users. But explainable AI is more likely to succeed if researchers and practitioners understand, adopt, implement, and improve models from the vast and valuable bodies of research in philosophy, psychology, and cognitive science; and if evaluation of these models is focused more on people than on technology. From a light scan of literature, we demonstrate that there is considerable scope to infuse more results from the social and behavioural sciences into explainable AI, and present some key results from these fields that are relevant to explainable AI.
Full-text available
Intelligent software algorithm are increasingly becoming a tool in consumers daily lives. Users understand the basic mechanics of the intelligent software systems they rely on, but often novices have no direct knowledge of their intelligent devices algorithm, data requirements, limitations, and representations. Problems can go beyond those caused by a poor user interface design and a user's ability to under stand a tool's simple components, which could be alleviated with proper instruction. This article describes the Experiential User Guide (EUG), a concept designed to address these challenges.
This article discusses verification and validation (V&V) of autonomous systems, a concept that will prove to be difficult for systems that were designed to execute decision initiative. V&V of such systems should include evaluations of the trustworthiness of the system based on transparency inputs and scenario-based training. Transparency facets should be used to establish shared awareness and shared intent between the designer, tester, and user of the system. The transparency facets will allow the human to understand the goals, social intent, contextual awareness, task limitations, analytical underpinnings, and team-based orientation of the system in an attempt to verify its trustworthiness. Scenario-based training can then be used to validate that programming in a variety of situations that test the behavioral repertoire of the system. This novel method should be used to analyze behavioral adherence to a set of governing principles coded into the system.
As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs. These explanations are often used to qualitatively assess other criteria such as safety or non-discrimination. However, despite the interest in interpretability, there is very little consensus on what interpretable machine learning is and how it should be measured. In this position paper, we first define interpretability and describe when interpretability is needed (and when it is not). Next, we suggest a taxonomy for rigorous evaluation and expose open questions towards a more rigorous science of interpretable machine learning.
We summarize the potential impact that the European Union's new General Data Protection Regulation will have on the routine use of machine learning algorithms. Slated to take effect as law across the EU in 2018, it will restrict automated individual decision-making (that is, algorithms that make decisions based on user-level predictors) which "significantly affect" users. The law will also create a "right to explanation," whereby a user can ask for an explanation of an algorithmic decision that was made about them. We argue that while this law will pose large challenges for industry, it highlights opportunities for machine learning researchers to take the lead in designing algorithms and evaluation frameworks which avoid discrimination.
Conference Paper
Model building in machine learning is an iterative process. The performance analysis and debugging step typically involves a disruptive cognitive switch from model building to error analysis, discouraging an informed approach to model building. We present ModelTracker, an interactive visualization that subsumes information contained in numerous traditional summary statistics and graphs while displaying example-level performance and enabling direct error examination and debugging. Usage analysis from machine learning practitioners building real models with ModelTracker over six months shows ModelTracker is used often and throughout model building. A controlled experiment focusing on ModelTracker's debugging capabilities shows participants prefer ModelTracker over traditional tools without a loss in model performance.
Much of learning and reasoning occurs in pedagogical situ- ations - situations in which teachers choose examples with the goal of having a learner infer the concept the teacher has in mind. In this paper, we present a model of teaching and learning in pedagogical settings which predicts what examples teachers should choose and what learners should infer given a teachers' examples. We present two experiments using an experimental paradigm called the rectangle game. The first experiment compares people's inferences to qualitative model predictions. The second experiment tests people in a situation where pedagogical sampling is not appropriate, ruling out al- ternative explanations, and suggesting that people use context- appropriate sampling assumptions. We conclude by discussing connections to broader work in inductive reasoning and cogni- tive development, and outline areas of future work. Much of human learning and reasoning goes on in ped- agogical settings. In schools, teachers impart their knowl- edge to students about mathematics, science, and literature through examples and problems. From early in life, parents teach children words for objects and actions, and cultural and personal preferences through subtle glances and outright ad- monitions. Pedagogical settings - settings where one agent is choosing information to transmit to another agent for the pur-