Conference PaperPDF Available

A Survey of Explainable AI Terminology

Authors:

Abstract and Figures

The field of Explainable Artificial Intelligence attempts to solve the problem of algorithmic opacity. Many terms and notions have been introduced recently to define Explainable AI, however, these terms seem to be used interchangeably, which is leading to confusion in this rapidly expanding field. As a solution to overcome this problem, we present an analysis of the existing research literature and examine how key terms, such as transparency, intelligibility, interpretability, and explainability are referred to and in what context. This paper, thus, moves towards a standard terminology for Explainable AI. Explainable AI, black-box, NLG, Theoretical Issues, Transparency, Intelligibility, Interpretability, Explainability, XAI
Content may be subject to copyright.
Proceedings of the 1st Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence (NL4XAI 2019), pages 8–13,
Tokyo, Japan, October 29 2019. c
2019 Association for Computational Linguistics
A Survey of Explainable AI Terminology
Miruna A. Clinciu and Helen F. Hastie
Edinburgh Centre for Robotics
Heriot-Watt University, Edinburgh, EH14 4AS, UK
{mc191, H.Hastie}@hw.ac.uk
Abstract
The field of Explainable Artificial Intelligence
attempts to solve the problem of algorithmic
opacity. Many terms and notions have been
introduced recently to define Explainable AI,
however, these terms seem to be used inter-
changeably, which is leading to confusion in
this rapidly expanding field. As a solution to
overcome this problem, we present an analysis
of the existing research literature and examine
how key terms, such as transparency,intelli-
gibility,interpretability, and explainability are
referred to and in what context. This paper,
thus, moves towards a standard terminology
for Explainable AI.
Keywords— Explainable AI, black-box, NLG, The-
oretical Issues, Transparency, Intelligibility, Inter-
pretability, Explainability
1 Introduction
In recent years, there has been an increased in-
terest in the field of Explainable Artificial Intel-
ligence (XAI). However, there is clear evidence
from the literature that there are a variety of terms
being used interchangeably such as transparency,
intelligibility,interpretability, and explainability,
which is leading to confusion. Establishing a set of
standard terms to be used by the community will
become increasingly important as XAI is man-
dated by regulation, such as the GDPR and as stan-
dards start to appear such as the IEEE standard in
transparency (P7001). This paper works towards
this goal.
Explainable Artificial Intelligence is not a new
area of research and the term explainable has ex-
isted since the mid-1970s (Moore and Swartout,
1988). However, XAI has come to the forefront
in recent times due to the advent of deep machine
learning and the lack of transparency of “black-
box” models. We introduce below, some descrip-
tions of XAI collected from the literature:
“Explainable AI can present the user with an
easily understood chain of reasoning from the
user's order, through the AI's knowledge and
inference, to the resulting behaviour” (van
Lent et al.,2004).
“XAI is a research field that aims to make AI
systems results more understandable to hu-
mans” (Adadi and Berrada,2018).
Thus, we conclude that XAI is a research field that
focuses on giving AI decision-making models the
ability to be easily understood by humans. Natural
language is an intuitive way to provide such Ex-
plainable AI systems. Furthermore, XAI will be
key for both expert and non-expert users to enable
them to have a deeper understanding and the ap-
propriate level of trust, which will hopefully lead
to increased adoption of this vital technology.
This paper firstly examines the various notions
that are frequently used in the field of Explainable
Artificial Intelligence in Section 2and attempts to
organise them diagrammatically. We then discuss
these terms with respect to Natural Language Gen-
eration in Section 3and provide conclusions.
2 Terminology
In this section, we examine four key terms found
frequently in the literature for describing various
techniques for XAI. These terms are illustrated in
Figure 1, where we organise them as a Venn dia-
gram that describes how a transparent AI system
has several facets, which include intelligibility,ex-
plainability, and interpretability. Below, we dis-
cuss how intelligibility can be discussed in terms
of explainability and/or interpretability. For each
of these terms, we present the dictionary defini-
tions extracted from modern and notable English
dictionaries, quotes from the literature presented
in tables and discuss how they support the pro-
posed structure given in Figure 1. In every table,
we emphasise related words and context, in order
8
to connect ideas and build up coherent relation-
ships within the text.
In this paper, the first phase of the selection cri-
teria of publications was defined by the relevance
of the paper and related key words. The second
phase was performed manually by choosing the
papers that define or describe the meaning of the
specified terms or examine those terms for ways in
which they are different, alike, or related to each
other.
Figure 1: A Venn Diagram of the relationship between
frequently used terms, that offers a representation of
the authors' interpretation for the field, excluding post-
hoc interpretation.
Transparency
Dictionary definitions: The word “transparent
refers to something that is “clear and easy to
understand” (Cambridge Dictionary,2019d); or
“easily seen through, recognized, understood, de-
tected; manifest, evident, obvious, clear” (Oxford
English Dictionary,2019d); or “language or infor-
mation that is transparent is clear and easy to un-
derstand” (The Longman Dictionary of Contem-
porary English,2019c).
Conversely, an opaque AI system is a system
with the lowest level of transparency, known as a
black-box” model. A similar definition is given
by Tomsett et al. (2018) in Table 1.
Tintarev and Masthoff (2007) state that trans-
parency “explains how the system works” and it is
considered one of the possible explanation facili-
ties that could influence good recommendations in
recommender systems.
In the research paper by Cramer et al. (2008),
transparency aims to increase understanding and
entails offering the user insight as to how a system
works, for example, by offering explanations for
system choices and behaviour.
Transparency clearly describing the model
structure, equations , parameter values , and
assumptions to enable interested parties to
understand the model” (Briggs et al.,2012).
Tomsett et al. (2018) defined transparency as a
“level to which a system provides information
about its internal workings or structure and
both “explainability and transparency are im-
portant for improving creator-interpretability”.
“Informally, transparency is the opposite of
opacity or blackbox-ness. It connotes some
sense of understanding the mechanism by
which the model works. We consider trans-
parency at the level of the model (simulata-
bility), at the level of individual components
(e.g. parameters) (decomposability), and at
the level of the training algorithm (algorithmic
transparency)” (Lipton,2016).
Table 1: Various notions of Transparency presented in
recent research papers
Intelligibility
Dictionary definitions: An “intelligible” system
should be “clear enough to be understood” accord-
ing to Cambridge Dictionary (2019b); or “capa-
ble of being understood; comprehensible” (Ox-
ford English Dictionary,2019b); or “easily un-
derstood” (The Longman Dictionary of Contem-
porary English,2019d).
The concept of intelligibility was defined by
Bellotti and Edwards (2001) from the perspective
of “context-aware systems that seek to act upon
what they infer about the context must be able to
represent to their users what they know, how they
know it, and what they are doing about it” (Bellotti
and Edwards,2001).
As illustrated in Table 2, it is challenging to
define how intelligible AI systems could be de-
signed, as they would need to communicate very
complex computational processes to various types
of users (Weld and Bansal,2018). Per the Venn di-
agram in Figure 1, we consider that an AI system
could become intelligible in a number of ways,
but also through explanations (e.g. in natural lan-
guage) and/or interpretations. We discuss both of
these in turn below.
9
“It remains remarkably hard to specify what
makes a system intelligible; The key challenge
for designing intelligible AI is communicating
a complex computational process to a human.
Specifically, we say that a model is intelligi-
ble to the degree that a human user can predict
how a change to a feature” (Weld and Bansal,
2018).
Intelligibility can help expose the inner work-
ings and inputs of context-aware applications
that tend to be opaque to users due to their
implicit sensing and actions” (Lim and Dey,
2009).
Table 2: Various notions of Intelligibility presented in
recent research papers
Interpretability
Dictionary Definitions: According to Cambridge
Dictionary (2019c), the word “interpret” defini-
tion is “to decide what the intended meaning
of something is”; or “to expound the meaning
of (something abstruse or mysterious); to render
(words, writings, an author, etc.) clear or explicit;
to elucidate; to explain” (Oxford English Dictio-
nary,2019c); or “to explain the meaning of some-
thing” (The Longman Dictionary of Contempo-
rary English,2019b).
Considering a “black-box” model, we will try to
understand how users and developers could define
the model interpretability. A variety of definitions
of the term interpretability have been suggested in
recent research papers, as presented in Table 3.
Various techniques have been used to give in-
sights into an AI model through interpretations,
such as Feature Selection Techniques (Kim et al.,
2015), Shapley Values (Sundararajan and Najmi,
2019); the interpretation of the AI model inter-
pretation e.g. Hybrid AI models (Wang and Lin,
2019), by combining interpretable models with
opaque models, and output interpretation (e.g.
Evaluation Metrics Interpretation (Mohseni et al.,
2018), and Visualisation Techniques Interpretation
(Samek et al.,2017;Choo and Liu,2018)). Thus
in our model in Figure 1, we define interpretability
as intersecting with explainability as some mod-
els may be interpretable without needing explana-
tions.
“In model-agnostic interpretability, the
model is treated as a black-box . Interpretable
models may also be more desirable when
interpretability is much more important than
accuracy, or when interpretable models trained
on a small number of carefully engineered
features are as accurate as black-box models”.
(Ribeiro et al.,2016)
An explanation can be evaluated in two ways:
according to its interpretability, and according
to its completeness ” (Gilpin et al.,2018).
“We define interpretable machine learning
as the use of machine-learning models for
the extraction of relevant knowledge about do-
main relationships contained in data...” (Mur-
doch et al.,2019).
Table 3: Various notions of Interpretability presented
in recent research papers
Explainability
Dictionary Definitions: For the word “explain
were extracted the following definitions: “to make
something clear or easy to understand by describ-
ing or giving information about it” Cambridge
Dictionary (2019a); or “to provide an explanation
for something. to make plain or intelligible” (Ox-
ford English Dictionary,2019a); or “to tell some-
one about something in a way that is clear or easy
to understand. to give a reason for something or to
be a reason for something” (The Longman Dictio-
nary of Contemporary English,2019a).
Per these definitions, providing explanations is
about improving the user’s mental model of how a
system works. Ribera and Lapedriza (2019) con-
sider that we do not have a concrete definition for
explanation in the literature. However, according
to these authors, every definition relates “expla-
nations with “why” questions or causality reason-
ings”. Given the nature of the explanations, Rib-
era and Lapedriza (2019) proposed to categorise
the explainees in three main groups, based on their
goals, background, and relationship with the prod-
uct, namely: developers and AI researchers, do-
main experts, and lay users. Various types of ex-
planations have been presented in the literature
such as “why” and “why not” (Kulesza et al.,
2013) or Adadi and Berrada (2018)’s four types
of explanations that are used to “justify, control,
discover and improve”. While it is out of scope
10
to go into detail here, what is clear is that in most
uses of the term explainability, it means providing
a way to improve the understanding of the user,
whomever they may be.
Explanation is considered closely related to
the concept of interpretability” (Biran and
Cotton,2017).
Transparent design: model is inherently in-
terpretable (globally or locally)” (Lucic et al.,
2019).
“I equate interpretability with explainabil-
ity” (Miller,2018).
“Systems are interpretable if their operations
can be understood by a human, either through
introspection or through a produced explana-
tion” (Biran and Cotton,2017).
In the paper (Poursabzi-Sangdeh et al.,2018),
interpretability is defined as something “that
cannot be manipulated or measured, and could
be defined by people , not algorithms”.
Table 4: Various notions of Explainability presented in
recent research papers
3 The Role of NLG in XAI
An intuitive medium to provide such explanations
is through natural language. The human-like ca-
pability of Natural Language Generation (NLG)
has the potential to increase the intelligibility of
an AI system and enable a system to provide ex-
planations that are tailored to the end-user (Chiyah
Garcia et al.,2019).
One can draw an analogy between natural lan-
guage generation of explanations and Lacave and
Diez’s model of explanation generation for expert
systems (Lacave and D´
ıez,2002); or Reiter and
Dale’s NLG pipeline (Reiter and Dale,2000) with
stages for determining “what” to say in an expla-
nation (content selection) and “how” to say it (sur-
face realisation). Lacave and Diez’s model also
emphasises the importance of adapting to the user,
which is also a focus area in NLG (e.g. adapting
styles (Dethlefs et al.,2014)).
Other studies have looked at agents and robots
providing a rationalisation of their behaviour
(Ehsan et al.,2018) by providing a running com-
mentary in language. Whilst this is not necessarily
how humans behave, it is beneficial to be able to
provide such rationalisation, especially in the face
of unusual behaviour and, again, natural language
is one way to do this. Defined as a process of pro-
ducing an explanation for an agent or system be-
havior as if a human had performed the behaviour,
AI rationalisation has multiple advantages to be
taken into consideration: “naturally accessible and
intuitive to humans, especially non-experts, could
increase the satisfaction, confidence, rapport, and
willingness to use autonomous systems and could
offer real-time response” (Ehsan et al.,2018).
4 Conclusions and Future work
In this paper, we introduced various terms that
could be found in the field of Explainable AI and
their concrete definition. In Figure 1, we have
attempted to define the relationship between the
main terms that define Explainable AI. Intelligi-
bility could be achieved through explanations and
interpretations, where the type of user, their back-
ground, goal and current mental model are taken
into consideration.
As mentioned previously, interpretability is de-
fined as a concept close to explainability (Biran
and Cotton,2017). Our Venn diagram given in
Figure 1illustrates that transparent systems could
be, by their nature interpretable, without providing
explanations and that the activities of interpreting
a model and explaining why a system behaves the
way it does are fundamentally different. We posit,
therefore, that the field moving forward should be
wary of using such terms interchangeably. Natu-
ral Language Generation will be key to providing
explanations, and rationalisation is one approach
that we have discussed here.
Evaluation of NLG is challenging area (Hastie
and Belz,2014) with objective measures such as
BLEU being shown not to reflect human ratings
(Liu et al.,2016). How natural language expla-
nations are evaluated will likely be based on, in
the near term at least, subjective measures that try
to evaluate an explanation in terms of whether it
improves a system’s intelligibility,interpretability
and transparency along with other typical metrics
related to the quality and clarity of the language
used (Curry et al.,2017).
In future work, it would be advisable to perform
empirical analysis of research papers related to
the various terms and notions introduced here and
continuously being added into the field of XAI.
11
Acknowledgements
The authors gratefully acknowledge the support
of Dr. Inˆ
es Cecilio, Prof. Mike Chantler, and
Dr. Vaishak Belle. This research was funded by
Schlumberger Cambridge Research Centre Doc-
toral programme.
References
Amina Adadi and Mohammed Berrada. 2018. Peeking
Inside the Black-Box: A Survey on Explainable Ar-
tificial Intelligence (XAI).IEEE Access, 6:52138–
52160.
Victoria Bellotti and Keith Edwards. 2001. Intelligi-
bility and accountability: Human considerations in
context-aware systems.Human-Computer Interac-
tion, 16(2-4):193–212.
Or Biran and Courtenay Cotton. 2017. Explanation
and Justification in Machine Learning: A Survey.
In Proceedings of the 1st Workshop on Explainable
Artificial Intelligence, IJCAI 2017.
Andrew H. Briggs, Milton C. Weinstein, Elisabeth
A. L. Fenwick, Jonathan Karnon, Mark J. Sculpher,
and A. David Paltiel. 2012. Model parameter es-
timation and uncertainty: A report of the ispor-
smdm modeling good research practices task force-
6.Value in Health, 15(6):835–842.
Cambridge Dictionary. 2019a. Explain. Cambridge
University Press. Accessed on 2019-08-25.
Cambridge Dictionary. 2019b. Intelligible. Cambridge
University Press. Accessed on 2019-08-25.
Cambridge Dictionary. 2019c. Interpret. Cambridge
University Press. Accessed on 2019-08-25.
Cambridge Dictionary. 2019d. Transparent. Cam-
bridge University Press. Accessed on 2019-08-25.
Francisco Javier Chiyah Garcia, David A. Robb,
Xingkun Liu, Atanas Laskov, Pedro Patron, and He-
len Hastie. 2019. Explainable Autonomy: A Study
of Explanation Styles for Building Clear Mental
Models. In Proceedings of the International Natural
Language Generation (INLG).
J. Choo and S. Liu. 2018. Visual analytics for explain-
able deep learning.IEEE Computer Graphics and
Applications, 38(4):84–92.
Henriette Cramer, Vanessa Evers, Satyan Ramlal,
Maarten van Someren, Lloyd Rutledge, Natalia
Stash, Lora Aroyo, and Bob Wielinga. 2008. The
effects of transparency on trust in and acceptance of
a content-based art recommender.User Modeling
and User-Adapted Interaction, 18(5):455.
Amanda Cercas Curry, Helen Hastie, and Verena
Rieser. 2017. A review of evaluation techniques for
social dialogue systems. In ISIAA 2017 - Proceed-
ings of the 1st ACM SIGCHI International Workshop
on Investigating Social Interactions with Artificial
Agents, Co-located with ICMI 2017, pages 25–26.
Association for Computing Machinery, Inc.
Nina Dethlefs, Heriberto Cuay ´
ahuitl, Helen Hastie,
Verena Rieser, and Oliver Lemon. 2014. Cluster-
based prediction of user ratings for stylistic sur-
face realisation. In Proceedings of the 14th Confer-
ence of the European Chapter of the Association for
Computational Linguistics 2014, EACL 2014, pages
702–711. Association for Computational Linguistics
(ACL).
Upol Ehsan, Brent Harrison, Larry Chan, and Mark O.
Riedl. 2018. Rationalization: A Neural Machine
Translation Approach to Generating Natural Lan-
guage Explanations. In AIES 2018 - Proceedings
of the 2018 AAAI/ACM Conference on AI, Ethics,
and Society, pages 81–87. Association for Comput-
ing Machinery, Inc.
L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter,
and L. Kagal. 2018. Explaining explanations: An
overview of interpretability of machine learning. In
Proceedings of the 5th International Conference on
Data Science and Advanced Analytics (DSAA) 2018
IEEE, pages 80–89. IEEE.
Helen Hastie and Anja Belz. 2014. A comparative
evaluation methodology for nlg in interactive sys-
tems. In Proceedings of the Ninth International
Conference on Language Resources and Evaluation
(LREC’14), Reykjavik, Iceland. European Language
Resources Association (ELRA).
Been Kim, Julie A Shah, and Finale Doshi-Velez.
2015. Mind the gap: A generative approach to in-
terpretable feature selection and extraction. In Pro-
ceedings of the Twenty-ninth Conference on Neu-
ral Information Processing Systems, NeurIPS 2015,
pages 2260–2268. Curran Associates, Inc.
T. Kulesza, S. Stumpf, M. Burnett, S. Yang, I. Kwan,
and W. Wong. 2013. Too much, too little, or just
right? ways explanations impact end users’ mental
models. In Proceedings of the 2013 IEEE Sympo-
sium on Visual Languages and Human Centric Com-
puting, pages 3–10. IEEE.
Carmen Lacave and Francisco J. D´
ıez. 2002. A Review
of Explanation Methods for Bayesian Networks.
The Knowledge Engineering Review, 17(2):107–
127.
Michael van Lent, William Fisher, and Michael Man-
cuso. 2004. An explainable artificial intelligence
system for small-unit tactical behavior. In Proceed-
ings of the 16th Conference on Innovative Applica-
tions of Artifical Intelligence, IAAI’04, pages 900–
907. AAAI Press.
12
Brian Y. Lim and Anind K. Dey. 2009. Assessing
demand for intelligibility in context-aware applica-
tions. In Proceedings of the 11th International Con-
ference on Ubiquitous Computing, UbiComp ’09,
pages 195–204, New York, NY, USA. ACM.
Zachary Chase Lipton. 2016. The mythos of model
interpretability.arXiv preprint arXiv:1606.03490.
Chia-Wei Liu, Ryan Lowe, Iulian V. Serban, Michael
Noseworthy, Laurent Charlin, and Joelle Pineau.
2016. How NOT To Evaluate Your Dialogue Sys-
tem: An Empirical Study of Unsupervised Evalua-
tion Metrics for Dialogue Response Generation. In
Proceedings of the Conference on Empirical Meth-
ods in Natural Language Processing, EMNLP 2016,
page 13.
Ana Lucic, Hinda Haned, and Maarten de Rijke. 2019.
Contrastive explanations for large errors in retail
forecasting predictions through monte carlo simula-
tions.arXiv preprint arXiv:1908.00085.
Tim Miller. 2018. Explanation in Artificial Intelli-
gence: Insights from the Social Sciences.arXiv
preprint arXiv:1706.07269.
Sina Mohseni, Niloofar Zarei, and Eric D. Ragan.
2018. A survey of evaluation methods and measures
for interpretable machine learning.arXiv preprint
arXiv:1811.11839, abs/1811.11839.
J.D. Moore and W.R. Swartout. 1988. Explanation in
Expert Systems: A Survey. Number no. 228 in Ex-
planation in Expert Systems: A Survey. University
of Southern California, Information Sciences Insti-
tute.
W James Murdoch, Chandan Singh, Karl Kumbier,
Reza Abbasi-Asl, and Bin Yu. 2019. Interpretable
machine learning: definitions, methods, and appli-
cations.arXiv preprint arXiv:1901.04592.
Oxford English Dictionary. 2019a. explain, v. Oxford
University Press. Accessed on 2019-11-10.
Oxford English Dictionary. 2019b. intelligible, adj.
(and n.). Oxford University Press. Accessed on
2019-11-10.
Oxford English Dictionary. 2019c. interpret, v. Oxford
University Press. Accessed on 2019-11-10.
Oxford English Dictionary. 2019d. transparent, adj.
(and n.). Oxford University Press. Accessed on
2019-11-10.
Forough Poursabzi-Sangdeh, Daniel G. Goldstein,
Jake M. Hofman, Jennifer Wortman Vaughan, and
Hanna M. Wallach. 2018. Manipulating and
measuring model interpretability.arXiv preprint
arXiv:1802.07810.
Ehud Reiter and Robert Dale. 2000. Building Natural
Language Generation Systems. Cambridge Univer-
sity Press, New York, NY, USA.
Marco T´
ulio Ribeiro, Sameer Singh, and Car-
los Guestrin. 2016. Model-agnostic inter-
pretability of machine learning.arXiv preprint
arXiv:1606.05386.
Mireia Ribera and Agata Lapedriza. 2019. Can we do
better explanations? A proposal of user-centered ex-
plainable AI. In Proceedings of the CEUR Work-
shop, volume 2327. CEUR-WS.
Wojciech Samek, Thomas Wiegand, and Klaus-Robert
Mller. 2017. Explainable artificial intelligence: Un-
derstanding, visualizing and interpreting deep learn-
ing models.ITU Journal: ICT Discoveries - Special
Issue 1 - The Impact of Artificial Intelligence (AI) on
Communication Networks and Services, 1:1–10.
Mukund Sundararajan and Amir Najmi. 2019. The
many shapley values for model explanation.arXiv
preprint arXiv:1908.08474, abs/1908.08474.
The Longman Dictionary of Contemporary English.
2019a. explain. Pearson Longman. Accessed on
2019-11-10.
The Longman Dictionary of Contemporary English.
2019b. interpret, v. Pearson Longman. Accessed
on 2019-11-10.
The Longman Dictionary of Contemporary English.
2019c. transparent, adj. (and n.). Pearson Long-
man. Accessed on 2019-11-10.
The Longman Dictionary of Contemporary English.
2019d. transparent, adj. (and n.). Pearson Long-
man. Accessed on 2019-11-10.
Nava Tintarev and Judith Masthoff. 2007. Effective ex-
planations of recommendations: User-centered de-
sign. In Proceedings of the 2007 ACM Conference
on Recommender Systems, RecSys ’07, pages 153–
156, New York, NY, USA. ACM.
Richard Tomsett, Dave Braines, Dan Harborne,
Alun D. Preece, and Supriyo Chakraborty. 2018.
Interpretable to whom? A role-based model for
analyzing interpretable machine learning systems.
arXiv preprint arXiv:1806.07552, abs/1806.07552.
Tong Wang and Qihang Lin. 2019. Hybrid predic-
tive model: When an interpretable model collab-
orates with a black-box model.arXiv preprint
arXiv:1905.04241.
Daniel S. Weld and Gagan Bansal. 2018. In-
telligible artificial intelligence.arXiv preprint
arXiv:1803.04263.
13
... Toward addressing the raised issues, considerable attempts have been made to define the problem by proposing terminologies and definitions. Keywords such as "transparency," "interpretability," "explainability", "intelligibility", "(white or grey or black)-Box", "responsible-AI", "third-wave-AI", "comprehensibility", are related to the main concept and have been used interchangeably [46], [45], [47], [48]. However, it is found that there are improper uses of the terms [46] (or perhaps "terminology ambiguity" [47]) in literature, and there is a lack of consensus on the definitions of the concept among researchers [45]. ...
... In a recent work [46], the authors define "explainability" as "given a certain audience, explainability refers to the details and reasons a model gives to make its functioning clear or easy to understand". Other definitions can be found or compiled in [46], [45], [48]. ...
... A model is transparent if it is understandable by itself; that is, it has inherent white box character. For example, transparent models can be k-nearest neighbor (k-NN), decision trees (DTs), and linear models [13,14] (for other less popular definitions of transparency, interpretability, and explainability, consult [15,16]). Such models are transparent but offer limited learning capacity, while more complex models like support vector regression and multilayer perceptrons (MLPs) can learn complex problems but are nontransparent [17]. ...
Article
Full-text available
Batteries are the most expensive component of battery electric vehicles (BEVs), but they degrade over time and battery operation. State of health (SOH) forecasting models learn how battery operation over long-time periods of weeks or months influences battery aging. Currently, existing methods for SOH forecasting of lithium-ion batteries based on deep neural network (DNN) models lack explainability of their forecasts due to their inherent black box character. However, the explainability of forecasts is essential to build user trust into the forecasting models. In this work, we address this problem from two perspectives: First, we compared four machine learning (ML) models like decision tree and random forest, which are inherently transparent, to two new DNN architectures with a more inherent black box character. Second, we proposed a new method using Gaussian-filtered saliency maps to visualize battery operational states that are relevant to DNN models. This method is applied to the best DNN models previously trained. We used an extensive data corpus consisting of five public data sets with different operational conditions, battery types, and aging trajectories. Furthermore, we show that the Gaussian-filtered saliency maps meaningfully visualize battery operational states that are consistent with findings from controlled laboratory aging experiments. Thus, this work was able to add transparency and interpretability to the SOH forecasting results of two state-of-the-art DNNs, while maintaining their superior performance compared to transparent ML models, while mitigating their inherent black box character.
... Chromik et al. 2019;Clinciu 2019;Lipton 2018;Mohseni 2018;Rader et al. 2018) ...
Chapter
Full-text available
The emergence of artificial intelligence has triggered enthusiasm and promise of boundless opportunities as much as uncertainty about its limits. The contributions to this volume explore the limits of AI, describe the necessary conditions for its functionality, reveal its attendant technical and social problems, and present some existing and potential solutions. At the same time, the contributors highlight the societal and attending economic hopes and fears, utopias and dystopias that are associated with the current and future development of artificial intelligence.
... Rule transparency is a species of what Creel calls "functional transparency." A system is functionally transparent for some agent to the extent that the agent is in a position to know what 11 The terminological situation here is indeed vexed (Clinciu and Hastie, 2019), with some researchers contrasting black box systems with "interpretable" systems (Burrell ,2016;Krishnan ,2019), others with "explainable" systems (Speith, 2022;Baum, 2022), and others still with "scrutable" systems (Selbst and Barocas, 2018). 12 Barocas and Selbst, (2018). ...
Article
Full-text available
The ongoing explosion of interest in artificial intelligence is fueled in part by recently developed techniques in machine learning. Those techniques allow automated systems to process huge amounts of data, utilizing mathematical methods that depart from traditional statistical approaches, and resulting in impressive advancements in our ability to make predictions and uncover correlations across a host of interesting domains. But as is now widely discussed, the way that those systems arrive at their outputs is often opaque, even to the experts who design and deploy them. Is it morally problematic to make use of opaque automated methods when making high-stakes decisions, like whether to issue a loan to an applicant, or whether to approve a parole request? Many scholars answer in the affirmative. However, there is no widely accepted explanation for why transparent systems are morally preferable to opaque systems. We argue that the use of automated decision-making systems sometimes violates duties of consideration that are owed by decision-makers to decision-subjects, duties that are both epistemic and practical in character. Violations of that kind generate a weighty consideration against the use of opaque decision systems. In the course of defending our approach, we show that it is able to address three major challenges sometimes leveled against attempts to defend the moral import of transparency in automated decision-making.
Conference Paper
Full-text available
EXplainable Artificial Intelligence (XAI) is a vibrant research topic in the artificial intelligence community. It is raising growing interest across methods and domains, especially those involving high stake decision-making, such as the biomedical sector. Much has been written about the subject, yet XAI still lacks shared terminology and a framework capable of providing structural soundness to explanations. In our work, we address these issues by proposing a novel definition of explanation that synthesizes what can be found in the literature. We recognize that explanations are not atomic but the combination of evidence stemming from the model and its input-output mapping, and the human interpretation of this evidence. Furthermore, we fit explanations into the properties of faithfulness (i.e., the explanation is an accurate description of the model’s inner workings and decision-making process) and plausibility (i.e., how much the explanation seems convincing to the user). Our theoretical framework simplifies how these properties are operationalized, and it provides new insights into common explanation methods that we analyze as case studies. We also discuss the impact that our framework could have in biomedicine, a very sensitive application domain where XAI can have a central role in generating trust.
Preprint
Full-text available
Background: Heart Rate Variability (HRV) is intimately associated with stress and can serve as a valuable indicator of the individual’s stress level. HRV is the variation in the length of time between heartbeats. Lower HRV is associated with higher stress levels, while higher HRV indicates better stress resilience and adaptability. The HRV parameters can be classified as − time-domain, frequency-domain, and non-linear. Parameters associated with HRV can be employed to assess individuals’ health and observe the effects of interventions such as exercise, stress reduction, and medication. Research in the field of artificial intelligence (AI) is ongoing that attempts to classify stress based on HRV data. HRV, which is associated with stress and physiological health, has received attention as a potential component to incorporate into models that classify and predict stress levels accurately. Monitoring HRV can offer insights into the interplay between stress and mental health, aiding in early detection and holistic approaches to well-being. Objective: The primary goal of this study is to perform semantic modeling of the vital HRV features in a knowledge graph, followed by, developing an accurate, reliable, explainable, and ethical AI model pipeline that can perform predictive analysis on HRV data. Methods: In this regard, we have considered the well-known multimodal SWELL knowledge work (SWELL−KW) dataset as a study case that represents the following stress conditions − no stress, time pressure, and interruption. The selected HRV dataset shows a labeled relationship between HRV and stress levels, which is deemed suitable for this study. We have explored different feature selection and dimensionality reduction techniques to extract relevant features from the HRV dataset to enhance classification accuracy with reduced bias. We have used various machine learning (ML) algorithms (e.g., traditional and ensemble) for the predictive analysis of imbalanced and balanced HRV datasets. We have used different data formats (e.g., scaled, normalized, and standardized) and various oversampling techniques (e.g., Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic (ADASYN)) to generate synthetic samples from the minority class. We have used a Tree-Explainer (e.g., Shapley Additive Explanations (SHAP)) to explain the model classifications. Results: As the HRV features are non-linear, the genetic algorithm-based feature selection followed by, model classification with Random Forest Classifier has produced the highest classification result on both the imbalanced and balanced datasets. The optimized feature set has been beneficial to design and develop a stress management system with a Semantic framework. Therefore, we have introduced the concept of domain ontology to represent the data and obtained knowledge. The consistency of the Ontology model has been evaluated with Hermit reasoners and reasoning time. Conclusions: Overall, HRV serves as a valuable physiological marker that can provide insights into the relationship between stress and mental health. It’s crucial to recognize that while HRV is a valuable non-invasive indicator of stress levels, its interpretation should be combined with other subjective and objective measures of stress in order to comprehensively understand the individual’s stress response. Therefore, the monitoring of HRV may help individuals assess the effectiveness of stress management techniques and interventions.
Preprint
Full-text available
Background: Heart Rate Variability (HRV) is intimately associated with stress and can serve as a valuable indicator of the individual’s stress level. HRV is the variation in the length of time between heartbeats. Lower HRV is associated with higher stress levels, while higher HRV indicates better stress resilience and adaptability. The HRV parameters can be classified as − time-domain, frequency-domain, and non-linear. Parameters associated with HRV can be employed to assess individuals’ health and observe the effects of interventions such as exercise, stress reduction, and medication. Research in the field of artificial intelligence (AI) is ongoing that attempts to classify stress based on HRV data. HRV, which is associated with stress and physiological health, has received attention as a potential component to incorporate into models that classify and predict stress levels accurately. Monitoring HRV can offer insights into the interplay between stress and mental health, aiding in early detection and holistic approaches to well-being. Objective: The primary goal of this study is to perform semantic modeling of the vital HRV features in a knowledge graph, followed by, developing an accurate, reliable, explainable, and ethical AI model pipeline that can perform predictive analysis on HRV data. Methods: In this regard, we have considered the well-known multimodal SWELL knowledge work (SWELL−KW) dataset as a study case that represents the following stress conditions − no stress, time pressure, and interruption. The selected HRV dataset shows a labeled relationship between HRV and stress levels, which is deemed suitable for this study. We have explored different feature selection and dimensionality reduction techniques to extract relevant features from the HRV dataset to enhance classification accuracy with reduced bias. We have used various machine learning (ML) algorithms (e.g., traditional and ensemble) for the predictive analysis of imbalanced and balanced HRV datasets. We have used different data formats (e.g., scaled, normalized, and standardized) and various oversampling techniques (e.g., Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic (ADASYN)) to generate synthetic samples from the minority class. We have used a Tree-Explainer (e.g., Shapley Additive Explanations (SHAP)) to explain the model classifications. Results: As the HRV features are non-linear, the genetic algorithm-based feature selection followed by, model classification with Random Forest Classifier has produced the highest classification result on both the imbalanced and balanced datasets. The optimized feature set has been beneficial to design and develop a stress management system with a Semantic framework. Therefore, we have introduced the concept of domain ontology to represent the data and obtained knowledge. The consistency of the Ontology model has been evaluated with Hermit reasoners and reasoning time. Conclusions: Overall, HRV serves as a valuable physiological marker that can provide insights into the relationship between stress and mental health. It’s crucial to recognize that while HRV is a valuable non-invasive indicator of stress levels, its interpretation should be combined with other subjective and objective measures of stress in order to comprehensively understand the individual’s stress response. Therefore, the monitoring of HRV may help individuals assess the effectiveness of stress management techniques and interventions.
Preprint
Full-text available
Background: Heart Rate Variability (HRV) is intimately associated with stress and can serve as a valuable indicator of the individual’s stress level. HRV is the variation in the length of time between heartbeats. Lower HRV is associated with higher stress levels, while higher HRV indicates better stress resilience and adaptability. The HRV parameters can be classified as − time-domain, frequency-domain, and non-linear. Parameters associated with HRV can be employed to assess individuals’ health and observe the effects of interventions such as exercise, stress reduction, and medication. Research in the field of artificial intelligence (AI) is ongoing that attempts to classify stress based on HRV data. HRV, which is associated with stress and physiological health, has received attention as a potential component to incorporate into models that classify and predict stress levels accurately. Monitoring HRV can offer insights into the interplay between stress and mental health, aiding in early detection and holistic approaches to well-being. Objective: The primary goal of this study is to perform semantic modeling of the vital HRV features in a knowledge graph, followed by, developing an accurate, reliable, explainable, and ethical AI model pipeline that can perform predictive analysis on HRV data. Methods: In this regard, we have considered the well-known multimodal SWELL knowledge work (SWELL−KW) dataset as a study case that represents the following stress conditions − no stress, time pressure, and interruption. The selected HRV dataset shows a labeled relationship between HRV and stress levels, which is deemed suitable for this study. We have explored different feature selection and dimensionality reduction techniques to extract relevant features from the HRV dataset to enhance classification accuracy with reduced bias. We have used various machine learning (ML) algorithms (e.g., traditional and ensemble) for the predictive analysis of imbalanced and balanced HRV datasets. We have used different data formats (e.g., scaled, normalized, and standardized) and various oversampling techniques (e.g., Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic (ADASYN)) to generate synthetic samples from the minority class. We have used a Tree-Explainer (e.g., Shapley Additive Explanations (SHAP)) to explain the model classifications. Results: As the HRV features are non-linear, the genetic algorithm-based feature selection followed by, model classification with Random Forest Classifier has produced the highest classification result on both the imbalanced and balanced datasets. The optimized feature set has been beneficial to design and develop a stress management system with a Semantic framework. Therefore, we have introduced the concept of domain ontology to represent the data and obtained knowledge. The consistency of the Ontology model has been evaluated with Hermit reasoners and reasoning time. Conclusions: Overall, HRV serves as a valuable physiological marker that can provide insights into the relationship between stress and mental health. It’s crucial to recognize that while HRV is a valuable non-invasive indicator of stress levels, its interpretation should be combined with other subjective and objective measures of stress in order to comprehensively understand the individual’s stress response. Therefore, the monitoring of HRV may help individuals assess the effectiveness of stress management techniques and interventions.
Chapter
Artificial Intelligence (AI) models operate as black boxes where most parts of the system are opaque to users. This reduces the user’s trust in the system. Although the Human-Computer Interaction (HCI) community has proposed design practices to improve transparency, work that provides a mapping of these practices and interactive elements that influence AI transparency is still lacking. In this paper, we conduct an in-depth literature survey to identify elements that influence transparency in the field of HCI. Research has shown that transparency allows users to have a better sense of the accuracy, fairness, and privacy of a system. In this context, much research has been conducted on providing explanations for the decisions made by AI systems. Researchers have also studied the development of interactive interfaces that allow user interaction to improve the explanatory capability of systems. This literature review provides key insights about transparency and what the research community thinks about it. Based on the insights gained we gather that a simplified explanation of the AI system is key. We conclude the paper with our proposed idea of representing an AI system, which is an amalgamation of the AI Model (algorithms), data (input and output, including outcomes), and the user interface, as visual interpretations (e.g. Venn diagrams) can aid in understanding AI systems better and potentially making them more transparent.KeywordsTransparencyExplainablityUsabilityHCIAI modelsDataInterpretability
Article
Full-text available
At the dawn of the fourth industrial revolution, we are witnessing a fast and widespread adoption of Artificial Intelligence (AI) in our daily life, which contributes to accelerating the shift towards a more algorithmic society. However, even with such unprecedented advancements, a key impediment to the use of AI-based systems is that they often lack transparency. Indeed, the black box nature of these systems allows powerful predictions, but it cannot be directly explained. This issue has triggered a new debate on Explainable Artificial Intelligence. A research field that holds substantial promise for improving trust and transparency of AI-based systems. It is recognized as the sine qua non for AI to continue making steady progress without disruption. This survey provides an entry point for interested researchers and practitioners to learn key aspects of the young and rapidly growing body of research related to explainable AI. Through the lens of literature, we review existing approaches regarding the topic, we discuss trends surrounding its sphere and we present major research trajectories.
Conference Paper
Full-text available
We introduce AI rationalization, an approach for generating explanations of autonomous system behavior as if a human had performed the behavior. We describe a rationalization technique that uses neural machine translation to translate internal state-action representations of an autonomous agent into natural language. We evaluate our technique in the Frogger game environment, training an autonomous game playing agent to rationalize its action choices using natural language. A natural language training corpus is collected from human players thinking out loud as they play the game. We motivate the use of rationalization as an approach to explanation generation and show the results of two experiments evaluating the effectiveness of rationalization. Results of these evaluations show that neural machine translation is able to accurately generate rationalizations that describe agent behavior, and that rationalizations are more satisfying to humans than other alternative methods of explanation.
Article
Full-text available
We investigate evaluation metrics for end-to-end dialogue systems where supervised labels, such as task completion, are not available. Recent works in end-to-end dialogue systems have adopted metrics from machine translation and text summarization to compare a model's generated response to a single target response. We show that these metrics correlate very weakly or not at all with human judgements of the response quality in both technical and non-technical domains. We provide quantitative and qualitative results highlighting specific weaknesses in existing metrics, and provide recommendations for future development of better automatic evaluation metrics for dialogue systems.
Conference Paper
Full-text available
Surface realisations typically depend on their target style and audience. A challenge in estimating a stylistic realiser from data is that humans vary significantly in their subjective perceptions of linguistic forms and styles, leading to almost no correlation between ratings of the same utterance. We address this problem in two steps. First, we estimate a mapping function between the linguistic features of a corpus of utterances and their human style ratings. Users are partitioned into clusters based on the similarity of their ratings, so that ratings for new utterances can be estimated, even for new, unknown users. In a second step, the estimated model is used to re-rank the outputs of a number of surface realisers to produce stylistically adaptive output. Results confirm that the generated styles are recognisable to human judges and that predictive models based on clusters of users lead to better rating predictions than models based on an average population of users.
Conference Paper
Full-text available
Research is emerging on how end users can correct mistakes their intelligent agents make, but before users can correctly "debug" an intelligent agent, they need some degree of understanding of how it works. In this paper we consider ways intelligent agents should explain themselves to end users, especially focusing on how the soundness and completeness of the explanations impacts the fidelity of end users' mental models. Our findings suggest that completeness is more important than soundness: increasing completeness via certain information types helped participants' mental models and, surprisingly, their perception of the cost/benefit tradeoff of attending to the explanations. We also found that oversimplification, as per many commercial agents, can be a problem: when soundness was very low, participants experienced more mental demand and lost trust in the explanations, thereby reducing the likelihood that users will pay attention to such explanations at all.
Article
Recently, deep learning has been advancing the state of the art in artificial intelligence to a new level, and humans rely on artificial intelligence techniques more than ever. However, even with such unprecedented advancements, the lack of explanation regarding the decisions made by deep learning models and absence of control over their internal processes act as major drawbacks in critical decision-making processes, such as precision medicine and law enforcement. In response, efforts are being made to make deep learning interpretable and controllable by humans. In this paper, we review visual analytics, information visualization, and machine learning perspectives relevant to this aim, and discuss potential challenges and future research directions.
Article
With the availability of large databases and recent improvements in deep learning methodology, the performance of AI systems is reaching or even exceeding the human level on an increasing number of complex tasks. Impressive examples of this development can be found in domains such as image classification, sentiment analysis, speech understanding or strategic game playing. However, because of their nested non-linear structure, these highly successful machine learning and artificial intelligence models are usually applied in a black box manner, i.e., no information is provided about what exactly makes them arrive at their predictions. Since this lack of transparency can be a major drawback, e.g., in medical applications, the development of methods for visualizing, explaining and interpreting deep learning models has recently attracted increasing attention. This paper summarizes recent developments in this field and makes a plea for more interpretability in artificial intelligence. Furthermore, it presents two approaches to explaining predictions of deep learning models, one method which computes the sensitivity of the prediction with respect to changes in the input and one approach which meaningfully decomposes the decision in terms of the input variables. These methods are evaluated on three classification tasks.