Figure 1 - uploaded by Vera Liao
Content may be subject to copyright.
XAI Question Bank, with leading questions in bold, and new questions identified from the interviews with *

XAI Question Bank, with leading questions in bold, and new questions identified from the interviews with *

Contexts in source publication

Context 1
... We present an extended XAI question bank (Figure 1) by combining algorithm-informed questions and user questions identified in the study. We discuss how it can be used as guidance and tool to support the needs specification work to create user-centered XAI applications. ...
Context 2
... each category, we created a leading question (e.g.,"Why is this instance given this prediction" for the Why category 1 ), and supplemented 2-3 additional example questions, inquiring about features and examples whenever applies. The list of questions developed in this step are shown in Fig- ure 1 without an asterisk. We do not claim the exhaustiveness of this list, but deem it to be sufficient as a study probe. ...
Context 3
... MURAL-a visual collaboration tool, we created a card for each question category listed in Figure 1, with the leading and example questions (without an asterisk). Informant went through each card and discussed whether they encountered these questions from users; If not, we asked whether they saw the questions would apply and in what situations. ...
Context 4
... perform gap analysis on the XAI question bank, we followed two steps. For the covered questions in each needs category, we identified new forms of questions that were not covered by the original example questions, as shown with asterisks in Figure 1. By forms, we grouped together questions with the same intent but phrased differently. ...
Context 5
... first excluded 22 questions not generalizable to AI products, such as "what is the summary of the article?". We then iteratively grouped and coded the intent of the remaining 24 questions and identified 5 additional forms of question in the Others category in Figure 1. Insights from the analysis will be discussed the results section. ...
Context 6
... The potential gaps between algorithmic explanations and user needs, by examining passages coded as design challenge, and the additional questions identified in the gap analysis (Figure 1). To help answer the former, we first discuss key factors that may lead to the variability of explainability needs, which we identified by coding informants' reasons to include, exclude or prioritize a needs category. ...

Citations

... Doing this involves, first of all, a better understanding the socio-technical constitution of the organisational environments within which such systems might be deployed and how trust in the accountable character of those environments is currently achieved. Exploring the sociotechnical challenges confronting the development of trustworthy AI in organisational settings is something that has so far been largely absent from the related literature (e.g., Yang et al. 2019, Liao et al. 2020, Ehsan et al. 2021, Glaser et al. 2021). ...
... Some studies, such as Wang et al. (2019), have approached these questions from a decision theory perspective. Other studies have followed an empirical approach based on interviews with clinicians , Liao et al. 2020 or scenario-based methods (Ehsan et al. 2021). What is lacking is an empirically-based exploration of what it means for AI to be accountable 'in the wild' and this is what we aim in this paper to begin to address by presenting evidence from our previous field studies of accounts and accountability practices in healthcare. ...
... As an aggregated and accumulating record of an AI system's behaviour (Ehsan et al. 2021) and the 'ground truth' of each case as it became available, this dataset would provide a more reliable system biography. This would help meet the need for an up-to-date, global account of AI performance (Liao et al. 2020) but in a form that is distinct from how global accounts have so far been conceived (Das & Rad 2020, Machlev et al. 2022. ...
Preprint
Full-text available
The need for AI systems to provide explanations for their behaviour is now widely recognised as key to their adoption. In this paper, we examine the problem of trustworthy AI and explore what delivering this means in practice, with a focus on healthcare applications. Work in this area typically treats trustworthy AI as a problem of Human-Computer Interaction involving the individual user and an AI system. However, we argue here that this overlooks the important part played by organisational accountability in how people reason about and trust AI in socio-technical settings. To illustrate the importance of organisational accountability, we present findings from ethnographic studies of breast cancer screening and cancer treatment planning in multidisciplinary team meetings to show how participants made themselves accountable both to each other and to the organisations of which they are members. We use these findings to enrich existing understandings of the requirements for trustworthy AI and to outline some candidate solutions to the problems of making AI accountable both to individual users and organisationally. We conclude by outlining the implications of this for future work on the development of trustworthy AI, including ways in which our proposed solutions may be re-used in different application settings.
... Our planned next steps are to analyse to which extent explanatory dialogue is present in these datasets and if they constitute a suitable resource for our purpose. The proposed dialog system should also be able to recognize user intent, by matching a user query with an appropriate explanation method [50,100,102]. A query like Which parts of the input contributed most to model output? ...
Article
Full-text available
Artificial Intelligence (AI) systems are increasingly pervasive: Internet of Things, in-car intelligent devices, robots, and virtual assistants, and their large-scale adoption makes it necessary to explain their behaviour, for example to their users who are impacted by their decisions, or to their developers who need to ensure their functionality. This requires, on the one hand, to obtain an accurate representation of the chain of events that caused the system to behave in a certain way (e.g., to make a specific decision). On the other hand, this causal chain needs to be communicated to the users depending on their needs and expectations. In this phase of explanation delivery, allowing interaction between user and model has the potential to improve both model quality and user experience. The XAINES project investigates the explanation of AI systems through narratives targeted to the needs of a specific audience, focusing on two important aspects that are crucial for enabling successful explanation: generating and selecting appropriate explanation content, i.e. the information to be contained in the explanation, and delivering this information to the user in an appropriate way. In this article, we present the project’s roadmap towards enabling the explanation of AI with narratives.
... We present the Multiverse, a research product that embodies this approach by facilitating the authoring of fictional discourse through Human-AI co-creation of literary artifacts by which the human user attends to an understanding of the AI mechanisms through interaction and creative co-writing process. That reflexive human understanding of the AI system is considered as a critical component in designing transparent, sustainable and explainable AI [33,48,56,61]. ...
... But AI technologies come with their own limitations, such as the lack of transparency and reproduction of already existing biases [2]. Explainable AI seeks to improve transparency and explainability of an AI system through facilitating forms of dialogue between humans and AI systems [22,48]. Collaborative creative activities are a form of dialogue, which go back and forth between humans and AI. ...
Conference Paper
Full-text available
Human creativity has been often aided and supported by artificial tools, spanning traditional tools such as ideation cards, pens, and paper, to computed and software. Tools for creativity are increasingly using artificial intelligence to not only support the creative process, but also to act upon the creation with a higher level of agency. This paper focuses on writing fiction as a creative activity and explores human-AI co-writing through a research product, which employs a natural language processing model, the Generative Pre-trained Transformer 3 (GPT-3), to assist the co-authoring of narrative fiction. We report on two progressive – not comparative – autoethnographic studies to attain our own creative practices in light of our engagement with the research product: (1) a co-writing activity initiated by basic textual prompts using basic elements of narrative and (2) a co-writing activity initiated by more advanced textual prompts using elements of narrative, including dialects and metaphors undertaken by one of the authors of this paper who has doctoral training in literature. In both studies, we quickly came up against the limitations of the system; then, we repositioned our goals and practices to maximize our chances of success. As a result, we discovered not only limitations but also hidden capabilities, which not only altered our creative practices and outcomes, but which began to change the ways we were relating to the AI as collaborator.
... Concerning the aspects that should be explained, we found the following options in the literature and validated them during the workshop with philosophers and psychologists: the system in general (e.g., global aspects of a system) [70], and, more specifically, its reasoning processes (e.g., inference processes for certain problems) [71], its inner logic (e.g., relationships between the inputs and outputs) [9], its model's internals (e.g., parameters and data structures) [72], its intention (e.g., pursued outcome of actions) [73], its behavior (e.g., real-world actions) [74], its decision (e.g., underlying criteria) [4], its performance (e.g., predictive accuracy) [75], and its knowledge about the user or the world (e.g., user preferences) [74]. ...
... Furthermore, explainability has a positive impact on the mental-model accuracy of involved parties. By giving explanations, it is possible to make users aware of the system's limitations [75], helping them to develop better mental models of it [94]. Explanations may also increase a user's ability to predict a decision and calibrate expectations with respect to what a system can or cannot do [75]. ...
... By giving explanations, it is possible to make users aware of the system's limitations [75], helping them to develop better mental models of it [94]. Explanations may also increase a user's ability to predict a decision and calibrate expectations with respect to what a system can or cannot do [75]. This can be attributed to an improved user awareness about a situation or about the system [12]. ...
Article
Full-text available
The growing complexity of software systems and the influence of software-supported decisions in our society sparked the need for software that is transparent, accountable, and trustworthy. Explainability has been identified as a means to achieve these qualities. It is recognized as an emerging non-functional requirement (NFR) that has a significant impact on system quality. Accordingly, software engineers need means to assist them in incorporating this NFR into systems. This requires an early analysis of the benefits and possible design issues that arise from interrelationships between different quality aspects. However, explainability is currently under-researched in the domain of requirements engineering, and there is a lack of artifacts that support the requirements engineering process and system design. In this work, we remedy this deficit by proposing four artifacts: a definition of explainability, a conceptual model, a knowledge catalogue, and a reference model for explainable systems. These artifacts should support software and requirements engineers in understanding the definition of explainability and how it interacts with other quality aspects. Besides that, they may be considered a starting point to provide practical value in the refinement of explainability from high-level requirements to concrete design choices, as well as on the identification of methods and metrics for the evaluation of the implemented requirements.
... This pressing need gives rise to the field of explainable AI (XAI) [29], which has made commendable progress in producing a growing collection of techniques to enable algorithmic explanations. While the technical landscape of XAI is increasingly broad (for an overview of XAI techniques, see [5,27,28,51]), they typically aim to address user questions such as "how does the model make decisions" or "why does the model make a particular decision" [48] through means such as revealing how the model weighs different features, and what rules it follows. ...
... Our work similarly adopts a sociotechnical perspective to expand the XAI design space and takes a goal-oriented stance. Focusing on AI supporting human decision-making (as opposed to complete automation), we aim to support the end-goal of actionability for decision-making [48] and contestability when AI is wrong [53]. ...
... In principle, RAI works call for explicit considerations of "how things can go wrong"-risks, harms, and ethical issues in general-and addressing these issues proactively during development, rather than reactively after deployment [4,65,72]. However, recent work studying how practitioners deal with RAI issues on the ground, including AI fairness [15,35,54,65], transparency [31,37,48], accountability [70], and overall harms mitigation [4,71], report that practitioners grapple with tremendous challenges. Proactively anticipating potential harms for complex systems deployed in heterogeneous social contexts is inherently challenging. ...
Preprint
Full-text available
Mistakes in AI systems are inevitable, arising from both technical limitations and sociotechnical gaps. While black-boxing AI systems can make the user experience seamless, hiding the seams risks disempowering users to mitigate fallouts from AI mistakes. While Explainable AI (XAI) has predominantly tackled algorithmic opaqueness, we propose that seamful design can foster Humancentered XAI by strategically revealing sociotechnical and infrastructural mismatches. We introduce the notion of Seamful XAI by (1) conceptually transferring "seams" to the AI context and (2) developing a design process that helps stakeholders design with seams, thereby augmenting explainability and user agency. We explore this process with 43 AI practitioners and users, using a scenario-based co-design activity informed by real-world use cases. We share empirical insights, implications, and critical reflections on how this process can help practitioners anticipate and craft seams in AI, how seamfulness can improve explainability, empower end-users, and facilitate Responsible AI.
... Explanations should cater to different types of users (i.e. personas), and their explanation needs [6,7]. We refer to an AI system which is capable of catering to these user needs as an XAI system. ...
... We refer to an AI system which is capable of catering to these user needs as an XAI system. All facets of XAI lead to the same conclusion that XAI is not one-shot, meaning an XAI system should be interactive, able to provide multiple explanations considering different personas and their needs during design [8,7]. We refer to a constellation of explainers curated to address such needs within an XAI system as an Explanation Strategy. ...
... Figure 8 illustrates a partial scenario where all three conditional navigations are used. Imagine the user and the chatbot are at a state where they enter into a disagreement (1), the user provides details and receives clarifications (2-3), the user indicates the need for more explanation (4)(5), the user poses a new question and provide target details (6)(7)(8)(9), chatbot navigates the behaviour tree and provide an explanation that answers the new question (10), the user indicates they have other questions (11)(12) and the BT repeats steps 6 to 10 (6-10), the user indicates their satisfaction and that they are happy move to evaluation (13), the chatbot presents the evaluation questionnaire for the user to answer (14). Note that greet, persona and evaluation nodes are at an abstract level whereas explanation need, explanation strategy and disagreement sub-trees present finegrained functionalities. ...
Preprint
Full-text available
Explainable AI (XAI) has the potential to make a significant impact on building trust and improving the satisfaction of users who interact with an AI system for decision-making. There is an abundance of explanation techniques in literature to address this need. Recently, it has been shown that a user is likely to have multiple explanation needs that should be addressed by a constellation of explanation techniques which we refer to as an explanation strategy. This paper focuses on how users interact with an XAI system to fulfil these multiple explanation needs satisfied by an explanation strategy. For this purpose, the paper introduces the concept of an "explanation experience" - as episodes of user interactions captured by the XAI system when explaining the decisions made by its AI system. In this paper, we explore how to enable and capture explanation experiences through conversational interactions. We model the interactive explanation experience as a dialogue model. Specifically, Behaviour Trees (BT) are used to model conversational pathways and chatbot behaviours. A BT dialogue model is easily personalised by dynamically extending or modifying it to attend to different user needs and explanation strategies. An evaluation with a real-world use case shows that BTs have a number of properties that lend naturally to modelling and capturing explanation experiences; as compared to traditionally used state transition models.
... Several studies have contributed to this subject by providing question banks (e.g. [36,37]) and design principles (e.g. [38]). ...
Article
Full-text available
Several studies have addressed the importance of context and users’ knowledge and experience in quantifying the usability and effectiveness of the explanations generated by explainable artificial intelligence (XAI) systems. However, to the best of our knowledge, no component-agnostic system that accounts for this need has yet been built. This paper describes an approach called ConvXAI, which can create a dialogical multimodal interface for any black-box explainer by considering the knowledge and experience of the user. First, we formally extend the state-of-the-art conversational explanation framework by introducing clarification dialogue as an additional dialogue type. We then implement our approach as an off-the-shelf Python tool. To evaluate our framework, we performed a user study including 45 participants divided into three groups based on their level of technology use and job function. Experimental results show that (i) different groups perceive explanations differently; (ii) all groups prefer textual explanations over graphical ones; and (iii) ConvXAI provides clarifications that enhance the usefulness of the original explanations.
... Although these guidelines are relevant and suitable for a wide range of common AI-enabled systems, more nuanced guidelines are desirable for domains where study participants cannot be recruited nor interviewed in abundance. Similarly, previous attempts to guide the design of effective transparency mechanisms acknowledge that real stakeholders involved should be considered and understood 17,32,37 . Starting from the identification of diverse design goals according to users' needs and their level of expertise on AI technology, and a categorization of evaluation measures for Explainable Artificial Intelligence (XAI) systems 38 , addressed the multidisciplinary efforts needed to build such systems. ...
Article
Full-text available
Transparency in Machine Learning (ML), often also referred to as interpretability or explainability, attempts to reveal the working mechanisms of complex models. From a human-centered design perspective, transparency is not a property of the ML model but an affordance, i.e., a relationship between algorithm and users. Thus, prototyping and user evaluations are critical to attaining solutions that afford transparency. Following human-centered design principles in highly specialized and high stakes domains, such as medical image analysis, is challenging due to the limited access to end users and the knowledge imbalance between those users and ML designers. To investigate the state of transparent ML in medical image analysis, we conducted a systematic review of the literature from 2012 to 2021 in PubMed, EMBASE, and Compendex databases. We identified 2508 records and 68 articles met the inclusion criteria. Current techniques in transparent ML are dominated by computational feasibility and barely consider end users, e.g. clinical stakeholders. Despite the different roles and knowledge of ML developers and end users, no study reported formative user research to inform the design and development of transparent ML models. Only a few studies validated transparency claims through empirical user evaluations. These shortcomings put contemporary research on transparent ML at risk of being incomprehensible to users, and thus, clinically irrelevant. To alleviate these shortcomings in forthcoming research, we introduce the INTRPRT guideline , a design directive for transparent ML systems in medical image analysis. The INTRPRT guideline suggests human-centered design principles, recommending formative user research as the first step to understand user needs and domain requirements. Following these guidelines increases the likelihood that the algorithms afford transparency and enable stakeholders to capitalize on the benefits of transparent ML.
... To surface a model's features, one can rely on a plethora of explainability methods [185]. Certain models are built with the idea of being explainable by design [216,274], while others are applied posthoc interpretability methods [18,176,211], with different properties (e.g., different nature of explanations being correlation or causation -based, different scopes be it local or global, different mediums be it visual or textual, etc.) [126,206]. It is now important to adapt such feature explanations to allow for checking their alignment with human expected features. ...
Preprint
Full-text available
Despite the impressive performance of Artificial Intelligence (AI) systems, their robustness remains elusive and constitutes a key issue that impedes large-scale adoption. Robustness has been studied in many domains of AI, yet with different interpretations across domains and contexts. In this work, we systematically survey the recent progress to provide a reconciled terminology of concepts around AI robustness. We introduce three taxonomies to organize and describe the literature both from a fundamental and applied point of view: 1) robustness by methods and approaches in different phases of the machine learning pipeline; 2) robustness for specific model architectures, tasks, and systems; and in addition, 3) robustness assessment methodologies and insights, particularly the trade-offs with other trustworthiness properties. Finally, we identify and discuss research gaps and opportunities and give an outlook on the field. We highlight the central role of humans in evaluating and enhancing AI robustness, considering the necessary knowledge humans can provide, and discuss the need for better understanding practices and developing supportive tools in the future.
... Later, [20] suggests that XAI explanations answer specific questions about data, its processing and results in ML. They map existing XAI solutions to questions and create an XAI question bank that supports the design of user-centered XAI applications. ...
... To get (E , E ′ ), we ask what the user wants in natural language with pre-written answers. To do so, AutoXAI uses the question bank from [20] for explanandum proposals, letting the user choose which question the XAI solution should answer. For the explanan, AutoXAI uses the list of explanation types from [7]. ...
... With this knowledge, a contextual prefiltering (see Section 2.3) is possible by selecting E ,E ′ in , E ,E ′ in and E ,E ′ in . This is possible by tagging each of these proposals with a tuple from (E, E ′ ) using [15,20] correspondence tables. E ,E ′ serve for contextual modeling (see Section 2.3). ...