About
170
Publications
73,577
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,402
Citations
Introduction
Professor in Applied Cognitive Science, Cognitive Modeling, and Human Factors. Studies Explanability in Artificial Intelligence, Expert decision making, and knowledge representation. Developer of PEBL, the Psychology Experiment Building Language. http://shanetmueller.info
Additional affiliations
August 2015 - present
June 2006 - April 2015
August 2011 - August 2015
Publications
Publications (170)
This is the fourth in a series of essays about explainable AI. Previous essays laid out the theoretical and empirical foundations. This essay focuses on Deep Nets, and con-siders methods for allowing system users to generate self-explanations. This is accomplished by exploring how the Deep Net systems perform when they are operating at their bounda...
Deep Image classifiers have made amazing advances in both basic and applied problems in recent years. Nevertheless, they are still very limited and can be foiled by even simple image distortions. Importantly, the way they fail is often unexpected, and sometimes difficult to even understand. Thus, advances in image classifiers made to improve their...
This is an integrative review that address the question, "What makes for a good explanation?" with reference to AI systems. Pertinent literatures are vast. Thus, this review is necessarily selective. That said, most of the key concepts and issues are expressed in this Report. The Report encapsulates the history of computer science efforts to create...
The question addressed in this paper is: If we present to a user an AI system that explains how it works, how do we know whether the explanation works and the user has achieved a pragmatic understanding of the AI? In other words, how do we know that an explanainable AI system (XAI) is any good? Our focus is on the key concepts of measurement. We di...
In the last decade, there has been a significant increase in the use of commercial semi-autonomous vehicles by consumers. This has led to a surge in concerns among users about the limitations of these systems, especially when it comes to safety. In order to address these concerns, users often seek out diverse educational resources to comprehend the...
https://www.teachingprofessor.com/topics/preparing-to-teach/assignments/concept-mapping-an-active-and-constructive-ai-proof-classroom-assignment/
Empathy is an essential part of communication in healthcare. For artificial intelligence (AI) to be successfully incorporated as an integral part of healthcare systems, it may need to incorporate cognitively empathetic interactions with patients, meaning it uses the reasoning, the perspectives, and the information of patients as much as possible. I...
In the past decade, consumer adoption of commercial semi-autonomous vehicles has increased, and along with it user concerns about the shortcomings of these systems, especially regarding safety. Users often turn to social media forums to discuss these shortcomings, find workarounds, and confirm their experience is common. We suggest that these forum...
Introduction
The purpose of the Stakeholder Playbook is to enable the developers of explainable AI systems to take into account the different ways in which different stakeholders or role-holders need to “look inside” the AI/XAI systems.
Method
We conducted structured cognitive interviews with senior and mid-career professionals who had direct expe...
The development of AI systems represents a significant investment of funds and time. Assessment is necessary in order to determine whether that investment has paid off. Empirical evaluation of systems in which humans and AI systems act interdependently to accomplish tasks must provide convincing empirical evidence that the work system is learnable...
This paper summarizes the psychological insights and related design challenges that have emerged in the field of Explainable AI (XAI). This summary is organized as a set of principles, some of which have recently been instantiated in XAI research. The primary aspects of implementation to which the principles refer are the design and evaluation stag...
When people make plausibility judgments about an assertion, an event, or a piece of evidence, they are gauging whether it makes sense that the event could transpire as it did. Therefore, we can treat plausibility judgments as a part of sensemaking. In this paper, we review the research literature, presenting the different ways that plausibility has...
This study: (a) compared executive functions between deficit (DS) and non-deficit schizophrenia (NDS) patients and healthy controls (HC), controlling premorbid IQ and level of education; (b) compared executive functions in DS and NDS patients, controlling premorbid IQ and psychopathological symptoms; and (c) estimated relationships between clinical...
If a user is presented an AI system that portends to explain how it works, how do we know whether the explanation works and the user has achieved a pragmatic understanding of the AI? This question entails some key concepts of measurement such as explanation goodness and trust. We present methods for enabling developers and researchers to: (1) Asses...
Modern artificial intelligence (AI) and machine learning (ML) systems have become more capable and more widely used, but often involve underlying processes their users do not understand and may not trust. Some researchers have addressed this by developing algorithms that help explain the workings of the system using ‘Explainable’ AI algorithms (XAI...
Research on empathy has traditionally focused on affective aspects, but cooperative work can also involve cognitive aspects of empathy in terms of common ground and perspective-taking. Many of the affective empathetic processes involve the same kinds of intuitive decision making processes invoked in naturalistic decision making research. Furthermor...
With the recent deployment of the latest generation of Tesla’s Full Self-Driving (FSD) mode, consumers are using semi-autonomous vehicles in both highway and residential driving for the first time. As a result, drivers are facing complex and unanticipated situations with an unproven technology, which is a central challenge for cooperative cognition...
Research has shown that university students are often miscalibrated in their judgments of learning (JOL) and are likely to use unreliable cues like recognition heuristics or retrieval fluency to assess their knowledge. These cues can be misleading so that students believe they understand the material when they do not, and consequently may not inves...
AI systems are increasingly being developed to provide the first point of contact for patients. These systems are typically focused on question-answering and integrating chat systems with diagnostic algorithms, but are likely to suffer from many of the same deficiencies in explanation that have plagued medical diagnostic systems since the 1970s ( S...
Abstract
Decisions made by Artificial Intelligence/Machine Learning (AI/ML) systems affect our daily lives. Therefore, it's important to be able to predict, and even know whether, or when these systems might make a mistake.
Some previous training approaches show learners of these cognitively challenging systems examples of correct and incorrect cl...
The development of AI systems represents a significant investment. But to realize the promise of that investment, performance assessment is necessary. Empirical evaluation of Human-AI work systems must adduce convincing empirical evidence that the work method and its AI technology are learnable, usable, and useful. The theme to this Report is the n...
This Report is a companion to the Report titled "Requirements for the Evaluation of Human-AI Work Systems." Whereas that Report focused on the minimum necessary empirical requirements for the assessment of AI systems, this Report provides additional recommendations and technical details to assist the developers of AI systems. Recommendations are pr...
The purpose of the Stakeholder Playbook is to enable system developers to take into account the different ways in which stakeholders need to "look inside" of the AI/XAI systems. Recent work on Explainable AI has mapped stakeholder categories onto explanation requirements. While most of these mappings seem reasonable, they have been largely speculat...
An important subdomain in research on Human-Artificial Intelligence interaction is Explainable AI (XAI). XAI attempts to improve human understanding and trust in machine intelligence and automation by providing users with visualizations and other information that explain decisions, actions, and plans. XAI approaches have primarily used algorithmic...
This report describes a Self-Explaining Scorecard for appraising the self-explanatory support capabilities of XAI systems. The Scorecard might be useful in conceptualizing the various ways in which XAI system developers are supporting users, and might also help in comparing and contrasting the various approaches.
When people make plausibility judgments about an assertion, an event, or a piece of evidence, they are gauging whether it makes sense. Therefore, we can treat plausibility judgments as sensemaking activities. In this paper, we review the research literature, presenting the different ways that plausibility has been defined and measured. Then we desc...
This material is approved for public release. Distribution is unlimited. This material is based on research sponsored by the Air Force Research Lab (AFRL) under agreement number FA8650-17-2-7711. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views...
The Cognitive Tutorial concept is based on the view that the genuine cognitive challenges to forming functional and accurate mental models of AI systems can be formalized, documented, and "trained in." Its purpose is to serve as a means of global explanation of an AI or machine learning system. A Cognitive Tutorial is created specifically to accele...
The process of explaining something to another person is more than offering a statement. Explaining means taking the perspective and knowledge of the Learner into account and determining whether the Learner is satisfied. While the nature of explanation—conceived of as a set of statements—has been explored philosophically and empirically, the proces...
Explainable AI represents an increasingly important category of systems that attempt to support human understanding and trust in machine intelligence and automation. Typical systems rely on algorithms to help understand underlying information about decisions and establish justified trust and reliance. Researchers have proposed using goodness criter...
Human factors researchers often collect qualitative data that involve statements about a system or tool. Establishing consistent patterns in such data is important for making conclusions about the data. When a theoretically motivated coding scheme has not been established, one might use card sorting techniques to have independent raters generate a...
An essential component to intelligence analysis is inferring an explanation for uncertain, contradictory, and incomplete data. In order to arrive at the best explanation, effective analysts in any discipline conduct an iterative, convergent broadening and narrowing hypothesis assessment using their own tradecraft. Based on this observation, we deve...
The field of Explainable AI (XAI) has focused primarily on algorithms that can help explain decisions and classification and help understand whether a particular action of an AI system is justified. These \emph{XAI algorithms} provide a variety of means for answering a number of questions human users might have about an AI. However, explanation is...
Background
Artificial Intelligence has the potential to revolutionize healthcare, and it is increasingly being deployed to support and assist medical diagnosis. One potential application of AI is as the first point of contact for patients, replacing initial diagnoses prior to sending a patient to a specialist, allowing health care professionals to...
Explainable Artificial Intelligence (XAI) has re-emerged in response to the development of modern AI and ML systems. These systems are complex and sometimes biased, but they nevertheless make decisions that impact our lives. XAI systems are frequently algorithm-focused; starting and ending with an algorithm that implements a basic untested idea abo...
Explainable Artificial Intelligence (XAI) has re-emerged in response to the development of modern AI and ML systems. These systems are complex and sometimes biased, but they nevertheless make decisions that impact our lives. XAI systems are frequently algorithm-focused; starting and ending with an algorithm that implements a basic untested idea abo...
AI systems are increasingly being deployed to provide the first point of contact for patients. These systems are typically focused on question-answering, and suffer from many of the same deficiencies in explanation that have plagued medical diagnostic systems since the 1970s (Shortliffe, Buchanan, and Feigenbaum, 1979). They provide information tha...
Explainable Artificial Intelligence (XAI) has re-emerged in response to the development of modern AI and ML systems. These systems are complex and sometimes biased, but they nevertheless make decisions that impact our lives. XAI systems are frequently algorithm-focused; starting and ending with an algorithm that implements a basic untested idea abo...
Background: Artificial Intelligence has the potential to revolutionize healthcare, and it is increasingly being deployed to support and assist medical diagnosis. One potential application of AI is as the first point of contact for patients, replacing initial diagnoses prior to sending a patient to a specialist, allowing health care professionals to...
The success of deep image classification networks has been met with enthusiasm and investment from both the academic community and industry. We hypothesize users will expect these systems to behave similarly to humans, and to succeed and fail in ways humans do. To investigate this, we tested six popular image classifiers on imagery from ten tool ca...
In many human performance tasks, researchers assess performance by measuring both accuracy and response time. A number of theoretical and practical approaches have been proposed to obtain a single performance value that combines these measures, with varying degrees of success. In this report, we examine data from a common paradigm used in applied h...
The success of deep image classification networks has been met with enthusiasm and investment from both the academic community and industry. We suggest that human users of these systems will expect AI image classifiers to understand visual concepts similarly to humans, and thus expect them to succeed and fail in ways humans do. To investigate this,...
Current discussions of "Explainable AI" (XAI) do not much consider the role of abduction in explanatory reasoning (see Mueller, et al., 2018). It might be worthwhile to pursue this, to develop intelligent systems that allow for the observation and analysis of abductive reasoning and the assessment of abductive reasoning as a learnable skill. Abduct...
Current discussions of "Explainable AI" (XAI) do not much consider the role of abduction in explanatory reasoning (see Mueller, et al., 2018). It might be worthwhile to pursue this, to develop intelligent systems that allow for the observation and analysis of abductive reasoning and the assessment of abductive reasoning as a learnable skill. Abduct...
The git repository (https://stmueller.github.io/epidemic-agents/) includes a lightweight agent-based epidemic simulation model implemented in R. It was developed to explore the cognitive and psychological impacts of various policy decisions. This git repository includes R markdown files producing simulation output, plus markdown-generated web pages...
This repository includes a lightweight agent-based epidemic simulation model implemented in R. It was developed to explore cognitive and psychological impacts of various policy decisions. This git repository includes R markdown files producing simulation output, plus markdown-generated web pages showing the results of the simulation.
Modern artificial intelligence (AI) image classifiers have made impressive advances in recent years, but their performance often appears strange or violates expectations of users. This suggests that humans engage in cognitive anthropomorphism: expecting AI to have the same nature as human intelligence. This mismatch presents an obstacle to appropri...
Background
Evidence suggests that disruption in the cingulum bundle (CB) may influence executive dysfunctions in schizophrenia, but findings are still inconsistent. Using diffusion tensor imaging tractography, we investigated the differences in fiber integrity between schizophrenia patients and healthy controls together with the association between...
Modern AI image classifiers have made impressive advances in recent years, but their performance often appears strange or violates expectations of users. This suggests humans engage in cognitive anthropomorphism: expecting AI to have the same nature as human intelligence. This mismatch presents an obstacle to appropriate human-AI interaction. To de...
Medical diagnosis tends to follow most-likely-first strategy, where less likely causes are explored only once more likely causes are eliminated. When a patient does not understand this, the resulting errors may reduce trust and acceptance, even if the diagnostic policy were optimal. We hypothesize that explanations (either local justification or gl...
The exponential growth of data in many research fields means that revolutionary measures are needed for data management & accessibility. Both government regulations and scientific standards encourage open archival of research data, and the most popular avenue for sharing research data are online repositories. Most of the online data archives (e.g....
Scientific advances across a range of disciplines hinge on the ability to make inferences about unobservable theoretical entities on the basis of empirical data patterns. Accurate inferences rely on both discovering valid, replicable data patterns and accurately interpreting those patterns in terms of their implications for theoretical constructs....
Past research has established systematic effects of thermal stress on human comfort and cognitive performance. However, this research has primarily focused on extremes of temperature, ignoring moderate temperature ranges typically found in work environments and vehicles. Furthermore, models predicting the psychological impact of thermal environment...
Psychology Today Seeing what others Don't Blog.
Clustering analysis is a powerful tool (see Tan & Mueller, 2017) but it doesn’t get used as often as it should. It can provide us with different insights than factor analysis, even when applied to the same data.
https://www.psychologytoday.com/us/blog/seeing-what-others-dont/201908/how-many-persona...
Explainability is assumed to be a key factor for the adoption of Artificial Intelligence systems in a wide range of contexts. The use of AI components in self-driving cars, medical diagnosis, or insurance and financial services has shown that when decisions are taken or suggested by automated systems it is essential for practical, social, and legal...
The process of explaining something to another person is more than offering a statement. Explaining means taking the perspective and knowledge of the Learner into account, and determining whether the Learner is satisfied. While the nature of explanationconceived of as a set of statements has been explored philosophically, empirically, and experim...
This study explored whether sensitivity to visuomotor discrepancies, specifically the ability to detect and respond to loss of control over a moving object, is associated with other psychological traits and abilities. College-aged adults performed a computerized tracking task which involved keeping a cursor centered on a moving target using keyboar...
In this chapter, word game expertise will be examined from a cognitive perspective. First, a general taxonomic space of word games where the primary organizing axis distinguishes letter versus meaning-centered games will be proposed. Next, crosswords and SCRABBLE will be the focus, and aspects of game expertise will be examined by summarizing past...
Slides from presentation panel with Robert Hoffman, Gary Klein, Tim Miller, and David Aha.
Word games are used extensively in STEM classrooms to help familiarize students with technical vocabulary and terminology. Despite their widely-documented use, only a handful have tested whether they are effective at improving either retention or use of scientific concepts, and the results of these studies has been mixed. We report the combined res...
This paper introduces The Tracer Method that integrates two common approaches for understanding skilled performance: Cognitive Task Analysis (CTA) and Eye Tracking (ET). This combination has the potential to provide information for game designers and human computer interaction researchers that will guide feedback to areas with the greatest payoff....
Code for simulating interacting cognitive agents to produce accounts of knowledge polarization and consensus formation.
What makes for an explanation of “black box” AI systems such as Deep Nets? We reviewed the pertinent literatures on explanation and derived key ideas. This set the stage for our empirical inquiries, which include conceptual cognitive modeling, the analysis of a corpus of cases of "naturalistic explanation" of computational systems, computational co...
Image set implementing visual search problems with natural imagery.
https://zenodo.org/record/1219145
Sensemaking is described as how people make sense out of the world, and is an emergent process involving the interaction of low-level cognitive functions that have often been studied in isolation. If individuals who perform well in one aspect of sensemaking also excel in other aspects, this suggests it may be valuable to study sensemaking as an eme...
We describe the Eye Movement Minimal Model-Modified (EM4), a lightweight minimally-sufficient model of eye movements that accounts for visual search times in several distinct paradigms. The model allows visual search to be guided by probe-item similarity in different foveal zones, which enables the model to be used as a front-end for various models...
Word games such as crossword puzzles are widely used in education to help familiarize students with technical vocabulary. Despite an extensive literature discussing their use, few published research articles have established their effectiveness on memory retention and retrieval, especially in comparison to control study methods. We report two exper...
Two phenomena that are central to simulation research on opinion dynamics are opinion divergence—the result that individuals interacting in a group do not always collapse to a single viewpoint, and group polarization—the result that average group opinions can become more extreme after discussions than they were to begin with. Standard approaches to...
Previous research has demonstrated that adaptive training of working memory can substantially increase performance on the trained task. Such training effects have been reported for performance on simple span tasks, complex span tasks, and n-back tasks. Another task that has become a popular vehicle for studying working memory is the change-detectio...
Background
Situation awareness (SA) is defined in three levels: SA1 is the perception of the elements in a specific context, SA2 is the comprehension of their meaning, and SA3 is the projection of their status.
Purpose
To analyze the possible association of a genetic polymorphism in the serotonin transporter ( SLC6A4) gene and performance on the S...
The integration of robotic systems into daily life is increasing, as technological advancements facilitate independent and interdependent decision-making by autonomous agents. Highly collaborative human-robot teams promise to maximize the capabilities of humans and machines. While a great deal of progress has been made toward developing efficient s...
Background. The Psychology Experimental Building Language (PEBL) test battery (http://pebl.sourceforge.net/) is a popular application for neurobehavioral investigations. This study evaluated the correspondence between the PEBL and the non-PEBL versions of four executive function tests.
Methods. In one cohort, young-adults (N = 44) completed both th...
Dataset for IGT and Digit Span
Scatterplot showing the association of variability of response times on the Conner’s and the Psychology Experiment Building Language (PEBL) Continuous Performance Tests (CPT) among college student participants (N = 44).
Removal of one extreme score (upper-right) reduced the proportion of variation accounted considerably (R2 = .44, p < .0005 to R2 =...
Cohort1 dataset
Spreadsheet with CPT data.
Background. The Psychology Experiment Building Language (PEBL) software consists of over one-hundred computerized tests based on classic and novel cognitive neuropsychology and behavioral neurology measures. Although the PEBL tests are becoming more widely utilized, there is currently very limited information about the psychometric properties of th...
Raw data for the reliability data