Conference Paper

Designing for Responsible Trust in AI Systems: A Communication Perspective

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Software developers has been long considered as an important and high stake type of knowledge worker [2,33] whose job requires years of training, are considerably stressful, and demands high mental load [26,29]. Recently, AI-powered code generation tools-GitHub Copilot 1 , Tabnine 2 , Kite 3 , Amazon Code Whisperer 4 , to name a few-started to emerge and become revolutionary [38], which describes how trustworthiness is communicated in AI systems and how users make trust judgements. ...
... Trust has been highlighted as one of the most important factors in multiple design guidelines for human-AI collaboration [1,14]. Trust in AI is defined as the user's attitude or judgement about how the AI system can help when the user is in a situation of uncertainty or vulnerability [37,38,53], and therefore is particularly important when users engage in high-stake activities where the consequence can have severe impact [31]. Misalignment between the trust of users and the AI's capability could lead to users' overreliance on and misuse of AI systems, resulting in undesired consequences [19,37,43]. ...
... For this reason, the HCI community has been calling for designing for calibrated trust or responsible trust [31,38]. ...
Preprint
Full-text available
Software developers commonly engage in online communities to learn about new technologies. As revolutionary AI-powered code generation tools such as GitHub Copilot emerge, many developers are uncertain about how to trust them. While we see the promise of online communities in helping developers build appropriate trust in AI tools, we know little about how communities shape developers' trust in AI tools and how community features can facilitate trust in the design of AI tools. We investigate these questions through a two-phase study. Through an interview study with 17 developers, we unpack how developers in online communities collectively make sense of AI code generation tools by developing proper expectation, understanding, strategies, and awareness of broader implications, as well as how they leverage community signals to evaluate AI suggestions. We then surface design opportunities and conduct 11 design probe sessions to explore the design space of integrating a user community to AI code generation systems. We conclude with a series of design recommendations.
... This double use of the term "trustworthiness" is reflected in seminal research on trust [70,79] that uses the term "trustworthiness" to refer to the subjective perception and the objective attributes of the trustee. This distinction is also reflected in the broader trust literature [e.g., 35,53,55,60,73,90,105,110,111]. To clearly distinguish between those two uses of the term trustworthiness, we thus propose that on the side of the trustee, there exists an actual trustworthiness (AT) reflecting the characteristics of the system. ...
... A cue is an "information element that can be used to make a trust assessment about an agent" ( [111], p. 253). Cues are pieces of information that presumably provide insights regarding the AT of a system [13,73,111]. Single cues may only provide narrow or even misleading insights regarding a system's AT, and each cue relates to a certain degree to the AT. ...
... Cancro, Pan, and Foulds [2022] surveyed and provided a taxonomy of system information (or cues) that can be used to calibrate trust. Liao and Sundar [2022] provide specific design ideas on which cues might improve trust calibration. Figure 1 shows the relations between the micro level components of the trustworthiness assessment model. ...
Preprint
Full-text available
Designing trustworthy algorithmic decision-making systems is a central goal in system design. Additionally, it is crucial that external parties can adequately assess the trustworthiness of systems. Ultimately, this should lead to calibrated trust: trustors adequately trust and distrust the system. But the process through which trustors assess actual trustworthiness of a system to end up at their perceived trustworthiness of a system remains underexplored. Transferring from psychological theory about interpersonal assessment of human characteristics, we outline a “trustworthiness assessment” model with two levels. On the micro level, trustors assess system trustworthiness utilizing cues. On the macro level, trustworthiness assessments proliferate between different trustors – one stakeholder’s trustworthiness assessment of a system affects others’ trustworthiness assessments of the same system. This paper contributes a theoretical model that advances understanding of trustworthiness assessment processes when confronted with algorithmic systems. It can be used to inspire system design, stakeholder training, and regulation.
... Across the fields of HCI and AI, extensive research has studied how to calibrate clinicians' trust in AI. Much of this research can trace its origin to "explainable AI " literature that aims to make AI less like a "black box" [16,42,73]. As a result, when designing AI-powered DSTs, designers and researchers most often offered clinicians explanations of the AI's inner workings. ...
... Practitioner-facing tools (e.g., Microsoft's HAX "responsible AI " toolkit, data sheets [24], model cards [46]) further boosted this approach's real-world impact. Experiments showed that AI explanations improved clinicians' satisfaction with DSTs and increased the likelihood of them taking the AI's advice [9,42,51,60]. However, because AI suggestions are not always correct, "increasing the likelihood of clinicians taking the AI's advice" did not mean that clinicians made better decisions. ...
Conference Paper
Full-text available
Clinical decision support tools (DSTs), powered by Artificial Intelligence (AI), promise to improve clinicians' diagnostic and treatment decision-making. However, no AI model is always correct. DSTs must enable clinicians to validate each AI suggestion, convincing them to take the correct suggestions while rejecting its errors. While prior work often tried to do so by explaining AI's inner workings or performance, we chose a different approach: We investigated how clinicians validated each other's suggestions in practice (often by referencing scientific literature) and designed a new DST that embraces these naturalistic interactions. This design uses GPT-3 to draw literature evidence that shows the AI suggestions' robustness and applicability (or the lack thereof). A prototyping study with clinicians from three disease areas proved this approach promising. Clinicians' interactions with the prototype also revealed new design and research opportunities around (1) harnessing the complementary strengths of literature-based and predictive decision supports; (2) mitigating risks of de-skilling clinicians; and (3) offering low-data decision support with literature.
... The hypothesis is that participants (often recruited on crowdsourcing platforms) over-rely on the AI because they do not deeply engage with the explanations. The dual processing model has been cited as a useful framework [10,30,42,50,51]-instead of engaging in analytical reasoning with the explanations (system 2 thinking), people may invoke heuristics or rules-of-thumb to make a quick judgment (system 1 thinking). A recent study [27] suggested that people often invoke positive heuristics that superficially associate AI being explainable with being capable or trustworthy, which can lead to overreliance. ...
... To facilitate responsible design of AI support that prevents harmful overreliance [50], a fundamental understanding of people's decision-making process with AI is key. We believe the set of intuition-driven pathways identified in this work can be a useful tool to help understand why and even anticipate when inappropriate reliance may happen. ...
Preprint
Full-text available
AI explanations are often mentioned as a way to improve human-AI decision-making. Yet, empirical studies have not found consistent evidence of explanations' effectiveness and, on the contrary, suggest that they can increase overreliance when the AI system is wrong. While many factors may affect reliance on AI support, one important factor is how decision-makers reconcile their own intuition -- which may be based on domain knowledge, prior task experience, or pattern recognition -- with the information provided by the AI system to determine when to override AI predictions. We conduct a think-aloud, mixed-methods study with two explanation types (feature- and example-based) for two prediction tasks to explore how decision-makers' intuition affects their use of AI predictions and explanations, and ultimately their choice of when to rely on AI. Our results identify three types of intuition involved in reasoning about AI predictions and explanations: intuition about the task outcome, features, and AI limitations. Building on these, we summarize three observed pathways for decision-makers to apply their own intuition and override AI predictions. We use these pathways to explain why (1) the feature-based explanations we used did not improve participants' decision outcomes and increased their overreliance on AI, and (2) the example-based explanations we used improved decision-makers' performance over feature-based explanations and helped achieve complementary human-AI performance. Overall, our work identifies directions for further development of AI decision-support systems and explanation methods that help decision-makers effectively apply their intuition to achieve appropriate reliance on AI.
... As mentioned, AI is the summation of algorithmic processes. The creators of these algorithms are technologists, and researchers in the field whose role it is to develop and implement algorithms that are no less than perfect in dependability and accuracy [34]. However, human beings are flawed. ...
... The such impact will most likely be accidental, but that does not mean it can be ignored. The term stakeholders are commonly used when referencing all parties impacted by a decision or process [34]. The stakeholders of developed AI would be the users of such intelligence. ...
Preprint
The widespread adoption of Artificial Intelligence (AI) models by various industries in recent years have made Explainable Artificial Intelligence (XAI) an active field of research. This adoption can cause trust and effectiveness to suffer if the results of these models are not favorable in some way. XAI has advanced to the point where many metrics have been proposed as reasons for the outputs of many AI models. However, there is little consensus about what technical metrics are most important, nor is there a consensus on how best to analyze explainable methods and models. A discussion of varying attempts at this is brought forth, but the paper also goes into the ethics of AI and its societal impact. Given the modern ubiquity with which AI exists and the immensely multidisciplinary approach, AI has evolved into, using only technical metrics cannot fully describe XAI's effectiveness. This paper explores several approaches to measuring the ethical effects of XAI, whether it has any bearing on modern research, as well as how the impacts of AI and XAI are measured on society. The full attempt at quantifying XAI models' effectiveness is explored from a technical and non-technical point of view.
... Within frameworks that build on business ethics, responsible innovation is often approached through three main dimensions: (1) avoiding harm, (2) doing good, usually via corporate social responsibility, and (3) implementing governance over the first two dimensions [92]. Some of these frameworks propose coarse grained principles [86], while others focus on specific themes or aspects of innovation such as media [13], data [57], licensing [18], workflows [82], and trust [56]. ...
Preprint
Full-text available
Technology development practices in industry are often primarily focused on business results, which risks creating unbalanced power relations between corporate interests and the needs or concerns of people who are affected by technology implementation and use. These practices, and their associated cultural norms, may result in uses of technology that have direct, indirect, short-term, and even long-term negative effects on groups of people and/or the environment. This paper contributes a formative framework -- the Responsible and Inclusive Technology Framework -- that orients critical reflection around the social contexts of technology creation and use; the power dynamics between self, business, and societal stakeholders; the impacts of technology on various communities across past, present, and future dimensions; and the practical decisions that imbue technological artifacts with cultural values. We expect that the implementation of the Responsible and Inclusive Technology framework, especially in business-to-business industry settings, will serve as a catalyst for more intentional and socially-grounded practices, thus bridging the responsibility and principles-to-practice gap.
... For me, it's either 100% or zero." This potentially disagrees with prior work, which found that numeric heuristics as being associated with algorithmic intelligence [51]-although this may be due to our participant population, data scientists, who may have higher computational literacy than other groups. We suggest that effective code annotation in this context should either be ambient (ignorable unless needed) or provide value for active interaction (as a dedicated input mechanism). ...
Preprint
Full-text available
AI-powered code assistants, such as Copilot, are quickly becoming a ubiquitous component of contemporary coding contexts. Among these environments, computational notebooks, such as Jupyter, are of particular interest as they provide rich interface affordances that interleave code and output in a manner that allows for both exploratory and presentational work. Despite their popularity, little is known about the appropriate design of code assistants in notebooks. We investigate the potential of code assistants in computational notebooks by creating a design space (reified from a survey of extant tools) and through an interview-design study (with 15 practicing data scientists). Through this work, we identify challenges and opportunities for future systems in this space, such as the value of disambiguation for tasks like data visualization, the potential of tightly scoped domain-specific tools (like linters), and the importance of polite assistants.
... While the expectation is that explanations can help people detect flawed model reasoning and make better decisions, empirical studies either failed to observe this effect [96] or even found the opposite that explanations make people more likely to blindly follow the model when it is wrong compared to showing only AI predictions [6,81,96,106]. Research has attributed this phenomenon to a lack of cognitive engagement with AI explanations [11,35,52,62]: when people lack either the motivation or ability to carefully analyze and reason about explanations, they make a quick heuristic judgment, which tends to superficially associate being explainable to being trustworthy [28,61]. A recent CSCW work by Vasconcelos et al. [91] further calls out that this lack of cognitive engagement will persist if XAI techniques remain hard to use, as people strategically choose to engage with explanations or simply defer to AI by weighing the cognitive costs. ...
Preprint
Full-text available
While a vast collection of explainable AI (XAI) algorithms have been developed in recent years, they are often criticized for significant gaps with how humans produce and consume explanations. As a result, current XAI techniques are often found to be hard to use and lack effectiveness. In this work, we attempt to close these gaps by making AI explanations selective -- a fundamental property of human explanations -- by selectively presenting a subset from a large set of model reasons based on what aligns with the recipient's preferences. We propose a general framework for generating selective explanations by leveraging human input on a small sample. This framework opens up a rich design space that accounts for different selectivity goals, types of input, and more. As a showcase, we use a decision-support task to explore selective explanations based on what the decision-maker would consider relevant to the decision task. We conducted two experimental studies to examine three out of a broader possible set of paradigms based on our proposed framework: in Study 1, we ask the participants to provide their own input to generate selective explanations, with either open-ended or critique-based input. In Study 2, we show participants selective explanations based on input from a panel of similar users (annotators). Our experiments demonstrate the promise of selective explanations in reducing over-reliance on AI and improving decision outcomes and subjective perceptions of the AI, but also paint a nuanced picture that attributes some of these positive effects to the opportunity to provide one's own input to augment AI explanations. Overall, our work proposes a novel XAI framework inspired by human communication behaviors and demonstrates its potentials to encourage future work to better align AI explanations with human production and consumption of explanations.
... Furthermore, as with other technology competencies [9], any complexities introduced by XAI will limit its accessibility and may exacerbate a digital divide in society. In some cases, we may aim to minimize or avoid explanations altogether at the end user level, instead validating the tools in advance through research [12] or independent evaluation [24], using XAI for post-hoc analysis [1,33], providing a more cue-like situational awareness of AI performance [16,23], or clarifying the conditions where performance is high [4,7]. ...
Conference Paper
Full-text available
Many opportunities and challenges accompany the use of AI in domains with complex human factors and risks. This paper proposes that in such domains the most advanced human-AI interactions will not arise from an emphasis on technical capabilities, but rather from an emphasis on understanding and applying existing human capabilities in new ways. A human-capabilities orientation is explored along with three aims for research and design.
Article
Full-text available
When forecasting events, multiple types of uncertainty are often inherently present in the modeling process. Various uncertainty typologies exist, and each type of uncertainty has different implications a scientist might want to convey. In this work, we focus on one type of distinction between direct quantitative uncertainty and indirect qualitative uncertainty. Direct quantitative uncertainty describes uncertainty about facts, numbers, and hypotheses that can be communicated in absolute quantitative forms such as probability distributions or confidence intervals. Indirect qualitative uncertainty describes the quality of knowledge concerning how effectively facts, numbers, or hypotheses represent reality, such as evidence confidence scales proposed by the Intergovernmental Panel on Climate Change. A large body of research demonstrates that both experts and novices have difficulty reasoning with quantitative uncertainty, and visualizations of uncertainty can help with such traditionally challenging concepts. However, the question of if, and how, people may reason with multiple types of uncertainty associated with a forecast remains largely unexplored. In this series of studies, we seek to understand if individuals can integrate indirect uncertainty about how “good” a model is (operationalized as a qualitative expression of forecaster confidence) with quantified uncertainty in a prediction (operationalized as a quantile dotplot visualization of a predicted distribution). Our first study results suggest that participants utilize both direct quantitative uncertainty and indirect qualitative uncertainty when conveyed as quantile dotplots and forecaster confidence. In manipulations where forecasters were less sure about their prediction, participants made more conservative judgments. In our second study, we varied the amount of quantified uncertainty (in the form of the SD of the visualized distributions) to examine how participants’ decisions changed under different combinations of quantified uncertainty (variance) and qualitative uncertainty (low, medium, and high forecaster confidence). The second study results suggest that participants updated their judgments in the direction predicted by both qualitative confidence information (e.g., becoming more conservative when the forecaster confidence is low) and quantitative uncertainty (e.g., becoming more conservative when the variance is increased). Based on the findings from both experiments, we recommend that forecasters present qualitative expressions of model confidence whenever possible alongside quantified uncertainty.
Conference Paper
Full-text available
As AI-powered systems increasingly mediate consequential decision-making, their explainability is critical for end-users to take informed and accountable actions. Explanations in human-human interactions are socially-situated. AI systems are often socio-organizationally embedded. However, Explainable AI (XAI) approaches have been predominantly algorithm-centered. We take a developmental step towards socially-situated XAI by introducing and exploring Social Transparency (ST), a sociotechnically informed perspective that incorporates the socio-organizational context into explaining AI-mediated decision-making. To explore ST conceptually, we conducted interviews with 29 AI users and practitioners grounded in a speculative design scenario. We suggested constitutive design elements of ST and developed a conceptual framework to unpack ST's effect and implications at the technical, decision-making, and organizational level. The framework showcases how ST can potentially calibrate trust in AI, improve decision-making, facilitate organizational collective actions, and cultivate holistic explainability. Our work contributes to the discourse of Human-Centered XAI by expanding the design space of XAI.
Article
Full-text available
Artificial intelligence (AI) ethics is now a global topic of discussion in academic and policy circles. At least 84 public–private initiatives have produced statements describing high-level principles, values and other tenets to guide the ethical development, deployment and governance of AI. According to recent meta-analyses, AI ethics has seemingly converged on a set of principles that closely resemble the four classic principles of medical ethics. Despite the initial credibility granted to a principled approach to AI ethics by the connection to principles in medical ethics, there are reasons to be concerned about its future impact on AI development and governance. Significant differences exist between medicine and AI development that suggest a principled approach for the latter may not enjoy success comparable to the former. Compared to medicine, AI development lacks (1) common aims and fiduciary duties, (2) professional history and norms, (3) proven methods to translate principles into practice, and (4) robust legal and professional accountability mechanisms. These differences suggest we should not yet celebrate consensus around high-level principles that hide deep political and normative disagreement.
Conference Paper
Full-text available
Trust in a Recommender System (RS) is crucial for its overall success. However, it remains underexplored whether users trust personal recommendation sources (i.e. other humans) more than impersonal sources (i.e. conventional RS), and, if they do, whether the perceived quality of explanation provided account for the difference. We conducted an empirical study in which we compared these two sources of recommendations and explanations. Human advisors were asked to explain movies they recommended in short texts while the RS created explanations based on item similarity. Our experiment comprised two rounds of recommending. Over both rounds the quality of explanations provided by users was assessed higher than the quality of the system's explanations. Moreover, explanation quality significantly influenced perceived recommendation quality as well as trust in the recommendation source. Consequently, we suggest that RS should provide richer explanations in order to increase their perceived recommendation quality and trustworthiness.
Article
Full-text available
At the dawn of the fourth industrial revolution, we are witnessing a fast and widespread adoption of Artificial Intelligence (AI) in our daily life, which contributes to accelerating the shift towards a more algorithmic society. However, even with such unprecedented advancements, a key impediment to the use of AI-based systems is that they often lack transparency. Indeed, the black box nature of these systems allows powerful predictions, but it cannot be directly explained. This issue has triggered a new debate on Explainable Artificial Intelligence. A research field that holds substantial promise for improving trust and transparency of AI-based systems. It is recognized as the sine qua non for AI to continue making steady progress without disruption. This survey provides an entry point for interested researchers and practitioners to learn key aspects of the young and rapidly growing body of research related to explainable AI. Through the lens of literature, we review existing approaches regarding the topic, we discuss trends surrounding its sphere and we present major research trajectories.
Article
Full-text available
Supervised machine-learning models boast remarkable predictive capabilities. But can you trust your model? Will it work in deployment? What else can it tell you about the world? Models should be not only good, but also interpretable, yet the task of interpretation appears underspecified. The academic literature has provided diverse and sometimes non-overlapping motivations for interpretability and has offered myriad techniques for rendering interpretable models. Despite this ambiguity, many authors proclaim their models to be interpretable axiomatically, absent further argument. Problematically, it is not clear what common properties unite these techniques. This article seeks to refine the discourse on interpretability. First it examines the objectives of previous papers addressing interpretability, finding them to be diverse and occasionally discordant. Then, it explores model properties and techniques thought to confer interpretability, identifying transparency to humans and post hoc explanations as competing concepts. Throughout, the feasibility and desirability of different notions of interpretability are discussed. The article questions the oft-made assertions that linear models are interpretable and that deep neural networks are not. © 2018 Association for Computing Machinery. All rights reserved.
Article
Full-text available
In this article, we look at trust in artificial intelligence, machine learning (ML), and robotics. We first review the concept of trust in AI and examine how trust in AI may be different from trust in other technologies. We then discuss the differences between interpersonal trust and trust in technology and suggest factors that are crucial in building initial trust and developing continuous trust in artificial intelligence.
Article
Full-text available
In the last years many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness sometimes at the cost of scarifying accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, delineating explicitly or implicitly its own definition of interpretability and explanation. The aim of this paper is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.
Article
Full-text available
This chapter outlines the two basic routes to persuasion. One route is based on the thoughtful consideration of arguments central to the issue, whereas the other is based on the affective associations or simple inferences tied to peripheral cues in the persuasion context. This chapter discusses a wide variety of variables that proved instrumental in affecting the elaboration likelihood, and thus the route to persuasion. One of the basic postulates of the Elaboration Likelihood Model—that variables may affect persuasion by increasing or decreasing scrutiny of message arguments—has been highly useful in accounting for the effects of a seemingly diverse list of variables. The reviewers of the attitude change literature have been disappointed with the many conflicting effects observed, even for ostensibly simple variables. The Elaboration Likelihood Model (ELM) attempts to place these many conflicting results and theories under one conceptual umbrella by specifying the major processes underlying persuasion and indicating the way many of the traditionally studied variables and theories relate to these basic processes. The ELM may prove useful in providing a guiding set of postulates from which to interpret previous work and in suggesting new hypotheses to be explored in future research. Copyright © 1986 Academic Press Inc. Published by Elsevier Inc. All rights reserved.
Article
Full-text available
The Elaboration Likelihood Model of persuasion (ELM) is discussed as it relates to source factors in persuasion. The ELM proposes that under low elaboration likelihood, source factors serve as simple acceptance or rejection cues under moderate elaboration likelihood source factors guide the extent of thinking; and under high elaboration likelihood source factors are unimportant as cues or general motivators of thought but (if relevant) serve as persuasive arguments or help in interpreting arguments. Several experiments are described which provide empirical support for these propositions.
Article
Full-text available
As automated controllers supplant human intervention in controlling complex systems, the operators' role often changes from that of an active controller to that of a supervisory controller. Acting as supervisors, operators can choose between automatic and manual control. Improperly allocating function between automatic and manual control can have negative consequences for the performance of a system. Previous research suggests that the decision to perform the job manually or automatically depends, in part, upon the trust the operators invest in the automatic controllers. This paper reports an experiment to characterize the changes in operators' trust during an interaction with a semi-automatic pasteurization plant, and investigates the relationship between changes in operators' control strategies and trust. A regression model identifies the causes of changes in trust, and a ‘trust transfer function’ is developed using lime series analysis to describe the dynamics of trust. Based on a detailed analysis of operators' strategies in response to system faults we suggest a model for the choice between manual and automatic control, based on trust in automatic controllers and self-confidence in the ability to control the system manually.
Article
Full-text available
Signaling theory is useful for describing behavior when two parties (individuals or organizations) have access to different information. Typically, one party, the sender, must choose whether and how to communicate (or signal) that information, and the other party, the receiver, must choose how to interpret the signal. Accordingly, signaling theory holds a prominent position in a variety of management literatures, including strategic management, entrepreneurship, and human resource management. While the use of signaling theory has gained momentum in recent years, its central tenets have become blurred as it has been applied to organizational concerns. The authors, therefore, provide a concise synthesis of the theory and its key concepts, review its use in the management literature, and put forward directions for future research that will encourage scholars to use signaling theory in new ways and to develop more complex formulations and nuanced variations of the theory.
Article
Full-text available
Purpose The purpose of this paper is to examine the extent to which measures and operationalisations of intra‐organisational trust reflect the essential elements of the existing conceptualisation of trust inside the workplace. Design/methodology/approach The paper provides an overview of the essential points from the rich variety of competing conceptualisations and definitions in the management and organisational literatures. It draws on this overview to present a framework of issues for researchers to consider when designing research based on trust. This framework is then used to analyse the content of 14 recently published empirical measures of intra‐organisational trust. It is noted for each measure the form that trust takes, the content, the sources of evidence and the identity of the recipient, as well as matters related to the wording of items. Findings The paper highlights where existing measures match the theory, but also shows a number of “blind‐spots” or contradictions, particularly over the content of the trust belief, the selection of possible sources of evidence for trust, and inconsistencies in the identity of the referent. Research limitations/implications It offers researchers some recommendations for future research designed to capture trust among different parties in organisations, and contains an Appendix with 14 measures for intra‐organisational trust. Originality/value The value of the paper is twofold: it provides an overview of the conceptualisation literature, and a detailed content‐analysis of several different measures for trust. This should prove useful in helping researchers refine their research designs in the future.
Article
Full-text available
Signaling theory provides an opportunity to integrate an interactive theory of symbolic communication and social benefit with materialist theories of individual strategic action and adaptation. This article examines the potential explanatory value of signaling theory for a variety of anthropological topics, focusing on three social arenas in which signaling might plausibly be important: unconditional generosity, "wasteful" subsistence behavior, and artistic or craft traditions. In each case, it outlines the ways in which the phenomena correspond with the expectations of signaling theory by showing how a given pattern of action might signal particular hidden attributes, provide benefits to both signaler and observers, and meet the conditions for honest communication. The ethnographic evidence suggests that the fundamental conditions for reliable signaling of condition-dependent qualities may exist in many social domains. It appears that signaling theory has considerable promise for generating novel and powerful insights into the ethnographic realm. © 2005 by The Wenner-Gren Foundation for Anthropological Research. All rignts reserved.
Article
The spread of AI-embedded systems involved in human decision making makes studying human trust in these systems critical. However, empirically investigating trust is challenging. One reason is the lack of standard protocols to design trust experiments. In this paper, we present a survey of existing methods to empirically investigate trust in AI-assisted decision making and analyse the corpus along the constitutive elements of an experimental protocol. We find that the definition of trust is not commonly integrated in experimental protocols, which can lead to findings that are overclaimed or are hard to interpret and compare across studies. Drawing from empirical practices in social and cognitive studies on human-human trust, we provide practical guidelines to improve the methodology of studying Human-AI trust in decision-making contexts. In addition, we bring forward research opportunities of two types: one focusing on further investigation regarding trust methodologies and the other on factors that impact Human-AI trust.
Article
The wide adoption of Machine Learning (ML) technologies has created a growing demand for people who can train ML models. Some advocated the term "machine teacher'' to refer to the role of people who inject domain knowledge into ML models. This "teaching'' perspective emphasizes supporting the productivity and mental wellbeing of machine teachers through efficient learning algorithms and thoughtful design of human-AI interfaces. One promising learning paradigm is Active Learning (AL), by which the model intelligently selects instances to query a machine teacher for labels, so that the labeling workload could be largely reduced. However, in current AL settings, the human-AI interface remains minimal and opaque. A dearth of empirical studies further hinders us from developing teacher-friendly interfaces for AL algorithms. In this work, we begin considering AI explanations as a core element of the human-AI interface for teaching machines. When a human student learns, it is a common pattern to present one's own reasoning and solicit feedback from the teacher. When a ML model learns and still makes mistakes, the teacher ought to be able to understand the reasoning underlying its mistakes. When the model matures, the teacher should be able to recognize its progress in order to trust and feel confident about their teaching outcome. Toward this vision, we propose a novel paradigm of explainable active learning (XAL), by introducing techniques from the surging field of explainable AI (XAI) into an AL setting. We conducted an empirical study comparing the model learning outcomes, feedback content and experience with XAL, to that of traditional AL and coactive learning (providing the model's prediction without explanation). Our study shows benefits of AI explanation as interfaces for machine teaching--supporting trust calibration and enabling rich forms of teaching feedback, and potential drawbacks--anchoring effect with the model judgment and additional cognitive workload. Our study also reveals important individual factors that mediate a machine teacher's reception to AI explanations, including task knowledge, AI experience and Need for Cognition. By reflecting on the results, we suggest future directions and design implications for XAL, and more broadly, machine teaching through AI explanations.
Article
This article attempts to bridge the gap between widely discussed ethical principles of Human-centered AI (HCAI) and practical steps for effective governance. Since HCAI systems are developed and implemented in multiple organizational structures, I propose 15 recommendations at three levels of governance: team, organization, and industry. The recommendations are intended to increase the reliability, safety, and trustworthiness of HCAI systems: (1) reliable systems based on sound software engineering practices, (2) safety culture through business management strategies, and (3) trustworthy certification by independent oversight. Software engineering practices within teams include audit trails to enable analysis of failures, software engineering workflows, verification and validation testing, bias testing to enhance fairness, and explainable user interfaces. The safety culture within organizations comes from management strategies that include leadership commitment to safety, hiring and training oriented to safety, extensive reporting of failures and near misses, internal review boards for problems and future plans, and alignment with industry standard practices. The trustworthiness certification comes from industry-wide efforts that include government interventions and regulation, accounting firms conducting external audits, insurance companies compensating for failures, non-governmental and civil society organizations advancing design principles, and professional organizations and research institutes developing standards, policies, and novel ideas. The larger goal of effective governance is to limit the dangers and increase the benefits of HCAI to individuals, organizations, and society.
Article
Accuracy is an important concern for suppliers of artificial intelligence (AI) services, but considerations beyond accuracy, such as safety (which includes fairness and explainability), security, and provenance, are also critical elements to engender consumers' trust in a service. Many industries use transparent, standardized, but often not legally required documents called supplier's declarations of conformity (SDoCs) to describe the lineage of a product along with the safety and performance testing it has undergone. SDoCs may be considered multi-dimensional fact sheets that capture and quantify various aspects of the product and its development to make it worthy of consumers' trust. In this paper, inspired by this practice, we propose FactSheets to help increase trust in AI services. We envision such documents to contain purpose, performance, safety, security, and provenance information to be completed by AI service providers for examination by consumers. We suggest a comprehensive set of declaration items tailored to AI in the appendix of the paper.
Conference Paper
Work in social psychology on interpersonal interaction has demonstrated that people are more likely to comply to a request if they are presented with a justification - even if this justification conveys no information. In the light of the many calls for explaining reasoning of interactive intelligent systems to users, we investigate whether this effect holds true for human-computer interaction. Using a prototype of a nutrition recommender, we conducted a lab study (N=30) between three groups (no explanation, placebic explanation, and real explanation). Our results indicate that placebic explanations for algorithmic decision-making may indeed invoke perceived levels of trust similar to real explanations. We discuss how placebic explanations could be considered in future work.
Conference Paper
In this day and age of identity theft, are we likely to trust machines more than humans for handling our personal information? We answer this question by invoking the concept of "machine heuristic," which is a rule of thumb that machines are more secure and trustworthy than humans. In an experiment (N = 160) that involved making airline reservations, users were more likely to reveal their credit card information to a machine agent than a human agent. We demonstrate that cues on the interface trigger the machine heuristic by showing that those with higher cognitive accessibility of the heuristic (i.e., stronger prior belief in the rule of thumb) were more likely than those with lower accessibility to disclose to a machine, but they did not differ in their disclosure to a human. These findings have implications for design of interface cues conveying machine vs. human sources of our online interactions.
Article
How can we add the most important ingredient to our relationship with machine learning?
Conference Paper
Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type [15]) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related artificial intelligence technology, increasing transparency into how well artificial intelligence technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.
Article
Older adults are a notable group among the exponentially growing population of online health information consumers. In order to better support older adults' health-related information seeking on the Internet, it is important to understand how they judge the credibility of such information when compared to younger users. We conducted two laboratory studies to explore how the credibility cues in message contents, website features, and user-generated comments differentially impact younger (19 to 26 years of age) and older adults' (58 to 80 years of age) credibility judgments. Results from the first experiment showed that older adults were less sensitive to the credibility cues in message contents and those in website features than younger adults. Verbal protocol analysis revealed that these differences could be caused by the higher tendency of older adults to passively accept web information, and their lack of deliberation on its quality and attention towards contextual web features (e.g., design look, source identity). In the second experiment, we studied how credibility cues from user reviews might differentially impact older and younger adults' credibility judgments of online health information. Results showed that consistent credibility cues in user reviews and message contents could facilitate older adults' credibility judgments. When the two were inconsistent, older adults, as compared to younger ones, were less swayed by highly appraising user reviews given to low credibility information. These results provided important implications for designing health information technologies that better fit the older population.
Article
To better support older adults' consumption of high quality health information on the Internet, it is important to understand how older adults make credibility judgments with online health information. For this purpose, we conducted two laboratory studies to explore how the credibility cues in message contents, website features, and user reviews could differentially impact younger and older adults' credibility judgments. Results from the first experiment showed that older adults, compared to younger ones, were less sensitive to the credibility cues in message contents, as well as those in the website features. Results from the second experiment showed that user reviews that were consistent with the credibility cues in message contents could reinforce older adults' credibility judgments. Older adults, compared to younger adults, seemed to be less swayed by user reviews that were inconsistent with the message contents. These results provided implications for designing health information websites that better support older adults' credibility judgments.
Article
James J Gibson introduced for the first time the word "affordances" in this 1977 paper.
Article
Data from 574 participants were used to assess perceptions of message, site, and sponsor credibility across four genres of websites; to explore the extent and effects of verifying web-based information; and to measure the relative influence of sponsor familiarity and site attributes on perceived credibility.The results show that perceptions of credibility differed, such that news organization websites were rated highest and personal websites lowest, in terms of message, sponsor, and overall site credibility, with e-commerce and special interest sites rated between these, for the most part.The results also indicated that credibility assessments appear to be primarily due to website attributes (e.g. design features, depth of content, site complexity) rather than to familiarity with website sponsors. Finally, there was a negative relationship between self-reported and observed information verification behavior and a positive relationship between self-reported verification and internet/web experience. The findings are used to inform the theoretical development of perceived web credibility.
Article
In this study 2,684 people evaluated the credibility of two live Web sites on a similar topic (such as health sites). We gathered the comments people wrote about each siteís credibility and analyzed the comments to find out what features of a Web site get noticed when people evaluate credibility. We found that the ìdesign lookî of the site was mentioned most frequently, being present in 46.1% of the comments. Next most common were comments about information structure and information focus. In this paper we share sample participant comments in the top 18 areas that people noticed when evaluating Web site credibility. We discuss reasons for the prominence of design look, point out how future studies can build on what we have learned in this new line of research, and outline six design implications for human-computer interaction professionals.
Article
Scholars in various disciplines have considered the causes, nature, and effects of trust. Prior approaches to studying trust are considered, including characteristics of the trustor, the trustee, and the role of risk. A definition of trust and a model of its antecedents and outcomes are presented, which integrate research from multiple disciplines and differentiate trust from similar constructs. Several research propositions based on the model are presented.
Article
In Exp I, 183 undergraduates read a persuasive message from a likable or unlikable communicator who presented 6 or 2 arguments on 1 of 2 topics. High involvement (HI) Ss anticipated discussing the message topic at a future experimental session, whereas low-involvement (LI) Ss anticipated discussing a different topic. For HI Ss, opinion change was significantly greater given 6 arguments but was unaffected by communicator likability. For LI Ss, opinion change was significantly greater given a likable communicator but was unaffected by the argument's manipulation. In Exp II with 80 similar Ss, HI Ss showed slightly greater opinion change when exposed to 5 arguments from an unlikable (vs 1 argument from a likable) communicator, whereas LI Ss exhibited significantly greater persuasion in response to 1 argument from a likable (vs 5 arguments from an unlikable) communicator. Findings support the idea that HI leads message recipients to employ a systematic information processing strategy in which message-based cognitions mediate persuasion, whereas LI leads recipients to use a heuristic processing strategy in which simple decision rules mediate persuasion. Support was also obtained for the hypothesis that content- vs source-mediated opinion change would result in greater persistence. (37 ref) (PsycINFO Database Record (c) 2004 APA, all rights reserved)