Article

Algorithm appreciation: People prefer algorithmic to human judgment

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Even though computational algorithms often outperform human judgment, received wisdom suggests that people may be skeptical of relying on them (Dawes, 1979). Counter to this notion, results from six experiments show that lay people adhere more to advice when they think it comes from an algorithm than from a person. People showed this effect, what we call algorithm appreciation, when making numeric estimates about a visual stimulus (Experiment 1A) and forecasts about the popularity of songs and romantic attraction (Experiments 1B and 1C). Yet, researchers predicted the opposite result (Experiment 1D). Algorithm appreciation persisted when advice appeared jointly or separately (Experiment 2). However, algorithm appreciation waned when: people chose between an algorithm’s estimate and their own (versus an external advisor’s; Experiment 3) and they had expertise in forecasting (Experiment 4). Paradoxically, experienced professionals, who make forecasts on a regular basis, relied less on algorithmic advice than lay people did, which hurt their accuracy. These results shed light on the important question of when people rely on algorithmic advice over advice from people and have implications for the use of “big data” and algorithmic advice it generates.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The influence of trust in an advice is repeatedly highlighted in advicetaking research and is one of the most examined factors in advice-taking literature (Gino & Schweitzer, 2008;Jung & Seiter, 2021;Logg et al., 2019Logg et al., , Önkal et al., 2009Waern & Ramberg, 1996). People have to decide if they trust an algorithm forecast and to what extent (Shin, 2022b). ...
... Trust in AI advice may also be influenced by age (Thurman et al., 2019) but the research so far is ambiguous. Logg et al., (2019) for example couldn't detect an influence of age on algorithm appreciation respectively aversion, while Thurman et al. (2019) report that higher age can lead to less reliance on algorithm advice. ...
... Since 2009, digitalization has continued to advance strongly and people who experienced the introduction of the first commercial smartphone in 2007 are now increasingly joining the workforce. This should likewise lead to an increase in the acceptance of digital sources (Logg et al., 2019). Adding to the previous thoughts, it seems reasonable to assume an interaction between the forecast source and the complexity of a decision task (Schrah et al., 2006) as interactions of different task features are suggested to be particularly worthwhile investigating (Appelt et al., 2011). ...
... The influence that algorithmic entities have on people depends on how people perceive the algorithm, for example, whether they attribute trustworthiness to its recommendations [50,76]. The influence of algorithms on individuals tends to increase as the environment becomes more uncertain and decisions become more difficult [20]. ...
... The influence of algorithms on individuals tends to increase as the environment becomes more uncertain and decisions become more difficult [20]. With the public's growing awareness of developments in artificial intelligence, people may regard smart algorithms as a source of authority [2,60,76]. There is recent evidence that people may accept algorithmic advice even in simple cases when it is clearly wrong [74]. ...
... The normative influence explanation is supported by the finding that participants in our experiment attributed a high degree of expertise to the assistant (see Figure 8). The wider literature similarly suggests that people may regard AI systems as authoritative sources [2,60,76]. However, our experimental design presented the language model as a support tool and did not personify the assistant. ...
Preprint
If large language models like GPT-3 preferably produce a particular point of view, they may influence people's opinions on an unknown scale. This study investigates whether a language-model-powered writing assistant that generates some opinions more often than others impacts what users write - and what they think. In an online experiment, we asked participants (N=1,506) to write a post discussing whether social media is good for society. Treatment group participants used a language-model-powered writing assistant configured to argue that social media is good or bad for society. Participants then completed a social media attitude survey, and independent judges (N=500) evaluated the opinions expressed in their writing. Using the opinionated language model affected the opinions expressed in participants' writing and shifted their opinions in the subsequent attitude survey. We discuss the wider implications of our results and argue that the opinions built into AI language technologies need to be monitored and engineered more carefully.
... Previous research has primarily focused on one-shot decisions and the results have been fairly inconsistent. While in some judgment tasks, such as shopping online, humans happily follow an algorithm's advice (e.g., Zhou et al., 2021), in others they exhibit considerable reluctance to do so (Logg et al., 2019;Dietvorst et al., 2015). Indeed, recent reviews suggest that whether and when people are willing to take advice is complex, depending on the characteristics of the individual, the algorithm, and the task (Kawaguchi, 2021;Mahmud et al., 2022). ...
... Previous research on taking advice from algorithms has shown large differences in individuals' willingness to follow the recommendations of an algorithm (e.g., Kawaguchi, 2021;Logg et al., 2019;Mahmud et al., 2022). In general, humans seem to be willing to accept the advice of an algorithm if it is perceived to be of high quality and the expertise of the human is low (e.g., Logg et al., 2019;Saragih and Morrison, 2022;Tauchert and Mesbah, 2019;Van Swol et al., 2018). ...
... Previous research on taking advice from algorithms has shown large differences in individuals' willingness to follow the recommendations of an algorithm (e.g., Kawaguchi, 2021;Logg et al., 2019;Mahmud et al., 2022). In general, humans seem to be willing to accept the advice of an algorithm if it is perceived to be of high quality and the expertise of the human is low (e.g., Logg et al., 2019;Saragih and Morrison, 2022;Tauchert and Mesbah, 2019;Van Swol et al., 2018). However, according to what Madhavan and Wiegmann (2007) refer to as the perfect automation scheme humans expect algorithms to work perfectly, unlike other humans, and adherence to an algorithm's recommendations decreases rapidly once the recommendations are perceived as imperfect. ...
... Readers can refer to two recent survey papers [6,40] for a comprehensive literature review. In contrast, Logg et al. [53] found that users were influenced more by the algorithmic decision instead of human decision, and they first coined the notion of "Algorithm Appreciation" to describe such a phenomenon. Others revealed similar findings in contexts where tasks are perceived as being more objective [7], machines share rationale with humans [75] or with prior exposure to similar systems [42]. ...
... Others revealed similar findings in contexts where tasks are perceived as being more objective [7], machines share rationale with humans [75] or with prior exposure to similar systems [42]. Besides contradicting attitudes towards the use of AI systems, prior work has shown how different human factors such as algorithmic literacy [72], expertise [53], and cognitive load [84] can affect users' final adoption of algorithmic advice. For example, users' algorithmic literacy [71][72][73] about fairness, accountability, transparency, and explainability is found to greatly affect their trust and privacy concern in adopting the advice from AI systems [70,74]. ...
... For example, users' algorithmic literacy [71][72][73] about fairness, accountability, transparency, and explainability is found to greatly affect their trust and privacy concern in adopting the advice from AI systems [70,74]. Logg et al. [53] found that experts may even show more tendency to discount algorithmic advice when compared to laypeople. Furthermore, these factors can also affect the extent to which users show algorithm aversion or algorithm appreciation. ...
Preprint
Full-text available
The dazzling promises of AI systems to augment humans in various tasks hinge on whether humans can appropriately rely on them. Recent research has shown that appropriate reliance is the key to achieving complementary team performance in AI-assisted decision making. This paper addresses an under-explored problem of whether the Dunning-Kruger Effect (DKE) among people can hinder their appropriate reliance on AI systems. DKE is a metacognitive bias due to which less-competent individuals overestimate their own skill and performance. Through an empirical study (N = 249), we explored the impact of DKE on human reliance on an AI system, and whether such effects can be mitigated using a tutorial intervention that reveals the fallibility of AI advice, and exploiting logic units-based explanations to improve user understanding of AI advice. We found that participants who overestimate their performance tend to exhibit under-reliance on AI systems, which hinders optimal team performance. Logic units-based explanations did not help users in either improving the calibration of their competence or facilitating appropriate reliance. While the tutorial intervention was highly effective in helping users calibrate their self-assessment and facilitating appropriate reliance among participants with overestimated self-assessment, we found that it can potentially hurt the appropriate reliance of participants with underestimated self-assessment. Our work has broad implications on the design of methods to tackle user cognitive biases while facilitating appropriate reliance on AI systems. Our findings advance the current understanding of the role of self-assessment in shaping trust and reliance in human-AI decision making. This lays out promising future directions for relevant HCI research in this community.
... From the algorithm-human binary decision-making perspective, previous empirical research contains mixed results regarding the effects of ADM on citizens' perceived fairness and acceptance behaviors. Following the assumption of algorithm appreciation, some research has found that ADM has higher perceived fairness and acceptance than HDM (Araujo, Helberger, Kruikemeier, & de Vreese, 2020;Logg, Minson, & Moore, 2019;Schlicker et al., 2021), while others have found the opposite based on the assumption of algorithm aversion (Acikgoz, Davison, Compagnone, & Laske, 2020;Burton, Stein, & Jensen, 2020;Dietvorst & Bharti, 2020). Other research also found no significant difference between ADM and HDM regarding perceived fairness and acceptance (Ötting & Maier, 2018;Prahl & Van Swol, 2017). ...
... Meanwhile, mixed evidence has been found in the research comparing citizens' acceptance of ADM and HDM. For example, after conducting multiple experiments across a variety of estimation and forecasting tasks, Logg et al. (2019) demonstrated that people prefer advice from algorithms rather than humans. However, Dietvorst and Bharti (2020) found that people rejected even the best possible algorithms in investing and medical decision-making domains alongside other inherently uncertain domains. ...
... These contribute to improving the perceptions of fairness and acceptance toward the AI model. Rule-driven ADM has a positive effect on perceived fairness and acceptance, which also supports the previous findings on algorithm appreciation (Logg et al., 2019;Schlicker et al., 2021). By contrast, data-driven ADM represents a black-box approach that is opaque for decision making (Jordan & Mitchell, 2015), whereas rule-driven ADM is a comparatively whitebox approach that is embedded in a certain level of transparency (Dijkstra, Liebrand, & Timminga, 1998). ...
Article
Various types of algorithms are being increasingly used to support public decision-making, yet we do not know how these different algorithm types affect citizens' attitudes and behaviors in specific public affairs. Drawing on public value theory, this study uses a survey experiment to compare the effects of rule-driven versus data-driven algorithmic decision-making (ADM) on citizens' perceived fairness and acceptance. This study also examines the moderating role of familiarity with public affairs and the mediating role of perceived fairness on the relationship. The findings show that rule-driven ADM is generally perceived as fairer and more acceptable than data-driven ADM. Low familiarity with public affairs strengthens citizens' perceived fairness and acceptance of rule-driven ADM more than data-driven ADM, and citizens' perceived fairness plays a significant mediating role in the effect of rule-driven ADM on citizens' acceptance behaviors. These findings further imply that citizens' perceived fairness and acceptance of ADM is strongly shaped by how they perceive familiarity of the decision-making context. In high-familiarity AI application scenarios, the realization of public values may ultimately not be what matters for ADM acceptance among citizens.
... On the one hand, studies across different domains demonstrated a preference for human over AI advice (e.g., (Larkin et al., 2021;Will et al., 2022), i.e., people reject algorithm advice more often than human advice, even when the human is obviously inferior to the algorithm (algorithmic aversion; e.g., Dietvorst et al., 2015). On the other hand, it has also been shown that individuals are more willing to adhere to algorithmic than human advice (algorithmic appreciation; e.g., Logg et al., 2019). The underlying mechanisms leading to either algorithmic aversion or appreciation are not yet fully understood, with several advice characteristics proposed as potentially relevant, including the quality of advice. ...
... These findings are somewhat inconsistent with other research, indicating that people with less task expertise might show algorithmic appreciation (Logg et al., 2019), or that task experts show algorithmic aversion (Gaube et al., 2021). ...
Preprint
Full-text available
Despite the rise of decision support systems enabled by artificial intelligence (AI) in personnel selection, their impact on decision-making processes is largely unknown. Consequently, we conducted five experiments (N = 1,403) investigating how people interact with AI-generated advice in a personnel selection task. In all pre-registered experiments, we presented correct and incorrect advice. In Experiments 1a and 1b, we manipulated the source of the advice (human vs. AI). In Experiments 2a, 2b, and 2c, we further manipulated the type of explainability of AI advice (2a and 2b: heatmaps and 2c: charts) to test if explainable advice improves decision-making. The independent variables were regressed on task performance, perceived advice quality and confidence ratings. The results consistently showed that incorrect advice negatively impacted performance, as people failed to dismiss it (i.e., overreliance). Additionally, we found that the effects of source and explainability of advice on the dependent variables were limited.
... Few studies, if any, focus exclusively on the impact of knowledge or experience, but many compare the attitudes of those with different levels of knowledge or experience (Zhang 2021;Starke et al. 2022). Most studies find that those with more knowledge or experience are more supportive of AI/ML (e.g., Thurman et al. 2019;Logg, Minson, and Moore 2019;Zhang and Dafoe 2019;Araujo et al. 2020;Arnesen and Johannesson 2022;Zhang 2021;Starke et al. 2022). Some results suggest that those with the most knowledge are again less supportive, suggesting a U-shaped relationship (Zhang 2021;cf. ...
... However, there are many limitations to the current evidence. First, how "knowledge or experience" is defined varies: Both math or computer programming skills (Logg, Minson, and Moore 2019;Lee and Baykal 2017), education (Thurman et al. 2019;Zhang and Dafoe 2019;Wang, Harper, and Zhu 2020), occupation , and respondents' self-assessed knowledge (Arnesen and Johannesson 2022;Araujo et al. 2020) have been used. These can represent vastly different aspects of having knowledge, induce different impacts on opinion, and also vary in how relevant they actually are for understanding public opinion in this case. ...
Conference Paper
Full-text available
We are on the verge of a revolution in public sector decision-making processes, where computers will take over many of the governance tasks previously assigned to human bureaucrats. Governance decisions based on algorithmic information processing are increasing in numbers and scope, contributing to decisions that impact individual citizens on topics such as giving defendants parole, reallocating refugees, and determining eligibility for welfare programs. Increased capacity to process relevant information enhances the potential for making more accurate and efficient judgments. Yet, we also run the risk of creating a black box society where citizens are being kept in the dark about the decision-making processes that affect their lives, potentially undermining the legitimacy of governmental institutions among the citizens they serve. While significant attention in the recent few years has been devoted to normative discussions on fairness, accountability, and transparency related to algorithmic decision making, little is still known about citizens' views on this issue. A pressing concern is that citizens in general have little knowledge about artificial intelligence. In an effort to empower citizens with knowledge and information, we conducted an online deliberative poll on the topic of using artificial intelligence to aid decision-makers in the Norwegian public sector. It is to our knowledge the first delibera-tive event on this topic. Analyzing the pretest/post-test control group deliberation experiment, we find that citizens who have participated in the deliberative event overall take a more positive position towards the use of artificial intelligence in the public sector. The supposed mechanism is that citizens are less fearful when they become more knowledgeable about the topic through having discussions with fellow citizens and acquiring balanced information about the topic.
... To assess laypeople's expectations of GPT performance and assess how knowledge about actual performance impacts advice utilization, we used the Judge-Advisor System (Sniezek and Buckley, 1995;van Swol and Sniezek, 2005). While the Judge-Advisor System has been used in the past to assess the utilization of algorithmic advice (Logg et al., 2019), it has not yetto the best of our knowledgebeen used to assess advice-taking from GPT. ...
... WOA typically ranges between 0indicating that the judge has entirely disregarded the advice and 1indicating that the judge used precisely the answer indicated by the advisor. Values outside of this range were winsorized, in line with previous research on the utilization of advice from machines (Logg et al., 2019). ...
Article
Full-text available
We assess the ability of GPT–a large language model–to serve as a financial robo-advisor for the masses, by combining a financial literacy test and an advice-utilization task (the Judge-Advisor System). Davinci and ChatGPT (variants of GPT) score 58% and 67% on the financial literacy test, respectively, compared to a baseline of 31%. However, people overestimated GPT's performance (79.3%), and in a savings dilemma, they relied heavily on advice from GPT (WOA = 0.65). Lower subjective financial knowledge increased advice-taking. We discuss the risk of overreliance on current large language models and how their utility to laypeople may change.
... The first and one of the most assumed is an increase in users' trust [12]. Schmidt et al. [32] indicated that the general perception in the literature is that transparency increases trust in AI systems and that system owners can enhance such trust by providing users with simple and easyto-understand explanations of the system's output [27,40]. In the context of AI systems, trust is more important than in other traditional engineering systems because AI systems are based on induction, meaning that they make generalizations by learning from specific instances rather than applying general concepts or laws to specific applications. ...
... They also found that people lose confidence in algorithms more quickly than humans when witnessing mistakes. Contrary to these findings, Logg, Minson, and Moore [27] found that people actually appreciate predications and recommendations coming from algorithms more than from humans, even when they do not understand how the algorithms make the recommendations [35]. Given the aforementioned issues and limitations of transparency, many scholars argue that achieving full transparency is undesirable, if not impossible [12,13,24,30]. ...
Article
Full-text available
Recently, artificial intelligence (AI) systems have been widely used in different contexts and professions. However, with these systems developing and becoming more complex, they have transformed into black boxes that are difficult to interpret and explain. Therefore, urged by the wide media coverage of negative incidents involving AI, many scholars and practitioners have called for AI systems to be transparent and explainable. In this study, we examine transparency in AI-augmented settings, such as in workplaces, and perform a novel analysis of the different jobs and tasks that can be augmented by AI. Using more than 1000 job descriptions and 20,000 tasks from the O*NET database, we analyze the level of transparency required to augment these tasks by AI. Our findings indicate that the transparency requirements differ depending on the augmentation score and perceived risk category of each task. Furthermore, they suggest that it is important to be pragmatic about transparency, and they support the growing viewpoint regarding the impracticality of the notion of full transparency.
... Compared with algorithm aversion, we know little about when and why decision makers appreciate algorithms (Logg et al., 2019). ...
... We expected that obtaining a license and reading up on the superiority of algorithms in the academic literature should increase knowledge and hence algorithm use . Moreover, we also explored the relation between experience and algorithm use since some evidence suggests that it is negatively related to algorithm use and prediction accuracy (Arkes et al., 1986;Logg et al., 2019), likely due to overconfidence of experienced decision makers (Arkes et al., 1986). So, given that practitioner-insights regarding algorithm aversion and appreciation are largely lacking, compared with insights from top-down research based on theory and experimental designs, we had the following research question: ...
Article
Full-text available
Although mechanical combination results in more valid human performance predictions and decisions than holistic combination, existing publications suggest that mechanical combination is rarely used in practice. Yet, these publications are either descriptions of anecdotal experiences or outdated surveys. Therefore, in several Western countries, we conducted two surveys (total N = 323) and two focus groups to investigate (1) how decision makers in psychological assessment and human resource practice combine information, (2) why they do (not) use mechanical combination, and (3) what may be needed to increase its use in practice. Many participants reported mostly using holistic combination, usually in teams. The most common reasons for not using mechanical combination were that algorithms are unavailable in practice and that stakeholders do not accept their use. Furthermore, decision makers do not quantify information, do not believe in research findings on evidence‐based decision making, and think that combining holistic and mechanical combination results in the best decisions. The most important reason why mechanical combination is used was to increase predictive validity. To stimulate the use of mechanical combination in practice, decision makers indicated that they should receive more training on evidence‐based decision making and that decision aids supporting the use of mechanical combination should be developed. Combining information with an algorithm (mechanical combination) results in more valid human performance predictions and decisions than combining information in the mind (holistic combination). Yet, decision makers rarely use mechanical combination in practice. To improve predictive validity, transparency, and the opportunity for learning, an algorithm should be used. Reasons reported by decision makers on why they rarely use mechanical combination were that they (1) do not and cannot quantify all available information, (2) do not believe in research findings on evidence‐based decision making, and (3) think that a combination of mechanical and holistic combination results in the best predictions and decisions. Furthermore, decision makers indicated that they fear negative stakeholder evaluations when they would use algorithms. Decision makers showed many misunderstandings regarding holistic and mechanical combination, even after reading an elaborate explanation of the two methods. To improve decision making in practice, decision makers should be (1) trained in evidence‐based decision making, (2) supported in designing evidence‐based algorithms, and (3) encouraged to consult the academic literature on evidence‐based decision making more regularly. Combining information with an algorithm (mechanical combination) results in more valid human performance predictions and decisions than combining information in the mind (holistic combination). Yet, decision makers rarely use mechanical combination in practice. To improve predictive validity, transparency, and the opportunity for learning, an algorithm should be used. Reasons reported by decision makers on why they rarely use mechanical combination were that they (1) do not and cannot quantify all available information, (2) do not believe in research findings on evidence‐based decision making, and (3) think that a combination of mechanical and holistic combination results in the best predictions and decisions. Furthermore, decision makers indicated that they fear negative stakeholder evaluations when they would use algorithms. Decision makers showed many misunderstandings regarding holistic and mechanical combination, even after reading an elaborate explanation of the two methods. To improve decision making in practice, decision makers should be (1) trained in evidence‐based decision making, (2) supported in designing evidence‐based algorithms, and (3) encouraged to consult the academic literature on evidence‐based decision making more regularly.
... When comparing different forms of advice, studies have found both algorithmic aversion (i.e., preferring human advice compared to an algorithm, (e.g. 17 )) and algorithmic appreciation (i.e., preferring advice from an algorithm compared to human advice (e.g. 18 )). These varying observations might be due to several factors-for instance, it has been shown that people with high task expertise are more inclined to dismiss or devalue task-related advice from an AI system than are people with low task expertise 7,18 . ...
... 18 )). These varying observations might be due to several factors-for instance, it has been shown that people with high task expertise are more inclined to dismiss or devalue task-related advice from an AI system than are people with low task expertise 7,18 . It has also been shown that even when participants rate the quality of AI advice as lower than human advice, they still follow both sources of advice to the same degree 7,9 . ...
Article
Full-text available
Artificial intelligence (AI)-generated clinical advice is becoming more prevalent in healthcare. However, the impact of AI-generated advice on physicians’ decision-making is underexplored. In this study, physicians received X-rays with correct diagnostic advice and were asked to make a diagnosis, rate the advice’s quality, and judge their own confidence. We manipulated whether the advice came with or without a visual annotation on the X-rays, and whether it was labeled as coming from an AI or a human radiologist. Overall, receiving annotated advice from an AI resulted in the highest diagnostic accuracy. Physicians rated the quality of AI advice higher than human advice. We did not find a strong effect of either manipulation on participants’ confidence. The magnitude of the effects varied between task experts and non-task experts, with the latter benefiting considerably from correct explainable AI advice. These findings raise important considerations for the deployment of diagnostic advice in healthcare.
... Nevertheless, researchers have documented a strong aversion to relying on algorithms (Dietvorst, Simmons, and Massey 2015;Burton, Stein, and Jensen 2020). There is increasing evidence, however, that people are more open to input from algorithmic decision aids (Logg, Minson, and Moore 2019). This shift towards a more trusting stance aligns with intuition: combining the proliferation of algorithmic decision aids and the insight that repeated exposure to stimuli can alter the ways in which people respond to them (Zajonc 1968;Bornstein and D'agostino 1992) creates a scenario in which it would be surprising if people did not exhibit more trust. ...
... Aversion to algorithmic choice aids is a well-documented phenomenon (Dietvorst, Simmons, and Massey 2015;Burton, Stein, and Jensen 2020). The proliferation of both passive and adaptive algorithms in every corner of life, however, is leaving people increasingly accepting of them (Logg, Minson, and Moore 2019). Historically, the experimental methods used to study human-algorithm teams relied on reducing the complexity of choices or studying them in situ. ...
Preprint
Full-text available
Behavioral scientists have classically documented aversion to algorithmic decision aids, from simple linear models to AI. Sentiment, however, is changing and possibly accelerating AI helper usage. AI assistance is, arguably, most valuable when humans must make complex choices. We argue that classic experimental methods used to study heuristics and biases are insufficient for studying complex choices made with AI helpers. We adapted an experimental paradigm designed for studying complex choices in such contexts. We show that framing and anchoring effects impact how people work with an AI helper and are predictive of choice outcomes. The evidence suggests that some participants, particularly those in a loss frame, put too much faith in the AI helper and experienced worse choice outcomes by doing so. The paradigm also generates computational modeling-friendly data allowing future studies of human-AI decision making.
... People readily rely on AI in objective and technical domains (e.g., numeric estimation, data analysis, and giving directions, Castelo et al., 2019;Logg et al., 2019). However, they are reluctant to use AI for subjective decisions, especially with ethical implications (e.g., parole sentences, trolley-type dilemmas, Bigman & Gray, 2018;Castelo et al., 2019;Laakasuo et al., 2021). ...
... The current study tests how advice type (honesty-vs dishonesty-promoting), advice source (AI vs Human), and information about advice source (transparency vs opacity) shape humans' (un)ethical behaviour. Prior work has examined people's stated preferences about hypothetical scenarios describing AI advice (Bigman & Gray, 2018;Castelo et al., 2019;Kim & Duhachek, 2020;Logg et al., 2019). We supplement such work by adopting a machine behaviour approach (Rahwan et al., 2019) and examine people's behavioural reactions to actual AI-generated output. ...
Preprint
Full-text available
Artificial Intelligence (AI) increasingly becomes an indispensable advisor. New ethical concerns arise if AI persuades people to behave dishonestly. In an experiment, we study how AI advice (generated by a Natural-Language-Processing algorithm) affects (dis)honesty, compare it to equivalent human advice, and test whether transparency about advice source matters. We find that dishonesty-promoting advice increases dishonesty, whereas honesty-promoting advice does not increase honesty. This is the case for both AI- and human advice. Algorithmic transparency, a commonly proposed policy to mitigate AI risks, does not affect behaviour. The findings mark the first steps towards managing AI advice responsibly.
... A small number of studies examining decisional biases when using AI have identified that physicians across expertise levels often fail to dismiss inaccurate advice generated by computerized systems (automation bias [41][42][43][44][45] ), but as well as by humans, indicating that people are generally susceptible to suggestions. The tendency to follow even bad advice appears to be even more prevalent among participants with less domain expertise 46,47 . Receiving such advice from AI systems can raise further dangers by potentially engaging other cognitive biases such as anchoring effects and confirmatory bias, in which users are primed towards a certain perspective and disproportionately orient their attention to information that confirms it 48 . ...
... Receiving such advice from AI systems can raise further dangers by potentially engaging other cognitive biases such as anchoring effects and confirmatory bias, in which users are primed towards a certain perspective and disproportionately orient their attention to information that confirms it 48 . Other studies have found that participants are averse to following algorithmic advice when making final decisions (algorithmic bias) [49][50][51] , but this result is inconsistent with other studies, which show people sometimes prefer algorithmic to human judgment 46,47,52 . ...
Article
Full-text available
As the use of artificial intelligence and machine learning (AI/ML) continues to expand in healthcare, much attention has been given to mitigating bias in algorithms to ensure they are employed fairly and transparently. Less attention has fallen to addressing potential bias among AI/ML’s human users or factors that influence user reliance. We argue for a systematic approach to identifying the existence and impacts of user biases while using AI/ML tools and call for the development of embedded interface design features, drawing on insights from decision science and behavioral economics, to nudge users towards more critical and reflective decision making using AI/ML.
... Some studies described preferences for individual decision-making and individual personal data management (Dietvorst et al., 2015). Some studies show users' preferences for algorithmic recommendations and choices (Logg et al., 2019;Dijkstra, 1999). It seems that the use of these tactics depends primarily on the level of media literacy, respectively, on users' technical and legal competencies. ...
Article
Full-text available
The paper describes how Czech news consumers adapted on personalization of news. Our study is the first one that tries to deepen the existing knowledge about Czech internet users’ reflections of the news contents’ personalization. Goal of research was to understand how news personalization affects consumers trust in this technology and describe what needs and competencies news consumers perceive as necessary for protecting their personal data. Qualitative research is based on three focus groups of 27 participants, two od ordinary users from the Czech republic, and one of experts from the same country. Qualitative analysis shows that respondents perceive the phenomenon of news personalization as a loss of control over search mechanisms and information delivery. They expressed doubts about personalization credibility and described the respective algorithms as a “black box.” Research describes defensive tactics which users apply to resist the personalization news algorithm. The data shows that personalized news consumers are concerned about personal data management. https://doi.org/10.15847/obsOBS17120232063
... With that said, these findings contribute to existing literature on algorithmic aversion (e.g., Dietvorst et al. 2015, Logg et al. 2019. Extending research that documents differences in algorithmic aversion across tasks (e.g., Longoni andCian 2022, Mahmud et al. 2022), we present preliminary evidence regarding when consumers may be more likely to have aversive reactions to AI-generated content. ...
Preprint
The emergence of generative AI technologies, such as OpenAI's ChatGPT chatbot, has expanded the scope of tasks that AI tools can accomplish and enabled AI-generated creative content. In this study, we explore how disclosure regarding the use of AI in the creation of creative content affects human evaluation of such content. In a series of pre-registered experimental studies, we show that AI disclosure has no meaningful effect on evaluation either for creative or descriptive short stories, but that AI disclosure has a negative effect on evaluations for emotionally evocative poems written in the first person. We interpret this result to suggest that reactions to AI-generated content may be negative when the content is viewed as distinctly "human." We discuss the implications of this work and outline planned pathways of research to better understand whether and when AI disclosure may affect the evaluation of creative content.
... However, other scholars consider algorithm aversion to be present as soon as subjects exhibit a fundamental disapproval of an algorithm in spite of its possible superiority (cf. [22][23][24][25][26][27][28]). ...
Article
Full-text available
Algorithms already carry out many tasks more reliably than human experts. Nevertheless, some subjects have an aversion towards algorithms. In some decision-making situations an error can have serious consequences, in others not. In the context of a framing experiment, we examine the connection between the consequences of a decision-making situation and the frequency of algorithm aversion. This shows that the more serious the consequences of a decision are, the more frequently algorithm aversion occurs. Particularly in the case of very important decisions, algorithm aversion thus leads to a reduction of the probability of success. This can be described as the tragedy of algorithm aversion.
... A common theme across this research is that these workers are often "left to their own devices" (Ticona, 2022) to create their own ways to fight back (Cameron & Rahman, 2022;Maffie, 2022;Shapiro, 2018), find meaning (Petrigerleri et al., 2019;Cameron, 2022;Connelly et al., 2021;Kameswaram et al., 2018), and navigate the economic and physical challenges inherent in such precarious work (Caza et al., 2021;Cameron et al., 2021;Ravenelle, 2019) in an increasingly opaque and unfamiliar terrain. A related line of research, at the intersection of psychology and technology, examines how workers think about algorithms when embedded in the everyday decision making of their work, such as in forecasting, predicting, and evaluating performance (Dietvorst et al., 2018(Dietvorst et al., , 2020Jago, 2019;Logg et al., 2019;Raveendhran & Fast, 2021). Taken together, these multiple perspectives have provided a fuller account of how algorithmic management has influenced organizations and organizing. ...
Article
In recent years, the topic of algorithmic management has received increasing attention in information systems (IS) research and beyond. As both emerging platform businesses and established companies rely on artificial intelligence and sophisticated software to automate tasks previously done by managers, important organizational, social, and ethical questions emerge. However, a cross-disciplinary approach to algorithmic management that brings together IS perspectives with other (sub-)disciplines such as macro-and micro-organizational behavior, business ethics, and digital sociology is missing, despite its usefulness for IS research. This article engages in cross-disciplinary agenda setting through an in-depth report of a professional development workshop (PDW) entitled "Algorithmic Management: Toward a Cross-Disciplinary Research Agenda" delivered at the 2021 Academy of Management Annual Meeting. Three leading experts (Mareike Möhlmann, Lindsey Cameron, and Laura Lamers) on the topic provide their insights on the current status of algorithmic management research, how their work contributes to this area, where the field is heading in the future, and what important questions should be answered going forward. These accounts are followed up by insights from the breakout group discussions at the PDW that provided further input. Overall, the experts and workshop participants highlighted that future research should examine both the desirable and undesirable outcomes of algorithmic management and should not shy away from posing ethical and normative questions.
... The reasoning of state-of-the-art black-box models is inherently opaque and, thus, more challenging to comprehend than that of models used in traditional decision support systems-even for domain experts [6]. This makes users reluctant to accept the recommendations of those models-a situation compounded by a general mistrust towards algorithmic decision-making [7]. Researchers increasingly argue that understanding "why a model makes a certain prediction can be as crucial as the prediction's accuracy itself" [6, p. 4766]. ...
Article
Human-AI collaboration has become common, integrating highly complex AI systems into the workplace. Still, it is often ineffective; impaired perceptions—such as low trust or limited understanding—reduce compliance with recommendations provided by the AI system. Drawing from cognitive load theory, we examine two techniques of human-AI collaboration as potential remedies. In three experimental studies, we grant users decision control by empowering them to adjust the system's recommendations, and we offer explanations for the system's reasoning. We find decision control positively affects user perceptions of trust and understanding, and improves user compliance with system recommendations. Next, we isolate different effects of providing explanations that may help explain inconsistent findings in recent literature: while explanations help reenact the system's reasoning, they also increase task complexity. Further, the effectiveness of providing an explanation depends on the specific user's cognitive ability to handle complex tasks. In summary, our study shows that users benefit from enhanced decision control, while explanations—unless appropriately designed for the specific user—may even harm user perceptions and compliance. This work bears both theoretical and practical implications for the management of human-AI collaboration.
... We chose these modalities to achieve multiple aims: First, showing the whole "global" decision-making process without requiring participants to have technical or specific domain knowledge, following the setups of Wang & Yin [62] and Logg et al. [34]. Second, giving learners the opportunity to close gaps in their understanding by asking questions ("inquiring") and interacting verbally in the dialogue modality, which can lead to more effective understanding [45,47,54]. ...
Preprint
Full-text available
Ethical principles for algorithms are gaining importance as more and more stakeholders are affected by "high-risk" algorithmic decision-making (ADM) systems. Understanding how these systems work enables stakeholders to make informed decisions and to assess the systems' adherence to ethical values. Explanations are a promising way to create understanding, but current explainable artificial intelligence (XAI) research does not always consider theories on how understanding is formed and evaluated. In this work, we aim to contribute to a better understanding of understanding by conducting a qualitative task-based study with 30 participants, including "users" and "affected stakeholders". We use three explanation modalities (textual, dialogue, and interactive) to explain a "high-risk" ADM system to participants and analyse their responses both inductively and deductively, using the "six facets of understanding" framework by Wiggins & McTighe. Our findings indicate that the "six facets" are a fruitful approach to analysing participants' understanding, highlighting processes such as "empathising" and "self-reflecting" as important parts of understanding. We further introduce the "dialogue" modality as a valid alternative to increase participant engagement in ADM explanations. Our analysis further suggests that individuality in understanding affects participants' perceptions of algorithmic fairness, confirming the link between understanding and ADM assessment that previous studies have outlined. We posit that drawing from theories on learning and understanding like the "six facets" and leveraging explanation modalities can guide XAI research to better suit explanations to learning processes of individuals and consequently enable their assessment of ethical values of ADM systems.
... despite the available and underway research which promise to explain the interpretability and fairness of an AI system, there is still a lack of satisfactory answer to the problem of opacity (Samek & Müller, 2019). In such cases it is the responsibility of a manager to maintain procedural justice within organization while building AI capabilities (Logg et. al., 2019). ...
Chapter
Full-text available
Artificial intelligence (AI) and human resources (HR) are often presented as a dichotomy by researchers. Rhetoric at present posits AI as a potential threat to HR. AI is seen as the ability of a machine to perform cognitive tasks, such as perceiving, reasoning, and problem solving, which we often associate with human minds. However, studies have also found that emotions, tacit knowledge, ethics, common sense, and pro-social behavior are fine human virtues that cannot be replaced by any intelligent machine. Thus, this study opines that organizations cannot achieve desired effectiveness by putting AI and people at the extreme ends of a continuum. The literature has many cases of specific human-AI proposals that widely address several organizational needs. The reverse case, to use specific organizational needs as a basis for formulating general human-AI proposal, is less common. This chapter is a nascent attempt to propose a conceptual framework of human capabilities (artistry/soft and scientific/hard) that are used to characterize the required human-AI intervention.
... Usevalue will be affected by the nature and context of its use (see above discussion of variation in distribution, so it is more difficult to estimate what proportion of teachers might hold extreme distrust of observational validity. 17 Although not narrowly tailored to this topic, research by Logg et al. (2019) suggests that individuals are generally receptive to algorithmic predictions in the modern era. teacher response in Cheresaro et al., 2016), but I have also argued that the level of agnosticism in an observational system is a "fundamental" feature and may tend to influence teacher response, in general, through specific mechanisms (choice, etc.). ...
Article
Full-text available
Many instructional observation systems are designed to provide rough, qualitative, highly-evaluative assessments on numerous core dimensions of teaching. Such systems achieve comprehensive overviews of teaching but are poorly suited to answering many discovery-oriented research questions. In contrast, fine-grained agnostic systems are needed to pose and answer causal questions about instruction, and to fully understand instructional variation and change. More speculatively, I argue that the agnostic quality of fine-grained systems may also be useful in promoting teacher learning. Agnostic systems offer choice, withhold judgement, make room for locally-compensatory practices, and promote a greater locus of control. Instructional observation systems that carefully and agnostically quantify instructional processes may best help teachers leverage their professional judgment and invigorate their professional practice.
... Individuals might be, by disposition, less/more inclined to trust a system, independent of its capabilities. Similarly, following insights on algorithm aversion [30] and algorithm appreciation [80], individuals' attitudes towards automation likely influence trust levels. From a system perspective, adding explanations or increasing system transparency might allow individuals to better gauge the true capabilities of a system. ...
Preprint
Full-text available
Trust has been recognized as a central variable to explain the resistance to using automated systems (under-trust) and the overreliance on automated systems (over-trust). To achieve appropriate reliance, users’ trust should be calibrated to reflect a system’s capabilities. Studies from various disciplines have examined different interventions to attain such trust calibration. Based on a literature body of 1000+ papers, we identified 96 relevant publications which aimed to calibrate users’ trust in automated systems. To provide an in-depth overview of the state-of-the-art, we reviewed and summarized measurements of the trust calibration, interventions, and results of these efforts. For the numerous promising calibration interventions, we extract common design choices and structure these into four dimensions of trust calibration interventions to guide future studies. Our findings indicate that the measurement of the trust calibration often limits the interpretation of the effects of different interventions. We suggest future directions for this problem.
... Some research indicates that there is no general aversion against algorithmic decision support but that its adoption substantially depends on the respective application context [16]. To resolve this inconsistency, it recently has been proposed that aversion against the usage of AI/algorithmic advice depends on the identity relevance of the respective context [17], with some studies showing even an appreciation of algorithmic advice in contexts with low identity relevance (e.g., certain estimation tasks; [18]). ...
Preprint
Full-text available
The release of ChatGPT has received significant attention from both scientists and the public. Despite its acknowledged capabilities and potential applications, the perception and reaction of individuals to content generated by ChatGPT is not well understood. To address this, we focus on two important application domains: recommendations for (i) societal challenges and (ii) personal challenges. In two preregistered experimental studies, we investigate how individuals evaluate the author's competence, the quality of the content, and their intention to share or follow the recommendations provided. Study 1 (N = 1,003) demonstrates that when individuals are (vs. are not) aware of the author's identity, they devalue the author's competence but not the content or the intention to share the recommendation for societal challenges provided by ChatGPT (vs. a human expert). Study 2 (N = 501) replicates the devaluation of ChatGPT's competence when its identity is (vs. is not) known in the context of self-relevant personal challenges. It further suggests that more negative evaluations of the author do not negatively affect the likelihood of following recommendations by ChatGPT. Overall, these results provide insights into the potential acceptance of ChatGPT and have implications for the literature on algorithm aversion.
... Hence, consumers need to be convinced that it is prudent to entrust these tasks to the increasingly autonomous and intelligent decision support systems as a means of realizing efficiency gains. According to Logg et al. (2019), people prefer automated processes (algorithms) over humans for tasks that entail objective information processing such as investment management because algorithms are perceived to perform better in financial decisionmaking (Harvey et al., 2017). Similar conclusions were reached by Ruhr (2020) with respect to robo-advisors, as their findings indicated that user intervention with the task automation may lead to performance deterioration. ...
Article
Robo-advisory services are gaining traction and could usher in the next cycle of disruptive change in the financial services industry. Yet, many are reticent to embrace this service innovation for their wealth management. This study probes this phenomenon by examining the interplay among technology characteristics (i.e. performance expectancy, effort expectancy, and perceived security), human-like characteristics (i.e. perceived autonomy, perceived intelligence, and perceived anthropomorphism), and consumer characteristics (i.e. financial literacy and affinity for technology interaction) to explain the acceptance of robo-advisory services. For this purpose, a fuzzy set qualitative comparative analysis and an artificial neural network analysis were performed to uncover the interdependency and complexity of the proposed variables, based on 375 responses collected through a large consumer panel survey in China. The findings revealed the presence of six configurations conducive for high acceptance of robo-advisory services, with perceived anthropomorphism and a combination of perceived effort expectancy and perceived security identified as core conditions. Moreover, according to the artificial neural network analysis, perceived intelligence is the most important determinant of robo-advisory service acceptance. This study challenges the conventional linear and symmetric perspective adopted in prior research.
... First, it points to an overestimation of what ML systems can achieve, indicating automation bias. 24 However, it is essential to recognise these systems' limitations. ML systems are statistical systems that make predictions based on correlations established at the population level. ...
Article
Machine learning (ML) systems play an increasingly relevant role in medicine and healthcare. As their applications move ever closer to patient care and cure in clinical settings, ethical concerns about the responsibility of their use come to the fore. I analyse an aspect of responsible ML use that bears not only an ethical but also a significant epistemic dimension. I focus on ML systems’ role in mediating patient–physician relations. I thereby consider how ML systems may silence patients’ voices and relativise the credibility of their opinions, which undermines their overall credibility status without valid moral and epistemic justification. More specifically, I argue that withholding credibility due to how ML systems operate can be particularly harmful to patients and, apart from adverse outcomes, qualifies as a form of testimonial injustice. I make my case for testimonial injustice in medical ML by considering ML systems currently used in the USA to predict patients’ risk of misusing opioids (automated Prediction Drug Monitoring Programmes, PDMPs for short). I argue that the locus of testimonial injustice in ML-mediated medical encounters is found in the fact that these systems are treated as markers of trustworthiness on which patients’ credibility is assessed. I further show how ML-based PDMPs exacerbate and further propagate social inequalities at the expense of vulnerable social groups.
... One of the risks of the fast integration of MT as an accessibility tool is that it can cement the impression that translation is a simple one-to-one replacement process that current algorithms can execute to high standards. In this sense, general users of MT can be overconfident technology users who display a high level of algorithm appreciationhuman preference for the advice provided by an algorithm (Logg et al., 2019)-to the detriment of human involvement. This type of belief could further contribute to the invisibility of translation, undermine language learning initiatives, and accentuate precarious working conditions for professional translators. ...
Article
Full-text available
En los 20 años de la Revista Tradumàtica, hemos visto cómo la traducción automática ha pasado a formar parte de la vida cotidiana de sus usuarios habituales. Partiendo de 17 respuestas, este artículo reflexiona sobre el uso de la TA entre los no profesionales de la traducción. Tras opinar sobre el uso de la TA como diccionario, para leer noticias, para acceder a la información o para producir textos en situaciones que los usuarios perciben como de bajo o alto riesgo, el artículo ahonda en la concienciación de los usuarios con respecto a la precisión de la TA y la necesidad de comprometerse con el resultado para mejorar la calidad de las traducciones. Además, los resultados también indican que el uso de la TA no solo afecta a la producción en la lengua meta, sino que también influye en la redacción de los originales que se pretende traducir. A partir de las respuestas, el artículo analiza el impacto de la TA en el marco de la accesibilidad y la democratización, revisando cómo la TA y la IA tienen el potencial de apoyar el cambio social pero también de profundizar la desigualdad, reproducir sesgos y reducir la operatividad de los agentes humanos. Por último, el artículo hace un llamamiento a una aplicación crítica y consciente de la TA para apoyar la interacción persona-ordenador como herramienta para el desarrollo de la sociedad.
... This psychological response to a collaborative decision may prove to be even stronger, when people rely on recommender systems, than if they rely on human advisors. Human decision makers seem to adhere more to algorithmic advice (Logg et al., 2019) and are reluctant to acknowledge how strongly the machine's advice influences them (Kr€ ugel et al., 2022). Secondly, society might encounter barriers to view human deciders, who follow the suggestion of a recommender system, as responsible as it would view a non-or merely human-advised decider (Braun et al., 2021;Nissenbaum, 1996). ...
Article
Full-text available
Widespread adoption of artificial intelligence (AI) technologies is substantially affecting the human condition in ways that are not yet well understood. Negative unintended consequences abound including the perpetuation and exacerbation of societal inequalities and divisions via algorithmic decision making. We present six grand challenges for the scientific community to create AI technologies that are human-centered, that is, ethical, fair, and enhance the human condition. These grand challenges are the result of an international collaboration across academia, industry and government and represent the consensus views of a group of 26 experts in the field of human-centered artificial intelligence (HCAI). In essence, these challenges advocate for a human-centered approach to AI that (1) is centered in human well-being, (2) is designed responsibly, (3) respects privacy, (4) follows human-centered design principles, (5) is subject to appropriate governance and oversight, and (6) interacts with individuals while respecting human's cognitive capacities. We hope that these challenges and their associated research directions serve as a call for action to conduct research and development in AI that serves as a force multiplier towards more fair, equitable and sustainable societies.
... Les chercheurs s'intéressent aujourd'hui aux raisons pour lesquelles les acteurs l'aide à la décision fournie grâce à l'IA utilisent ces outils (Burton et al., 2020). Plus spécifiquement, un champ de recherche s'est emparé de la question spécifique de la réticence des individus à utiliser les systèmes automatisés (Dietvorst et al., 2015) ou a contrario de leurs appréciation positive de cette technologie (Logg et al., 2019). Dans le domaine du recrutement, on sait aujourd'hui encore peu de choses sur la façon dont les ADSS influencent les recruteurs. ...
Conference Paper
Full-text available
La présélection des CV assistée par des systèmes d'aide à la décision intégrant l'intelligence artificielle connaît actuellement un fort développement dans de nombreuses organisations, soulevant des questions techniques, managériales, juridiques et éthiques. L'objectif De la présente communication vise à mieux comprendre les réactions des recruteurs lorsqu'ils se voient proposer des recommandations basées sur des algorithmes lors de la présélection des CV. Deux attitudes majeures ont été identifiées dans la littérature sur les réactions des utilisateurs aux recommandations basées sur des algorithmes : l'aversion pour les algorithmes, qui reflète une méfiance générale et une préférence pour les recommandations humaines ; et le biais d'automation, qui correspond à une confiance excessive dans les décisions ou les recommandations faites par les systèmes algorithmiques d'aide à la décision (ADSS). En s'appuyant sur les résultats obtenus dans le domaine de l'aide à la décision automatisée, nous faisons l'hypothèse générale que les recruteurs font plus confiance aux experts humains qu'aux systèmes algorithmiques d’aide à la décision, car ils se méfient des algorithmes pour des décisions subjectives comme le recrutement. Une expérimentation sur la sélection des CV a été menée sur un échantillon de professionnels (N=1 100) auxquels il a été demandé d'étudier une offre d'emploi, puis d'évaluer deux CV fictifs dans un plan factoriel 2×2 avec manipulation du type de recommandation (pas de recommandation/recommandation algorithmique/recommandation d'un expert humain) et de la pertinence des recommandations (recommandation pertinente vs non pertinente). Nos résultats confirment l'hypothèse générale de préférence pour les recommandations humaines : les recruteurs font preuve d'un niveau de confiance plus élevé envers les recommandations d'experts humains par rapport aux recommandations algorithmiques. Cependant, nous avons également constaté que la pertinence de la recommandation a un impact différentiel et inattendu sur les décisions : en présence d'une recommandation algorithmique non pertinente, les recruteurs ont favorisé le CV le moins pertinent par rapport au meilleur CV. Ce décalage entre les attitudes et les comportements suggère un possible biais d'automation. Nos résultats montrent également que des traits de personnalité spécifiques (extraversion, neuroticisme et confiance en soi) sont associés à une utilisation différentielle des recommandations algorithmiques. Les implications pour la recherche et les politiques RH sont enfin discutées.
Article
Full-text available
The role of artificial intelligence (AI) in organizations has fundamentally changed from performing routine tasks to supervising human employees. While prior studies focused on normative perceptions of such AI supervisors, employees’ behavioral reactions towards them remained largely unexplored. We draw from theories on AI aversion and appreciation to tackle the ambiguity within this field and investigate if and why employees might adhere to unethical instructions either from a human or an AI supervisor. In addition, we identify employee characteristics affecting this relationship. To inform this debate, we conducted four experiments (total N = 1701) and used two state-of-the-art machine learning algorithms (causal forest and transformers). We consistently find that employees adhere less to unethical instructions from an AI than a human supervisor. Further, individual characteristics such as the tendency to comply without dissent or age constitute important boundary conditions. In addition, Study 1 identified that the perceived mind of the supervisors serves as an explanatory mechanism. We generate further insights on this mediator via experimental manipulations in two pre-registered studies by manipulating mind between two AI (Study 2) and two human supervisors (Study 3). In (pre-registered) Study 4, we replicate the resistance to unethical instructions from AI supervisors in an incentivized experimental setting. Our research generates insights into the ‘black box’ of human behavior toward AI supervisors, particularly in the moral domain, and showcases how organizational researchers can use machine learning methods as powerful tools to complement experimental research for the generation of more fine-grained insights.
Article
eXplainable AI (XAI) involves two intertwined but separate challenges: the development of techniques to extract explanations from black-box AI models, and the way such explanations are presented to users, i.e., the explanation user interface. Despite its importance, the second aspect has received limited attention so far in the literature. Effective AI explanation interfaces are fundamental for allowing human decision-makers to take advantage and oversee high-risk AI systems effectively. Following an iterative design approach, we present the first cycle of prototyping-testing-redesigning of an explainable AI technique, and its explanation user interface for clinical Decision Support Systems (DSS). We first present an XAI technique that meets the technical requirements of the healthcare domain: sequential, ontology-linked patient data, and multi-label classification tasks. We demonstrate its applicability to explain a clinical DSS, and we design a first prototype of an explanation user interface. Next, we test such a prototype with healthcare providers and collect their feedback, with a two-fold outcome: first, we obtain evidence that explanations increase users’ trust in the XAI system, and second, we obtain useful insights on the perceived deficiencies of their interaction with the system, so that we can re-design a better, more human-centered explanation interface.
Article
Human-AI collaboration for decision-making strives to achieve team performance that exceeds the performance of humans or AI alone. However, many factors can impact success of Human-AI teams, including a user’s domain expertise, mental models of an AI system, trust in recommendations, and more. This paper reports on a study that examines users’ interactions with three simulated algorithmic models, all with equivalent accuracy rates but each tuned differently in terms of true positive and true negative rates. Our study examined user performance in a non-trivial blood vessel labeling task where participants indicated whether a given blood vessel was flowing or stalled. Users completed 150 trials across multiple stages, first without an AI and then with recommendations from an AI-Assistant. Although all users had prior experience with the task, their levels of proficiency varied widely. Our results demonstrated that while recommendations from an AI-Assistant can aid in users’ decision making, several underlying factors, including user base expertise and complementary human-AI tuning, significantly impact the overall team performance. First, users’ base performance matters, particularly in comparison to the performance level of the AI. Novice users improved, but not to the accuracy level of the AI. Highly proficient users were generally able to discern when they should follow the AI recommendation and typically maintained or improved their performance. Mid-performers, who had a similar level of accuracy to the AI, were most variable in terms of whether the AI recommendations helped or hurt their performance. Second, tuning an AI algorithm to complement users’ strengths and weaknesses also significantly impacted users’ performance. For example, users in our study were better at detecting flowing blood vessels, so when the AI was tuned to reduce false negatives (at the expense of increasing false positives), users were able to reject those recommendations more easily and improve in accuracy. Finally, users’ perception of the AI’s performance relative to their own performance had an impact on whether users’ accuracy improved when given recommendations from the AI. Overall, this work reveals important insights on the complex interplay of factors influencing Human-AI collaboration and provides recommendations on how to design and tune AI algorithms to complement users in decision-making tasks.
Article
Consistency in medical decision-making is ideally expected. This includes consistency between different clinicians so that the same patient will receive the same diagnosis regardless of the assessing clinician. It also encompasses reliability as an individual clinician meaning at any given time or context, we apply the same process and principles to ensure the decisions we make do not deviate significantly from our peers or indeed our own past decisions. However, consistency in decision-making can be challenged when working within a busy healthcare system. We discuss the concept of 'noise' and explore how it affects decision-making in acute presentations of transient neurology where doctors can differ in terms of their diagnostic decisions.
Article
Full-text available
Although future regulations increasingly advocate that AI applications must be interpretable by users, we know little about how such explainability can affect human information processing. By conducting two experimental studies, we help to fill this gap. We show that explanations pave the way for AI systems to reshape users' understanding of the world around them. Specifically, state-of-the-art explainability methods evoke mental model adjustments that are subject to confirmation bias, allowing misconceptions and mental errors to persist and even accumulate. Moreover, mental model adjustments create spillover effects that alter users' behavior in related but distinct domains where they do not have access to an AI system. These spillover effects of mental model adjustments risk manipulating user behavior, promoting discriminatory biases, and biasing decision making. The reported findings serve as a warning that the indiscriminate use of modern explainability methods as an isolated measure to address AI systems' black-box problems can lead to unintended, unforeseen problems because it creates a new channel through which AI systems can influence human behavior in various domains.
Article
This qualitative study aims to illuminate the profound impacts of algorithmic processes on news audiences. The CAPI with 101 participants nationwide examines algorithmic news consumption from several interrelated perspectives, including personalization, news appreciation, echo chambers, algorithmic literacy, and news literacy. The study finds that users of AI-powered news apps often trust and even prefer algorithmic to human judgement but are concerned about missing out on important information and challenging viewpoints, as well as about their privacy. Many participants, while ambivalent about living with algorithm-based news recommendations, commonly exhibit a sense of comfort, appreciation, and gratification with personalized news feeds, calling the use of news apps a pleasant and satisfactory experience. While participants in general report being less active in searching information after using news apps, many participants believe that they are, ironically, more informed and knowledgeable now that they have the news apps.
Chapter
Cooperate social responsibility (CSR) has presented a new set of challenges in the emerging era of artificial intelligence (AI) and big-data analytics to managerial decision-makers and investors alike. Access and statistical manipulation of personally sensitive information as well as transaction datasets provide an opportunity for CSR and the social contract, which examines the relationship of workers with a living wage versus automation, between society and corporate leaders. The initial cost of developing and deploying the appropriate hardware and the software for automation, especially if does not pass the costs-benefit test, may not labor replacement based on operational elements alone (e.g., high levels of output, improved quality, reduced fewer errors, reduced administration and monitoring) may be more important that just reducing labor costs. A discussion of the characteristics of automation in light of AI and big-data analytics in managerial decision making and its relationship to the tenants of CSR that takes care of people before profits highlights these opposing forces.
Article
Full-text available
The growing uses of algorithm-based decision-making in human resources management have drawn considerable attention from different stakeholders. While prior literature mainly focused on stakeholders directly related to HR decisions (e.g., employees), this paper pertained to a third-party observer perspective and investigated how consumers would respond to companies’ adoption of algorithm-based HR decision-making. Through five experimental studies, we showed that the adoption of algorithm-based (vs. human-based) HR decision-making could induce consumers’ unfavorable ethicality inferences of the company (study 1); because implementing a calculative and data-driven approach (i.e. algorithm-based) to make employee-related decisions violates the deontological principles of respectful employee treatment (study 2). However, this effect was attenuated when consumers had high (vs. low) power distance beliefs (study 3); the algorithm served as assistance (vs. replacement) for human decisions (study 4); or the adoption was framed as employee-oriented (vs. company-oriented) motivated (study 5). Our findings suggested that consumers are aversive to algorithm-based HR decision-making because it is deontologically problematic regardless of its decision quality (i.e. accuracy). This paper contributes to the extant understanding of stakeholders’ responses to algorithm-based HR decision-making and consumers’ attitudes toward algorithm users.
Preprint
Full-text available
Robots are transforming the nature of human work. Although human–robot collaborations can create newjobs and increase productivity, pundits often warn about how robots might replace humans at work andcreate mass unemployment. Despite these warnings, relatively little research has directly assessed howlaypeople react to robots in the workplace. Drawing from cognitive appraisal theory of stress, we suggestthat employees exposed to robots (either physically or psychologically) would report greater job insecurity.Six studies—including two pilot studies, an archival study across 185 U.S. metropolitan areas (Study 1), apreregistered experiment conducted in Singapore (Study 2), an experience-sampling study among engineersconducted in India (Study 3), and an online experiment (Study 4)—find that increased exposure to robotsleads to increased job insecurity. Study 3 also reveals that this robot-related job insecurity is in turnpositively associated with burnout and workplace incivility. Study 4 reveals that self-affirmation is apsychological intervention that might buffer the negative effects of robot-related job insecurity. Our findingshold across different cultures and industries, including industries not threatened by robots.
Article
Problem definition: We study the adherence to the recommendations of a decision support system (DSS) for clearance markdowns at Zara, the Spanish fast fashion retailer. Our focus is on behavioral drivers of the decision to deviate from the recommendation, and the magnitude of the deviation when it occurs. Academic/practical relevance: A major obstacle in the implementation of prescriptive analytics is users’ lack of trust in the tool, which leads to status quo bias. Understanding the behavioral aspects of managers’ usage of these tools, as well as the specific biases that affect managers in revenue management contexts, is paramount for a successful rollout. Methodology: We use data collected by Zara during seven clearance sales campaigns to analyze the drivers of managers’ adherence to the DSS. Results: Adherence to the DSS’s recommendations was higher, and deviations were smaller, when the products were predicted to run out before the end of the campaign, consistent with the fact that inventory and sales were more salient to managers than revenue. When there was a higher number of prices to set, managers of Zara’s own stores were more likely to deviate from the DSS’s recommendations, whereas franchise managers did the opposite and showed a weak tendency to adhere more often instead. Two interventions aimed at shifting salience from inventory and sales to revenue helped increase adherence and overall revenue. Managerial implications: Our findings provide insights on how to increase voluntary adherence that can be used in any context in which a company wants an analytical tool to be adopted organically by its users. We also shed light on two common biases that can affect managers in a revenue management context, namely salience of inventory and sales, and cognitive workload. Supplemental Material: The e-companion is available at https://doi.org/10.1287/msom.2022.1166 .
Article
Considering the growing acceptance of humanoid robots in the service industry, this study aimed to examine their negative impact on service evaluation, as well as the underlying mechanism of perceived effort and the moderating role of consumer mindset. Three experiments that used different service scenarios revealed that humanoid service robots negatively affected service evaluation compared to human employees, and this effect was mediated by decreased perceived effort. Furthermore, this negative impact was attenuated when consumers had a concrete mindset compared to abstract. This work contributes to both consumer service and robot literature by elaborating on the possible adverse influence of replacing human employees with humanoid service robots. It also offers managerial implications for how and when to adopt a robot service in this machine age.
Chapter
Engineering design relies on the human ability to make complex decisions, but design activities are increasingly supported by computation. Although computation can help humans make decisions, over- or under-reliance on imperfect models can prevent successful outcomes. To investigate the effects of assistance from a computational agent on decision making, a behavioral experiment was conducted (N = 33). Participants chose between pairs of aircraft brackets while optimizing the design across competing objectives (mass and displacement). Participants received suggestions from a simulated model which suggested correct (i.e., better) and incorrect (i.e., worse) designs based on the global design space. In an uncertain case, both options were approximately equivalent but differed along the objectives. The results indicate that designers do not follow suggestions when the relative design performances are notably different, often underutilizing them to their detriment. However, they follow the suggestions more than expected when the better design choice is less clear.
Chapter
This paper investigates team psychological safety (N = 34 teams) in a synchronous online engineering design class spanning 4 weeks. While work in this field has suggested that psychological safety in virtual teams can facilitate knowledge-sharing, trust among teams, and overall performance, there have been limited investigations of the longitudinal trajectory of psychological safety, when the construct stabilizes in a virtual environment, and what factors impact the building of psychological safety in virtual teams.
Chapter
Artificial intelligence has long been ubiquitous at all levels of our society. AI-based algorithms get more and more entrusted with the task of deciding on the allocation of resources and access to social participation. Recent experience shows a high number of cases in which people experience systematic discrimination due to AI-based decision-making systems. This article takes up this observation and addresses questions about the significance of social science concepts of participation and discrimination in the context of algorithms, about the causes of and proposed solutions to algorithmic discrimination, and about the consequences of technological transformation processes in the age of artificial intelligence for social science discourses on social participation and justice.
Article
Full-text available
Although evidence-based algorithms consistently outperform human forecasters, people often fail to use them after learning that they are imperfect, a phenomenon known as algorithm aversion. In this paper, we present three studies investigating how to reduce algorithm aversion. In incentivized forecasting tasks, participants chose between using their own forecasts or those of an algorithm that was built by experts. Participants were considerably more likely to choose to use an imperfect algorithm when they could modify its forecasts, and they performed better as a result. Notably, the preference for modifiable algorithms held even when participants were severely restricted in the modifications they could make (Studies 1–3). In fact, our results suggest that participants’ preference for modifiable algorithms was indicative of a desire for some control over the forecasting outcome, and not for a desire for greater control over the forecasting outcome, as participants’ preference for modifiable algorithms was relatively insensitive to the magnitude of the modifications they were able to make (Study 2). Additionally, we found that giving participants the freedom to modify an imperfect algorithm made them feel more satisfied with the forecasting process, more likely to believe that the algorithm was superior, and more likely to choose to use an algorithm to make subsequent forecasts (Study 3). This research suggests that one can reduce algorithm aversion by giving people some control—even a slight amount—over an imperfect algorithm’s forecast. Data, as supplemental material, are available at https://doi.org/10.1287/mnsc.2016.2643 . This paper was accepted by Yuval Rottenstreich, judgment and decision making.
Article
Full-text available
Rationally, people should want to receive information that is costless and relevant for a decision. But people sometimes choose to remain ignorant. The current paper identifies intuitive-deliberative conflict as a driver of information avoidance. Moreover, we examine whether people avoid information not only to protect their feelings or experiences, but also to protect the decision itself. We predict that people avoid information that could encourage a more thoughtful, deliberative decision to make it easier to enact their intuitive preference. In Studies 1 and 2, people avoid learning the calories in a tempting dessert and compensation for a boring task to protect their preferences to eat the dessert and work on a more enjoyable task. The same people who want to avoid the information, however, use it when it is provided. In Studies 3-5, people decide whether to learn how much money they could earn by accepting an intuitively unappealing bet (that a sympathetic student performs poorly or that a hurricane hits a third-world country). Although intuitively unappealing, the bets are financially rational because they only have financial upside. If people avoid information in part to protect their intuitive preference, then avoidance should be greater when an intuitive preference is especially strong and when information could influence the decision. As predicted, avoidance is driven by the strength of the intuitive preference (Study 3) and, ironically, information avoidance is greater before a decision is made, when the information is decision relevant, than after, when the information is irrelevant for the decision (Studies 4 and 5). (PsycINFO Database Record
Chapter
Full-text available
Overprecision in judgment is both the most durable and the least understood form of overconfidence. This chapter reviews the evidence on overprecision, highlighting its consequences in everyday life and for our understanding the psychology of uncertainty. There are some interesting explanations for overprecision, but none fully accounts for the diversity of the evidence. Overprecision remains an important phenomenon in search of a full explanation.
Article
Full-text available
The behaviour of poker players and sports gamblers has been shown to change after winning or losing a significant amount of money on a single hand. In this paper, we explore whether there are changes in experts’ behaviour when performing judgmental adjustments to statistical forecasts and, in particular, examine the impact of ‘big losses’. We define a big loss as a judgmental adjustment that significantly decreases the forecasting accuracy compared to the baseline statistical forecast. In essence, big losses are directly linked with wrong direction or highly overshooting judgmental overrides. Using relevant behavioural theories, we empirically examine the effect of such big losses on subsequent judgmental adjustments exploiting a large multinational data set containing statistical forecasts of demand for pharmaceutical products, expert adjustments and actual sales. We then discuss the implications of our findings for the effective design of forecasting support systems, focusing on the aspects of guidance and restrictiveness.
Article
Full-text available
Research shows that evidence-based algorithms more accurately predict the future than do human forecasters. Yet when forecasters are deciding whether to use a human forecaster or a statistical algorithm, they often choose the human forecaster. This phenomenon, which we call algorithm aversion, is costly, and it is important to understand its causes. We show that people are especially averse to algorithmic forecasters after seeing them perform, even when they see them outperform a human forecaster. This is because people more quickly lose confidence in algorithmic than human forecasters after seeing them make the same mistake. In 5 studies, participants either saw an algorithm make forecasts, a human make forecasts, both, or neither. They then decided whether to tie their incentives to the future predictions of the algorithm or the human. Participants who saw the algorithm perform were less confident in it, and less likely to choose it over an inferior human forecaster. This was true even among those who saw the algorithm outperform the human. (PsycINFO Database Record (c) 2014 APA, all rights reserved).
Article
Full-text available
http://www.ijis.net/ijis9_1/ijis9_1_editorial_pre.html
Article
Full-text available
Five university-based research groups competed to recruit forecasters, elicit their predictions, and aggregate those predictions to assign the most accurate probabilities to events in a 2-year geopolitical forecasting tournament. Our group tested and found support for three psychological drivers of accuracy: training, teaming, and tracking. Probability training corrected cognitive biases, encouraged forecasters to use reference classes, and provided forecasters with heuristics, such as averaging when multiple estimates were available. Teaming allowed forecasters to share information and discuss the rationales behind their beliefs. Tracking placed the highest performers (top 2% from Year 1) in elite teams that worked together. Results showed that probability training, team collaboration, and tracking improved both calibration and resolution. Forecasting is often viewed as a statistical problem, but forecasts can be improved with behavioral interventions. Training, teaming, and tracking are psychological interventions that dramatically increased the accuracy of forecasts. Statistical algorithms (reported elsewhere) improved the accuracy of the aggregation. Putting both statistics and psychology to work produced the best forecasts 2 years in a row.
Article
Full-text available
Sophisticated technology is increasingly replacing human minds to perform complicated tasks in domains ranging from medicine to education to transportation. We investigated an important theoretical determinant of people's willingness to trust such technology to perform competently—the extent to which a nonhuman agent is anthropomorphized with a humanlike mind—in a domain of practical importance, autonomous driving. Participants using a driving simulator drove either a normal car, an autonomous vehicle able to control steering and speed, or a comparable autonomous vehicle augmented with additional anthropomorphic features—name, gender, and voice. Behavioral, physiological, and self-report measures revealed that participants trusted that the vehicle would perform more competently as it acquired more anthropomorphic features. Technology appears better able to perform its intended design when it seems to have a humanlike mind. These results suggest meaningful consequences of humanizing technology, and also offer insights into the inverse process of objectifying humans.
Article
Full-text available
When perceiving, explaining, or criticizing human behavior, people distinguish between intentional and unintentional actions. To do so, they rely on a shared folk concept of intentionality. In contrast to past speculative models, this article provides an empirically based model of this concept. Study 1 demonstrates that people agree substantially in their judgments of intentionality, suggesting a shared underlying concept. Study 2 reveals that when asked to define directly the termintentional,people mention four components of intentionality: desire, belief, intention, and awareness. Study 3 confirms the importance of a fifth component, namely skill. In light of these findings, the authors propose a model of the folk concept of intentionality and provide a further test in Study 4. The discussion compares the proposed model to past ones and examines its implications for social perception, attribution, and cognitive development.
Article
Full-text available
This study was based on the proposition that people attribute a disposition to an actor by evaluating its consistency with other information about the actor or the situation. This strategy was assumed to be accompanied by a cognitive set or bias to view ambiguous information as consistent with the hypothesized disposition. 256 undergraduates were told that a student had chosen or had been assigned to write a proabortion or antiabortion paper, and they were or were not given an ambiguous description of the author's personality. In support of predictions, under choice conditions attitudes were always attributed in accordance with the paper's position, but under assignment conditions such attributions occurred only when Ss received the ambiguous personality description. (23 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Arguably, all judgments and decisions are made in 1 (or some combination) of 2 basic evaluation modes-joint evaluation mode (JE), in which multiple options are presented simultaneously and evaluated comparatively, or separate evaluation mode (SE), in which options are presented in isolation and evaluated separately. This article reviews recent literature showing that people evaluated options differently and exhibit reversals of preferences for options between JE and SE. The authors propose an explanation for the JE/SE reversal based on a principle called the evaluability hypothesis. The hypothesis posits that it is more diffecult to evaluate the desirability of values on some attributes than on others and that, compared with easy-to-evaluate attributes, difficult-to-evaluate attributes have a greater impact in JE than in SE.
Article
Full-text available
Numerous studies have demonstrated that theoretically equivalent measures of preference, such as choices and prices, can lead to systematically different preference orderings, known as preference reversals. Two major causes of preference reversals are the compatibility effect and the prominence effect. The present studies demonstrate that the combined effects of prominence and compatibility lead to predictable preference reversals in settings where improvements in air quality are compared with improvements in consumer commodities by two methods-willingness to pay for each improvement and choice (For which of the two improvements would you pay more? Which improvement is more valuable to you?). Willingness to pay leads to relatively greater preference for improved commodities; choice leads to relatively greater preference for improved air quality. These results extend the domain of preference reversals and pose a challenge to traditional theories of preference. At the applied level, these findings indicate the need to develop new methods for valuing environmental resources.
Article
Full-text available
Prior investigators have asserted that certain group characteristics cause group members to disregard outside information and that this behavior leads to diminished performance. We demonstrate that the very process of making a judgment collaboratively rather than individually also contributes to such myopic underweighting of external viewpoints. Dyad members exposed to numerical judgments made by peers gave significantly less weight to those judgments than did individuals working alone. This difference in willingness to use peer input was mediated by the greater confidence that the dyad members reported in the accuracy of their own estimates. Furthermore, dyads were no better at judging the relative accuracy of their own estimates and the advisor's estimates than individuals were. Our analyses demonstrate that, relative to individuals, dyads suffered an accuracy cost. Specifically, if dyad members had given as much weight to peer input as individuals working alone did, then their revised estimates would have been significantly more accurate.
Article
Full-text available
Over the last few years, microblogging has gained prominence as a form of personal broadcasting media where information and opinion are mixed together without an established order, usually tightly linked with current reality. Location awareness and promptness provide researchers using the Internet with the opportunity to create "psychological landscapes"--that is, to detect differences and changes in voiced (twittered) emotions, cognitions, and behaviors. In our article, we present iScience Maps, a free Web service for researchers, available from http://maps.iscience.deusto.es/ and http://tweetminer.eu/ . Technologically, the service is based on Twitter's streaming and search application programming interfaces (APIs), accessed through several PHP libraries, and a JavaScript frontend. This service allows researchers to assess via Twitter the effect of specific events in different places as they are happening and to make comparisons between cities, regions, or countries regarding psychological states and their evolution in the course of an event. In a step-by-step example, it is shown how to replicate a study on affective and personality characteristics inferred from first names (Mehrabian & Piercy, Personality and Social Psychology Bulletin, 19, 755-758 1993) by mining Twitter data with iScience Maps.Results from the original study are replicated in both world regions we tested (the western U.S. and the U.K./Ireland); we also discover base rate of names to be a confound that needs to be controlled for in future research.
Article
Full-text available
A basic issue in social influence is how best to change one's judgment in response to learning the opinions of others. This article examines the strategies that people use to revise their quantitative estimates on the basis of the estimates of another person. The authors note that people tend to use 2 basic strategies when revising estimates: choosing between the 2 estimates and averaging them. The authors developed the probability, accuracy, redundancy (PAR) model to examine the relative effectiveness of these two strategies across judgment environments. A surprising result was that averaging was the more effective strategy across a wide range of commonly encountered environments. The authors observed that despite this finding, people tend to favor the choosing strategy. Most participants in these studies would have achieved greater accuracy had they always averaged. The identification of intuitive strategies, along with a formal analysis of when they are accurate, provides a basis for examining how effectively people use the judgments of others. Although a portfolio of strategies that includes averaging and choosing can be highly effective, the authors argue that people are not generally well adapted to the environment in terms of strategy selection.
Article
Full-text available
The process of making judgments and decisions requires a method for combining data. To compare the accuracy of clinical and mechanical (formal, statistical) data-combination techniques, we performed a meta-analysis on studies of human health and behavior. On average, mechanical-prediction techniques were about 10% more accurate than clinical predictions. Depending on the specific analysis, mechanical prediction substantially outperformed clinical prediction in 33%-47% of studies examined. Although clinical predictions were often as accurate as mechanical predictions, in only a few studies (6%-16%) were they substantially more accurate. Superiority for mechanical-prediction techniques was consistent, regardless of the judgment task, type of judges, judges' amounts of experience, or the types of data being combined. Clinical predictions performed relatively less well when predictors included clinical interview data. These data indicate that mechanical predictions of human behaviors are equal or superior to clinical prediction methods for a wide range of circumstances.
Article
Full-text available
Although increases in the use of automation have occurred across society, research has found that human operators often underutilize (disuse) and overly rely on (misuse) automated aids (R. Parasuraman & V. Riley, 1997). Nearly 275 Cameron University students participated in 1 of 3 experiments performed to examine the effects of perceived utility (M. T. Dzindolet, H. P. Beck, L. G. Pierce, & L. A. Dawe, 2001) on automation use in a visual detection task and to compare reliance on automated aids with reliance on humans. Results revealed a bias for human operators to rely on themselves. Although self-report data indicate a bias toward automated aids over human aids, performance data revealed that participants were more likely to disuse automated aids than to disuse human aids. This discrepancy was accounted for by assuming human operators have a "perfect automation" schema. Actual or potential applications of this research include the design of future automateddecision aids and training procedures for operators relying on such aids.
Article
The authors propose that consumers’ preferences are systematically affected by whether they make direct comparisons between brands (e.g., a choice task) or evaluate brands individually (e.g., purchase likelihood ratings). In particular, “comparable” attributes, which produce precise and easy-to-compute comparisons (e.g., price), tend to be relatively more important in comparison-based tasks. Conversely, “enriched” attributes (e.g., brand name), which are more difficult to compare but are often more meaningful and informative when evaluated on their own, tend to receive relatively greater weight when preferences are formed on the basis of separate evaluations of individual options. Consistent with this analysis, systematic preference reversals were observed in a series of studies, which tested the proposed explanation on the basis of attribute-task compatibility, demonstrated that the findings generalize across preference elicitation tasks and attributes that have the characteristics prescribed by their theory, and examined rival accounts. The authors discuss the theoretical implications of this research and explore its consequences for the measurement of buyers’ preferences and for marketers’ pricing, merchandising, distribution, and communications strategies.
Article
Are overconfident beliefs driven by the motivation to view oneself positively? We test the relationship between motivation and overconfidence using two distinct, but often conflated measures: better-than-average (BTA) beliefs and overplacement. Our results suggest that motivation can indeed affect these faces of overconfidence, but only under limited conditions. Whereas BTA beliefs are inflated by motivation, introducing some specificity and clarity to the standards of assessment (Experiment 1) or to the trait’s definition (Experiments 2 and 3) reduces or eliminates this bias in judgment overall. We find stronger support for a cognitive explanation for overconfidence, which emphasizes the effect of task difficulty. The difficulty of possessing a desirable trait (Experiment 4) or succeeding on math and logic problems (Experiment 5) affects self-assessment more consistently than does motivation. Finally, we find the lack of an objective standard for vague traits allows people to create idiosyncratic definitions and view themselves as better than others in their own unique ways (Experiment 6). Overall, the results suggest motivation’s effect on BTA beliefs is driven more by idiosyncratic construals of assessment than by self-enhancing delusion. They also suggest that by focusing on vague measures (BTA rather than overplacement) and vague traits, prior research may have exaggerated the role of motivation in overconfidence.
Article
Forecasting advice from human advisors is often utilized more than advice from automation. There is little understanding of why “algorithm aversion” occurs, or specific conditions that may exaggerate it. This paper first reviews literature from two fields—interpersonal advice and human–automation trust—that can inform our understanding of the underlying causes of the phenomenon. Then, an experiment is conducted to search for these underlying causes. We do not replicate the finding that human advice is generally utilized more than automated advice. However, after receiving bad advice, utilization of automated advice decreased significantly more than advice from humans. We also find that decision makers describe themselves as having much more in common with human than automated advisors despite there being no interpersonal relationship in our study. Results are discussed in relation to other findings from the forecasting and human–automation trust fields and provide a new perspective on what causes and exaggerates algorithm aversion.
Article
For millennia humans have relied on one another to recall the minutiae of our daily goings-on. Now we rely on “the cloud”—and it is changing how we perceive and remember the world around us
Article
In the experiment presented in this paper the Elaboration Likelihood Model (ELM), a social psychological theory of persuasion, was applied to explain why users sometimes agree with the incorrect advice of an expert system. Subjects who always agreed with the expert system's incorrect advice (n = 36) experienced less mental effort, scored lower on recall questions, and evaluated the cases as being easier than subjects who disagreed once or more with the expert system (n = 35). These results show that subjects who agreed with the expert system hardly studied the advice but just trusted the expert system. This is in agreement with the ELM. The experiment also covers an investigation into the factors that moderate user agreement. The results have serious implications for the use of expert systems.
Article
Despite over 50 years of one-sided research favoring formal prediction rules over human judgment, the “clinical-statistical controversy,” as it has come to be known, remains something of a hot-button issue. Surveying the objections to the formal approach, it seems the strongest point of disagreement is that clinical expertise can be replaced by statistics. We review and expand upon an unfortunately obscured part of Meehl's book to try to reconcile the issue. Building on Meehl, we argue that the clinician provides information that cannot be captured in, or outperformed by, mere frequency tables. However, that information is still best harnessed by a mechanical prediction rule that makes the ultimate decision. Two original studies support our arguments. The first study shows that multivariate prediction models using no data other than clinical speculations can perform well against statistical regression models. Study 2, however, showed that holistic predictions were less accurate than predictions made by mechanically combining smaller judgments without input from the judge at the combination stage. While we agree that clinical expertise cannot be replaced or neglected, we see no ethical reason to resist using explicit, mechanical rules for socially important decisions. Copyright © 2006 John Wiley & Sons, Ltd.
Article
Three experiments were conducted within the framework of correspondent inference theory. In each of the experiments the subjects were instructed to estimate the “true” attitude of a target person after having either read or listened to a speech by him expressing opinions on a controversial topic. Independent variables included position of speech (pro, anti, or equivocal), choice of position vs. assignment of position, and reference group of target person. The major hypothesis (which was confirmed with varying strength in all three experiments) was that choice would make a greater difference when there was a low prior probability of someone taking the position expressed in the speech. Other findings of interest were: (1) a tendency to attribute attitude in line with behavior, even in no-choice conditions; (2) increased inter-individual variability in conditions where low probability opinions were expressed in a constraining context; (3) that this variability was partly a function of the subjects' own attitudes on the issue; (4) that equivocation in no-choice conditions leads to the attribution that the equivocator opposes the assigned position. The main conclusion suggested is that perceivers do take account of prior probabilities and situational constraints when attributing private attitude, but perhaps do not weight these factors as heavily as would be expected by a rational analysis.
The general problem of forming composite variables from components is prevalent in many types of research. A major aspect of this problem is the weighting of components. Assuming that composites are a linear function of their components, composites formed by using standard linear regression are compared to those formed by simple unit weighting schemes, i.e., where predictor variables are weighted by 1.0. The degree of similarity between the two composites, expressed as the minimum possible correlation between them, is derived. This minimum correlation is found to be an increasing function of the intercorrelation of the components and a decreasing function of the number of predictors. Moreover, the minimum is fairly high for most applied situations. The predictive ability of the two methods is compared. For predictive purposes, unit weighting is a viable alternative to standard regression methods because unit weights: (1) are not estimated from the data and therefore do not “consume” degrees of freedom; (2) are “estimated” without error (i.e., they have no standard errors); (3) cannot reverse the “true” relative weights of the variables. Predictive ability of the two methods is examined as a function of sample size and number of predictors. It is shown that unit weighting will be superior to regression in certain situations and not greatly inferior in others. Various implications for using unit weighting are discussed and applications to several decision making situations are illustrated.
Article
Information displays influence decision processes by facilitating some decision strategies while hindering others. Component characteristics of displays, such as the form, organization, and sequence of information, influence decision processes through an adaptive mechanism whereby a decision maker balances the desire to maximize accuracy against the desire to minimize effort. Variations in the information display lead to changes in the anticipated effort and anticipated accuracy of each available strategy and, therefore, provide an incentive for decision makers to use different decision processes. Research in this area can provide guidance regarding the use of displays and other decision-aiding approaches.
Article
This paper identifies a systematic instability in the weight that people place on interpersonal comparisons of outcomes. When evaluating the desirability of a single outcome consisting of a payoff for oneself and another person, people display great concern for relative payoffs. However, when they choose between two or more outcomes, their choices reflect greater concern with their own payoffs and less concern for relative payoffs. Modal subjects in our experiments rated the outcome of $500 for self/$500 for other as more desirable than the outcome $600 for self/$800 for other when both were evaluated independently, but they chose the latter outcome over the former when presented with the two options simultaneously. We offer a theoretical explanation for this phenomenon and demonstrate its robustness.
Article
Two studies provided evidence for the role of naïve realism in the failure of individuals to give adequate weight to peer input, and explored two strategies for reducing the impact of this inferential bias. Study 1 demonstrated that dyad members see their own estimates as more “objective” than those of their partners and that this difference in perceived objectivity predicts the degree of underweighting. Compelling participants to assess their own versus their partners’ objectivity prior to revising estimates decreased underweighting, an effect that was mediated by differences in perceived objectivity. Study 2 showed that the increase in accuracy that results from requiring dyad members to offer joint estimates via discussion is largely retained in subsequent individual estimates. Both studies showed that underweighting is greater when dyad members disagree on the issue about which they are making consensus estimates—a finding that further supports a “naïve realism” interpretation of the phenomenon. (150 words)
Article
Proper linear models are those in which predictor variables are given weights such that the resulting linear composite optimally predicts some criterion of interest; examples of proper linear models are standard regression analysis, discriminant function analysis, and ridge regression analysis. Research summarized in P. Meehl's (1954) book on clinical vs statistical prediction and research stimulated in part by that book indicate that when a numerical criterion variable (e.g., graduate GPA) is to be predicted from numerical predictor variables, proper linear models outperform clinical intuition. Improper linear models are those in which the weights of the predictor variables are obtained by some nonoptimal method. The present article presents evidence that even such improper linear models are superior to clinical intuition when predicting a numerical criterion from numerical predictors. In fact, unit (i.e., equal) weighting is quite robust for making such predictions. The application of unit weights to decide what bullet the Denver Police Department should use is described; some technical, psychological, and ethical resistances to using linear models in making social decisions are considered; and arguments that could weaken these resistances are presented. (50 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
A review of the literature indicates that linear models are frequently used in situations in which decisions are made on the basis of multiple codable inputs. These models are sometimes used (a) normatively to aid the decision maker, (b) as a contrast with the decision maker in the clinical vs statistical controversy, (c) to represent the decision maker "paramorphically" and (d) to "bootstrap" the decision maker by replacing him with his representation. Examination of the contexts in which linear models have been successfully employed indicates that the contexts have the following structural characteristics in common: each input variable has a conditionally monotone relationship with the output; there is error of measurement; and deviations from optimal weighting do not make much practical difference. These characteristics ensure the success of linear models, which are so appropriate in such contexts that random linear models (i.e., models whose weights are randomly chosen except for sign) may perform quite well. 4 examples involving the prediction of such codable output variables as GPA and psychiatric diagnosis are analyzed in detail. In all 4 examples, random linear models yield predictions that are superior to those of human judges. (52 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
The statistical vs. clinical prediction issue as applied to daily clinical decisions. The problem of pragmatic decisions, the theoretical derivation of novel patterns, and the relationship of nonfrequentist probability and rational action are considered. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Decision makers and forecasters often receive advice from different sources including human experts and statistical methods. This research examines, in the context of stock price forecasting, how the apparent source of the advice affects the attention that is paid to it when the mode of delivery of the advice is identical for both sources. In Study 1, two groups of participants were given the same advised point and interval forecasts. One group was told that these were the advice of a human expert and the other that they were generated by a statistical forecasting method. The participants were then asked to adjust forecasts they had previously made in light of this advice. While in both cases the advice led to improved point forecast accuracy and better calibration of the prediction intervals, the advice which apparently emanated from a statistical method was discounted much more severely. In Study 2, participants were provided with advice from two sources. When the participants were told that both sources were either human experts or both were statistical methods, the apparent statistical-based advice had the same influence on the adjusted estimates as the advice that appeared to come from a human expert. However when the apparent sources of advice were different, much greater attention was paid to the advice that apparently came from a human expert. Theories of advice utilization are used to identify why the advice of a human expert is likely to be preferred to advice from a statistical method. Copyright © 2009 John Wiley & Sons, Ltd.
Article
In two studies, we inquired whether patients accept medical recommendations that come from a computer program rather than from a physician. In study 1, we found that subjects, when deciding whether to have an operation or not in different medical scenarios, were more likely to follow a recommendation that came from a physician than one that came from a computer program. Subjects stated that they would feel less responsible when following a recommendation than when deciding against it. Following a physician's recommendation reduced the feeling of responsibility more than following that of a computer program. The difference in feeling of responsibility when following versus when not following a recommendation partly mediated subjects' inclination to follow the physician more. In our second study, we found that subjects were more decision seeking when they received a recommendation or decision from a computer program, and they were more decision seeking when they had to accept a decision than when they received a recommendation. Subjects also trusted the physician more than the computer program to make a good recommendation or decision. Copyright © 2006 John Wiley & Sons, Ltd.
Article
The uncanny valley-the unnerving nature of humanlike robots-is an intriguing idea, but both its existence and its underlying cause are debated. We propose that humanlike robots are not only unnerving, but are so because their appearance prompts attributions of mind. In particular, we suggest that machines become unnerving when people ascribe to them experience (the capacity to feel and sense), rather than agency (the capacity to act and do). Experiment 1 examined whether a machine's humanlike appearance prompts both ascriptions of experience and feelings of unease. Experiment 2 tested whether a machine capable of experience remains unnerving, even without a humanlike appearance. Experiment 3 investigated whether the perceived lack of experience can also help explain the creepiness of unfeeling humans and philosophical zombies. These experiments demonstrate that feelings of uncanniness are tied to perceptions of experience, and also suggest that experience-but not agency-is seen as fundamental to humans, and fundamentally lacking in machines.
Article
When predicting potential jury verdicts, trial attorneys often seek second opinions from other attorneys. But how much weight do they give to these opinions, and how optimally do they use them? In a four-round estimation task developed by Liberman et al. (under review), pairs of law students and pairs of experienced trial attorneys estimated actual jury verdicts. When participants were given access to a partner's estimates, participants' accuracy improved in both groups. However, participants in both groups underweighted their partners' estimates relative to their own, with experienced attorneys giving less weight to their partners' opinions than did law students. In doing so, participants failed to reap the full benefits of statistical aggregation. In both groups, requiring partners to reach agreement on a joint estimate improved accuracy. This benefit was then largely retained when participants gave final individual estimates. In a further analysis, we randomly sampled estimates of various-sized groups. The accuracy of mean estimates substantially increased as group size increased, with the largest relative benefit coming from the first additional estimate. We discuss the implications of these findings for the legal profession and for the study of individual versus collective estimation.
Article
Although prior studies have found that people generally underweight advice from others, such discounting of advice is not universal. Two studies examined the impact of task difficulty on the use of advice. In both studies, the strategy participants used to weigh advice varied with task difficulty even when it should not have. In particular, the results show that people tend to overweight advice on difficult tasks and underweight advice on easy tasks. This pattern held regardless of whether advice was automatically provided or whether people had to seek it out. The paper discusses implications for the circumstances under which people will be open to influence by advisors. Copyright © 2006 John Wiley & Sons, Ltd.
Article
Decision makers ("Judges") often make decisions after obtaining advice from an Advisor. The two parties often share a psychological "contract" about what each contributes in expertise to the decision and receives in monetary outcomes from it. In a laboratory experiment, we varied Advisor Experitise and the opportunity for monetary rewards. As expected, these manipulations influenced advice quality, advice taking, and Judge post-advice decision quality. The main contribution of the study, however, was the manipulation of the timing of monetary rewards (before or after the advising interaction). We found, as predicted, that committing money for expert-but not novice-advice increases Judges' use of advice and their subsequent estimation accuracy. Implications for advice giving and taking are discussed. Copyright (C) 2004 John Wiley Sons, Ltd.
Article
The Judge-Advisor paradigm (Sniezek & Buckley, 1989) allows for the study of decision making by groups with differentiated roles. This paper reports a study using this approach to investigate the impact of advice vis-à-vis the judge′s own initial choice. Teams consisting of one judge and two advisors (business students randomly assigned to these roles) were given a choice task (concerning business events) composed of 70 items with two alternatives each. The judges provided final team choices and confidence assessments under one of three conditions: Dependent (judge has no basis for own choice), Cued (judge chooses only after being advised), or Independent (judges makes own tentative choice prior to being advised as well as subsequent final choice). Results showed that this manipulation affected the judge′s final choice accuracy and confidence, leading to the best performance by Independent judges and the poorest by Dependent judges. These data are discussed with respect to theory and data on the cueing effect (Sniezek, Paese, & Switzer, 1990) for individual choice. In addition, the effects on the decision making process of the judge from (a) advisors′ confidence and (b) conflict between advisors′ recommendations are examined in detail. Finally, issues concerning the potential contribution of the Judge-Advisor paradigm to the understanding of social decision making are addressed.
Article
We report the results of a novel experiment that addresses two unresolved questions in the judgmental forecasting literature. First, how does combining the estimates of others differ from revising one’s own estimate based on the judgment of another? The experiment found that participants often ignored advice when revising an estimate but averaged estimates when combining. This was true despite receiving identical feedback about the accuracy of past judgments. Second, why do people consistently tend to overweight their own opinions at the expense of profitable advice? We compared two prominent explanations for this, differential access to reasons and egocentric beliefs, and found that neither adequately accounts for the overweighting of the self. Finally, echoing past research, we find that averaging opinions is often advantageous, but that choosing a single judge can perform well in certain predictable situations.
Article
When facing a decision, people often rely on advice received from others. Previous studies have shown that people tend to discount others’ opinions. Yet, such discounting varies according to several factors. This paper isolates one of these factors: the cost of advice. Specifically, three experiments investigate whether the cost of advice, independent of its quality, affects how people use advice. The studies use the Judge–Advisor System (JAS) to investigate whether people value advice from others more when it costs money than when it is free, and examine the psychological processes that could account for this effect. The results show that people use paid advice significantly more than free advice and suggest that this effect is due to the same forces that have been documented in the literature to explain the sunk costs fallacy. Implications for circumstances under which people value others’ opinions are discussed.