David V. BudescuFordham University · Department of Psychology
David V. Budescu
PhD
About
243
Publications
74,635
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
14,328
Citations
Introduction
Skills and Expertise
Additional affiliations
August 2008 - present
August 2008 - present
Publications
Publications (243)
High-stakes debates often pivot on clashing estimates of outcomes that one side sees as so improbable as not to deserve policy prioritization. These debates are especially intractable when they focus on rare events ranging from disasters (e.g., existential risks from Artificial Intelligence, nuclear war, or bioengineered pandemics) to surprising su...
Forecasting tournaments are a well established method for assessing human forecasting skills. Most forecasting tournaments are based on a format where participants estimate the probabilities of discrete events. For predictions of continuous values, the possible range of outcome values is divided into mutually exclusive bins covering the entire outc...
The quality of information that informs decisions in expert domains such as law enforcement and national security often requires assessment based on meta-informational cues such as source reliability and information credibility. Across two experiments with intelligence analysts (n = 74) and non-experts (n = 175), participants rated the accuracy, in...
Crowd sourcing approaches accompanied by optimal aggregation algorithms of human forecasts are becoming increasingly popular and are used in many contexts. In situations where large number of judges are offered the opportunity to predict multiple events, one often encounters large numbers of “missing” forecasts. This article proposes a new approach...
Selection decisions are often affected by irrelevant variables such as gender or race. People can discount this irrelevant information by adjusting their predictions accordingly, yet they fail to do so intuitively. In five online studies (N = 1,077), participants were asked to make selection decisions in which the selection test was affected by irr...
This correction pertains to an error in the copyright information of the original publication. There were no revisions to the content of the chapter itself.
The copyright holder of this chapter has been retrospectively corrected. The correct copyright holder name is:
© His Majesty the King in Right of Canada as represented by Department of Nation...
Is forecasting ability a stable trait? While domain knowledge and reasoning abilities are necessary for making accurate forecasts, research shows that knowing how accurate forecasters have been in the past is the best predictor of future accuracy. However, unlike the measurement of other traits, evaluating forecasting skill requires substantial tim...
In most forecasting contexts, each target event has a resolution time point at which the “ground truth” is revealed or determined. It is reasonable to expect that as time passes, and information relevant to the event resolution accrues, the accuracy of individual forecasts will improve. For example, we expect forecasts about stock prices on a given...
The benefits of judgment aggregation are intuitive and well-documented. By combining the input of several judges, practitioners may enhance information sharing and signal strength while cancelling out biases and noise. The resulting judgment is more accurate than the average accuracy of the individual judgments—a phenomenon known as the wisdom of c...
Sound decision‐making relies on accurate prediction for tangible outcomes ranging from military conflict to disease outbreaks. To improve crowdsourced forecasting accuracy, we developed SAGE, a hybrid forecasting system that combines human and machine generated forecasts. The system provides a platform where users can interact with machine models a...
Is forecasting ability a stable trait? While domain knowledge and reasoning abilities are necessary for making accurate forecasts, research shows that knowing how accurate forecasters have been in the past is the best predictor of future accuracy. However, unlike the measurement of other traits, evaluating forecasting skill requires substantial tim...
This is a pre-print of a chapter intended for publication in the book "JUDGMENT IN PREDICTIVE ANALYTICS" (Ed. Matthias Seifert, Ph.D.). The chapter represents a primer and review of various performance-weighted aggregation techniques, including: history-based weighting methods, disposition-based weighting methods, and coherence-based weighting meth...
In most forecasting contexts, each target event has a resolution time point at which the “ground truth” is revealed or determined. It is reasonable to expect that as time passes, and information relevant to the event resolution accrues, the accuracy of individual forecasts will improve. For example, we expect forecasts about stock prices on a given...
We propose a new method to facilitate comparison of aggregated forecasts based on different aggregation, elicitation and calibration methods. Aggregates are evaluated by their relative position on the cumulative distribution of the corresponding individual scores. This allows one to compare methods using different measures of quality that use diffe...
Meta‐information is information about information that can be used as cues to guide judgments and decisions. Three types of meta‐information that are routinely used in intelligence analysis are source reliability, information credibility, and classification level. The first two cues are intended to speak to information quality (in particular, the p...
Meta-information is information about information that can be used as cues to guide judgments and decisions. Three types of meta-information that are routinely used in intelligence analysis are source reliability, information credibility and classification level. The first two cues are intended to speak to information quality (in particular, the pr...
Past research has found that people treat advice differently depending on its source. In many cases, people seem to prefer human advice to algorithms, but in others, there is a reversal, and people seem to prefer algorithmic advice. Across two studies, we examine the persuasiveness of, and judges' preferences for, advice from different sources when...
In selection decisions, decision makers often struggle to ignore irrelevant information, such as candidates' age, gender and attractiveness, which can lead to suboptimal decisions. One way to correct the effects of these irrelevant attributes is to consider them as suppressor variables, and penalize individuals who unjustifiably benefit from them....
The accuracy of human forecasters is often reduced because of incomplete information and cognitive biases that affect the judges. One approach to improve the accuracy of the forecasts is to recalibrate them by means of non-linear transformations that are sensitive to the direction and the magnitude of the biases. Previous work on recalibration has...
A growing body of research indicates that forecasting skill is a unique and stable trait: forecasters with a track record of high accuracy tend to maintain this record. But how does one identify skilled forecasters effectively? We address this question using data collected during two seasons of a longitudinal geopolitical forecasting tournament. Ou...
This paper's top-level goal is to provide an overview of research conducted in the many academic domains concerned with forecasting. By providing a summary encompassing these domains, this survey connects them, establishing a common ground for future discussions. To this end, we survey literature on human judgement and quantitative forecasting as w...
Key Points: • Several lines of research evidence point to the ineffectiveness of numerically bounded linguistic probability schemes. • Methods for effective communication of probability information should aim to boost users' capability to process quantitative information. • Lewis et al.'s (2019) recommendation for a numerically bounded linguistic p...
This study investigates how individuals assess imprecise information. We focus on two essential dimensions of decision under uncertainty, outcomes and probabilities, and their respective precision. We believe the precision of information is highly relevant in the investment setting, as reflected in the well-known “home (familiarity) bias”, and the...
A crucial challenge for organizations is to pool and aggregate information effectively. Traditionally, organizations have relied on committees and teams, but recently many organizations have explored the use of information markets. In this paper, the authors compared groups and markets in their ability to pool and aggregate information in a hidden-...
In a recent issue of Earth’s Future [vol. 7, pp. 1020-1026], S. C. Lewis et al. recommended a numerically bounded linguistic probability (NBLP) scheme for communicating probabilistic information in extreme event attribution studies. We provide a critique of NBLP schemes in general and of Lewis et al.’s in particular, noting two key points. First, e...
Choosing between candidates for a position can be tricky, especially when the selection test is affected by irrelevant characteristics (e.g., reading speed). One can correct for this irrelevant attribute by penalizing individuals who have unjustifiably benefited from it. Statistical models do so by including the irrelevant attribute as a suppressor...
The consequences of global warming will be dire, but the full extent of these effects on society is unknown and includes uncertainties. Research now suggests that how scientists communicate about the uncertainty over such climate change impacts can influence the public’s trust and acceptance of this information.
The literature suggests that simple expert (mathematical) models can improve the quality of decisions, but people are not always eager to accept and endorse such models. We ran three online experiments to test the receptiveness to advice from computerized expert models. Middle‐ and high‐school teachers (N = 435) evaluated student profiles that vari...
Human forecasts and other probabilistic judgments can be improved by elicitation and aggregation methods. Recent work on elicitation shows that deriving probability estimates from relative judgments (the ratio method) is advantageous, whereas other recent work on aggregation shows that it is beneficial to transform probabilities into coherent sets...
Forecasting of geopolitical events is a notoriously difficult task, with experts failing to significantly outperform a random baseline across many types of forecasting events. One successful way to increase the performance of forecasting tasks is to turn to crowdsourcing: leveraging many forecasts from non-expert users. Simultaneously, advances in...
Climate researchers use carbon dioxide emission scenarios to explore alternative climate futures and potential impacts, as well as implications of mitigation and adaptation policies. Often, these scenarios are published without formal probabilistic interpretations, given the deep uncertainty related to future development. However, users often seek...
Two consistent findings from the study of the fit between judgment of performance and actual performance are general overconfidence and the hard–easy effect, with overconfidence being higher with more difficult stimuli. These findings are based on aggregated analyses of confidence and accuracy, despite the fact that confidence judgments are individ...
Background
Urine drug testing techniques have different rates of false-positive and false-negative test results. However, clinicians may have highly varying perceptions of test accuracy and may compensate for perceived inaccuracy by incorporating other factors into their interpretation of observed test results. Thus, there is the potential for adve...
Violations of the Reduction of Compound Lottery axiom (ROCL) were documented, but they are not fully understood, and only few descriptive models were offered to model decision makers’ (DMs) decisions in such cases. This article comprehensively tests the effects of 6 factors that could influence DMs’ evaluations of compound lotteries, and models how...
Scientists agree that the climate is changing due to human activities, but there is less agreement about the specific consequences and their timeline. Disagreement among climate projections is attributable to the complexity of climate models that differ in their structure, parameters, initial conditions, etc. We examine how different sources of unc...
Extensive research has been devoted to the quality of analysts' earnings forecasts. The common finding is that analysts' forecasts are not very accurate. Prior studies have tended to focus on the mean of forecasts and measure accuracy using various summaries of forecast errors. The present study sheds new light on the accuracy of analysts' forecast...
This article proposes an Item Response Theoretical (IRT) forecasting model that incorporates proper scoring rules and provides evaluations of forecasters’ expertise in relation to the features of the specific questions they answer. We illustrate the model using geopolitical forecasts obtained by the Good Judgment Project (GJP) (see Mellers, Ungar,...
Policymakers involved in climate change negotiations are key users of climate science. It is therefore vital to understand how to communicate scientific information most effectively to this group. We tested how a unique sample of policymakers and negotiators at the Paris COP21 conference update their beliefs on year 2100 global mean temperature inc...
The terms global warming and climate change are often used interchangeably, but recent research finds ?global warming? has become more emotive and more polarizing, resulting in less advocacy by some subpopulations. We explore the robustness of this framing effect based on the expectation that people with stronger partisan identities tend to have mo...
We use results from a multiyear, geopolitical forecasting tournament to highlight the ability of the contribution weighted model [Budescu DV, Chen E (2015) Identifying expertise to extract the wisdom of crowds. Management Sci. 61(2):267-280] to capture and exploit expertise. We show that the model performs better when judges gain expertise from man...
We propose and test a novel approach for eliciting subjective joint probabilities. In the proposed approach, judges compare pairs of possible outcomes and identify which of the two is more likely and by how much. These pair-wise comparative judgments create a matrix of ratio judgments from which the target probabilities are extracted using the rows...
Public policymakers routinely receive and communicate information characterized by uncertainty.
Decisions based on such information can have important consequences, so it is imperative that uncertainties are communicated effectively. Many organizations have developed dictionaries, or lexicons, that contain specific language (e.g., very likely, almo...
We show that, under some circumstances, identification and differentiation in the form of competition and individual rewards may undermine, rather than improve, group performance. The key factor for successful group performance seems to be whether or not group members share common goals and whether or not they have aligned incentives.
We apply the principles of the "Wisdom of Crowds (WoC)" to improve the calibration of interval estimates. Previous research has documented the significant impact of the WoC on the accuracy of point estimates but only a few studies have examined its effectiveness in aggregating interval estimates. We demonstrate that collective probability intervals...
We investigate optimal group member configurations for producing a maximally accurate group forecast. We generalize the core results of Lamberson and Page (2012) to account for group members that may be biased in their forecasts and/or have errors that correlate with the criterion values being forecast. We show that for large forecasting groups, th...
People utilize advice often with limited knowledge of their advisers' expertise. We examine the effects of learning mode on giving and receiving advice in two separate but linked studies. Advisers learned about a choice between two lotteries from either description or experience before writing advice. The decision makers only read this advice befor...
Multiple-choice (MC) tests have been criticized for allowing guessing and the failure to credit partial knowledge, and alternative scoring methods and response formats (Ben-Simon et al., Appl Psychol Meas 21:65–88, 1997) have been proposed to address this problem. Modern test theory addresses these issues by using binary item response models (e.g.,...
We investigate the implications of penalizing incorrect answers to multiple-choice tests, from the perspective of both test-takers and test-makers. To do so, we use a model that combines a well-known item response theory model with prospect theory (Kahneman and Tversky, Prospect theory: An analysis of decision under risk, Econometrica 47:263-91, 19...
We report results of a series of experiments on decision-making in the presence of irreducibly imprecise probabilities of negative and undesirable outcomes. Subjects faced decisions among actions where the payoffs depend on the probability of drawing balls from an urn whose composition was not fully known. Consistent with the vagueness avoidance hy...
Statistical aggregation is often used to combine multiple opinions within a group. Such aggregates outperform individuals, including experts, in various prediction and estimation tasks. This result is attributed to the “wisdom of crowds.” We seek to improve the quality of such aggregates by eliminating poorly performing individuals from the crowd....
The Intergovernmental Panel on Climate Change (IPCC) uses verbal descriptions of uncertainty (for example, Unlikely) to convey imprecision in its forecasts and conclusions. Previous studies showed that the American public misinterprets these probabilistic statements. We report results from a multi-national study involving 25 samples in 24 countries...
Numerous studies and anecdotes demonstrate the "wisdom of the crowd," the
surprising accuracy of a group's aggregated judgments. Less is known, however,
about the generality of crowd wisdom. For example, are crowds wise even if
their members have systematic judgmental biases, or can influence each other
before members render their judgments? If so,...
Cohen's κ measures the improvement in classification above chance level and it is the most popular measure of interjudge agreement. Yet, there is considerable confusion about its interpretation. Specifically, researchers often ignore the fact that the observed level of matched agreement is bounded from above and below and the bounds are a function...
A fundamental assumption of prospect theory is gain–loss separability (GLS)—the assertion that the overall utility of a prospect can be expressed as a function of the utilities of its positive and negative components. Violations of GLS may potentially limit the generalization of results from studies of single-domain prospects to mixed prospects and...
The functional measurement approach (Anderson, 1977) technically is based on factorial designs, quantitative response measures, and monotonic transformations, with the analysis of variance (ANOVA) as the usual statistical tool. This note discusses problems that often occur when utilizing monotonie data transformations in this context. Illustrative...
It is known that the average of many forecasts about a future event tends to outperform the individual assessments. With the goal of further improv-ing forecast performance, this paper develops and compares a number of models for calibrating and aggregating forecasts that exploit the well-known fact that individuals exhibit systematic biases during...
Many Web sites provide consumers with product recommendations, which are typically presented by a sequence of verbal reviews and numerical ratings. In three experiments, we demonstrate that when participants switch between formats (e.g., from verbal to numerical), they are more prone to preference inconsistencies than when they aggregate the recomm...
Many important decisions are routinely made by transient and temporary teams, which perform their duty and disperse. Team members often continue making similar decisions as individuals. We study how the experience of team decision making affects subsequent individual decisions in two seminal probability and reasoning tasks, the Monty Hall problem a...
The authors propose a new structural solution to the knowledge-sharing dilemma. They show that simple auction mechanisms, which impose a rigid set of rules designed to standardize interactions and communication among participants, can prevent some of the detrimental effects associated with conflict of interest in freely interacting groups. The auth...
This is the final issue under this Editor-in-Chief, so this column is fittingly coauthored with the associate editors, whose terms also end with this issue, to emphasize their major role in the leadership of the journal. We first introduce incoming Editor-in-Chief Rakesh K. Sarin, briefly review this year's operations, and thank our editorial board...
We investigate a new give-or-take-some (GOTS) dilemma paradigm that merges traditional give-some and take-some dilemmas. In this hybrid social dilemma, individuals can choose to give or to take resources from a shared resource pool. Previous empirical work by McCarter, Budescu, and Scheffran (2011) found that the composition of the group and the in...
The Intergovernmental Panel on Climate Change (IPCC) publishes periodical assessment reports informing policymakers and the public on issues relevant to the understanding of human induced climate change. The IPCC uses a set of 7 verbal descriptions of uncertainty, such as unlikely and very likely to convey the underlying imprecision of its forecast...
We use simulation to investigate the joint effects of materiality, evidence extent, evidence nature, and misstatement type on achieved audit risk, i.e., the risk of undetected material financial statement misstatement due to error or fraud. Our primary results are fourfold. First, contrary to conventional audit wisdom, we show that elevating the ex...
This paper focuses on decisions under ambiguity. Participants in a laboratory experiment made decisions in three different settings: (a) individually, (b) individually after discussing the decisions with others, and (c) in groups of three. We show that groups are more likely to make ambiguity-neutral decisions than individuals, and that individuals...
We describe the Aggregative Contingent Estimation System (http://www.forecastingace.com), which is de-signed to elicit and aggregate forecasts from large, di-verse groups of individuals. The Aggregative Contingent Estimation System (ACES; see http://www.forecastingace.com) is a project funded by the Intelligence Advanced Research Projects Activity....
Racial/ethnic diversity has become an increasingly important variable in the social sciences. Research from multiple disciplines consistently demonstrates the tremendous impact of ethnic diversity on individuals and organizations. Investigators use a variety of measures, and their choices can affect the conclusions that can be drawn and limit the a...
Often research in judgment and decision making requires comparison of multiple competing models. Researchers invoke global measures such as the rate of correct predictions or the sum of squared (or absolute) deviations of the various models as part of this evaluation process. Reliance on such measures hides the (often very high) level of agreement...
The calibration of probability or confidence judgments concerns the association between the judgments and some estimate of the correct probabilities of events. Researchers rely on estimates using relative frequencies computed by aggregating data over observations. We show that this approach creates conceptual problems, and may result in the confoun...
This issue’s “From the Editors” column is coauthored with all the associate editors, to emphasize their major role in the leadership of the journal. We first review this year’s operations and thank our editorial board and referees. Our first article, by David J. Johnstone, Victor Richmond R. Jose, and Robert L. Winkler, presents “Tailored Scoring R...
One of the most widely used methods for probability encoding in decision analysis uses binary comparisons (choices) between two lotteries: one that depends on the values of the random variable of interest and another that is contingent on an external reference chance device (typically a probability wheel). This note investigates the degree to which...
Prior findings suggest managers often choose ranges to communicate uncertainty in future earnings. We analyzed earnings forecasts over 11 years and find higher earnings uncertainty firms are more likely to choose range estimates. We study investors' attitudes to forecast precision and argue investors' evaluations of forecasts can be explained by a...
A reanalysis of Budescu et al.'s (2009) data on numer-ical interpretations of the Intergovernmental Panel on Climate Change (IPCC 2007) fourth report's ver-bal probability expressions (PE's) revealed that neg-ative wording has deleterious effects on lay judge-ments. Budescu et al. asked participants to inter-pret PE's in IPCC report sentences, by a...