Article

Analytic Confidence and Political Decision‐Making: Theoretical Principles and Experimental Evidence From National Security Professionals

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

When making decisions under uncertainty, it is important to distinguish between the probability that a judgment is true and the confidence analysts possess in drawing their conclusions. Yet analysts and decision-makers often struggle to define “confidence” in this context, and many ways that scholars use this term do not necessarily facilitate decision-making under uncertainty. To help resolve this confusion, we argue for disaggregating analytic confidence along three dimensions: reliability of available evidence, range of reasonable opinion, and responsiveness to new information. After explaining how these attributes hold different implications for decision-making in principle, we present survey experiments examining how analysts and decision-makers employ these ideas in practice. Our first experiment found that each conception of confidence distinctively influenced national security professionals' evaluations of high-stakes decisions. Our second experiment showed that inexperienced assessors of uncertainty could consistently discriminate among our conceptions of confidence when making political forecasts. We focus on national security, where debates about defining “confidence levels” have clear practical implications. But our theoretical framework generalizes to nearly any area of political decision-making, and our empirical results provide encouraging evidence that analysts and decision-makers can grasp these abstract elements of uncertainty.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This can take the form of a Bayesian credible interval around a probability estimate. Friedman and Zeckhauser (2018) argued that analytic confidence depends upon three factors: the reliability of available evidence, the range of reasonable opinions, and the responsiveness of the estimate to new information. The reliability of evidence determines the extent to which an analyst's prior probability distribution (i.e., their initial assumptions and state of knowledge) influences the credible interval. ...
... Another purpose of Experiment 2 was to examine whether providing information about the source of the analyst's uncertainty improves interpretations of confidence. Friedman and Zeckhauser (2018) argued that information about the three factors informing analytic confidence (reliability of evidence, range of reasonable opinion, and responsiveness to new information) should be conveyed to decision-makers orthogonally and they found that decision-makers were generally able to discern between the three factors when making decisions. In Experiment 2, participants in the verbal condition ...
... Revealing the source of analytic uncertainty did not improve the interpretation of expressions of confidence, contrary to the findings of (Friedman & Zeckhauser, 2018 ...
Article
Intelligence agencies communicate uncertainty to decision‐makers through verbal probability phrases that correspond to numerical ranges (i.e., probability lexicons) and ordinal levels of confidence. However, decision‐makers may misinterpret the relationship between these concepts and form inappropriate interpretations of intelligence analysts' uncertainty. In two experiments, four ways of conveying second‐order probability to decision‐makers were compared: (a) probability and confidence phrases written in the text of a report, (b) the addition of a probability lexicon, (c) the addition of a probability lexicon that varied numerical ranges according to the level of confidence (i.e., revised lexicon), and (d) a probability phrase written in text followed by a numerical range that varied according to the level of confidence. The revised lexicon was expected to improve interpretations of second‐order probability. The 275 participants in Experiment 1 and 796 participants in Experiment 2 provided numerical estimates corresponding to analytic judgments provided in descriptions about three overseas military operations and also indicated their support for approving or delaying the operations. The results demonstrated that providing the numerical range in the text of the report or providing a probability lexicon, improved interpretations of probability phrases above the verbal phrase‐only condition, but not interpretations of confidence. Participants were unable to correctly interpret confidence with respect to the precision of their estimate intervals and their decisions about the operations. However, in Experiments 2 and 3 the effects on these variables of providing decision‐makers with information about the source of the analyst's uncertainty were examined. In Experiment 3 ( n = 510), providing this information improved correspondence between confidence level and approval of the operation. Recommendations are provided regarding additional methods of improving decision‐makers' interpretation of second‐order probability conveyed in intelligence reporting.
... In intelligence production, NBLP schemes are often paired with a separate ordinal scale for communicating analytic confidence in a given judgment (e.g., low, moderate, high;Friedman & Zeckhauser, 2018;Mandel & Irwin, 2021c). These scales (and the concepts they represent) are often described as distinct from each other (Defense Intelligence Agency, 2015). ...
... Critically, we found that intelligence consumers do not treat confidence and probability as independent constructs. While previous research suggests that experts and non-experts can distinguish between probability and specific components of confidence (Friedman & Zeckhauser, 2018), our findings cohere with evidence from declassified intelligence products that analysts conflate ordinal confidence ratings with event probabilities (Friedman & Zeckhauser, 2012). Simply put, the present findings indic-27 Communicating Uncertainty in National Security Intelligence ate that the intelligence community's efforts to disjoin confidence and probability are ineffective, much like efforts to impose semantic meaning of probability terms through NBLP schemes. ...
Article
Full-text available
Organizations in several domains including national security intelligence communicate judgments under uncertainty using verbal probabilities (e.g., likely) instead of numeric probabilities (e.g., 75% chance), despite research indicating that the former have variable meanings across individuals. In the intelligence domain, uncertainty is also communicated using terms such as low, moderate, or high to describe the analyst's confidence level. However, little research has examined how intelligence professionals interpret these terms and whether they prefer them to numeric uncertainty quantifiers. In two experiments (N = 481 and 624, respectively), uncertainty communication preferences of expert (n = 41 intelligence analysts in Experiment 1) and nonexpert intelligence consumers were elicited. We examined which format participants judged to be more informative and simpler to process. We further tested whether participants treated verbal probability and confidence terms as independent constructs and whether participants provided coherent numeric probability translations of verbal probabilities. Results showed that although most nonexperts favored the numeric format, experts were about equally split, and most participants in both samples regarded the numeric format as more informative. Experts and nonexperts consistently conflated probability and confidence. For instance, confidence intervals inferred from verbal confidence terms had a greater effect on the location of the estimate than the width of the estimate, contrary to normative expectation. Approximately one‐fourth of experts and over one‐half of nonexperts provided incoherent numeric probability translations for the terms likely and unlikely when the elicitation of best estimates and lower and upper bounds were briefly spaced by intervening tasks.
... Within the diverse discourse on uncertainty and political decisionmaking, several studies focus on how political decision-makers perceive different types of uncertainty, how these perceptions influence their confidence and their assessment of their own decision-making capabilities (Friedman and Zeckhauser, 2018), and the strategies they employ to manage such uncertainties (Heazle, 2012). Research further explores how political decisionmakers' acknowledgments of different kinds of uncertainty-such as external uncertainty, arising from the inherent complexity of situations, or internal uncertainty, reflecting ignorance or a lack of confidence in their own decisions-affects public perceptions of them (Løhre and Halvor Teigen, 2023). ...
Article
Full-text available
Political decision-making is often riddled with uncertainties, largely due to the complexities and fluid nature of contemporary societies, which make it difficult to predict the consequences of political decisions. Despite these challenges, political leaders cannot shy away from decision-making, even when faced with overwhelming uncertainties. Thankfully, there are tools that can help them manage these uncertainties and support their decisions. Among these tools, Artificial Intelligence (AI) has recently emerged. AI-systems promise to efficiently analyze complex situations, pinpoint critical factors, and thus reduce some of the prevailing uncertainties. Furthermore, some of them have the power to carry out in-depth simulations with varying parameters, predicting the consequences of various political decisions, and thereby providing new certainties. With these capabilities, AI-systems prove to be a valuable tool for supporting political decision-making. However, using such technologies for certainty purposes in political decision-making contexts also presents several challenges—and if these challenges are not addressed, the integration of AI in political decision-making could lead to adverse consequences. This paper seeks to identify these challenges through analyses of existing literature, conceptual considerations, and political-ethical-philosophical reasoning. The aim is to pave the way for proactively addressing these issues, facilitating the responsible use of AI for managing uncertainty and supporting political decision-making. The key challenges identified and discussed in this paper include: (1) potential algorithmic biases, (2) false illusions of certainty, (3) presumptions that there is no alternative to AI proposals, which can quickly lead to technocratic scenarios, and (4) concerns regarding human control.
... Reference [26] shows that highly confident people are more confident in making decisions and are more likely to take risks in decision making. The study of reference [27] reveals that self-confidence affects reviewers' assessment. Evaluation order. ...
... Analytic confidence by humans can be broken down along three dimensions: reliability of available evidence, range of reasonable opinion, and responsiveness to new information (Friedman & Zeckhauser, 2018). The 'grey box' process fosters trust and transparency from the human element in the output of the AI system, going some way to ensure that confidence is part of the decision process (Christensen & Lyons, 2017). ...
Technical Report
Full-text available
Recent developments in the field of artificial intelligence (AI) have highlighted the significant potential of the technology to increase Defence capability while reducing risk in military operations. However, it is clear that significant work also needs to be undertaken to ensure that introduction of the technology does not result in adverse outcomes. Defence's challenge is that failure to adopt the emerging technologies in a timely manner may result in a military disadvantage, while premature adoption without sufficient research and analysis may result in inadvertent harms. To explore how to achieve ethical AI in Defence, a workshop was held in Canberra from 30 July to 1 August 2019. 104 people from 45 organisations attended, including representatives from Defence, other Australian government agencies, the Trusted Autonomous Systems Defence Cooperative Research Centre (TASDCRC), civil society, universities and Defence industry. The outputs of the workshop represent a small part of a substantial and ongoing investment in appropriate methodologies, frameworks and theories to guide the development, evaluation, deployment and adaptation of ethical AI and autonomous systems across Defence and the TASDCRC. This report articulates the views of participants and outcomes of the workshop for further consideration and does not represent the views of the Australian Government. This report will be provided to support the development of Defence policy, doctrine, research and project management.
... On failures of intelligence, seeBar-Joseph and Kruglanski (2003),Jervis (2010), andShlaim (1976. 12 On this, seeZeckhauser (2012, 2018) andFriedman, Lerner, and Zeckhauser (2017).Downloaded from https://academic.oup.com/fpa/article/18/1/orab036/6458602 by guest on 11 December 2021 ...
Article
A study of the daily briefings of US presidents by the intelligence community offers a useful test of whether governments can surmount intragovernmental influences in the acquisition and processing of information. A finding that the briefs somehow anticipate events would suggest that governments—their leaders and organizations—rise above political incentives and institutional practices to approach the rationality that realist and liberal scholars attribute to states. This study, thus, examines which countries appear in (the now declassified) daily intelligence briefs of the 1961–(January)1977 period, covering the Kennedy, Johnson, Nixon, and Ford years. It not only finds evidence that the selection of countries for the briefs favors countries referenced in prior briefs (per the foreign-policy literature) but also finds significant evidence that the appearance of countries, in the briefs, anticipates their increased activity in the period to follow (per a rational model).
... I therefore focus on some of the key entry points for uncertainty in such modeling efforts because I believe the prospects for success of the simulation manifesto require that its proponents have a well-articulated plan for dealing with uncertainty as well as conflict. This is especially true if the aim of the approach is to provide intelligence to policymakers since intelligence is effectively an exercise in judgment under uncertainty to improve decision-making by "elevating the debate" (Kent, 1955), a precondition of which is the careful estimation of uncertainty, both about future events that may not only be unknown but unknowable (Kent, 1964) and about the margins of error on those estimatesnamely, the analyst's confidence (Friedman & Zeckhauser, 2018;Mandel & Irwin, 2020). ...
Article
Lustick and Tetlock outline an intellectually ambitious approach to scoping the future. They are particularly interested in sectors of national security and foreign policy decision-making that require an-ticipatory strategic intelligence that is difficult to produce because there is insufficient data, even if relevant theories are available. They propose that in these theory-rich/data-impoverished cases, there can be great value in developing agent-based simulation models that incorporate probabilistic rules that cohere with postulates of the theory or theories that are brought to bear on the intelligence challenge. This is the gist of the "simulation manifesto." Such agent-based models, once constructed, could be run and rerun to examine the effect of counterfactual interventions of the simulated world histories. Since the rules comprising such models are probabilistic, rerunning history from a fixed set of initial conditions will produce a distribution of outcomes. This is precisely the sort of reference-class "outside-view" data that are missing in the type of problem that concerns Lustick and Tetlock. The approach that they propose is, in fact, of great interest to the U.S. intelligence community and it is currently supported by the Office of the Director of National Intelligence through the Intelligence Advanced Research Projects Activity (IARPA) FOCUS program (the acronym stands for Forecasting Counterfactuals in Uncontrolled Settings), which at least partially funds the authors' current work on this topic. Whether theory-driven counterfactual simulations will ultimately prove useful to intelligence organizations and policymakers remains to be seen. However, Lustick and Tetlock's preliminary results using this approach are promising, and serious attempts to go beyond the sort of "thoughtful impres-sionism" they describe is both encouraging and to be encouraged. The aim of this commentary is to focus on the assessment and representation of key uncertainties in such models, a topic that Lustick and Tetlock allude to at points in their paper, but which they say little about in specific terms. This is surprising given their awareness of the fact that human reasoning and judgment is inherently noisy, a point they aptly make in their paper. Discussion of these issues may have been fleshed out in earlier papers on the Virtual Strategic Analysis and Forecasting Tool (VSAFT) meta-model, but it is of use to flag them here for readers (like myself) who are not well versed in the earlier papers on the VSAFT. I therefore focus on some of the key entry points for uncertainty in such modeling efforts because I believe the prospects for success of the simulation manifesto require that its proponents have a well-articulated plan for dealing with uncertainty as well as conflict. This is especially true if the aim of the approach is to provide intelligence to policymakers since intelligence is effectively an exercise in judgment under uncertainty to improve decision-making by "elevating the debate" (Kent, 1955), a precondition of which is the careful estimation of uncertainty, both about future events that may not only be unknown but unknowable (Kent, 1964) and about the margins of error on those estimates-namely, the analyst's confidence (Friedman & Zeckhauser, 2018; Mandel & Irwin, 2020). First, there must be a principled way to assess which theories are applicable to the intelligence task at hand. Even if theories can be recruited (allowing ascent from the dreaded D-cell to B-cell in Lustick and Tetlock's confusion matrix), it is no simple and objective matter to state which ones are relevant and, from a curated set of theories, how the contenders should be ranked. This is already a daunting (and, at best, inter-subjective) exercise in judgment under uncertainty, but there is still a long way to go. From each theory, there will be a range of postulates, some of which will appear highly relevant and others of which may seem irrelevant to the intelligence
... Offering a useful template for intelligence narratives about analytic confidence, Friedman and Zeckhauser propose three central components: (1) the reliability of evidence supporting an assessment; (2) the range of reasonable opinion surrounding an assessment; and (3) the extent to which analysts believe an assessment could change given additional information. 45 Forecasters reliably discriminated between these components and decisionmakers used them reliably. 46 Therefore, analysts might benefit from training that exposes them to frameworks for thinking systematically about these components of confidence. ...
Article
Uncertainty is both inherent in nature and endemic to national security decision-making. Intelligence communities throughout the Western world, however, rely on vague language to communicate uncertainty—both the probability of critical events and the confidence that analysts have in their assessments—to decision-makers. In this article, we review the status-quo approach taken by the intelligence community and, drawing on abundant research findings, we describe fundamental limitations with the approach, including the inherent vagueness, context-dependence, and implicit meanings that attend the use of verbal uncertainty expressions. We then present an alternative approach based on the use of imprecise numeric estimates supplemented by clear written rationales, highlighting the affordances of this alternative. Finally, we describe institutional barriers to reform and address common objections raised by practitioners. While we focus our discussion on the domain of national security intelligence, the case for numeric probabilities is relevant to any organizational field where high-stakes decisions are made under conditions of uncertainty.
... Offering a useful template for intelligence narratives about analytic confidence, Friedman and Zeckhauser propose three central components: (1) the reliability of evidence supporting an assessment; (2) the range of reasonable opinion surrounding an assessment; and (3) the extent to which analysts believe an assessment could change given additional information. 45 Forecasters reliably discriminated between these components and decisionmakers used them reliably. 46 Therefore, analysts might benefit from training that exposes them to frameworks for thinking systematically about these components of confidence. ...
Preprint
Uncertainty is both inherent in nature and endemic to national security decision-making. Intelligence communities throughout the Western world, however, rely on vague language to communicate uncertainty—both the probability of critical events and the confidence that analysts have in their assessments—to decision-makers. In this article, we review the status-quo approach taken by the intelligence community and, drawing on abundant research findings, we describe fundamental limitations with the approach, including the inherent vagueness, context-dependence, and implicit meanings that attend the use of verbal uncertainty expressions. We then present an alternative approach based on the use of imprecise numeric estimates supplemented by clear written rationales, highlighting the affordances of this alternative. Finally, we describe institutional barriers to reform and address common objections raised by practitioners. While we focus our discussion on the domain of national security intelligence, the case for numeric probabilities is relevant to any organizational field where high-stakes decisions are made under conditions of uncertainty.
Chapter
This volume provides the first major study of worldviews in international relations. Worldviews are the unexamined, pre-theoretical foundations of the approaches with which we understand and navigate the world. Advances in twentieth century physics and cosmology and other intellectual developments questioning anthropocentrism have fostered the articulation of alternative worldviews that rival conventional Newtonian humanism and its assumption that the world is constituted by controllable risks. This matters for coming to terms with the uncertainties that are an indelible part of many spheres of life including public health, the environment, finance, security and politics – uncertainties that are concealed by the conventional presumption that the world is governed only by risk. The confluence of risk and uncertainty requires an awareness of alternative worldviews, alerts us to possible intersections between humanist Newtonianism and hyper-humanist Post-Newtonianism, and reminds us of the relevance of science, religion and moral values in world politics.
Article
Full-text available
Intelligence analysis is fundamentally an exercise in expert judgment made under conditions of uncertainty. These judgments are used to inform consequential decisions. Following the major intelligence failure that led to the 2003 war in Iraq, intelligence organizations implemented policies for communicating probability in their assessments. Virtually all chose to convey probability using standardized linguistic lexicons in which an ordered set of select probability terms (e.g., highly likely) is associated with numeric ranges (e.g., 80–90%). We review the benefits and drawbacks of this approach, drawing on psychological research on probability communication and studies that have examined the effectiveness of standardized lexicons. We further discuss how numeric probabilities can overcome many of the shortcomings of linguistic probabilities. Numeric probabilities are not without drawbacks (e.g., they are more difficult to elicit and may be misunderstood by receivers with poor numeracy). However, these drawbacks can be ameliorated with training and practice, whereas the pitfalls of linguistic probabilities are endemic to the approach. We propose that, on balance, the benefits of using numeric probabilities outweigh their drawbacks. Given the enormous costs associated with intelligence failure, the intelligence community should reconsider its reliance on using linguistic probabilities to convey probability in intelligence assessments. Our discussion also has implications for probability communication in other domains such as climate science.
Preprint
Full-text available
Intelligence analysis is fundamentally an exercise in expert judgment made under conditions of uncertainty. These judgments are used to inform consequential decisions. Following the major intelligence failure that led to the 2003 war in Iraq, intelligence organizations implemented policies for communicating probability in their assessments. Virtually all chose to convey probability using standardized linguistic lexicons in which an ordered set of select probability terms (e.g., highly likely) is associated with numeric ranges (e.g., 80-90%). We review the benefits and drawbacks of this approach, drawing on psychological research on probability communication and studies that have examined the effectiveness of standardized lexicons. We further discuss how numeric probabilities can overcome many of the shortcomings of linguistic probabilities. Numeric probabilities are not without drawbacks (e.g., they are more difficult to elicit and may be misunderstood by receivers with poor numeracy). However, these drawbacks can be ameliorated with training and practice, whereas the pitfalls of linguistic probabilities are endemic to the approach. We propose that, on balance, the benefits of using numeric probabilities outweigh their drawbacks. Given the enormous costs associated with intelligence failure, the intelligence community should reconsider its reliance on using linguistic probabilities to convey probability in intelligence assessments. Our discussion also has implications for probability communication in other domains such as climate science.
Article
Full-text available
How much mileage can we get out of prospect theory to explain foreign policy decision-making? To answer this question, we first argue that risk as outcome uncertainty is the appropriate definition in prospect-theoretical applications. Then, we indicate that probability weighting—a crucial component of prospect theory—is typically ignored in such applications. We argue why this is problematic and suggest how to move forward. Next, we discuss how to establish the reference point in the face of outcomes in multiple dimensions, as is typically the case in foreign policy decision-making. Finally, we discuss what we have learnt regarding prospect theory’s scope conditions and the differences across individuals in the theory’s applicability. Overall, our contribution lies in identifying several underexposed or neglected issues (e.g., the definition of risk and probability weighting), in examining the advancements regarding prospect theory’s scope conditions, and in discussing avenues for further research.
Article
Scholars, practitioners, and pundits often leave their assessments of uncertainty vague when debating foreign policy, arguing that clearer probability estimates would provide arbitrary detail instead of useful insight. We provide the first systematic test of this claim using a data set containing 888,328 geopolitical forecasts. We find that coarsening numeric probability assessments in a manner consistent with common qualitative expressions—including expressions currently recommended for use by intelligence analysts—consistently sacrifices predictive accuracy. This finding does not depend on extreme probability estimates, short time horizons, particular scoring rules, or individual attributes that are difficult to cultivate. At a practical level, our analysis indicates that it would be possible to make foreign policy discourse more informative by supplementing natural language-based descriptions of uncertainty with quantitative probability estimates. More broadly, our findings advance long-standing debates over the nature and limits of subjective judgment when assessing social phenomena, showing how explicit probability assessments are empirically justifiable even in domains as complex as world politics.
Article
Full-text available
Public policymakers routinely receive and communicate information characterized by uncertainty. Decisions based on such information can have important consequences, so it is imperative that uncertainties are communicated effectively. Many organizations have developed dictionaries, or lexicons, that contain specific language (e.g., very likely, almost certain) to express uncertainty. But these lexicons vary greatly and only a few have been empirically tested. We have developed evidence-based methods to standardize the language of uncertainty so that it has clear meaning understood by all parties in a given communication. We tested these methods in two policy-relevant domains: climate science and intelligence analysis. In both, evidence-based lexicons were better understood than those now used by the Intergovernmental Panel on Climate Change, the U.S. National Intelligence Council, and the U.K. Defence Intelligence. A well-established behavioral science method for eliciting the terms’ full meaning was especially effective for deriving such lexicons.
Article
Full-text available
We examine the trade-offs associated with using Amazon.com's Mechanical Turk (MTurk) interface for subject recruitment. We first describe MTurk and its promise as a vehicle for performing low-cost and easy-to-field experiments. We then assess the internal and external validity of experiments performed using MTurk, employing a framework that can be used to evaluate other subject pools. We first investigate the characteristics of samples drawn from the MTurk population. We show that respondents recruited in this manner are often more representative of the U.S. population than in-person convenience samples—the modal sample in published experimental political science—but less representative than subjects in Internet-based panels or national probability samples. Finally, we replicate important published experimental work using MTurk samples.
Article
Full-text available
Risks tend to be judged lower by men than by women and by white people than by people of colour. Prior research by Flynn, Slovic and Mertz [Risk Analysis, 14, pp. 1101-1108] found that these race and gender differences in risk perception in the United States were primarily due to 30% of the white male population who judge risks to be extremely low. The specificity of this finding suggests an explanation in terms of sociopolitical factors rather than biological factors. The study reported here presents new data from a recent national survey conducted in the United States. Although white males again stood apart with respect to their judgements of risk and their attitudes concerning worldviews, trust, and risk-related stigma, the results showed that the distinction between white males and others is more complex than originally thought. Further investigation of sociopolitical factors in risk judgements is recommended to clarify gender and racial differences.
Article
Full-text available
Virtually all current theories of choice under risk or uncertainty are cognitive and consequentialist. They assume that people assess the desirability and likelihood of possible outcomes of choice alternatives and integrate this information through some type of expectation-based calculus to arrive at a decision. The authors propose an alternative theoretical perspective, the risk-as-feelings hypothesis, that highlights the role of affect experienced at the moment of decision making. Drawing on research from clinical, physiological, and other subfields of psychology, they show that emotional reactions to risky situations often diverge from cognitive assessments of those risks. When such divergence occurs, emotional reactions often drive behavior. The risk-as-feelings hypothesis is shown to explain a wide range of phenomena that have resisted interpretation in cognitive-consequentialist terms.
Article
Full-text available
Overconfidence has long been noted by historians and political scientists as a major cause of war. However, the origins of such overconfidence, and sources of variation, remain poorly understood. Mounting empirical studies now show that mentally healthy people tend to exhibit psychological biases that encourage optimism, collectively known as 'positive illusions'. Positive illusions are thought to have been adaptive in our evolutionary past because they served to cope with adversity, harden resolve, or bluff opponents. Today, however, positive illusions may contribute to costly conflicts and wars. Testosterone has been proposed as a proximate mediator of positive illusions, given its role in promoting dominance and challenge behaviour, particularly in men. To date, no studies have attempted to link overconfidence, decisions about war, gender, and testosterone. Here we report that, in experimental wargames: (i) people are overconfident about their expectations of success; (ii) those who are more overconfident are more likely to attack; (iii) overconfidence and attacks are more pronounced among males than females; and (iv) testosterone is related to expectations of success, but not within gender, so its influence on overconfidence cannot be distinguished from any other gender specific factor. Overall, these results constitute the first empirical support of recent theoretical work linking overconfidence and war.
Article
Full-text available
The authors present a reconciliation of 3 distinct ways in which the research literature has defined overconfidence: (a) overestimation of one's actual performance, (b) overplacement of one's performance relative to others, and (c) excessive precision in one's beliefs. Experimental evidence shows that reversals of the first 2 (apparent underconfidence), when they occur, tend to be on different types of tasks. On difficult tasks, people overestimate their actual performances but also mistakenly believe that they are worse than others; on easy tasks, people underestimate their actual performances but mistakenly believe they are better than others. The authors offer a straightforward theory that can explain these inconsistencies. Overprecision appears to be more persistent than either of the other 2 types of overconfidence, but its presence reduces the magnitude of both overestimation and overplacement.
Article
Conventional accounts of how democracy works are flawed on a fundamental level, argue Christopher Achen and Larry Bartels. By accounting for the ways social identities shape voting behaviour, they present a new model that not only offers greater intellectual clarity but could make genuine political change possible.
Article
The cable operators in various countries have adopted inter-operator VoIP peering Next-Generation Networks (NGN) agreements, which would eliminate termination fees and encourage operators to offload traffic onto another network quickly. The European Regulators Group (ERG) has issued IP interconnection networking regulations for service providers. The operators that benefit from termination fees on their networks are aimed to use NGN to ensure an efficient exchange of traffic between operators. NGNs are being designed to incorporate finer quality-of-service changing mechanism for various classes of traffic. The IP peering networks need to be efficiently judged for a particular economically efficient interconnection model in conjuncture with a specific retail model.
Article
We examine the trade-offs associated with using Amazon.com's Mechanical Turk (MTurk) interface for subject recruitment. We first describe MTurk and its promise as a vehicle for performing low-cost and easy-to-field experiments. We then assess the internal and external validity of experiments performed using MTurk, employing a framework that can be used to evaluate other subject pools. We first investigate the characteristics of samples drawn from the MTurk population. We show that respondents recruited in this manner are often more representative of the U.S. population than in-person convenience samples-the modal sample in published experimental political science-but less representative than subjects in Internet-based panels or national probability samples. Finally, we replicate important published experimental work using MTurk samples. © The Author 2012. Published by Oxford University Press on behalf of the Society for Political Methodology. All rights reserved.
Article
Recent studies suggest psychological differences between conservatives and liberals, including that conservatives are more overconfident. We use a behavioral political economy model to show that while this is undoubtedly true for election years in the current era, there is no reason to believe that conservative ideologies are intrinsically linked to overconfidence. Indeed, it appears that in 1980 and before, conservatives and liberals were equally overconfident.
Article
Policymakers need estimative intelligence to help them understand the more diffuse and ambiguous threats and opportunities of the post-Cold War world. Ideological divisions are less likely to obstruct analysis, but greater uncertainties make analysis more difficult. The greater the uncertainty, the greater the scope of and need for estimative intelligence. Rather than trying to predict the future, analysts should deal with heightened uncertainty by presenting alternative scenarios.
Article
At conferences, at seminars, and on political science blogs, the potential utility of experimental methods for international relations (IR) research continues to be a hotly contested topic. Given the recent rise in creative applications of experimental methods, now is a useful moment to reflect more generally on the potential value of experiments to study international affairs, how these inherently micro-level methods can shed light on bigger-picture questions, what has been learned already, what goals are probably out of reach, and how various research agendas in IR might productively incorporate experiments.
Article
This paper studies, theoretically and empirically, the role of overconfidence in political behavior. Our model of overconfidence in beliefs predicts that overconfidence leads to ideological extremeness, increased voter turnout, and stronger partisan identification. The model also makes nuanced predictions about the patterns of ideology in society. These predictions are tested using unique data that measure the overconfidence and standard political characteristics of a nationwide sample of over 3,000 adults. Our numerous predictions find strong support in these data. In particular, we document that overconfidence is a substantively and statistically important predictor of ideological extremeness, voter turnout, and partisan identification.
Article
In a series of reports and meetings in Spring 2011, intelligence analysts and officials debated the chances that Osama bin Laden was living in Abbottabad, Pakistan. Estimates ranged from a low of 30 or 40 per cent to a high of 95 per cent. President Obama stated that he found this discussion confusing, even misleading. Motivated by that experience, and by broader debates about intelligence analysis, this article examines the conceptual foundations of expressing and interpreting estimative probability. It explains why a range of probabilities can always be condensed into a single point estimate that is clearer (but logically no different) than standard intelligence reporting, and why assessments of confidence are most useful when they indicate the extent to which estimative probabilities might shift in response to newly gathered information.
Article
This essay calls both for much greater attention to a relatively neglected but critical issue-area in foreign policy analysis, that of risk, and for a departure from a rational choice approach in the direction of a sociocognitive approach to risk. The sociocognitive approach to risk and risk-taking introduced here represents a substantial departure from the rational choice approach that has dominated the study of risk in foreign policy decision-making. The essay redefines and reconceptualizes the nature of risk in international politics in a manner that is in our view more consistent with actual decision-makers' behavior. Unlike the "black boxing" of the concept of risk that is evident in the rational choice approach to studies of risk, this study opens the "black box" by first disaggregating it into actual risk, perceived risk, and acceptable risk-and then by detailing the attributes of risk. This, we believe, is an essential first step toward a more realistic, accurate, and policy-relevant analysis. The approach presented here views risk as a social construct. Therefore, in explaining diversity in risk preferences, we focus on the taste for risk as a source of variety in risk preference across individuals, and on the process of problem-framing and communication. How risks are framed, especially when problems are ill-defined, becomes highly consequential for how decision-makers will understand and respond to it, and how effectively risk will be communicated both within the decision-making system as well as to outsiders.
Article
Good survey and experimental research requires subjects to pay attention to questions and treatments, but many subjects do not. In this article, we discuss “Screeners” as a potential solution to this problem. We first demonstrate Screeners’ power to reveal inattentive respondents and reduce noise. We then examine important but understudied questions about Screeners. We show that using a single Screener is not the most effective way to improve data quality. Instead, we recommend using multiple items to measure attention. We also show that Screener passage correlates with politically relevant characteristics, which limits the generalizability of studies that exclude failers. We conclude that attention is best measured using multiple Screener questions and that studies using Screeners can balance the goals of internal and external validity by presenting results conditional on different levels of attention.
Article
List and endorsement experiments are becoming increasingly popular among social scientists as indirect survey techniques for sensitive questions. When studying issues such as racial prejudice and support for militant groups, these survey methodologies may improve the validity of measurements by reducing nonresponse and social desirability biases. We develop a statistical test and multivariate regression models for comparing and combining the results from list and endorsement experiments. We demonstrate that when carefully designed and analyzed, the two survey experiments can produce substantively similar empirical findings. Such agreement is shown to be possible even when these experiments are applied to one of the most challenging research environments: contemporary Afghanistan. We find that both experiments uncover similar patterns of support for the International Security Assistance Force (ISAF) among Pashtun respondents. Our findings suggest that multiple measurement strategies can enhance the credibility of empirical conclusions. Open-source software is available for implementing the proposed methods.
Article
Survey experiments help establish causality, but scholars do not know how closely the treatments mimic natural phenomena. This study compares survey experiments and a natural experiment on the same topic. In two survey experiments providing information about Medicare, we observe double-digit learning effects. In contrast, most respondents in our contemporaneous natural experiment show little evidence of learning. Consistent with our expectations, the only people who showed comparable levels of learning to respondents in our survey experiment were individuals exposed to Medicare facts in their media source of choice as well as people who were uncertain about the facts from the very beginning. Our conclusion is that survey experiments, at least on this topic, generate effects that are only observed among parts of the population who are likely to be exposed to treatment messages or predisposed to accept them.
Article
The intelligence failure concerning Iraqi weapons of mass destruction (WMD) has been the center of political controversy and official investigations in three countries. This article reviews the Report on the U.S. Intelligence Community's Prewar Intelligence Assessments on Iraq, Senate Select Committee on Intelligence, 7 July 2004, Review of Intelligence on Weapons of Mass Destruction, a Report of a Committee of Privy Councillors to the House of Commons, 14 July 2004 (the Butler Report), Report to the President of the United States, The Commission on the Intelligence Capabilities of the United States Regarding Weapons of Mass Destruction, 31 March 2005. It explores the reasons for their deficiencies and the failure itself. This case and the investigations of it are similar to many previous ones. The investigations are marred by political bias and excessive hindsight. Neither the investigations nor contemporary intelligence on Iraqi WMD followed good social science practices. The comparative method was not utilized, confirmation bias was rampant, alternative hypotheses were not tested, and negative evidence was ignored. Although the opportunities to do better are many, the prospects for adequate reform are dim.
Article
Richard K. Betts is Leo A. Shifrin Professor and Director of the Institute of War and Peace Studies at Columbia University, and Senior Fellow at the Council on Foreign Relations. For comments on earlier drafts the author thanks Yael Aronoff, Robert Art, David Baldwin, Eliot Cohen, Timothy Crawford, Scott Douglas, George Downs, Annette Baker Fox, Charles Glaser, Arman Grigorian, Michael Handel, Robert Jervis, Ronald Krebs, Alan Kuperman, Peter Liberman, Charles Miller, Allan Millet, Andrew Moravcsik, Rebecca Murphy, Barry Posen, Cynthia Roberts, Gideon Rose, Stephen Rosen, Scott Sagan, Warner Schilling, Randall Schweller, Mark Sheetz, Jack Snyder, Stephen Van Evera, Kenneth Waltz, Dessislava Zagorcheva, Philip Zelikow, and Kimberly Marten Zisk. Space limitations preclude dealing with many good points they raised. Betts is also grateful for comments by participants in a conference on scholarship inspired by the work of Samuel Huntington held in Cambridge in 1995, and, in 1999, seminars at Columbia's Institute of War and Peace Studies, Harvard's John M. Olin Institute for Strategic Studies, the University of Chicago's Program on International Security Policy, and MIT's Security Studies Program. 1. Carl von Clausewitz, On War, ed. and trans. Michael Howard and Peter Paret (Princeton, N.J.: Princeton University Press, 1976), p. 128 (emphasis deleted) and p. 181. 2. Bernard Brodie, War and Politics (New York: Macmillan, 1974), pp. 452-453 (emphasis deleted). 3. "If, on one hand, the investigator superimposes a clear and definite pattern of tastes on economic actors and assigns a clear and definite mode of rationality to them, then the possibility of determinate theoretical explanations is increased. If, on the other hand, tastes and modes of rational action are regarded as idiosyncratic and variable from actor to actor, then theoretical determinacy is lost as analysis moves in the direction of relativism of tastes and a phenomenological conception of the actor." Neil J. Smelser, "The Rational Choice Perspective: A Theoretical Assessment," Rationality and Society, Vol. 4, No. 4 (October 1992), p. 399; see also pp. 398, 400-401, 403. These problems apply to strategies for preventing wars as well as fighting them. "One disturbing possibility lies at the intersection of the nonfalsifiable character of the weak model [of deterrence] and the difficulty of testing any proposition about the nature of deterrence empirically. . . . history rarely presents evidence that unambiguously falsifies the weak version of rational deterrence theory." George W. Downs, "The Rational Deterrence Debate," World Politics, Vol. 41, No. 2 (January 1989), p. 227. 4. The deal would have been to let Britain keep its empire while Germany kept Europe. Klaus Hildebrand, The Foreign Policy of the Third Reich, trans. Anthony Fothergill (Berkeley: University of California Press, n.d.), pp. 93-94; Norman Rich, Hitler's War Aims, Vol. 1, Ideology, the Nazi State, and the Course of Expansion (New York: W.W. Norton, 1973), pp. 157-158; and Wilhelm Deist, "The Road to Ideological War: Germany, 1918-1945," in Williamson Murray, MacGregor Knox, and Alvin Bernstein, eds., The Making of Strategy: Rulers, States, and War (New York: Cambridge University Press, 1994), p. 388. 5. Quoted in John Lukacs, Five Days in London: May 1940 (New Haven, Conn.: Yale University Press, 1999), p. 117. 6. David Reynolds, "Churchill and the British 'Decision' to Fight on in 1940," in Richard Langhorne, ed., Diplomacy and Intelligence during the Second World War: Essays in Honour of T.H. Hinsley (New York: Cambridge University Press, 1985), pp. 147, 154-155, 156-160, 163, 167. "A belief which is unjustified . . . may well be instrumentally useful, but it seems odd to call it rational. Rationality . . . is a variety of intentionality. For something to be rational, it has to be within the scope of conscious, deliberate action or reflection. Useful false beliefs obtain by fluke, not by conscious reflection upon the evidence." Jon Elster, Solomonic Judgments (New York: Cambridge University Press, 1989), p. 7. Churchill's rationale for confidence in the defensibility of England is set out in his June 18, 1940, speech in the House of Commons. See "Their Finest Hour," in Robert Rhodes James, ed., Winston S. Churchill: His Complete Speeches, 1897-1963, Vol. 6 (New York: Chelsea House, 1974), pp. 6231-6238. 7. Richard K. Betts, Surprise...
Article
Can international relations (IR) be studied productively with field experimental methods? The two most common existing empirical approaches in IR rely on cross-national data, detailed case studies, or a combination of the two. One as yet uncommon approach is the use of randomized field experiments to evaluate causal hypotheses. Applying such methods within IR complements other theoretical, case study, and observational research, and permits a productive research agenda to be built by testing the micro-foundations of theories within IR. This argument is illustrated by exploring how field experimental methods could be applied to two existing areas: how international institutions facilitate cooperation, and whether international actors can promote democracy in sovereign states.
Article
I. Are there uncertainties that are not risks? 643. — II. Uncertainties that are not risks, 647. — III. Why are some uncertainties not risks? — 656.
Article
The force of uncertainty is central to every major research tradition in the study of international relations. Yet uncertainty has multiple meanings, and each paradigm has a somewhat unique understanding of it. More often than not, these meanings are implicit. I argue that realists define uncertainty as fear induced by anarchy and the possibility of predation; rationalists as ignorance (in a nonpejorative sense) endemic to bargaining games of incomplete information and enforcement; cognitivists as the confusion (again nonpejoratively) of decision making in a complex international environment; and constructivists as the indeterminacy of a largely socially constructed world that lacks meaning without norms and identities. I demonstrate how these different understandings are what provide the necessary microfoundations for the paradigms’ definitions of learning, their contrasting expectations about signaling, and the functions provided by international organizations. This has conceptual, methodological, and theoretical payoffs. Understanding uncertainty is necessary for grasping the logic of each paradigm, for distinguishing them from each other, and promoting interparadigmatic communication.
Article
Growing experimental evidence in cognitive psychology and behavioral economics is shaping the way political science scholars think about how humans make decision in areas of high complexity, uncertainty and risk. Nearly all those studies utilize convenience samples of university students, but insights from that work may not be directly applicable to decisions that are made by political elites. We survey the nascent empirical literature on elite decision-making and look at six areas where the insights of cognitive psychology and behavioral economics are particularly relevant for political behavior and where evidence suggests that experienced elites differ from convenience samples. These differences suggest testable implications for theories of political decision making, which we illustrate in one major area of political science theory—crisis bargaining.
Article
Researchers use survey experiments to establish causal effects in descriptively representative samples, but concerns remain regarding the strength of the stimuli and the lack of realism in experimental settings. We explore these issues by comparing three national survey experiments on Medicare and immigration with contemporaneous natural experiments on the same topics. The survey experiments reveal that providing information increases political knowledge and alters attitudes. In contrast, two real-world government announcements had no discernable effects, except among people who were exposed to the same facts publicized in the mass media. Even among this exposed subsample, treatment effects were smaller and sometimes pointed in the opposite direction. Methodologically, our results suggest the need for caution when extrapolating from survey experiments. Substantively, we find that many citizens are able to recall factual information appearing in the news but may not adjust their beliefs and opinions in response to this information.
Why intelligence fails
  • R. Jervis
The great war of our time
  • M. Morell
Enemies of intelligence
  • R. Betts
Rationality for mortals
  • G. Gigerenzer
Public policy in an uncertain world
  • C. Manksi
The ambiguity aversion literature
  • Al-Najjar
Separating the shirkers from the workers?
  • Berinsky
Overconfidence in war
  • D. Johnson
Overconfidence in wargames
  • Johnson
Comparing list and endorsement experiments
  • Blair
  • Doyle
  • Achen
Evaluating online labor markets for experimental research
  • Berinsky
Experiments in international relations
  • Hyde