Article

Wise teamwork: Collective confidence calibration predicts the effectiveness of group discussion

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

‘Crowd wisdom’ refers to the surprising accuracy that can be attained by averaging judgments from independent individuals. However, independence is unusual; people often discuss and collaborate in groups. When does group interaction improve vs. degrade judgment accuracy relative to averaging the group's initial, independent answers? Two large laboratory studies explored the effects of 969 face-to-face discussions on the judgment accuracy of 211 teams facing a range of numeric estimation problems from geographic distances to historical dates to stock prices. Although participants nearly always expected discussions to make their answers more accurate, the actual effects of group interaction on judgment accuracy were decidedly mixed. Importantly, a novel, group-level measure of collective confidence calibration robustly predicted when discussion helped or hurt accuracy relative to the group's initial independent estimates. When groups were collectively calibrated prior to discussion, with more accurate members being more confident in their own judgment and less accurate members less confident, subsequent group interactions were likelier to yield increased accuracy. We argue that collective calibration predicts improvement because groups typically listen to their most confident members. When confidence and knowledge are positively associated across group members, the group's most knowledgeable members are more likely to influence the group's answers.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... For example, the post hoc analyses of a relationships comparison task in group size 30 revealed that the accuracy of group judgments in proportion "2: 3: 0" (.56) did not improve than that in proportion "3: 2: 0" (.56). As a recent study pointed out [27] , these ndings imply that to achieve the wisdom of crowds, it is important to not only consider how con dent individuals feel but also what they know (and do not know). ...
... In this situation, although a group judgment is A according to a simple majority rule, it is B according to a weighted-con dence majority rule because the sum of con dence ratings in B (280) is larger than that in A (70). In addition, some previous studies have considered the con dence threshold of accepting individuals' judgments [27], [30], [33], [34] : If a person's con dence is above a certain threshold, then their judgment is accepted, and vice versa. This procedure is repeated until all the group members' judgments are evaluated. ...
Preprint
Full-text available
In group judgments in a binary choice task, the judgments of individuals with low confidence (i.e., they feel that the judgment was not correct) may be regarded as unreliable. Previous studies have shown that aggregating individuals’ diverse judgments can lead to high accuracy in group judgments, a phenomenon known as the wisdom of crowds. Therefore, if low-confidence individuals make diverse judgments between individuals and the mean of accuracy of their judgments is above the chance level (.50), it is likely that they will not always decrease the accuracy of group judgments. To investigate this issue, the present study conducted behavioral experiments using binary choice inferential tasks, and computer simulations of group judgments by manipulating group sizes and individuals’ confidence levels. Results revealed that (I) judgment patterns were highly similar between individuals regardless of their confidence levels; (II) the low-confidence group could make judgments as accurate as the high-confidence group, as the group size increased; and (III) even if there were low-confidence individuals in a group, they generally did not inhibit group judgment accuracy. The results suggest the usefulness of low-confidence individuals’ judgments in a group and provide practical implications for real-world group judgments.
... These results are very relevant for studies on forecasting and researchers in forecasting have highlighted similar effects in forecasting tasks. Human forecasting is improved when forecasters can benefit from each others' estimations, arguments, evidence, and signals of confidence [18][19][20][21]. More recently, research in collective intelligence expanded their focus of analysis from teams to networks of problem-solvers [22][23][24][25]. ...
Article
Full-text available
As artificial intelligence becomes ubiquitous in our lives, so do the opportunities to combine machine and human intelligence to obtain more accurate and more resilient prediction models across a wide range of domains. Hybrid intelligence can be designed in many ways, depending on the role of the human and the algorithm in the hybrid system. This paper offers a brief taxonomy of hybrid intelligence, which describes possible relationships between human and machine intelligence for robust forecasting. In this taxonomy, biological intelligence represents one axis of variation, going from individual intelligence (one individual in isolation) to collective intelligence (several connected individuals). The second axis of variation represents increasingly sophisticated algorithms that can take into account more aspects of the forecasting system, from information to task to human problem-solvers. The novelty of the paper lies in the interpretation of recent studies in hybrid intelligence as precursors of a set of algorithms that are expected to be more prominent in the future. These algorithms promise to increase hybrid system’s resilience across a wide range of human errors and biases thanks to greater human-machine understanding. This work ends with a short overview for future research in this field.
Article
In humans and other gregarious animals, collective decision-making is a robust behavioural feature of groups. Pooling individual information is also fundamental for modern societies, in which digital technologies have exponentially increased the interdependence of individual group members. In this Review, we selectively discuss the recent human and animal literature, focusing on cognitive and behavioural mechanisms that can yield collective intelligence beyond the wisdom of crowds. We distinguish between two group decision-making situations: consensus decision-making, in which a group consensus is required, and combined decision-making, in which a group consensus is not required. We show that in both group decision-making situations, cognitive and behavioural algorithms that capitalize on individual heterogeneity are the key for collective intelligence to emerge. These algorithms include accuracy or expertise-weighted aggregation of individual inputs and implicit or explicit coordination of cognition and behaviour towards division of labour. These mechanisms can be implemented either as ‘cognitive algebra’, executed mainly within the mind of an individual or by some arbitrating system, or as a dynamic behavioural aggregation through social interaction of individual group members. Finally, we discuss implications for collective decision-making in modern societies characterized by a fluid but auto-correlated flow of information and outline some future directions. Collective intelligence emerges in group decision-making, whether it requires a consensus or not. In this Review, Kameda et al. describe cognitive and behavioural algorithms that capitalize on individual heterogeneity to yield gains in decision-making accuracy beyond the wisdom of crowds. View-only file is available. https://rdcu.be/cL3QB
Article
Full-text available
The aggregation of many independent estimates can outperform the most accurate individual judgement1–3. This centenarian finding1,2, popularly known as the 'wisdom of crowds'3, has been applied to problems ranging from the diagnosis of cancer4 to financial forecasting5. It is widely believed that social influence undermines collective wisdom by reducing the diversity of opinions within the crowd. Here, we show that if a large crowd is structured in small independent groups, deliberation and social influence within groups improve the crowd’s collective accuracy. We asked a live crowd (N = 5,180) to respond to general-knowledge questions (for example, "What is the height of the Eiffel Tower?"). Participants first answered individually, then deliberated and made consensus decisions in groups of five, and finally provided revised individual estimates. We found that averaging consensus decisions was substantially more accurate than aggregating the initial independent opinions. Remarkably, combining as few as four consensus choices outperformed the wisdom of thousands of individuals. The collective wisdom of crowds often provides better answers to problems than individual judgements. Here, a large experiment that split a crowd into many small deliberative groups produced better estimates than the average of all answers in the crowd.
Article
Full-text available
Collective intelligence refers to the ability of groups to outperform individual decision makers when solving complex cognitive problems. Despite its potential to revolutionize decision making in a wide range of domains, including medical, economic, and political decision making, at present, little is known about the conditions underlying collective intelligence in real-world contexts. We here focus on two key areas of medical diagnostics, breast and skin cancer detection. Using a simulation study that draws on large real-world datasets, involving more than 140 doctors making more than 20,000 diagnoses, we investigate when combining the independent judgments of multiple doctors outperforms the best doctor in a group. We find that similarity in diagnostic accuracy is a key condition for collective intelligence: Aggregating the independent judgments of doctors outperforms the best doctor in a group whenever the diagnostic accuracy of doctors is relatively similar, but not when doctors' diagnostic accuracy differs too much. This intriguingly simple result is highly robust and holds across different group sizes, performance levels of the best doctor, and collective intelligence rules. The enabling role of similarity, in turn, is explained by its systematic effects on the number of correct and incorrect decisions of the best doctor that are overruled by the collective. By identifying a key factor underlying collective intelligence in two important real-world contexts, our findings pave the way for innovative and more effective approaches to complex real-world decision making, and to the scientific analyses of those approaches.
Article
Full-text available
This paper examines how consumers forecast their future spare money or “financial slack.” While consumers generally think that both their income and expenses will rise in the future, they underweight the extent to which their expected expenses will cut into their spare money, a phenomenon we term “expense neglect.” We test and rule out several possible explanations, and conclude that expense neglect is due in part to insufficient attention towards expectations about future expenses compared to future income. “Tightwad” consumers who are chronically attuned to expenses show less severe expense neglect than “spendthrifts” who are not. We further find that expectations regarding changes in income (and not changes in expenses) predict the Michigan Index of Consumer Sentiments—a leading macro-economic indicator. Finally, we conduct a meta-analysis of our entire file-drawer (27 studies, 8,418 participants) and find that, across studies, participants place 2.9 times the weight on income change as they do on expense change when forecasting changes in their financial slack, and that expense neglect is stronger for distant than near future forecasts.
Article
Full-text available
Errors in estimating and forecasting often result from the failure to collect and consider enough relevant informa-tion. We examine whether attributes associated with persistence in information acquisition can predict performance in an estimation task. We focus on actively open-minded thinking (AOT), need for cognition, grit, and the tendency to maximize or satisfice when making decisions. In three studies, participants made estimates and predictions of uncertain quantities, with varying levels of control over the amount of information they could collect before estimating. Only AOT predicted performance. This relationship was mediated by information acquisition: AOT predicted the tendency to collect information, and information acquisition predicted performance. To the extent that available information is predictive of future outcomes, actively open-minded thinkers are more likely than others to make accurate forecasts.
Article
Full-text available
Social psychologists have long recognized the power of statisticized groups. When individual judgments about some fact (e.g., the unemployment rate for next quarter) are averaged together, the average opinion is typically more accurate than most of the individual estimates, a pattern often referred to as the wisdom of crowds. The accuracy of averaging also often exceeds that of the individual perceived as most knowledgeable in the group. However, neither averaging nor relying on a single judge is a robust strategy; each performs well in some settings and poorly in others. As an alternative, we introduce the select-crowd strategy, which ranks judges based on a cue to ability (e.g., the accuracy of several recent judgments) and averages the opinions of the top judges, such as the top 5. Through both simulation and an analysis of 90 archival data sets, we show that select crowds of 5 knowledgeable judges yield very accurate judgments across a wide range of possible settings-the strategy is both accurate and robust. Following this, we examine how people prefer to use information from a crowd. Previous research suggests that people are distrustful of crowds and of mechanical processes such as averaging. We show in 3 experiments that, as expected, people are drawn to experts and dislike crowd averages-but, critically, they view the select-crowd strategy favorably and are willing to use it. The select-crowd strategy is thus accurate, robust, and appealing as a mechanism for helping individuals tap collective wisdom. (PsycINFO Database Record (c) 2014 APA, all rights reserved).
Article
Full-text available
Numerous studies and anecdotes demonstrate the "wisdom of the crowd," the surprising accuracy of a group's aggregated judgments. Less is known, however, about the generality of crowd wisdom. For example, are crowds wise even if their members have systematic judgmental biases, or can influence each other before members render their judgments? If so, are there situations in which we can expect a crowd to be less accurate than skilled individuals? We provide a precise but general definition of crowd wisdom: A crowd is wise if a linear aggregate, for example a mean, of its members' judgments is closer to the target value than a randomly, but not necessarily uniformly, sampled member of the crowd. Building on this definition, we develop a theoretical framework for examining, a priori, when and to what degree a crowd will be wise. We systematically investigate the boundary conditions for crowd wisdom within this framework and determine conditions under which the accuracy advantage for crowds is maximized. Our results demonstrate that crowd wisdom is highly robust: Even if judgments are biased and correlated, one would need to nearly deterministically select only a highly skilled judge before an individual's judgment could be expected to be more accurate than a simple averaging of the crowd. Our results also provide an accuracy rationale behind the need for diversity of judgments among group members. Contrary to folk explanations of crowd wisdom which hold that judgments should ideally be independent so that errors cancel out, we find that crowd wisdom is maximized when judgments systematically differ as much as possible. We re-analyze data from two published studies that confirm our theoretical results.
Article
Full-text available
Five university-based research groups competed to recruit forecasters, elicit their predictions, and aggregate those predictions to assign the most accurate probabilities to events in a 2-year geopolitical forecasting tournament. Our group tested and found support for three psychological drivers of accuracy: training, teaming, and tracking. Probability training corrected cognitive biases, encouraged forecasters to use reference classes, and provided forecasters with heuristics, such as averaging when multiple estimates were available. Teaming allowed forecasters to share information and discuss the rationales behind their beliefs. Tracking placed the highest performers (top 2% from Year 1) in elite teams that worked together. Results showed that probability training, team collaboration, and tracking improved both calibration and resolution. Forecasting is often viewed as a statistical problem, but forecasts can be improved with behavioral interventions. Training, teaming, and tracking are psychological interventions that dramatically increased the accuracy of forecasts. Statistical algorithms (reported elsewhere) improved the accuracy of the aggregation. Putting both statistics and psychology to work produced the best forecasts 2 years in a row.
Article
Full-text available
The present experiments examined several strategies designed to reduce interval overconfidence in group judgments. Results consistently indicated that 3–4 person nominal groups (whose members made independent judgments and later combined the highest and lowest of these estimates into a single confidence interval) were better calibrated than individual judges and interactive groups. This pattern held even when participants were directly instructed to expand their interval estimates, or when interactive groups appointed a devil's advocate or explicitly considered reasons why their interval estimates might be too narrow. Interactive groups did not perform substantially better than individuals, although participants frequently had the impression that group judgments were far superior to individual judgments. This misperception resembles the "illusion of group effectivity" found in brainstorming research. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Examined the quality of group judgment in situations in which groups have to express an opinion in quantitative form. To provide a measure for evaluating the quality of group performance (which is defined as the absolute value of the discrepancy between the judgment and the true value), 4 baseline models are considered. These models provide a standard for evaluating how well groups perform. The 4 models are: (a) randomly picking a single individual; (b) weighting the judgments of the individual group members equally (the group mean); (c) weighting the 'best' group member (i.e., the one closest to the true value) totally where the best is known, a priori , with certainty; (d) weighting the best member totally where there is a given probability of misidentifying the best and getting the 2nd, 3rd, etc, best member. These 4 models are examined under varying conditions of group size and "bias." Bias is defined as the degree to which the expectation of the population of individual judgments does not equal the true value (i.e., there is systematic bias in individual judgments). A method is then developed to evaluate the accuracy of group judgment in terms of the 4 models. The method uses a Bayesian approach by estimating the probability that the accuracy of actual group judgment could have come from distributions generated by the 4 models. (25 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
When financial columnist James Surowiecki wrote The Wisdom of Crowds, he wished to explain the successes and failures of markets (an example of a "crowd") and to understand why the average opinion of a crowd is frequently more accurate than the opinions of most of its individual members. In this expanded review of the book, Scott Armstrong asks a question of immediate relevance to forecasters: Are the traditional face-to-face meetings an effective way to elicit forecasts from forecast crowds (i.e. teams)? Armstrong doesn't believe so. Quite the contrary, he explains why he considers face-to-face meetings a detriment to good forecasting practice, and he proposes several alternatives that have been tried successfully.
Article
Full-text available
Is it possible to increase one's influence simply by behaving more confidently? Prior research presents two competing hypotheses: (1) the confidence heuristic holds that more confidence increases credibility, and (2) the calibration hypothesis asserts that overconfidence will backfire when others find out. Study 1 reveals that, consistent with the calibration hypothesis, while accurate advisors benefit from displaying confidence, confident but inaccurate advisors received low credibility ratings. However, Study 2 shows that when feedback on advisor accuracy is unavailable or costly, confident advisors hold sway regardless of accuracy. People also made less effort to determine the accuracy of confident advisors; interest in buying advisor performance data decreased as the advisor’s confidence went up. These results add to our understanding of how advisor confidence, accuracy, and calibration influence others.
Article
Full-text available
Prior research has suggested that most people are seriously overconfident in their answers to general knowledge questions. We attempted to reduce over-confidence in each of two separate experiments. In Experiment 1 half of the subjects answered five practice questions which appeared to be difficult. The remaining subjects answered practice problems which appeared to be easy but were actually just as difficult as the other group's practice questions. Within each of these two groups, half of the subjects received feedback on the accuracy of their answers to the practice questions, while the other half received no feedback. All four groups then answered 30 additional questions and indicated their confidence in these answers. The group which had received five apparently “easy” practice questions and then had been given feedback on the accuracy of their answers was underconfident on the final 30 questions. In Experiment 2 subjects who anticipated a group discussion of their answers to general knowledge questions took longer to answer the questions and expressed less overconfidence in their answers than did a control group.
Article
Full-text available
The relative susceptibility of individuals and groups to systematic judgmental biases is considered. An overview of the relevant empirical literature reveals no clear or general pattern. However, a theoretical analysis employing J. H. Davis's (1973) social decision scheme (SDS) model reveals that the relative magnitude of individual and group bias depends upon several factors, including group size, initial individual judgment, the magnitude of bias among individuals, the type of bias, and most of all, the group-judgment process. It is concluded that there can be no simple answer to the question, "Which are more biased, individuals or groups?," but the SDS model offers a framework for specifying some of the conditions under which individuals are both more and less biased than groups. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Psychologists have repeatedly shown that a single statistical factor--often called "general intelligence"--emerges from the correlations among people's performance on a wide variety of cognitive tasks. But no one has systematically examined whether a similar kind of "collective intelligence" exists for groups of people. In two studies with 699 people, working in groups of two to five, we find converging evidence of a general collective intelligence factor that explains a group's performance on a wide variety of tasks. This "c factor" is not strongly correlated with the average or maximum individual intelligence of group members but is correlated with the average social sensitivity of group members, the equality in distribution of conversational turn-taking, and the proportion of females in the group.
Article
Full-text available
Herding is a form of convergent social behaviour that can be broadly defined as the alignment of the thoughts or behaviours of individuals in a group (herd) through local interaction and without centralized coordination. We suggest that herding has a broad application, from intellectual fashion to mob violence; and that understanding herding is particularly pertinent in an increasingly interconnected world. An integrated approach to herding is proposed, describing two key issues: mechanisms of transmission of thoughts or behaviour between agents, and patterns of connections between agents. We show how bringing together the diverse, often disconnected, theoretical and methodological approaches illuminates the applicability of herding to many domains of cognition and suggest that cognitive neuroscience offers a novel approach to its study.
Article
Full-text available
Consumer knowledge is seldom complete or errorless. Therefore, the self-assessed validity of knowledge and consequent knowledge calibration (i.e., the correspondence between self-assessed and actual validity) is an important issue for the study of consumer decision making. In this article we describe methods and models used in calibration research. We then review a wide variety of empirical results indicating that high levels of calibration are achieved rarely, moderate levels that include some degree of systematic bias are the norm, and confidence and accuracy are sometimes completely uncorrelated. Finally, we examine the explanations of miscalibration and offer suggestions for future research. Copyright 2000 by the University of Chicago.
Article
Full-text available
When students answer an in-class conceptual question individually using clickers, discuss it with their neighbors, and then revote on the same question, the percentage of correct answers typically increases. This outcome could result from gains in understanding during discussion, or simply from peer influence of knowledgeable students on their neighbors. To distinguish between these alternatives in an undergraduate genetics course, we followed the above exercise with a second, similar (isomorphic) question on the same concept that students answered individually. Our results indicate that peer discussion enhances understanding, even when none of the students in a discussion group originally knows the correct answer.
Article
Full-text available
This article reviews the now extensive research literature addressing the impact of accountability on a wide range of social judgments and choices. It focuses on 4 issues: (a) What impact do various accountability ground rules have on thoughts, feelings, and action? (b) Under what conditions will accountability attenuate, have no effect on, or amplify cognitive biases? (c) Does accountability alter how people think or merely what people say they think? and (d) What goals do accountable decision makers seek to achieve? In addition, this review explores the broader implications of accountability research. It highlights the utility of treating thought as a process of internalized dialogue; the importance of documenting social and institutional boundary conditions on putative cognitive biases; and the potential to craft empirical answers to such applied problems as how to structure accountability relationships in organizations.
Article
Full-text available
The authors present a reconciliation of 3 distinct ways in which the research literature has defined overconfidence: (a) overestimation of one's actual performance, (b) overplacement of one's performance relative to others, and (c) excessive precision in one's beliefs. Experimental evidence shows that reversals of the first 2 (apparent underconfidence), when they occur, tend to be on different types of tasks. On difficult tasks, people overestimate their actual performances but also mistakenly believe that they are worse than others; on easy tasks, people underestimate their actual performances but mistakenly believe they are better than others. The authors offer a straightforward theory that can explain these inconsistencies. Overprecision appears to be more persistent than either of the other 2 types of overconfidence, but its presence reduces the magnitude of both overestimation and overplacement.
Article
Full-text available
This paper introduces a three-item "Cognitive Reflection Test" (CRT) as a simple measure of one type of cognitive ability--the ability or disposition to reflect on a question and resist reporting the first response that comes to mind. The author will show that CRT scores are predictive of the types of choices that feature prominently in tests of decision-making theories, like expected utility theory and prospect theory. Indeed, the relation is sometimes so strong that the preferences themselves effectively function as expressions of cognitive ability--an empirical fact begging for a theoretical explanation. The author examines the relation between CRT scores and two important decision-making characteristics: time preference and risk preference. The CRT scores are then compared with other measures of cognitive ability or cognitive "style." The CRT scores exhibit considerable difference between men and women and the article explores how this relates to sex differences in time and risk preferences. The final section addresses the interpretation of correlations between cognitive abilities and decision-making characteristics.
Article
The wisdom of the crowd refers to the finding that judgments aggregated over individuals are typically more accurate than the average individual’s judgment. Here, we examine the potential for improving crowd judgments by allowing individuals to choose which of a set of queries to respond to. If individuals’ metacognitive assessments of what they know is accurate, allowing individuals to opt in to questions of interest or expertise has the potential to create a more informed knowledge base over which to aggregate. This prediction was confirmed: crowds composed of volunteered judgments were more accurate than crowds composed of forced judgments. Overall, allowing individuals to use private metacognitive knowledge holds much promise in enhancing judgments, including those of the crowd.
Article
We evaluate the effect of discussion on the accuracy of collaborative judgments. In contrast to prior research, we show that discussion can either aid or impede accuracy relative to the averaging of collaborators' independent judgments, as a systematic function of task type and interaction process. For estimation tasks with a wide range of potential estimates, discussion aided accuracy by helping participants prevent and eliminate egregious errors. For estimation tasks with a naturally bounded range, discussion following independent estimates performed on par with averaging. Importantly, if participants did not first make independent estimates, discussion greatly harmed accuracy by limiting the range of considered estimates, independent of task type. Our research shows that discussion can be a powerful tool for error reduction, but only when appropriately structured: Decision makers should formindependent judgments to consider a wide range of possible answers, and then use discussion to eliminate extremely large errors.
Article
Nature - the world's best science and medicine on your desktop
Article
Once considered provocative, the notion that the wisdom of the crowd is superior to any individual has become itself a piece of crowd wisdom, leading to speculation that online voting may soon put credentialed experts out of business. Recent applications include political and economic forecasting, evaluating nuclear safety, public policy, the quality of chemical probes, and possible responses to a restless volcano. Algorithms for extracting wisdom from the crowd are typically based on a democratic voting procedure. They are simple to apply and preserve the independence of personal judgment. However, democratic methods have serious limitations. They are biased for shallow, lowest common denominator information, at the expense of novel or specialized knowledge that is not widely shared. Adjustments based on measuring confidence do not solve this problem reliably. Here we propose the following alternative to a democratic vote: select the answer that is more popular than people predict. We show that this principle yields the best answer under reasonable assumptions about voter behaviour, while the standard ‘most popular’ or ‘most confident’ principles fail under exactly those same assumptions. Like traditional voting, the principle accepts unique problems, such as panel decisions about scientific or artistic merit, and legal or historical disputes. The potential application domain is thus broader than that covered by machine learning and psychometric methods, which require data across multiple questions.
Article
A ubiquitous finding in research on human judgment is that people are overconfident about their true predictive abilities. The goal of this study was to understand why overconfidence arises and how it can be reduced to improve the accuracy of predictions about future personal events. Subjects made predictions about the results of their job search efforts 9 months away (e.g., starting salary); all of the events involved positive outcomes, where unrealistic optimism was expected. These events were constructed to vary in their underlying base rate of occurrence. Some subjects generated pro and/or con reasons concerning event occurrence before making their predictions. At low- to moderate-base rates, predictive accuracy increased when subjects generated a con reason. However, at high-base rates (events that occurred for a majority of the subjects), con reason generation had no effect on accuracy-all subjects were more accurate in predicting these events. Generation of pro reasons had no effect on accuracy, suggesting that subjects may have automatically generated supportive reasons as a by-product of the question-answering process. A substantive analysis of the reasons indicated that subjects attributed pro reasons to internal factors and con reasons to external factors. Moreover, subjects who generated internal pro reasons were less accurate than subjects generating external pro or either type of con reason.
Article
Investigated the influence of counterfactual reasoning on accuracy when predicting the outcomes of future personal events. 260 graduate business students made predictions about the results of their job search efforts 9 mo away (e.g., starting salary); all of the events involved positive outcomes, in which unrealistic optimism was expected. These events were constructed to vary in their underlying base rate of occurrence. Some Ss generated pro and/or con reasons concerning event occurrence before making their predictions. At low- to moderate-base rates, predictive accuracy increased when Ss generated a con reason. However, at high-base rates (events that occurred for a majority of the Ss), con reason generation had no effect on accuracy—all Ss were more accurate in predicting these events. Generation of pro reasons had no effect on accuracy, suggesting that Ss may have automatically generated supportive reasons as a by-product of the question-answering process. A substantive analysis of the reasons indicated that Ss attributed pro reasons to internal factors and con reasons to external factors. Moreover, Ss who generated internal pro reasons were less accurate than Ss generating external pro or either type of con reason. (60 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Although researchers have documented many instances of crowd wisdom, it is important to know whether some kinds of judgments may lead the crowd astray, whether crowds’ judgments improve with feedback over time, and whether crowds’ judgments can be improved by changing the way judgments are elicited. We investigated these questions in a sports gambling context (predictions against point spreads) believed to elicit crowd wisdom. In a season-long experiment, fans wagered over $20,000 on NFL football predictions. Contrary to the wisdom-of-crowds hypothesis, faulty intuitions led the crowd to predict “favorites” more than “underdogs” against point spreads that disadvantaged favorites, even when bettors knew that the spreads disadvantaged favorites. Moreover, the bias increased over time, a result consistent with attributions for success and failure that rewarded intuitive choosing. However, when the crowd predicted game outcomes by estimating point differentials rather than by predicting against point spreads, its predictions were unbiased and wiser.
Article
This research tests the hypothesis of Yates et al. (1996) that people prefer judgment producers who make extreme confidence judgments. In each of three experiments, college students evaluated two fictional financial advisors who judged the likelihood that each of several stocks would increase in value. One of the advisors (the moderate advisor) was reasonably well calibrated and the other (the extreme advisor) was overconfident. In all three experiments, participants tended to prefer the extreme advisor. Experiments 2 and 3 showed that the advisors' confidence influenced participants' perception of their knowledge, and Experiment 3 showed that it influenced their perception of the number of categorically correct judgments they made. Both of these variables were, in turn, related to participants' preferences. Experiment 3 also suggested that need for cognition and right-wing authoritarianism are positively related to preference for the extreme advisor. A quantitative model is presented, which captures the basic pattern of results. This model includes the assumption that people use a confidence heuristic; they assume that a more confident advisor makes more categorically correct judgments and is more knowledgeable. Copyright © 2004 John Wiley & Sons, Ltd.
Article
Decision makers have a strong tendency to consider problems as unique. They isolate the current choice from future opportunities and neglect the statistics of the past in evaluating current plans. Overly cautious attitudes to risk result from a failure to appreciate the effects of statistical aggregation in mitigating relative risk. Overly optimistic forecasts result from the adoption of an inside view of the problem, which anchors predictions on plans and scenarios. The conflicting biases are documented in psychological research. Possible implications for decision making in organizations are examined.
Article
The subjective confidence of individuals in groups can be a valid predictor of accuracy in decision-making tasks.
Article
We present three studies of interactive decision making, where decision makers interact with others before making a final decision alone. Because the theories of lay observers and social psychologists emphasize the role of information collection in interaction, we developed a series of tests of information collection. Two studies with sports collection show that interaction does not increase decision accuracy or meta-knowledge (calibration or resolution). The simplest test of information collection is responsiveness - that people should respond to information against their position by modifying their choices or at least lowering their confidence. Studies using traditional scenarios from the group polarization literature show little responsiveness, and even "deviants," who interact with others who unanimously disagree with their choice, frequently fail to respond to the information they collect. The most consistent finding is that interaction increases people′s confidence in their decisions in both sports predictions and risky shift dilemmas. For predictions, confidence increases are not justified by increased accuracy. These results question theories of interaction which assume that people collect information during interaction (e.g., Persuasive Arguments Theory). They also question the labeling of previous results as "shifts" or "polarization." We suggest that interaction is better understood as rationale construction than as information collection - interaction forces people to explain their choices to others, and a variety of previous research in social psychology has shown that explanation generation leads to increased confidence. In Study 3, we provide a preliminary test of rationale construction by showing that people increase in confidence when they construct a case for their position individually, without interaction.
Article
Considerable literature has accumulated over the years regarding the combination of forecasts. The primary conclusion of this line of research is that forecast accuracy can be substantially improved through the combination of multiple individual forecasts. Furthermore, simple combination methods often work reasonably well relative to more complex combinations. This paper provides a review and annotated bibliography of that literature, including contributions from the forecasting, psychology, statistics, and management science literatures. The objectives are to provide a guide to the literature for students and researchers and to help researchers locate contributions in specific areas, both theoretical and applied. Suggestions for future research directions include (1) examination of simple combining approaches to determine reasons for their robustness, (2) development of alternative uses of multiple forecasts in order to make better use of the information they contain, (3) use of combined forecasts as benchmarks for forecast evaluation, and (4) study of subjective combination procedures. Finally, combining forecasts should become part of the mainstream of forecasting practice. In order to achieve this, practitioners should be encouraged to combine forecasts, and software to produce combined forecasts easily should be made available.
Article
Studied 4-member decision-making groups given information about 3 hypothetical candidates for student body president in unshared/consensus or shared or unshared/conflict conditions. 84 undergraduates participated in the unshared consensus condition, and 72 undergraduates participated in the other conditions. Results show that even though groups could have produced unbiased composites of the candidates through discussion, they decided in favor of the candidate initially preferred by a plurality rather than the most favorable candidate. Group members' pre- and postdiscussion recall of candidate attributes indicated that discussion tended to perpetuate, not to correct, members' distorted pictures of the candidates. It is suggested that unstructured discussion in the face of a consensus requirement may fail as a means of combining unique informational resources. (16 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
This study investigates the relation between an individual's self-reported confidence and his or her influence within a freely interacting group. Each participant chose responses and provided confidence assessments for choice items of a variety of task types, first as an individual and a second time as a member of a pentad, a member of a dyad, or an individual. The influence of a particular faction within a group was greater if its members were more confident. A participant's response accuracy was related to both greater confidence and greater influence to the extent that the task fell on the intellective end of the intellective-judgmental continuum of task types. As a result, the extent to which group members' confidence predicted their influence was also greatest on intellective rather than judgmental tasks. Results further illustrate that adding group members to work on a problem may increase overconfidence on judgmental tasks but decrease overconfidence on intellective tasks.
Article
We introduce a general framework for modeling functionally diverse problem-solving agents. In this framework, problem-solving agents possess representations of problems and algorithms that they use to locate solutions. We use this framework to establish a result relevant to group composition. We find that when selecting a problem-solving team from a diverse population of intelligent agents, a team of randomly selected agents outperforms a team comprised of the best-performing agents. This result relies on the intuition that, as the initial pool of problem solvers becomes large, the best-performing agents necessarily become similar in the space of problem solvers. Their relatively greater ability is more than offset by their lack of problem-solving diversity.
Researchers often conduct mediation analysis in order to indirectly assess the effect of a proposed cause on some outcome through a proposed mediator. The utility of mediation analysis stems from its ability to go beyond the merely descriptive to a more functional understanding of the relationships among variables. A necessary component of mediation is a statistically and practically significant indirect effect. Although mediation hypotheses are frequently explored in psychological research, formal significance tests of indirect effects are rarely conducted. After a brief overview of mediation, we argue the importance of directly testing the significance of indirect effects and provide SPSS and SAS macros that facilitate estimation of the indirect effect with a normal theory approach and a bootstrap approach to obtaining confidence intervals, as well as the traditional approach advocated by Baron and Kenny (1986). We hope that this discussion and the macros will enhance the frequency of formal mediation tests in the psychology literature. Electronic copies of these macros may be downloaded from the Psychonomic Society's Web archive at www.psychonomic.org/archive/.
Article
This paper provides a survey on studies that analyze the macroeconomic effects of intellectual property rights (IPR). The first part of this paper introduces different patent policy instruments and reviews their effects on R&D and economic growth. This part also discusses the distortionary effects and distributional consequences of IPR protection as well as empirical evidence on the effects of patent rights. Then, the second part considers the international aspects of IPR protection. In summary, this paper draws the following conclusions from the literature. Firstly, different patent policy instruments have different effects on R&D and growth. Secondly, there is empirical evidence supporting a positive relationship between IPR protection and innovation, but the evidence is stronger for developed countries than for developing countries. Thirdly, the optimal level of IPR protection should tradeoff the social benefits of enhanced innovation against the social costs of multiple distortions and income inequality. Finally, in an open economy, achieving the globally optimal level of protection requires an international coordination (rather than the harmonization) of IPR protection.
Article
This paper explores the consequences of cognitive dissonance, coupled with time-inconsistent preferences, in an intertemporal decision problem with two distinct goals: acting decisively on early information (vision) and adjusting flexibly to late information (flexibility). The decision maker considered here is capable of manipulating information to serve her self-interests, but a tradeoff between distorted beliefs and distorted actions constrains the extent of information manipulation. Building on this tradeoff, the present model provides a unified framework to account for the conformity bias (excessive reliance on precedents) and the confirmatory bias (excessive attachment to initial perceptions).
Article
Interacting groups fail to make judgments as accurate as those of their most capable members due to problems associated with both interaction processes and cognitive processing. Group process techniques and decision analytic tools have been used with groups to combat these problems. While such techniques and tools do improve the quality of group judgment, they have not enabled groups to make judgments more accurate than those of their most capable members on tasks that evoke a great deal of systematic bias. A new intervention procedure that integrates group facilitation, social judgment analysis, and information technology was developed to overcome more fully the problems typically associated with interaction processes and cognitive processing. The intervention was evaluated by testing the hypothesis that groups using this new procedure can establish judgment policies for cognitive conflict tasks that are more accurate than the ones produced by any of their members. An experiment involving 16 four- and five-member groups was conducted to compare the accuracy of group judgments with the accuracy of the judgments of the most capable group member. A total of 96 participants (48 males and 48 females) completed the individual part of the task; 71 of these participants worked in groups. Results indicated that the process intervention enabled small, interacting groups to perform significantly better than their most capable members on two cognitive conflict tasks (p < .05). The findings suggest that Group Decision Support Systems that integrate facilitation, social judgment analysis, and information technology should be used to improve the accuracy of group judgment.
Self-serving beliefs and the pleasure of outcomes. The Psychology of Economic Decisions
  • B Mellers
  • A P Mcgraw
Mellers, B., & McGraw, A. P. (2004). Self-serving beliefs and the pleasure of outcomes. The Psychology of Economic Decisions, 2, 31.
Bias in judgment: Comparing individuals and groups
  • Kerr
Cheap talk and credibility: The consequences of confidence and accuracy on advisor credibility and persuasiveness
  • Sah
Intuitive biases in choice versus estimation: Implications for the wisdom of crowds
  • Simmons