Pavel D. Atanasov

Pavel D. Atanasov
Verified
Pavel verified their affiliation via an institutional email.
Verified
Pavel verified their affiliation via an institutional email.
IE University

PhD, Psychology & Decision Science, UPenn
Assistant Professor at IE Business School, Co-PI of Human Forest project

About

60
Publications
30,248
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,331
Citations
Introduction
Human Forest, clinical trial forecasting, crowdsourcing, identifying predictive skill, belief updating, forecast aggregation, human-machine hybrids, prediction markets.
Additional affiliations
Pytho LLC
Position
  • Co-founder
July 2012 - July 2015
University of Pennsylvania
Position
  • PostDoc Position

Publications

Publications (60)
Article
Full-text available
We report the results of the first large-scale, long-term, experimental test between two crowdsourcing methods: prediction markets and prediction polls. More than 2,400 participants made forecasts on 261 events over two seasons of a geopolitical prediction tournament. Forecasters were randomly assigned to either prediction markets (continuous doubl...
Conference Paper
Full-text available
Proper scoring rules can be used to incentivize a forecaster to truthfully report her private beliefs about the probabilities of future events and to evaluate the relative accuracy of fore-casters. While standard scoring rules can score forecasts only once the associated events have been resolved, many applications would benefit from instant access...
Preprint
Full-text available
Laboratory research has shown that both underreaction and overreaction to new information pose threats to forecasting accuracy. This article explores how real-world forecasters who vary in skill attempt to balance these threats. We distinguish among three aspects of updating: frequency, magnitude, and confirmation propensity. Drawing on data from a...
Preprint
Full-text available
How do we effectively combine historical data and human insights to predict complex outcomes? How well do human crowds compete with predictive algorithms? We provide the first description of the Human Forest method, which enables forecasters to define custom reference classes, query a historical database and review base rates specific to their sele...
Preprint
Full-text available
What systems should we use to elicit and aggregate judgmental forecasts? Who should be asked to make such forecasts? We address these questions by assessing two widely-used crowd prediction systems: prediction markets and prediction polls. Our main test compares a prediction market against team-based prediction polls, using data from a large, multi...
Preprint
Full-text available
High-stakes debates often pivot on clashing estimates of outcomes that one side sees as so improbable as not to deserve policy prioritization. These debates are especially intractable when they focus on rare events ranging from disasters (e.g., existential risks from Artificial Intelligence, nuclear war, or bioengineered pandemics) to surprising su...
Preprint
Full-text available
Sound decision-making relies on accurate prediction for tangible outcomes ranging from military conflict to disease outbreaks. To improve crowdsourced forecasting accuracy, we developed SAGE, a hybrid forecasting system that combines human and machine generated forecasts. The system provides a platform where users can interact with machine models a...
Preprint
Full-text available
High-stakes debates often pivot on clashing estimates of outcomes that one side sees as so improbable as not to deserve policy prioritization. These debates are especially intractable when they focus on rare events ranging from disasters (e.g., existential risks from Artificial Intelligence, nuclear war, or bioengineered pandemics) to surprising su...
Article
Full-text available
Gender discrimination is present across various fields, but identifying the underlying mechanism is challenging. We demonstrate own-gender favouritism in a field setting that allows for clean identification of tastes versus beliefs: the One Bid game on the TV show The Price Is Right. Players must guess an item’s value without exceeding it, leaving...
Chapter
Who is good at prediction? Addressing this question is key to recruiting and cultivating accurate crowds and effectively aggregating their judgments. Recent research on superforecasting has demonstrated the importance of individual, persistent skill in crowd prediction. This chapter takes stock of skill identification measures in probability estima...
Article
Full-text available
Sound decision‐making relies on accurate prediction for tangible outcomes ranging from military conflict to disease outbreaks. To improve crowdsourced forecasting accuracy, we developed SAGE, a hybrid forecasting system that combines human and machine generated forecasts. The system provides a platform where users can interact with machine models a...
Article
Full-text available
Psychologists typically measure beliefs and preferences using self-reports, whereas economists are much more likely to infer them from behavior. Prediction markets appear to be a victory for the economic approach, having yielded more accurate probability estimates than opinion polls or experts for a wide variety of events, all without ever asking f...
Preprint
Full-text available
Who is good at prediction? Addressing this question is key to recruiting and cultivating accurate crowds and effectively aggregating their judgments. Recent research on superforecasting has demonstrated the importance of individual, persistent skill in crowd prediction. This chapter takes stock of skill identification measures in probability estima...
Preprint
Little is known about the extent to which medical expert communities can anticipate the outcomes of clinical trials. In this study, we collected 33 expert probability distribution forecasts for an ongoing precision medicine cancer trial (NSABP-B47 or NCT01275677) on the primary outcome (incidence of disease free survival) in study and comparator ar...
Article
Full-text available
Forecasting tournaments are misaligned with the goal of producing actionable forecasts of existential risk, an extreme-stakes domain with slow accuracy feedback and elusive proxies for long-run outcomes. We show how to improve alignment by measuring facets of human judgment that playcentral roles in policy debates but have long been dismissed as un...
Data
Supplementary analyses mentioned in Atanasov et al. (2020).
Article
Full-text available
A growing body of research indicates that forecasting skill is a unique and stable trait: forecasters with a track record of high accuracy tend to maintain this record. But how does one identify skilled forecasters effectively? We address this question using data collected during two seasons of a longitudinal geopolitical forecasting tournament. Ou...
Article
Full-text available
Laboratory research has shown that both underreaction and overreaction to new information pose threats to forecasting accuracy. This article explores how real-world forecasters who vary in skill attempt to balance these threats. We distinguish among three aspects of updating: frequency, magnitude, and confirmation propensity. Drawing on data from a...
Article
Objective To explore the accuracy of combined neurology expert forecasts in predicting primary endpoints for trials. Methods We identified one major randomized trial each in stroke, multiple sclerosis (MS), and amyotrophic lateral sclerosis (ALS) that was closing within 6 months. After recruiting a sample of neurology experts for each disease, we...
Preprint
Full-text available
Forecasting the future is a notoriously difficult task. To overcome this challenge, state-of-the-art forecasting platforms are "hybridized", they gather forecasts from a crowd of humans, as well as one or more machine models. However, an open challenge remains in how to optimally combine forecasts from these pools into a single forecast. We propose...
Conference Paper
Full-text available
Forecasting of geopolitical events is a notoriously difficult task, with experts failing to significantly outperform a random baseline across many types of forecasting events. One successful way to increase the performance of forecasting tasks is to turn to crowdsourcing: leveraging many forecasts from non-expert users. Simultaneously, advances in...
Article
Full-text available
Psychologists typically measure beliefs and preferences using self-reports, whereas economists are much more likely to infer them from behavior. Prediction markets appear to be a victory for the economic approach, having yielded more accurate probability estimates than opinion polls or experts for a wide variety of events, all without ever asking f...
Presentation
Full-text available
Preparatory document for working paper and conference presentation on IARPA HFC RCT-A training concept, development, expected effects and protocol results.
Presentation
Full-text available
Preparatory document for working paper and conference presentation on IARPA HFC RCT-A training protocol results, causality implications and possible selection biases at work.
Article
Full-text available
Accountability pressures are a ubiquitous feature of social systems: virtually everyone must answer to someone for something. Behavioral research has, however, warned that accountability, specifically a focus on being responsible for outcomes, tends to produce suboptimal judgments. We qualify this view by demonstrating the long-term adaptive benefi...
Article
e21011 Background: First line (1L) systemic combination (combo) therapies for treatment of metastatic melanoma (MM) include targeted combo therapies such as dabrafenib+trametinib (D+T) or vemurafenib+cobimetinib for patients (pts) with BRAF mutation (BRAF+), or immunotherapy combo ipilimumab+nivolumab (I+N) for pts irrespective of BRAF status. The...
Article
e21003 Background: Guidelines for metastatic melanoma (MM) recommend targeted therapy combination (combo) for patients (pts) with BRAF mutation (BRAF+) and immunotherapy for pts irrespective of BRAF status. The study objective was to describe real world characteristics and treatment patterns among mm pts treated with either dabrafenib+trametinib (D...
Article
We report the results of the first large-scale, long-term, experimental test between two crowdsourcing methods: prediction markets and prediction polls. More than 2,400 participants made forecasts on 261 events over two seasons of a geopolitical prediction tournament. Forecasters were randomly assigned to either prediction markets (continuous doubl...
Article
Full-text available
Proper scoring rules can be used to incentivize a forecaster to truthfully report her private beliefs about the probabilities of future events and to evaluate the relative accuracy of forecasters. While standard scoring rules can score forecasts only once the associated events have been resolved, many applications would benefit from instant access...
Article
Individuals often make decisions that affect groups, yet the propensities of group representatives are not as well understood than those of independent decision makers or deliberating groups. We ask how responsibility for group payoffs − in the absence of group deliberation − affects the choice. The experiment utilizes the Interdependent Security D...
Article
Full-text available
This article extends psychological methods and concepts into a domain that is as profoundly consequential as it is poorly understood: intelligence analysis. We report findings from a geopolitical forecasting tournament that assessed the accuracy of more than 150,000 forecasts of 743 participants on 199 events occurring over 2 years. Participants we...
Article
Full-text available
We introduce a new method for converting individual probability estimates (obtained through surveys) into market orders for use in a Continuous Double Auction prediction market. Our Survey-Powered Market Agent (SPMA) algorithm is based on actual forecaster behavior, and offers notable advantages over existing market agent algorithms such as Zero In...
Article
Full-text available
What are the barriers to voluntary take-up of high-deductible plans? We address this question using a large-scale employer survey conducted after an open-enrollment period in which a new high-deductible plan was first introduced. Only 3% of the employees chose this plan, despite the respondents’ recognition of its financial advantages. Employees wh...
Article
Full-text available
The CAD triad hypothesis (Rozin, Lowery, Imada, & Haidt, 1999) stipulates that, cross-culturally, people feel anger for violations of autonomy, contempt for violations of community, and disgust for violations of divinity. Although the disgust-divinity link has received some measure of empirical support, the results have been difficult to interpret...
Article
Full-text available
Five university-based research groups competed to recruit forecasters, elicit their predictions, and aggregate those predictions to assign the most accurate probabilities to events in a 2-year geopolitical forecasting tournament. Our group tested and found support for three psychological drivers of accuracy: training, teaming, and tracking. Probabi...
Article
Hypothetical choice studies suggest that physicians often take more risk for themselves than on their patient's behalf. To examine if physicians recommend more screening tests than they personally undergo in the real-world context of breast cancer screening. Within-subjects survey. A national sample of female obstetricians and gynecologists (N = 13...
Conference Paper
Full-text available
We describe a hybrid forecasting method called marketcast. Marketcasts are based on bid and ask orders from prediction markets, aggregated using techniques associated with survey methods, rather than market matching algorithms. We discuss the process of conversion from market orders to probability estimates, and simple aggregation methods. The perf...
Article
Claims of taste based discrimination are common but difficult to prove in the field. Furthermore, much of the research on discrimination focuses on evaluation. However, discriminatory patterns of competition, such as use of aggressive strategies based on opponents' gender, may produce similar discriminatory outcomes. We report evidence for discrimi...
Article
Full-text available
The hypothesis that psychometric instruments incorporating local idioms of distress predict functional impairment in a non-Western, war-affected population above and beyond translations of already established instruments was tested. Exploratory factor analysis was conducted on the War-Related Psychological and Behavioral Problems section of the Pen...
Article
Are we more inclined to take risks for ourselves than on someone else's behalf? Risk taking for self and others were compared across four studies. Study 1 was a meta-analysis of 28 effects from 18 studies. Overall, choices for others were significantly more risk-averse than choices for self. Two features of the choice environment moderated these ef...
Article
We examined the effects of framing and perceived vulnerability on dishonest behavior in competitive environments. Participants were randomly matched into pairs and took a short multiple-choice test, the relative score of which determined their merit-based payoffs. After learning about the test scores, participants were asked to report them, thus af...
Article
Full-text available
The majority of research in conflict management focuses on conflict resolution: the process of reaching a mutually beneficial solution for the negotiating parties. However, some negotiations impart substantial negative externalities onto third parties, so "conflict resolution" is socially suboptimal. Bribery is one such example: potential bribe-giv...
Article
Despite the variance in methods and results in the literature, the current review finds reliable evidence that choices for others tend to be more risk-averse than choices for the self. I term the tendency to avoid risks for others more than for one’s self double risk aversion. The term correctly implies that there is an increase in aversion to risk...
Article
To compare total costs and risk of hypoglycemia in patients with type 2 diabetes (T2D) initiated on NPH insulin versus glargine in a real-world setting. This study used claims data (10/2001 to 06/2005) from a privately insured U.S. population of adult T2D patients who were initiated on NPH or glargine following a 6-month insulin-free period. A samp...
Article
Compare treatment patterns for patients with schizophrenia treated with olanzapine versus quetiapine in the Pennsylvania Medicaid population. Patients (18-64 years) with a diagnosis of schizophrenia (ICD-9-CM: 295.xx) and treated with olanzapine or quetiapine were identified from the Pennsylvania Medicaid claims database (1999-2003). Patients were...
Article
Compare annual health-care costs and resource utilization associated with olanzapine versus quetiapine for treating schizophrenia in a Medicaid population. Adult schizophrenia patients were selected from deidentified Pennsylvania Medicaid claims database (1999–2003). Included patients were continuously enrolled and initiated with olanzapine or quet...
Article
Objective: To determine and compare the cost utilities of the tumour necrosis factor (TNF) antagonists adalimumab and infliximab as maintenance therapies for patients in the US with moderately to severely active Crohn's disease. Methods: Maintenance regimens of adalimumab (40 mg every other week) and infliximab (5 mg/kg) were compared using prim...

Network

Cited By