• Home
  • Pavel D. Atanasov
Pavel D. Atanasov

Pavel D. Atanasov
Pytho LLC · Decision Science & Prediction

PhD, Psychology & Decision Science, UPenn
Co-PI of Human Forest project, Co-Founder of Pytho LLC

About

53
Publications
24,306
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
908
Citations
Citations since 2017
29 Research Items
725 Citations
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
Introduction
Human Forest, clinical trial forecasting, crowdsourcing, identifying predictive skill, belief updating, forecast aggregation, human-machine hybrids, prediction markets.
Additional affiliations
July 2012 - July 2015
University of Pennsylvania
Position
  • PostDoc Position

Publications

Publications (53)
Article
Full-text available
Psychologists typically measure beliefs and preferences using self-reports, whereas economists are much more likely to infer them from behavior. Prediction markets appear to be a victory for the economic approach, having yielded more accurate probability estimates than opinion polls or experts for a wide variety of events, all without ever asking f...
Preprint
Full-text available
How do we effectively combine historical data and human insights to predict complex outcomes? How well do human crowds compete with predictive algorithms? We provide the first description of the Human Forest method, which enables forecasters to define custom reference classes, query a historical database and review base rates specific to their sele...
Preprint
Full-text available
Problem definition: Accurate forecasts are a key ingredient of effective operations. In fast-changing environments with little historical data, organizations rely on judgmental forecasts for decision making. But how should these forecasts be elicited and aggregated, and who should be asked to provide these forecasts in the first place? Academic/pra...
Preprint
Full-text available
Who is good at prediction? Addressing this question is key to recruiting and cultivating accurate crowds and effectively aggregating their judgments. Recent research on superforecasting has demonstrated the importance of individual, persistent skill in crowd prediction. This chapter takes stock of skill identification measures in probability estima...
Preprint
Little is known about the extent to which medical expert communities can anticipate the outcomes of clinical trials. In this study, we collected 33 expert probability distribution forecasts for an ongoing precision medicine cancer trial (NSABP-B47 or NCT01275677) on the primary outcome (incidence of disease free survival) in study and comparator ar...
Article
Full-text available
Forecasting tournaments are misaligned with the goal of producing actionable forecasts of existential risk, an extreme-stakes domain with slow accuracy feedback and elusive proxies for long-run outcomes. We show how to improve alignment by measuring facets of human judgment that playcentral roles in policy debates but have long been dismissed as un...
Preprint
Full-text available
Evidence of gender discrimination has been found in a variety of contexts, but identifying the mechanism of discrimination is notoriously difficult. Discriminatory behavior could reflect a preference for treating one group differently than another, or it could reflect beliefs about average group differences. We identify a preference for costly gend...
Data
Supplementary analyses mentioned in Atanasov et al. (2020).
Article
Full-text available
A growing body of research indicates that forecasting skill is a unique and stable trait: forecasters with a track record of high accuracy tend to maintain this record. But how does one identify skilled forecasters effectively? We address this question using data collected during two seasons of a longitudinal geopolitical forecasting tournament. Ou...
Preprint
Full-text available
Laboratory research has shown that both underreaction and overreaction to new information pose threats to forecasting accuracy. This article explores how real-world forecasters who vary in skill attempt to balance these threats. We distinguish among three aspects of updating: frequency, magnitude, and confirmation propensity. Drawing on data from a...
Article
Full-text available
Laboratory research has shown that both underreaction and overreaction to new information pose threats to forecasting accuracy. This article explores how real-world forecasters who vary in skill attempt to balance these threats. We distinguish among three aspects of updating: frequency, magnitude, and confirmation propensity. Drawing on data from a...
Article
Objective To explore the accuracy of combined neurology expert forecasts in predicting primary endpoints for trials. Methods We identified one major randomized trial each in stroke, multiple sclerosis (MS), and amyotrophic lateral sclerosis (ALS) that was closing within 6 months. After recruiting a sample of neurology experts for each disease, we...
Preprint
Full-text available
Forecasting the future is a notoriously difficult task. To overcome this challenge, state-of-the-art forecasting platforms are "hybridized", they gather forecasts from a crowd of humans, as well as one or more machine models. However, an open challenge remains in how to optimally combine forecasts from these pools into a single forecast. We propose...
Conference Paper
Full-text available
Forecasting of geopolitical events is a notoriously difficult task, with experts failing to significantly outperform a random baseline across many types of forecasting events. One successful way to increase the performance of forecasting tasks is to turn to crowdsourcing: leveraging many forecasts from non-expert users. Simultaneously, advances in...
Article
Full-text available
Psychologists typically measure beliefs and preferences using self-reports, whereas economists are much more likely to infer them from behavior. Prediction markets appear to be a victory for the economic approach, having yielded more accurate probability estimates than opinion polls or experts for a wide variety of events, all without ever asking f...
Presentation
Full-text available
Preparatory document for working paper and conference presentation on IARPA HFC RCT-A training concept, development, expected effects and protocol results.
Presentation
Full-text available
Preparatory document for working paper and conference presentation on IARPA HFC RCT-A training protocol results, causality implications and possible selection biases at work.
Article
Full-text available
Accountability pressures are a ubiquitous feature of social systems: virtually everyone must answer to someone for something. Behavioral research has, however, warned that accountability, specifically a focus on being responsible for outcomes, tends to produce suboptimal judgments. We qualify this view by demonstrating the long-term adaptive benefi...
Article
e21011 Background: First line (1L) systemic combination (combo) therapies for treatment of metastatic melanoma (MM) include targeted combo therapies such as dabrafenib+trametinib (D+T) or vemurafenib+cobimetinib for patients (pts) with BRAF mutation (BRAF+), or immunotherapy combo ipilimumab+nivolumab (I+N) for pts irrespective of BRAF status. The...
Article
e21003 Background: Guidelines for metastatic melanoma (MM) recommend targeted therapy combination (combo) for patients (pts) with BRAF mutation (BRAF+) and immunotherapy for pts irrespective of BRAF status. The study objective was to describe real world characteristics and treatment patterns among mm pts treated with either dabrafenib+trametinib (D...
Article
Full-text available
We report the results of the first large-scale, long-term, experimental test between two crowdsourcing methods: prediction markets and prediction polls. More than 2,400 participants made forecasts on 261 events over two seasons of a geopolitical prediction tournament. Forecasters were randomly assigned to either prediction markets (continuous doubl...
Article
We report the results of the first large-scale, long-term, experimental test between two crowdsourcing methods: prediction markets and prediction polls. More than 2,400 participants made forecasts on 261 events over two seasons of a geopolitical prediction tournament. Forecasters were randomly assigned to either prediction markets (continuous doubl...
Article
Full-text available
Proper scoring rules can be used to incentivize a forecaster to truthfully report her private beliefs about the probabilities of future events and to evaluate the relative accuracy of forecasters. While standard scoring rules can score forecasts only once the associated events have been resolved, many applications would benefit from instant access...
Conference Paper
Full-text available
Proper scoring rules can be used to incentivize a forecaster to truthfully report her private beliefs about the probabilities of future events and to evaluate the relative accuracy of fore-casters. While standard scoring rules can score forecasts only once the associated events have been resolved, many applications would benefit from instant access...
Article
Individuals often make decisions that affect groups, yet the propensities of group representatives are not as well understood than those of independent decision makers or deliberating groups. We ask how responsibility for group payoffs − in the absence of group deliberation − affects the choice. The experiment utilizes the Interdependent Security D...
Article
Full-text available
This article extends psychological methods and concepts into a domain that is as profoundly consequential as it is poorly understood: intelligence analysis. We report findings from a geopolitical forecasting tournament that assessed the accuracy of more than 150,000 forecasts of 743 participants on 199 events occurring over 2 years. Participants we...
Article
Full-text available
The CAD triad hypothesis (Rozin, Lowery, Imada, & Haidt, 1999) stipulates that, cross-culturally, people feel anger for violations of autonomy, contempt for violations of community, and disgust for violations of divinity. Although the disgust-divinity link has received some measure of empirical support, the results have been difficult to interpret...
Article
Full-text available
We introduce a new method for converting individual probability estimates (obtained through surveys) into market orders for use in a Continuous Double Auction prediction market. Our Survey-Powered Market Agent (SPMA) algorithm is based on actual forecaster behavior, and offers notable advantages over existing market agent algorithms such as Zero In...
Article
Full-text available
What are the barriers to voluntary take-up of high-deductible plans? We address this question using a large-scale employer survey conducted after an open-enrollment period in which a new high-deductible plan was first introduced. Only 3% of the employees chose this plan, despite the respondents’ recognition of its financial advantages. Employees wh...
Article
Full-text available
Five university-based research groups competed to recruit forecasters, elicit their predictions, and aggregate those predictions to assign the most accurate probabilities to events in a 2-year geopolitical forecasting tournament. Our group tested and found support for three psychological drivers of accuracy: training, teaming, and tracking. Probabi...
Article
Hypothetical choice studies suggest that physicians often take more risk for themselves than on their patient's behalf. To examine if physicians recommend more screening tests than they personally undergo in the real-world context of breast cancer screening. Within-subjects survey. A national sample of female obstetricians and gynecologists (N = 13...
Conference Paper
Full-text available
We describe a hybrid forecasting method called marketcast. Marketcasts are based on bid and ask orders from prediction markets, aggregated using techniques associated with survey methods, rather than market matching algorithms. We discuss the process of conversion from market orders to probability estimates, and simple aggregation methods. The perf...
Article
Claims of taste based discrimination are common but difficult to prove in the field. Furthermore, much of the research on discrimination focuses on evaluation. However, discriminatory patterns of competition, such as use of aggressive strategies based on opponents' gender, may produce similar discriminatory outcomes. We report evidence for discrimi...
Article
Full-text available
The hypothesis that psychometric instruments incorporating local idioms of distress predict functional impairment in a non-Western, war-affected population above and beyond translations of already established instruments was tested. Exploratory factor analysis was conducted on the War-Related Psychological and Behavioral Problems section of the Pen...
Article
Are we more inclined to take risks for ourselves than on someone else's behalf? Risk taking for self and others were compared across four studies. Study 1 was a meta-analysis of 28 effects from 18 studies. Overall, choices for others were significantly more risk-averse than choices for self. Two features of the choice environment moderated these ef...
Article
We examined the effects of framing and perceived vulnerability on dishonest behavior in competitive environments. Participants were randomly matched into pairs and took a short multiple-choice test, the relative score of which determined their merit-based payoffs. After learning about the test scores, participants were asked to report them, thus af...
Article
Full-text available
The majority of research in conflict management focuses on conflict resolution: the process of reaching a mutually beneficial solution for the negotiating parties. However, some negotiations impart substantial negative externalities onto third parties, so "conflict resolution" is socially suboptimal. Bribery is one such example: potential bribe-giv...
Article
Despite the variance in methods and results in the literature, the current review finds reliable evidence that choices for others tend to be more risk-averse than choices for the self. I term the tendency to avoid risks for others more than for one’s self double risk aversion. The term correctly implies that there is an increase in aversion to risk...
Article
To compare total costs and risk of hypoglycemia in patients with type 2 diabetes (T2D) initiated on NPH insulin versus glargine in a real-world setting. This study used claims data (10/2001 to 06/2005) from a privately insured U.S. population of adult T2D patients who were initiated on NPH or glargine following a 6-month insulin-free period. A samp...
Article
Compare treatment patterns for patients with schizophrenia treated with olanzapine versus quetiapine in the Pennsylvania Medicaid population. Patients (18-64 years) with a diagnosis of schizophrenia (ICD-9-CM: 295.xx) and treated with olanzapine or quetiapine were identified from the Pennsylvania Medicaid claims database (1999-2003). Patients were...
Article
Compare annual health-care costs and resource utilization associated with olanzapine versus quetiapine for treating schizophrenia in a Medicaid population. Adult schizophrenia patients were selected from deidentified Pennsylvania Medicaid claims database (1999–2003). Included patients were continuously enrolled and initiated with olanzapine or quet...
Article
Objective: To determine and compare the cost utilities of the tumour necrosis factor (TNF) antagonists adalimumab and infliximab as maintenance therapies for patients in the US with moderately to severely active Crohn's disease. Methods: Maintenance regimens of adalimumab (40 mg every other week) and infliximab (5 mg/kg) were compared using prim...

Network

Cited By

Projects

Projects (2)
Project
Accurate predictions are key to effective decision making under uncertainty. Psychology research has shown that predictive judgments can be improved by considering the outside view: placing a problem in the context of similar historical cases, rather than focusing on its unique features. But choosing the right comparison is difficult: statisticians have studied the so-called reference class problem since at least the 19th century. The main objective of this project is to assess the performance of a new method for crowdsourcing reference-class judgments and producing probability forecasts, relative to new and established machine learning models. The method, called human forest (HF), promotes outside-view thinking by enabling forecasters to construct reference classes from a database of historical cases. The human forest method shares a conceptual connection with random forest machine models. In both, predictions are based on frequencies assessed in classification trees. While random forest models use training data to build the trees, HF relies on forecaster' collective knowledge. The project will examine the relative strengths of both methods and explore combinations of the two. We will also assess methods for improving the accuracy of individual forecasters. The intellectual merit of the proposal resides in its promise to address the reference class problem through collective intelligence. The project will compare HF's accuracy, complemented with metacognitive training and statistical aggregation techniques, with that of random forest models, and a human-machine hybrid approach. The latter will use bi-level optimization, providing an advancement in the use of optimization in machine learning, with the aim of pushing the frontier of both machine learning and human capabilities. The core randomized experiments will focus on clinical trial forecasting, namely, predicting the probability of advancement for cancer treatments. The study methods will utilize naturalistic, longitudinal, large-scale online experiments, and will compare the performance of subject-matter experts and generalists. The project will also provide training for researchers and students in machine learning and collective intelligence and develop materials for interactive exercises in high-school STEM classes, undergraduate and graduate courses in statistics and decision making. Assessing the relative importance of general forecasting skill versus subject matter expertise may help address skill scarcity problems in areas dependent exclusively on specialists. The research aims to improve the predictability of clinical trial outcomes and similarly complex activities. Accurate forecasts regarding the success of clinical trial programs may in turn improve risk management, resource allocation, and ultimately result in wider availability of life-saving treatments.