Article
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Performance was shown to compete or exceed common approaches like boosting, neural networks or random forests. However, especially for binary outcomes, cross validating BART models to choose regularization priors is advantageous (Chipman et al. 2010;Dorie et al., 2019cited by Hill et al. 2020. ...
... survival, multinomial logistic regressions) and especially well-equipped for causal inference tasks as modelling complex response surfaces and controlling for confounding does not rely on parametric assumptions (Carnegie and Wu 2019;Hill et al. 2020). Resulting posterior distributions allow to readily estimate individual average treatment or heterogeneous causal effects (Carnegie, Dorie, and Hill 2019;Hill et al. 2020). Moreover, the underlying likelihood framework delivers probabilistic statements about the outcome including credibility intervals whereas identifying and quantifying the effect of individual variables on the outcome is more complicated (Carnegie and Wu 2019). ...
Preprint
Full-text available
The uptake of machine learning (ML) approaches in the social and health sciences has been rather slow, and research using ML for social and health research questions remains fragmented. This may be due to the separate development of research in the computational/data versus social and health sciences as well as a lack of accessible overviews and adequate training in ML techniques for non data science researchers. This paper provides a meta-mapping of research questions in the social and health sciences to appropriate ML approaches, by incorporating the necessary requirements to statistical analysis in these disciplines. We map the established classification into description, prediction, and causal inference to common research goals, such as estimating prevalence of adverse health or social outcomes, predicting the risk of an event, and identifying risk factors or causes of adverse outcomes. This meta-mapping aims at overcoming disciplinary barriers and starting a fluid dialogue between researchers from the social and health sciences and methodologically trained researchers. Such mapping may also help to fully exploit the benefits of ML while considering domain-specific aspects relevant to the social and health sciences, and hopefully contribute to the acceleration of the uptake of ML applications to advance both basic and applied social and health sciences research.
... This idea originates from the idea of separating model fitting and interpretation, as first suggested by MacEachern (2001) and further expanded upon by Hahn and Carvalho (2015), rather than allowing the eventual goal of interpretation to drive formulation of the model. Similar ideas as applied to causal inference can be found in the work Murray (2020b), Sivaganesan, Müller, andHuang (2017), and Carnegie et al. (2019). ...
Preprint
In estimating the causal effect of a continuous exposure or treatment, it is important to control for all confounding factors. However, most existing methods require parametric specification for how control variables influence the outcome or generalized propensity score, and inference on treatment effects is usually sensitive to this choice. Additionally, it is often the goal to estimate how the treatment effect varies across observed units. To address this gap, we propose a semiparametric model using Bayesian tree ensembles for estimating the causal effect of a continuous treatment of exposure which (i) does not require a priori parametric specification of the influence of control variables, and (ii) allows for identification of effect modification by pre-specified moderators. The main parametric assumption we make is that the effect of the exposure on the outcome is linear, with the steepness of this relationship determined by a nonparametric function of the moderators, and we provide heuristics to diagnose the validity of this assumption. We apply our methods to revisit a 2001 study of how abortion rates affect incidence of crime.
Article
Full-text available
Statisticians have made great strides towards assumption-free estimation of causal estimands in the past few decades. However this explosion in research has resulted in a breadth of inferential strategies that both create opportunities for more reliable inference as well as complicate the choices that an applied researcher has to make and defend. Relatedly, researchers advocating for new methods typically compare their method to (at best) 2 or 3 other causal inference strategies and test using simulations that may or may not be designed to equally tease out flaws in all the competing methods. The causal inference data analysis challenge, "Is Your SATT Where It's At?", launched as part of the 2016 Atlantic Causal Inference Conference, sought to make progress with respect to both of these issues. The researchers creating the data testing grounds were distinct from the researchers submitting methods whose efficacy would be evaluated. Results from 30 competitors across the two versions of the competition (black box algorithms and do-it-yourself analyses) are presented along with post-hoc analyses that reveal information about the characteristics of causal inference strategies and settings that affect performance. The most consistent conclusion was that methods that flexibly model the response surface perform better overall than methods that fail to do so.
Article
Full-text available
This paper develops a semi-parametric Bayesian regression model for estimating heterogeneous treatment effects from observational data. Standard nonlinear regression models, which may work quite well for prediction, can yield badly biased estimates of treatment effects when fit to data with strong confounding. Our Bayesian causal forests model avoids this problem by directly incorporating an estimate of the propensity function in the specification of the response model, implicitly inducing a covariate-dependent prior on the regression function. This new parametrization also allows treatment heterogeneity to be regularized separately from the prognostic effect of control variables, making it possible to informatively "shrink to homogeneity", in contrast to existing Bayesian non- and semi-parametric approaches.
Article
Full-text available
We introduce Bayesian additive regression trees (BART) for log-linear models including multinomial logistic regression and count regression with zero-inflation and overdispersion. BART has been applied to nonparametric mean regression and binary classification problems in a range of settings. However, existing applications of BART have been limited to models for Gaussian "data", either observed or latent. This is primarily because efficient MCMC algorithms are available for Gaussian likelihoods. But while many useful models are naturally cast in terms of latent Gaussian variables, many others are not -- including models considered in this paper. We develop new data augmentation strategies and carefully specified prior distributions for these new models. Like the original BART prior, the new prior distributions are carefully constructed and calibrated to be flexible while guarding against overfitting. Together the new priors and data augmentation schemes allow us to implement an efficient MCMC sampler outside the context of Gaussian models. The utility of these new methods is illustrated with examples and an application to a previously published dataset.
Article
Full-text available
When estimating causal effects, unmeasured confounding and model misspecification are both potential sources of bias. We propose a method to simultaneously address both issues in the form of a semi-parametric sensitivity analysis. In particular, our approach incorporates Bayesian Additive Regression Trees into a two-parameter sensitivity analysis strategy that assesses sensitivity of posterior distributions of treatment effects to choices of sensitivity parameters. This results in an easily interpretable framework for testing for the impact of an unmeasured confounder that also limits the number of modeling assumptions. We evaluate our approach in a large-scale simulation setting and with high blood pressure data taken from the Third National Health and Nutrition Examination Survey. The model is implemented as open-source software, integrated into the treatSens package for the R statistical programming language. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Article
Full-text available
We develop a Bayesian “sum-of-trees” model where each tree is constrained by a regularization prior to be a weak learner, and fitting and inference are accomplished via an iterative Bayesian backfitting MCMC algorithm that generates samples from a posterior. Effectively, BART is a nonparametric Bayesian regression approach which uses dimensionally adaptive random basis elements. Motivated by ensemble methods in general, and boosting algorithms in particular, BART is defined by a statistical model: a prior and a likelihood. This approach enables full posterior inference including point and interval estimates of the unknown regression function as well as the marginal effects of potential predictors. By keeping track of predictor inclusion frequencies, BART can also be used for model-free variable selection. BART’s many features are illustrated with a bake-off against competing methods on 42 different data sets, with a simulation experiment and on a drug discovery classification problem.
Article
Full-text available
Causal inference in observational studies typically requires making com-parisons between groups that are dissimilar. For instance, researchers inves-tigating the role of a prolonged duration of breastfeeding on child outcomes may be forced to make comparisons between women with substantially dif-ferent characteristics on average. In the extreme there may exist neighbor-hoods of the covariate space where there are not sufficient numbers of both groups of women (those who breastfed for prolonged periods and those who did not) to make inferences about those women. This is referred to as lack of common support. Problems can arise when we try to estimate causal effects for units that lack common support, thus we may want to avoid inference for such units. If ignorability is satisfied with respect to a set of potential con-founders, then identifying whether, or for which units, the common support assumption holds is an empirical question. However, in the high-dimensional covariate space often required to satisfy ignorability such identification may not be trivial. Existing methods used to address this problem often require reliance on parametric assumptions and most, if not all, ignore the informa-tion embedded in the response variable. We distinguish between the concepts of "common support" and "common causal support." We propose a new ap-proach for identifying common causal support that addresses some of the shortcomings of existing methods. We motivate and illustrate the approach using data from the National Longitudinal Survey of Youth to estimate the effect of breastfeeding at least nine months on reading and math achievement scores at age five or six. We also evaluate the comparative performance of this method in hypothetical examples and simulations where the true treat-ment effect is known.
Article
Full-text available
Researchers have long struggled to identify causal effects in nonexperimental settings. Many recently proposed strategies assume ignorability of the treatment assignment mechanism and require fitting two models—one for the assignment mechanism and one for the response surface. This article proposes a strategy that instead focuses on very flexibly modeling just the response surface using a Bayesian nonparametric modeling procedure, Bayesian Additive Regression Trees (BART). BART has several advantages: it is far simpler to use than many recent competitors, requires less guess-work in model fitting, handles a large number of predictors, yields coherent uncertainty intervals, and fluidly handles continuous treatment variables and missing data for the outcome variable. BART also naturally identifies heterogeneous treatment effects. BART produces more accurate estimates of average treatment effects compared to propensity score matching, propensity-weighted estimators, and regression adjustment in the nonlinear simulation situations examined. Further, it is highly competitive in linear settings with the "correct" model, linear regression. Supplemental materials including code and data to replicate simulations and examples from the article as well as methods for population inference are available online.
Article
Full-text available
This article explores some of the challenges that arise when trying to implement propensity score strategies to answer a causal question using data with a large number of covariates. We discuss choices in propensity score estimation strategies, matching and weighting implementation strategies, balance diagnostics, and final analysis models. We demonstrate the wide range of estimates that can result from different combinations of these choices. Finally, an alternative estimation strategy is presented that may have benefits in terms of simplicity and reliability. These issues are explored in the context of an empirical example that uses data from the Early Childhood Longitudinal Study, Kindergarten Cohort to investigate the potential effect of grade retention after the 1st-grade year on subsequent cognitive outcomes.
Article
There is growing interest in using routinely collected data from health care databases to study the safety and effectiveness of therapies in “real‐world” conditions, as it can provide complementary evidence to that of randomized controlled trials. Causal inference from health care databases is challenging because the data are typically noisy, high dimensional, and most importantly, observational. It requires methods that can estimate heterogeneous treatment effects while controlling for confounding in high dimensions. Bayesian additive regression trees, causal forests, causal boosting, and causal multivariate adaptive regression splines are off‐the‐shelf methods that have shown good performance for estimation of heterogeneous treatment effects in observational studies of continuous outcomes. However, it is not clear how these methods would perform in health care database studies where outcomes are often binary and rare and data structures are complex. In this study, we evaluate these methods in simulation studies that recapitulate key characteristics of comparative effectiveness studies. We focus on the conditional average effect of a binary treatment on a binary outcome using the conditional risk difference as an estimand. To emulate health care database studies, we propose a simulation design where real covariate and treatment assignment data are used and only outcomes are simulated based on nonparametric models of the real outcomes. We apply this design to 4 published observational studies that used records from 2 major health care databases in the United States. Our results suggest that Bayesian additive regression trees and causal boosting consistently provide low bias in conditional risk difference estimates in the context of health care database studies.
Article
We develop a Bayesian “sum-of-trees” model, named BART, where each tree is constrained by a prior to be a weak learner. Fitting and inference are accomplished via an iterative backfitting MCMC algorithm. This model is motivated by ensemble methods in general, and boosting algorithms in particular. Like boosting, each weak learner (i.e., each weak tree) contributes a small amount to the overall model. However, our procedure is defined by a statistical model: a prior and a likelihood, while boosting is defined by an algorithm. This model-based approach enables a full and accurate assessment of uncertainty in model predictions, while remaining highly competitive in terms of predictive accuracy.
Article
Randomized experiments are considered the gold standard for causal inference because they can provide unbiased estimates of treatment effects for the experimental participants. However, researchers and policymakers are often interested in using a specific experiment to inform decisions about other target populations. In education research, increasing attention is being paid to the potential lack of generalizability of randomized experiments because the experimental participants may be unrepresentative of the target population of interest. This article examines whether generalization may be assisted by statistical methods that adjust for observed differences between the experimental participants and members of a target population. The methods examined include approaches that reweight the experimental data so that participants more closely resemble the target population and methods that utilize models of the outcome. Two simulation studies and one empirical analysis investigate and compare the methods’ performance. One simulation uses purely simulated data while the other utilizes data from an evaluation of a school-based dropout prevention program. Our simulations suggest that machine learning methods outperform regression-based methods when the required structural (ignorability) assumptions are satisfied. When these assumptions are violated, all of the methods examined perform poorly. Our empirical analysis uses data from a multisite experiment to assess how well results from a given site predict impacts in other sites. Using a variety of extrapolation methods, predicted effects for each site are compared to actual benchmarks. Flexible modeling approaches perform best, although linear regression is not far behind. Taken together, these results suggest that flexible modeling techniques can aid generalization while underscoring the fact that even state-of-the-art statistical techniques still rely on strong assumptions.