Improved propensity score weighting using machine learning

Department of Epidemiology and Biostatistics, Drexel University School of Public Health, Philadelphia, PA 19102, U.S.A.
Statistics in Medicine (Impact Factor: 2.04). 11/2009; 29(3):337-46. DOI: 10.1002/sim.3782
Source: PubMed

ABSTRACT Machine learning techniques such as classification and regression trees (CART) have been suggested as promising alternatives to logistic regression for the estimation of propensity scores. The authors examined the performance of various CART-based propensity score models using simulated data. Hypothetical studies of varying sample sizes (n=500, 1000, 2000) with a binary exposure, continuous outcome, and 10 covariates were simulated under seven scenarios differing by degree of non-linear and non-additive associations between covariates and the exposure. Propensity score weights were estimated using logistic regression (all main effects), CART, pruned CART, and the ensemble methods of bagged CART, random forests, and boosted CART. Performance metrics included covariate balance, standard error, per cent absolute bias, and 95 per cent confidence interval (CI) coverage. All methods displayed generally acceptable performance under conditions of either non-linearity or non-additivity alone. However, under conditions of both moderate non-additivity and moderate non-linearity, logistic regression had subpar performance, whereas ensemble methods provided substantially better bias reduction and more consistent 95 per cent CI coverage. The results suggest that ensemble methods, especially boosted CART, may be useful for propensity score weighting.

Download full-text


Available from: Brian K Lee, Aug 24, 2015
1 Follower
    • "how many iterations), and a shrinkage parameter (i.e., the " learning rate " or how much change to make for each new regression tree) (Karwa et al., 2011; Lee et al., 2010; McCaffrey et al., 2004; Westreich et al., 2010; Wyss et al., 2014 "
    [Show abstract] [Hide abstract]
    ABSTRACT: A sufficient understanding of the safety impact of lane widths in urban areas is necessary to produce geometric designs that optimize safety performance for all users. The overarching trend found in the research literature is that as lane widths narrow, crash frequency increases. However, this trend is inconsistent and is the result of multiple cross-sectional studies that have issues related to lack of control for potential confounding variables, unobserved heterogeneity or omitted variable bias, or endogeneity among independent variables, among others. Using ten years of mid-block crash data on urban arterials and collectors from four cities in Nebraska, crash modification factors (CMFs) were estimated for various lane widths and crash types. These CMFs were developed using the propensity scores-potential outcomes methodology. This method reduces many of the issues associated with cross-sectional regression models when estimating the safety effects of infrastructure-related design features. Generalized boosting, a non-parametric modeling technique, was used to estimate the propensity scores. Matching was performed using both Nearest Neighbor and Mahalanobis matching techniques. CMF estimation was done using mixed-effects negative binomial or Poisson regression with the matched data. Lane widths included in the analysis included 9ft, 10ft, 11ft, and 12ft. Some of the estimated CMFs were point estimates while others were functions of traffic volume (i.e., the CMF changed depending on the traffic volume). Roadways with 10ft travel lanes were found to experience the highest crash frequency relative to other lane widths. Meanwhile, roads with 9ft travel lanes were found to experience the lowest relative crash frequency. While this may be due to increased driver caution when traveling on narrow lanes, it is possible that unobserved factors influenced this result. CMFs for target crash types (sideswipe same-direction and sideswipe opposite-direction) were consistent with the values currently used in the Highway Safety Manual (HSM). Copyright © 2015 Elsevier Ltd. All rights reserved.
    Accident; analysis and prevention 06/2015; 82:180-191. DOI:10.1016/j.aap.2015.06.002 · 1.65 Impact Factor
  • Source
    • "However , weighting approaches may yield biased and inefficient estimates when the propensity score model is misspecified (Kang and Schafer, 2007). This problem can be overcome using a boosted classification and regression trees approach (boosted CART; McCaffrey et al., 2004), which can produce very accurate estimated propensity scores (Lee et al., 2010). However, no propensity score techniques can account for the possible presence of unobserved confounders. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The aim of this paper is to estimate the effect of obesity on the employment probability for Italian men and women accounting for both observed and unobserved confounding. We use microdata collected by the Italian National Statistical Office for the year 2009 during a multi-scope survey of Italian households. The employment-obesity relationship is estimated after controlling for observed confounding by using regression modelling and a propensity score weighting approach. To control for both observed and unobserved confounding (endogeneity) a semiparametric recursive bivariate probit approach is employed instead. Our findings suggest that obesity has a significant negative effect on the employment probability and that endogeneity might not be an important issue here.
    Statistica Neerlandica 11/2013; 67(4):436-455. DOI:10.1111/stan.12016 · 0.48 Impact Factor
  • Source
    • "However, in practice, there are two challenges to this theory. First, better covariate balance (at least by some measures) does not always yield less biased effect estimates [3]. Second, given the variety of balance measures available, assessing balance is not straightforward, either in terms of the measures for each covariate or in how to summarize across covariates. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Examining covariate balance is the prescribed method for determining the degree to which propensity score methods should be successful at reducing bias. This study assessed the performance of various balance measures, including a proposed balance measure based on the prognostic score (similar to a disease risk score), to determine which balance measures best correlate with bias in the treatment effect estimate. The correlations of multiple common balance measures with bias in the treatment effect estimate produced by weighting by the odds, subclassification on the propensity score, and full matching on the propensity score were calculated. Simulated data were used, based on realistic data settings. Settings included both continuous and binary covariates and continuous covariates only. The absolute standardized mean difference (ASMD) in prognostic scores, the mean ASMD (in covariates), and the mean t-statistic all had high correlations with bias in the effect estimate. Overall, prognostic scores displayed the highest correlations with bias of all the balance measures considered. Prognostic score measure performance was generally not affected by model misspecification, and the prognostic score measure performed well under a variety of scenarios. Researchers should consider using prognostic score-based balance measures for assessing the performance of propensity score methods for reducing bias in nonexperimental studies.
    Journal of clinical epidemiology 08/2013; 66(8 Suppl):S84-S90.e1. DOI:10.1016/j.jclinepi.2013.01.013 · 5.48 Impact Factor
Show more