# Guido W. ImbensStanford University | SU · Graduate School of Business

Guido W. Imbens

## About

181

Publications

68,375

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

37,777

Citations

## Publications

Publications (181)

We study identification and estimation of causal effects in settings with panel data. Traditionally researchers follow model-based identification strategies relying on assumptions governing the relation between the potential outcomes and the observed and unobserved confounders. We focus on a different, complementary approach to identification where...

We present a new estimator for causal effects with panel data that builds on insights behind the widely used difference-in-differences and synthetic control methods. Relative to these methods we find, both theoretically and empirically, that this “synthetic difference-in-differences” estimator has desirable robustness properties, and that it perfor...

We develop new semiparametric methods for estimating treatment effects. We focus on a setting where the outcome distributions may be thick tailed, where treatment effects are small, where sample sizes are large and where assignment is completely random. This setting is of particular interest in recent experimentation in tech companies. We propose u...

In this paper we study estimation of and inference for average treatment effects in a setting with panel data. We focus on the staggered adoption setting where units, e.g, individuals, firms, or states, adopt the policy or treatment of interest at a particular point in time, and then remain exposed to this treatment at all times afterwards. We take...

When researchers develop new econometric methods it is common practice to compare the performance of the new methods to those of existing methods in Monte Carlo studies. The credibility of such Monte Carlo studies is often limited because of the discretion the researcher has in choosing the Monte Carlo designs reported. To improve the credibility w...

We develop and analyze a tractable empirical model for strategic network formation that can be estimated with data from a single network at a single point in time. We model the network formation as a sequential process where in each period a single randomly selected pair of agents has the opportunity to form a link. Conditional on such an opportuni...

We study identification and estimation of causal effects of a binary treatment in settings with panel data. We highlight that there are two paths to identification in the presence of unobserved confounders. First, the conventional path based on making assumptions on the relation between the potential outcomes and the unobserved confounders. Second,...

In this essay I discuss potential outcome and graphical approaches to causality, and their relevance for empirical work in economics. I review some of the work on directed acyclic graphs, including the recent "The Book of Why," by Pearl and MacKenzie. I also discuss the potential outcome framework developed by Rubin and coauthors, building on work...

Fuzzy regression discontinuity designs identify the local average treatment effect (LATE) for the subpopulation of compliers, and with forcing variable equal to the threshold. We develop methods that assess the external validity of LATE to other compliance groups at the threshold, and allow for identification away from the threshold. Specifically,...

We present a new perspective on the Synthetic Control (SC) method as a weighted regression estimator with time fixed effects. This perspective suggests a generalization with two way (both unit and time) fixed effects, which can be interpreted as a weighted version of the standard Difference In Differences (DID) estimator. We refer to this new estim...

Consider two heterogenous populations of agents who, when matched, jointly produce an output, Y. For example, teachers and classrooms of students together produce achievement, parents raise children, whose life outcomes vary in adulthood, assembly plant managers and workers produce a certain number of cars per month, and lieutenants and their plato...

Purpose:
Observational pharmacoepidemiological studies can provide valuable information on the effectiveness or safety of interventions in the real world, but one major challenge is the existence of unmeasured confounder(s). While many analytical methods have been developed for dealing with this challenge, they appear under-utilized, perhaps due t...

Contextual bandit algorithms seek to learn a personalized treatment assignment policy, balancing exploration against exploitation. Although a number of algorithms have been proposed, there is little guidance available for applied researchers to select among various approaches. Motivated by the econometrics and statistics literatures on causal effec...

In this paper we develop new methods for estimating causal effects in settings with panel data, where a subset of units are exposed to a treatment during a subset of periods, and the goal is estimating counterfactual (untreated) outcomes for the treated unit/period combinations. We develop a class of estimators that uses the observed elements of th...

In empirical work in economics it is common to report standard errors that account for clustering of units. Typically, the motivation given for the clustering adjustments is that unobserved components in outcomes for units within clusters are correlated. However, because correlation may occur across more than one dimension, this motivation makes it...

Consider a researcher estimating the parameters of a regression function based on data for all 50 states in the United States or on data for all visits to a website. What is the interpretation of the estimated parameters and the standard errors? In practice, researchers typically assume that the sample is randomly drawn from a large population of i...

The increasing popularity of regression discontinuity methods for causal inference in observational studies has led to a proliferation of different estimating strategies, most of which involve first fitting non-parametric regression models on both sides of a threshold and then reporting plug-in estimates for the discontinuity parameter. In applicat...

There is a large literature on semiparametric estimation of average treatment effects under unconfounded treatment assignment in settings with a fixed number of covariates. More recently attention has focused on settings with a large number of covariates. In this paper we extend lessons from the earlier literature to this new setting. We propose th...

In this article, we develop new methods for estimating average treatment effects in observational studies, in settings with more than two treatment levels, assuming unconfoundedness given pretreatment variables. We emphasize propensity score subclassification and matching methods which have been among the most popular methods in the binary treatmen...

In a seminal paper Abadie, Diamond, and Hainmueller [2010] (ADH) develop the synthetic control procedure for estimating the effect of a treatment, in the presence of a single treated unit and a number of control units, with pre-treatment outcomes observed for all units. The method constructs a set of weights such that covariates and pre-treatment o...

In this paper we discuss the properties of confidence intervals for regression parameters based on robust standard errors. We discuss the motivation for a modification suggested by Bell and McCaffrey (2002) to improve the finite sample properties of the confidence intervals based on the conventional robust standard errors. We show that the Bell-McC...

In non-network settings, encouragement designs have been widely used to analyze causal effects of a treatment, policy, or intervention on an outcome of interest when randomizing the treatment was considered impractical or when compliance to treatment cannot be perfectly enforced. Unfortunately, such questions related to treatment compliance have re...

There are many studies where researchers are interested in estimating average treatment effects and are willing to rely on the unconfoundedness assumption, which requires that treatment assignment is as good as random conditional on pre-treatment variables. The unconfoundedness assumption is often more plausible if a large number of pre-treatment v...

We analyze linear models with a single endogenous regressor in the presence of many instrumental variables. We weaken a key assumption typically made in this literature by allowing all the instruments to have direct effects on the outcome. We consider restrictions on these direct effects that allow for point identification of the effect of interest...

We study the calculation of exact p-values for a large class of non-sharp
null hypotheses about treatment effects in a setting with data from experiments
involving members of a single connected network. The class includes null
hypotheses that limit the effect of one unit's treatment status on another
according to the distance between units; for exa...

Researchers often report estimates and standard errors for the object of interest (such as a treatment effect) based on a single specification of a statistical model. We propose a systematic approach to assessing sensitivity to specification. We construct estimates of the object of interest for each of a large set of models. Our proposed robustness...

In this paper we study the problems of estimating heterogeneity in causal
effects in experimental or observational studies and conducting inference about
the magnitude of the differences in treatment effects across subsets of the
population. In applications, our method provides a data-driven approach to
determine which subpopulations have large or...

Rejoinder of "Instrumental Variables: An Econometrician's Perspective" by
Guido W. Imbens [arXiv:1410.0163].

Following the work by Eicker, Huber, and White it is common in empirical work to report standard errors that are robust against general misspecification. In a regression setting, these standard errors are valid for the parameter that minimizes the squared difference between the conditional expectation and a linear approximation, averaged over the p...

There is a large and growing literature on peer effects in economics. In the current article, we focus on a Manski-type linear-in-means model that has proved to be popular in empirical work. We critically examine some aspects of the statistical model that may be restrictive in empirical analyses. Specifically, we focus on three aspects. First, we e...

It is standard practice in empirical work to allow for clustering in the error covariance matrix if the explanatory variables of interest vary at a more aggregate level than the units of observation. Often, however, the structure of the error covariance matrix is more complex, with correlations varying in magnitude within clusters, and not vanishin...

Following the work by White (1980ab; 1982) it is common in empirical work in economics to report standard errors that are robust against general misspecification. In a regression setting these standard errors are valid for the parameter that in the population minimizes the squared difference between the conditional expectation and the linear approx...

In this paper we nonparametrically analyze the effects of reallocating individuals across social groups in the presence of social spillovers. Individuals are either 'high' or 'low' types. Own outcomes may vary with the fraction of high types in one's social group. We characterize the average outcome and inequality effects of small increases in segr...

To frame what is in my view of the main challenges facing researchers in econometrics, let me set the stage by describing the current state of research. Much of the traditional research in econometrics can be divided into two branches, the first comprising cross-section and panel data econometrics and the second time series analysis. In the cross-s...

Two recent papers, Deaton (2009) and Heckman and Urzua (2009), argue against what they see as an excessive and inappropriate use of experimental and quasi-experimental methods in empirical work in economics in the last decade. They specifically question the increased use of instrumental variables and natural experiments in labor economics and of ra...

Properties of GMM estimators are sensitive to the choice of instruments. Using many instruments leads to high asymptotic asymptotic efficiency but can cause high bias and/or variance in small samples. In this paper we develop and implement asymptotic mean square error (MSE) based criteria for instrumental variables to use for estimation of conditio...

In Shadish (2010) and West and Thoemmes (2010), the authors contrasted 2 approaches to causality. The first originated in the psychology literature and is associated with work by Campbell (e.g., Shadish, Cook, & Campbell, 2002), and the second has its roots in the statistics literature and is associated with work by Rubin (e.g., Rubin, 2006). In th...

In the last fifteen years there has been much work on nonparametric identification of causal effects in settings with endogeneity. Earlier, researchers focused on linear systems with additive residuals. However, such systems are often difficult to motivate by economic theory. In many cases it is precisely the nonlinearity of the system and the pres...

The Rubin Causal Model (RCM) is a formal mathematical framework for causal inference, first given that name by Holland (1986) for a series of previous articles developing the perspective (Rubin, 1974; 1975; 1976; 1977; 1978; 1979; 1980). There are two essential parts to the RCM, and a third optional one. The first part is the use of ‘potential outc...

We develop and analyze a tractable empirical model for strategic network formation that can be estimated with data from a single network at a single point in time. We model the network formation as a sequential process where in each period a single randomly selected pair of agents has the opportunity to form a link. Conditional on such an opportuni...

We develop and analyze a tractable empirical model for strategic network formation that can be estimated with data from a single network at a single point in time. We model the network formation as a sequential process where in each period a single randomly selected pair of agents has the opportunity to form a link. Conditional on such an opportuni...

We address a major discrepancy in matching methods for causal inference in observational data. Since these data are typically plentiful, the goal of matching is to reduce bias and only secondar-ily to keep variance low. However, most matching methods seem designed for the opposite goal, guaranteeing sample size ex ante but limiting bias by controll...

Properties of GMM estimators are sensitive to the choice of instrument. Using many instruments leads to high asymptotic asymptotic efficiency but can cause high bias and/or variance in small samples. In this paper we develop and implement asymptotic mean square error (MSE) based criteria for instrument selection in estimation of conditional moment...

This paper uses control variables to identify and estimate models with nonseparable, multidimensional disturbances. Triangular simultaneous equations models are considered, with instruments and disturbances that are independent and a reduced form that is strictly monotonic in a scalar disturbance. Here it is shown that the conditional cumulative di...

Propensity score matching estimators (Rosenbaum and Rubin, 1983) are widely used in evaluation research to estimate average treatment effects. In this article, we derive the large sample distribution of propensity score matching estimators. Our derivations take into account that the propensity score is itself estimated in a first step, prior to mat...

Matching estimators are widely used in statistical data analysis. However, the distribution of matching estimators has been derived only for particular cases (Abadie and Imbens, 2006). This article establishes a martingale representation for matching estimators. This representation allows the use of martingale limit theorems to derive the asymptoti...

Estimation of average treatment effects under unconfounded or ignorable treatment assignment is often hampered by lack of overlap in the covariate distributions between treatment groups. This lack of overlap can lead to imprecise estimates, and can make commonly used estimators sensitive to the choice of specification. In such cases researchers hav...

We investigate the choice of the bandwidth for the regression discontinuity estimator. We focus on estimation by local linear
regression, which was shown to have attractive properties (Porter, J. 2003, “Estimation in the Regression Discontinuity Model” (unpublished, Department of Economics, University of Wisconsin, Madison)).
We derive the asymptot...

This paper evaluates a pilot program run by a company called OPOWER, previously known as Positive Energy, to mail home energy reports to residential utility consumers. The reports compare a household’s energy use to that of its neighbors and provide energy conservation tips. Using data from randomized natural field experiment at 80,000 treatment an...

Many empirical questions in economics and other social sciences depend on causal effects of programs or policies. In the last two decades, much research has been done on the econometric and statistical analysis of such causal effects. This recent theoretical literature has built on, and combined features of, earlier work in both the statistics and...

This paper evaluates a pilot program run by a company called OPOWER, previously known as Positive Energy, to mail home energy reports to residential utility consumers. The reports compare a household’s energy use to that of its neighbors and provide energy conservation tips. Using data from randomized natural field experiment at 80,000 treatment an...

I examine the impact of taxation on family labor supply and test economic models of the family by analyzing responses to the Tax Reform of 1991 in Sweden, known as the "tax reform of the century" because of its large magnitude. Using detailed administrative panel data on approximately 11% of the married Swedish population, I …nd that husbands and w...

In regression discontinuity (RD) designs for evaluating causal effects of interventions, assignment to a treatment is determined at least partly by the value of an observed covariate lying on either side of a fixed threshold. These designs were first introduced in the evaluation literature by Thistlewaite and Campbell [1960. Regression-discontinuit...

Matching estimators are widely used in empirical economics for the evaluation of programs or treatments. Researchers using matching methods often apply the bootstrap to calculate the standard errors. However, no formal justification has been provided for the use of the bootstrap in this setting. In this article, we show that the standard bootstrap...

In this paper we develop two nonparametric tests of treatment effect heterogeneity. The first test is for the null hypothesis that the treatment has a zero average effect for all subpopulations defined by covariates. The second test is for the null hypothesis that the average effect conditional on the covariates is identical for all subpopulations,...

In paired randomized experiments units are grouped in pairs, often based on covariate information, with random assignment within the pairs. Average treatment effects are then estimated by averaging the within-pair differences in outcomes. Typically the variance of the average treatment effect estimator is estimated using the sample variance of the...

The Rubin Causal Model (RCM), a framework for causal inference, has three distinctive features. First, it uses ‘potential outcomes’ to define causal effects at the unit level, first introduced by Neyman in the context of randomized experiments and randomization-based inference, but not used formally in non-randomized studies or with other modes of...

In Regression Discontinuity (RD) designs for evaluating causal effects of interventions, assignment to a treatment is determined at least partly by the value of an observed covariate lying on either side of a fixed threshold. These designs were first introduced in the evaluation literature by Thistlewaite and Campbell (1960). With the exception of...

There are many environments where knowledge of a structural relationship is required to answer questions of interest. Also, nonseparability of a structural disturbance is a key feature of many models. Here, we consider nonparametric identification and estimation of a model that is monotonic in a nonseparable scalar disturbance, which disturbance is...

In this paper we study the effect of reallocating an indivisible input across a population of production units on average output. We define the Average Redistributive Effect (ARE) as the effect of such a reallocation on average output. We consider the case where inputs are discretely-valued, studying the case where they take two or three levels in...

Since the pioneering work by Daniel McFadden, utility-maximization-based multinomial response models have become important tools of empirical researchers. Various generalizations of these models have been developed to allow for unobserved heterogeneity in taste parameters and choice characteristics. Here we investigate how rich a specification of t...

In this paper we analyze the causal eects of reallocating individuals across social groups in the presence of social interactions or spillovers. We consider the case where individuals are either 'high'or 'low'types. Own outcomes may depend on the fraction of high types in one's social group. We characterize the average outcome eect and inter-type i...

In this talk, I look at several methods for estimating average effects of a program, treatment, or regime, under unconfoundedness. The setting is one with a binary program. The traditional example in economics is that of a labor market program where some individuals receive training and others do not, and interest is in some measure of the effectiv...

Estimation of average treatment effects under unconfoundedness or exogenous treatment assignment is often hampered by lack of overlap in the covariate distributions. This lack of overlap can lead to imprecise estimates and can make commonly used estimators sensitive to the choice of specification. In such cases researchers have often used informal...

Using a Cox proportional hazard model that allows for a flexible time dependence that can incorporate both seasonal and business cycle effects, we analyze the determinants of re-employment probabilities of young workers from 1978-1989. We find considerable changes in the chances of young workers finding jobs over the business cycle, however, the ch...

We show how data from an evaluation in which subjects are randomly assigned to some treatment versus a control group can be combined with nonexperimental methods to estimate the differential effects of alternative treatments. We propose tests for the validity of these methods. We use these methods and tests to analyze the differential effects of la...

Matching estimators for average treatment effects are widely used in evaluation research despite the fact that their large sample properties have not been established in many cases. The absence of formal results in this area may be partly due to the fact that standard asymptotic expansions do not apply to matching estimators with a fixed number of...

This paper develops a generalization of the widely used difference-in-differences method for evaluating the effects of policy changes. We propose a model that allows the control and treatment groups to have different average benefits from the treatment. The assumptions of the proposed model are invariant to the scaling of the outcome. We provide co...

This paper develops a new nonparametric series estimator for the average treatment effect for the case with unconfounded treatment assignment, that is, where selection for treatment is on observables. The new estimator is efficient. In addition we develop an optimal procedure for choosing the smoothing parameter, the number of terms in the series b...

This paper develops a new efficient estimator for the average treatment effect, if selection for treatment is on observables. The new estimator is linear in the first-stage nonparametric estimator. This simplifies the derivation of the means squared error (MSE) of the estimator as a function of the number of basis functions that is used in the firs...

I will give a brief overview of modern statistical methods for estimating treatment effects that have recently become popular in social and biomedical sciences. These methods are based on the potential outcome framework developed by Donald Rubin. The specific methods discussed include regression methods, matching, and methods involving the propensi...

This paper uses an experimental design to assess the effectiveness of calls on cooperation in managing the shortage of a vital commodity through non-price mechanisms. Using the large unexpected shortage of flu-vaccines in the Fall 2004, we observed the responses of the members of a campus population to two distinct randomized treatments: providing...

of the binary treatment propensity score, which we label the generalized propensity score (GPS). We demonstrate that the GPS has many of the attractive properties of the binary treatment propensity score. Just as in the binary treatment case, adjusting for this scalar function of the covariates removes all biases associated with dierences in the co...

We investigate the problem of predicting the average effect of a new training program using experiences with previous implementations. There are two principal complications in doing so. First, the population in which the new program will be implemented may differ from the population in which the old program was implemented. Second, the two programs...

Our analysis of migration differs from previous research in three important aspects. First, we exploit the confidential geocoding in the NLSY79 to obtain a distance-based measure. Second, we let the effect of migration on wage growth differ by schooling level. Third, we use propensity score matching to measure the effect of migration on the wages o...

Estimation of average treatment effects under unconfoundedness or selection on observ-ables is often hampered by lack of overlap in the covariate distributions. This lack of overlap can lead to imprecise estimates and can make commonly used estimators sensitive to the choice of specification. In this paper we develop formal methods for addressing s...

An instrument or instrumental variable manipulates a treatment and affects the outcome only indirectly through its manipulation of the treatment. For instance, encouragement to exercise might increase cardiovascular fitness, but only indirectly to the extent that it increases exercise. If instrument levels are randomly assigned to individuals, then...

Recently a growing body of research has studied inference in settings where parameters of interest are partially identified. In many cases the parameter is real-valued and the identification region is an interval whose lower and upper bounds may be estimated from sample data. For this case confidence intervals (CIs) have been proposed that cover th...

The structure of many applied quasi-experiments generates two control groups. This paper shows that by matching to the second control across the same space which assigns treatment status, we can test and relax the assumptions required for the estimation of treatment effects. By varying the number of matches used, the standard difference-in-differen...