ArticlePDF Available

Quantile treatment effect estimation from censored data by regression adjustment

Authors:

Abstract and Figures

I discuss the mqgamma command that estimates the quantiles of the potential-outcome distributions for each treatment level from censored observa-tional data in which the dependent variable is inherently positive, such as time-to-event data and health-expenditure data. Differences in these marginal quantiles are quantile treatment effects. The implemented estimator is a regression-adjustment type estimator based on a two-parameter gamma distribution for each potential outcome.
Content may be subject to copyright.
The Stata Journal (yyyy)vv, Number ii, pp. 1 –13
Quantile treatment effect estimation from
censored data by regression adjustment
David M. Drukker
Stata
College Station, Texas
ddrukker@stata.com
Abstract. I discuss the mqgamma command that estimates the quantiles of the
potential-outcome distributions for each treatment level from censored observa-
tional data in which the dependent variable is inherently positive, such as time-to-
event data and health-expenditure data. Differences in these marginal quantiles
are quantile treatment effects.
The implemented estimator is a regression-adjustment type estimator based on
a two-parameter gamma distribution for each potential outcome.
(This prepublication draft was distributed on 19 June 2014.)
Keywords: st0001, mqgamma, casual inference, treatment effects, quantile estima-
tion
1 Introduction
I discuss the mqgamma command that estimates the quantiles of the potential-outcome
distributions for each treatment level from censored observational data in which the de-
pendent variable is inherently positive, such as time-to-event data and health-expenditure
data. Differences in these marginal quantiles are quantile treatment effects (QTEs).
QTEs can vary by quantile. For example, the treatment effect of exercise on in-
dividuals with weak hearts could be significantly smaller than the treatment effect on
those with strong hearts. The treatment effect at a lower marginal quantile of time to
second heart attack could be significantly smaller than the treatment effect at an upper
marginal quantile.
The regression-adjustment (RA) estimator implemented in mqgamma handles obser-
vational data. The logic of the estimator has several steps, but solving all the estimating
equations jointly produces a consistent, asymptotically normal, one-step estimator.
The RA estimator implemented in mqgamma is a simple application of parametric
modeling techniques. The details of the estimator are provided in section 9. As an RA
estimator, the implemented estimator is related to the RA estimators for average treat-
ment effects discussed in (Wooldridge 2010, chapter 21). As an estimator of a quantile
treatment effect, the implemented estimator is related to the estimators discussed in
Folich and Melly (2010), Cattaneo (2010), Cattaneo, Drukker, and Holland (2013),
and Firpo (2007).
c
yyyy StataCorp LP st0001
2Quantile treatment effects
To extend the weighting estimators derived by Firpo (2007) and Cattaneo (2010),
one would have to model the probability of censoring and then jointly correct for miss-
ingness due to treatment and missingness due to censoring. (This extension follows
immediately from the results in Wooldridge (2007, 2002).) The advantage of the imple-
mented RA estimator over the weighted estimators is that only an outcome model is
required. The weighting estimators require models for the treatment allocation and the
censoring process. Furthermore, the model for the censoring process is most likely just
as complicated as the model for the outcome.
2 An example
Suppose I have data from a study that followed a random sample of middle-aged men
who had previously had a heart attack for three years. I am interested in whether an
exercise regime affects the time to a second heart attack. Some observations on the
time to second heart attack are censored. Because the data is observational, treatment
allocation depends on covariates and I use a model for the outcome to adjust for this
dependence.
Key to this story is that exercise could help individuals with relatively strong hearts
but not help those with weak hearts. An individual with a “strong heart” would be
in an upper quantile of the marginal distribution, over the covariates, of the potential
outcome generated by each treatment level. Analogously, an individual with a “weak
heart” would be in a lower quantile of the marginal distribution of the potential outcome
generated by each treatment level. As is standard in the treatment-effect literature, the
treated potential outcome is the variable that would occur if everyone in the population
received the treatment. Analogously, the control potential outcome is the variable
that would occur if everyone received the control. See Holland (1986); Imbens and
Wooldridge (2009); Wooldridge (2010) for discussions and further references.
I estimate the difference in the marginal quantiles for the treated potential-outcome
distribution and the control potential-outcome distribution at the upper quantile of .75.
This difference in marginal quantiles is a QTE at quantile .75, denoted by QTE(.75).
Similarly, I estimate the difference in the marginal quantiles for the treated potential-
outcome distribution and control potential-outcome distribution at the lower quantile
of .25. This difference in marginal quantiles is a QTE at quantile .25, denoted by
QTE(.25). Our story indicates that the QTE(.75) should be significantly larger that
the QTE(.25).
In the (fictional) data for this example, tis the possibly censored observation on
the time in years to second heart attack and fail is 1 if the observation is not censored
and 0 if it is censored. In this over-simplified example, the covariates are an index of
pretreatment health status health, and an index of pretreatment activity level active.
The binary variable exercise is 1 if an individual joins the exercise regime and 0 if he
does not.
I begin by estimating the marginal .25 quantile and the marginal .75 quantile for
Drukker 3
each potential-outcome distribution.
. use exercise
. mqgamma t active, treat(exercise) fail(fail) lns(health) quantile(.25 .75)
Iteration 0: EE criterion = .7032254
Iteration 1: EE criterion = .05262105
Iteration 2: EE criterion = .00028553
Iteration 3: EE criterion = 6.892e-07
Iteration 4: EE criterion = 4.706e-12
Iteration 5: EE criterion = 1.604e-22
Gamma quantile-treatment-effect estimation Number of obs = 2000
Robust
t Coef. Std. Err. z P>|z| [95% Conf. Interval]
q25_0
_cons .2151604 .0159611 13.48 0.000 .1838771 .2464436
q25_1
_cons .2612655 .0249856 10.46 0.000 .2122946 .3102364
q75_0
_cons 1.591147 .0725607 21.93 0.000 1.44893 1.733363
q75_1
_cons 2.510068 .1349917 18.59 0.000 2.245489 2.774647
The estimated .25 quantile for the treated potential outcome is 0.26 while the esti-
mated .25 quantile for the control potential outcome is .22. The estimated .75 quantile
for the treated potential outcome is 2.51 while the estimated .75 quantile for the con-
trol potential outcome is 1.59. These results appear to confirm the conjecture that the
QTE(.75) is significantly larger than the QTE(.25). Below I use nlcom to estimate the
QTEs from the estimated marginal quantiles.
. nlcom (_b[q25_1:_cons] - _b[q25_0:_cons]) ///
> (_b[q75_1:_cons] - _b[q75_0:_cons])
_nl_1: _b[q25_1:_cons] - _b[q25_0:_cons]
_nl_2: _b[q75_1:_cons] - _b[q75_0:_cons]
t Coef. Std. Err. z P>|z| [95% Conf. Interval]
_nl_1 .0461051 .0295846 1.56 0.119 -.0118796 .1040899
_nl_2 .9189214 .1529012 6.01 0.000 .6192405 1.218602
The above output confirms that the estimated QTE(.75) of .92 is significantly larger
than the estimated QTE(.25) of .05.
3 Estimator details: Some examples
The estimator implemented in mqgamma is a regression-adjustment type estimator. For
each treatment level, after finding the maximum-likelihood (ML) estimates of the conditional-
on-covariates distribution, the implemented estimator uses the ML estimates to estimate
the marginal quantiles.
4Quantile treatment effects
This section clarifies the above description by discussing a detailed example of how
this RA estimator works. (You may skip this section if you wish to avoid these details.)
The implemented RA estimator models each treatment level using a two-parameter
gamma distribution. For treatment level j∈ {0,1}, as a function of covariates xiand
wi, the shape parameter is parameterized as αj= exp(2xiβ0
j) and the scale parameter
is parameterized as βj= exp(wiγ0
j) exp(2xiβ0
j). To facilitate model specification, I note
that a variable yjwith this two-parameter gamma distribution has conditional mean
E[yj|x,w] = αjβj
= exp(2xiβ0
j)exp(wiγ0
j) exp(2xiβ0
j)
=exp(wiγ0
j)
The conditional variance is given by
Var[yj|x,w] = αjβ2
j
= exp(2xiβ0
j)[exp(wiγ0
j) exp(2xiβ0
j)]2
=exp(2wiγ0
j+ 2xiβ0
j)
The conditional distribution function is given by
F(yj|x,w) = G(exp(2xiβ0
j), y/ exp(wiγ0
j) exp(2xiβ0
j))
where G() is the distribution function of the one-parameter gamma distribution, imple-
mented in Stata as gamma(a, x).
I repeat the previous estimation but also specifying the option aequations so that
the command reports the auxiliary parameters.
. mqgamma t active, treat(exercise) fail(fail) lns(health) ///
> quantile(.25 .75) aequations
Iteration 0: EE criterion = .7032254
Iteration 1: EE criterion = .05262105
Iteration 2: EE criterion = .00028553
Iteration 3: EE criterion = 6.892e-07
Iteration 4: EE criterion = 4.706e-12
Iteration 5: EE criterion = 1.604e-22
Gamma quantile-treatment-effect estimation Number of obs = 2000
Robust
t Coef. Std. Err. z P>|z| [95% Conf. Interval]
q25_0
_cons .2151604 .0159611 13.48 0.000 .1838771 .2464436
q25_1
_cons .2612655 .0249856 10.46 0.000 .2122946 .3102364
q75_0
_cons 1.591147 .0725607 21.93 0.000 1.44893 1.733363
q75_1
_cons 2.510068 .1349917 18.59 0.000 2.245489 2.774647
z_0
Drukker 5
active .1571665 .1363228 1.15 0.249 -.1100212 .4243542
_cons .0588663 .0824981 0.71 0.476 -.102827 .2205596
lns_0
health .1667081 .1069483 1.56 0.119 -.0429067 .3763229
_cons .1275698 .0356235 3.58 0.000 .0577489 .1973906
z_1
active .9148007 .1010204 9.06 0.000 .7168043 1.112797
_cons .0968546 .0905065 1.07 0.285 -.0805349 .2742441
lns_1
health .4050442 .0851527 4.76 0.000 .238148 .5719404
_cons .1299308 .0368826 3.52 0.000 .0576421 .2022194
Note that
the z 0 equation reports the estimated γjfor the control potential outcome,
the lns 0 equation reports the estimated βjfor the control potential outcome,
the z 1 equation reports the estimated γjfor the treated potential outcome, and
the lns 1 equation reports the estimated βjfor the treated potential outcome.
Given the control-potential-outcome estimates b
β0and b
γ0,
1/N
N
X
i=1
G(exp(2xib
β
0
0), q/ exp(wib
γ0
0) exp(2xib
β
0
0))
consistently estimates the marginal distribution, over the covariates, of tfor the control
potential outcome at the point q. The bqthat sets this average to a value τ(0,1) is a
consistent estimator of the marginal τquantile in the control potential outcome.
I provide two illustrations of this point. First, I show that the average of the con-
ditional distribution function for the controls at the estimated .25 marginal quantile is
.25. I begin this first illustration by computing wib
γ0
0for all the observations in the
data.
. predict double z0, equation(z_0)
Now I compute xib
β0for all the observations in the data.
. predict double lns0, equation(lns_0)
Next I compute the conditional-on-covariates distribution at the marginal-quantile
point .2151604 for all the observations in the data. The mean of this variable estimates
the marginal distribution for the control potential outcome at the marginal-quantile
point .2151604 and this mean is .25 because .2151604 is the .25 quantile of this distri-
bution.
6Quantile treatment effects
. generate double cd0 = gammap(exp(-2*lns0), .2151604/(exp(z0)*exp(2*lns0)))
. sum cd0
Variable Obs Mean Std. Dev. Min Max
cd0 2000 .25 .0213933 .1777323 .3922674
Second, I use the gmm command to show that .215604 is the value that solves the
sample moment condition
1/N
N
X
i=1
G(exp(2xib
β
0
0),bq/ exp(wib
γ0
0) exp(2xib
β
0
0)) .25 = 0
. gmm ( gammap(exp(-2*lns0), {qh}/(exp(z0)*exp(2*lns0))) - .25), onestep
Step 1
Iteration 0: GMM criterion Q(b) = .0625
Iteration 1: GMM criterion Q(b) = .00235841
Iteration 2: GMM criterion Q(b) = .00001057
Iteration 3: GMM criterion Q(b) = 1.934e-10
Iteration 4: GMM criterion Q(b) = 6.505e-20
GMM estimation
Number of parameters = 1
Number of moments = 1
Initial weight matrix: Unadjusted Number of obs = 2000
Robust
Coef. Std. Err. z P>|z| [95% Conf. Interval]
/qh .2151604 .0006355 338.57 0.000 .2139148 .2164059
Instruments for equation 1: _cons
As expected, gmm reports the estimated parameter to be .2151604. The standard
errors reported by gmm do not match those reported by mqgamma because I used gmm to
solve this moment condition taking the other estimated parameters as fixed. mqgamma
estimates all the parameters jointly and it reports consistent estimates of the standard
errors.
Further estimation details are provided in the 9.
4 Estimator details:Some assumptions
In this section, I provide an intuitive discussion of the assumptions needed to identify
the marginal quantiles and to interpret differences in them as QTEs. I also provide
references to more formal versions in the literature.
I assume that the two-parameter gamma distributions are correctly specified for each
potential outcome. Note that this implies that the true distributions are continuous, so
that smooth quantile estimation makes sense.
I assume that conditional on the covariates, the distribution of the potential out-
Drukker 7
comes is independent from the treatment; see for example Assumption 1 in Firpo (2007).
This assumption allows us to recover the distributions of potential outcomes from the
conditional-on-treatment distributions on which I have data. I also assume that, condi-
tional on the covariates, each person has a positive probability of getting either treat-
ment; see for example Assumption 1 in Firpo (2007). This assumption ensures that for
each covariate pattern, there are observations in each treatment level so that compar-
isons make sense.
Finally, I assume that the rank of an individual in the conditional-on-covariates
distributions is the same for each treatment level. This assumption is known as the
rank-preservation assumption; see Firpo (2007).
All of these assumptions rule out some cases of interest, but all are standard in the
literature.
5 Simulations
To test the implementation and illustrate the finite-sample performance of the imple-
mented estimator, I ran a Monte Carlo simulation. Each sample had 2,000 observations
and I drew 10,000 samples.
The data-generating process (DGP) had the following features
Selection into treatment depended on each of two covariates.
Each potential outcome came from a two-parameter gamma distribution and the
parameters differed by potential outcome.
A censoring time came from another two-parameter gamma distribution that was
a function of the same covariates that determined the treatment allocation and
the potential outcomes. Each potential outcome was possibly censored by setting
it to the minimum of the true potential outcome and the censoring time.
For more details see the program used to draw each repetition which is provided in
section 10.
For each sample I estimated the marginal .25, .50, and .75 quantiles, and the pa-
rameters of the conditional distributions.
Table 1 summarizes the results.
Column 1 gives the parameter name.
Column 2 gives the true value of the parameter.
Column 3 gives the mean of the estimates for the parameter.
Column 4 gives the standard deviation of the estimates for the parameter.
8Quantile treatment effects
Column 5 gives the mean of the estimated standard errors.
Column 6 gives the rejection rate for a 5% Wald test against the null hypothesis
that the parameter equals its true value.
Table 1: Results
Parameter True Mean S.D. Mean S.E. Rej.
q25 0 0.25 0.25 0.25 0.02 0.05
q25 1 0.27 0.27 0.27 0.02 0.05
q50 0 0.78 0.78 0.78 0.04 0.05
q50 1 1.01 1.01 1.01 0.06 0.05
q75 0 1.84 1.84 1.84 0.09 0.06
q75 1 2.69 2.69 2.69 0.14 0.05
γx2,00.30 0.30 0.30 0.13 0.06
γ1,00.12 0.12 0.12 0.08 0.06
βx1,00.20 0.20 0.20 0.10 0.05
β1,00.12 0.12 0.12 0.03 0.05
γx2,11.00 1.00 1.00 0.11 0.06
γ1,10.11 0.11 0.11 0.09 0.05
βx1,10.50 0.50 0.50 0.08 0.06
β1,10.11 0.11 0.11 0.03 0.05
The results in table 1 indicate that the command and the method perform well.
There is no evidence of bias and the rejection rates are all close to the expected .05.
6 Extensions
The implemented RA estimator can be extended to a finite integer set of k > 2 treat-
ments by estimating different parameters for each treatment level and using each set of
estimated parameters to estimate the marginal quantiles. I will perform this extension
in future research.
The idea underlying the implemented RA estimator could also be applied to estimate
marginal hazard functions, at a point, for each potential outcome. Differences and ra-
tios of these estimated marginal hazard functions define hazard treatment effects. The
point-wise asymptotics follow immediately from the point-wise results in Newey and
McFadden (1994). Results over intervals of points and tests for, say, stochastic domi-
nance would require another asymptotic framework such as empirical process theory. I
will also investigate these extensions in future research.
Another set of extensions replaces the parametric gamma distribution with a flexible
distribution built from terms of a basis for the space of possible distributions, such
Drukker 9
as orthogonal polynomials; see Chen (2007); Gallant and Nychka (1987). I will also
investigate these extensions in future research.
7 Appendix: Syntax
Here is the syntax of the mqgamma command.
mqgamma depvar indepvars  if  in ,treat(varname)quantile(numlist)
lns(varlist) fail(varname) aequations from(matrix)display options
Note that the indepvars are used to model the natural log of the conditional mean
of the depvar. See section 9 for details.
Options
treat(varname)is a required option and it specifies the binary treatment variable. The
treatment variable must be coded 0 for control cases and 1 for treated cases.
quantile(numlist)specifies the marginal quantiles to estimated. Each specified quan-
tile must be in (0,1).
lns(varlist)specifies the variables used to model the natural log of the scale. See section
9 for details.
fail(varname)specifies the binary failure indicator which must be coded 1 for an
observed value and 0 for a censored observation.
aequations specifies that the auxiliary-equation parameters should be displayed.
from(matrix)specifies a row vector of initial values for the optimization routine. Each
element in the specified matrix specifies the initial value for the corresponding pa-
rameter.
display options the standard display options. See [R]estimation options
7.1 Post estimation syntax
After mqgamma,predict has the following syntax.
predict type newvarname if  in  , equation(eqno)
Options
equation(eqno)specifies the equation for which predict calculates the xb term. eqno
specifies the equation number. Examples are equation(#1) specifies the first equa-
tion and equation(z 0) specifies the z 0 equation. See [R]predict for more details
about the equation() option.
After mqgamma,predict computes linear predictions from the fitted model, known
10 Quantile treatment effects
as xb terms in Stata parlance.
8 Appendix: Saved results
The mqgamma saves off the following e-results.
Scalars
e(N) number of observations e(converged) 1 if the estimator converged
e(k eq) number equations to display e(k quant) number of estimated quantiles
Macros
e(cmdline) command line input e(title) title for estimation header
e(vce) robust e(vcetype) Robust
e(quantile) quantiles estimated e(predict) mqgamma p
e(cmd) mqgamma
Matrices
e(b) coefficient vector e(V) variance–covariance matrix of
the estimators
Functions
e(sample) marks estimation sample
9 Appendix: Methods and Formulas
The probability density function (PDF) of the two-parameter gamma distribution is
frequently written as
f(y|α, β) = 1
Γ(α)βαyα1ey/β
To model the conditional-on-covariates PDF, I let
α= (1/s2
i) = exp[2 ln(si)] = exp(2xiγ0)
β=ezis2
i= exp(wiγ0) exp(2xiβ0)
The conditional mean is E[y|zi, si] = eziand the conditional variance is Var[y|zi, si] =
e2zis2
i. Thus, ziparameterizes the natural log of the conditional mean and siparam-
eterizes the scale. The variables specified as indepvars are the wiused to model the
natural log of the conditional mean; zi=wiγ0. The variables specified in option lns()
are the xiused to model the natural log of si; ln(si) = xiβ0.
This parameterization of αand βyields a PDF of
f(yi|zi, si) = 1
Γ(s2
i)(ezis2
i)s2
i
ys2
i1
ieyi/(ezis2
i)
and CDF of
F(yi|zi, si) = G(s2
i, yezi2 ln(si))
where G(a, x) is the CDF of a one-parameter Gamma distribution evaluated at xwhich
is implemented in Stata as gammap(a, x).
Drukker 11
Letting ci= 1 if observation iis censored and ci= 0 when the observation is not
censored yields the following log-likelihood function for observation i
Li= (1 ci)ln Γ(s2
i)s2
i[zi+ 2 ln(si) + yiezi]+(s2
i1) ln(yi)
+cinln h1G(s2
i, yiezi2 ln(si))io
This log-likelihood is implemented separately for the treated observations and for
the control observations. For each treatment level, the score equations for the model
parameters are estimating equations. In addition, for each treatment level d∈ {0,1},
each quantile qdsolves a sample estimating equation
1/N
N
X
i=1
G(exp(2xib
β
0
d), qd/exp(wib
γ0
d) exp(2xib
β
0
d)) τ= 0
where τ(0,1) and G(a, w) is the CDF of a one-parameter Gamma distribution eval-
uated at wwhich is implemented in Stata as gammap(a, w).
The estimating equations are solved jointly using gmm, whose robust standard errors
are consistent.
The implemented RA is a simple application of stacking the moment conditions from
a parametric estimator and smooth functions of the parameters. Newey (1984) discussed
stacking the moment conditions to produce a one-step consistent and asymptotically
normal estimator. Alternatively, I could prove the consistency and asymptotic normality
of the implemented RA estimator using the results in Newey and McFadden (1994).
10 Appendix: Data Generation
The following command was used to generate each draw from the DGP.
program define mkxdata
drop _all
set obs 2000
// generate covariates
gen double x1 = rchi2(3)/10
gen double x2 = rchi2(4)/7
gen double uy = runiform()
// generate y0
gen double ln_sy0 = .12 + .2*x1
gen double sy0 = exp(ln_sy0)
gen double zy0 = .12 + .3*x2
gen double exp_zy0 = exp(zy0)
12 Quantile treatment effects
gen double alphay0 = 1/(sy0^2)
gen double betay0 = (sy0^2)*exp_zy0
gen double y0 = invgammap(alphay0 , uy )*(betay0 )
// generate y1
gen double ln_sy1 = .11 + .5*x1
gen double sy1 = exp(ln_sy1)
gen double zy1 = .11 + 1.0*x2
gen double exp_zy1 = exp(zy1)
gen double alphay1 = 1/(sy1^2)
gen double betay1 = (sy1^2)*exp_zy1
gen double y1 = invgammap(alphay1 , uy )*(betay1 )
// generate censor time
gen double uc = runiform()
gen double ln_sc = .7 + .7*x1
gen double sc = exp(ln_sc)
gen double zc = 3.3 + 3.2*x2
gen double exp_zc = exp(zc)
gen double alphac = 1/(sc^2)
gen double betac = (sc^2)*exp_zc
gen double c = invgammap(alphac , uc )*(betac )
gen treat = (-.6 + .5*x1 + .75*x2 + rnormal()) > 0
gen double w = treat*min(y1,c) + (1-treat)*min(y0,c)
gen f0 = y0<=c
gen f1 = y1<=c
gen double f = treat*f1 + (1-treat)*f0
gen double cons = 1
end
11 References
Cattaneo, M. 2010. Efficient semiparametric estimation of multi-valued treatment effects
under ignorability. Journal of Econometrics 155(2): 138–154.
Cattaneo, M. D., D. M. Drukker, and A. D. Holland. 2013. Estimation of multivalued
treatment effects under conditional independence. Stata Journal 13(3): ??
Chen, X. 2007. Large sample sieve estimation of semi-nonparametric models. In Hand-
book of Econometrics, vol. 6, 5549–5632. Amsterdam: Elsevier.
Firpo, S. 2007. Efficient semiparametric estimation of quantile treatment effects. Econo-
metrica 75(1): 259–276.
Drukker 13
Folich, M., and B. Melly. 2010. Estimation of quantile treatment ef-
fects with Stata. Stata Journal 10(3): 423–457(35). http://www.stata-
journal.com/article.html?article=st0203.
Gallant, A. R., and D. W. Nychka. 1987. Semi-nonparametric maximum likelihood
estimation. Econometrica: Journal of the Econometric Society 363–390.
Holland, P. W. 1986. Statistics and causal inference. Journal of the American Statistical
Association 945–960.
Imbens, G. W., and J. M. Wooldridge. 2009. Recent Developments in the Econometrics
of Program Evaluation. Journal of Economic Literature 47: 5–86.
Newey, W. K. 1984. A method of moments interpretation of sequential estimators.
Economics Letters 14(2): 201–206.
Newey, W. K., and D. McFadden. 1994. Large sample estimation and hypothesis testing.
In Handbook of Econometrics, vol. 4, 2111–2245. Amsterdam: Elsevier.
Wooldridge, J. M. 2002. Inverse probability weighted M-estimators for sample selection,
attrition, and stratification. Portuguese Economic Journal 1: 117–139.
. 2007. Inverse probability weighted estimation for general missing data problems.
Journal of Econometrics 141(2): 1281–1301.
. 2010. Econometric Analysis of Cross Section and Panel Data. 2nd ed. Cam-
bridge, Massachusetts: MIT Press.
About the authors
David M. Drukker is the Director of Econometrics at Stata.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
In this article, we discuss the implementation of various estimators proposed to estimate quantile treatment effects. We distinguish four cases involv- ing conditional and unconditional quantile treatment effects with either exogenous or endogenous treatment variables. The introduced ivqte command covers four different estimators: the classical quantile regression estimator of Koenker and Bassett (1978, Econometrica 46: 33–50) extended to heteroskedasticity consis- tent standard errors; the instrumental-variable quantile regression estimator of Abadie, Angrist, and Imbens (2002, Econometrica 70: 91–117); the estimator for unconditional quantile treatment effects proposed by Firpo (2007, Econometrica 75: 259–276); and the instrumental-variable estimator for unconditional quantile treatment effects proposed by Fr ̈olich and Melly (2008, IZA discussion paper 3288). The implemented instrumental-variable procedures estimate the causal effects for the subpopulation of compliers and are only well suited for binary instruments. ivqte also provides analytical standard errors and various options for nonpara- metric estimation. As a by-product, the locreg command implements local linear and local logit estimators for mixed data (continuous, ordered discrete, unordered discrete, and binary regressors).
Article
Full-text available
Many empirical questions in economics and other social sciences depend on causal effects of programs or policies. In the last two decades, much research has been done on the econometric and statistical analysis of such causal effects. This recent theoretical literature has built on, and combined features of, earlier work in both the statistics and econometrics literatures. It has by now reached a level of maturity that makes it an important tool in many areas of empirical research in economics, including labor economics, public finance, development economics, industrial organization, and other areas of empirical microeconomics. In this review, we discuss some of the recent developments. We focus primarily on practical issues for empirical researchers, as well as provide a historical overview of the area and give references to more technical research.
Article
Problems involving causal inference have dogged at the heels of statistics since its earliest days. Correlation does not imply causation, and yet causal conclusions drawn from a carefully designed experiment are often valid. What can a statistical model say about causation? This question is addressed by using a particular model for causal inference (Holland and Rubin 1983; Rubin 1974) to critique the discussions of other writers on causation and causal inference. These include selected philosophers, medical researchers, statisticians, econometricians, and proponents of causal modeling.
Article
This article discusses the poparms command, which implements two semiparametric estimators for multivalued treatment effects discussed in Cattaneo (2010, Journal of Econometrics 155: 138–154). The first is a properly reweighted inverse-probability weighted estimator, and the second is an efficient-influence function estimator, which can be interpreted as having the double-robust property. Our implementation jointly estimates means and quantiles of the potential outcome distributions, allowing for multiple, discrete treatment levels. These estimators are then used to estimate a variety of multivalued treatment effects. We discuss pre- and postestimation approaches that can be used in conjunction with our main implementation. We illustrate the program and provide a simulation study assessing the finite-sample performance of the inference procedures.
Article
I provide an overview of inverse probability weighted (IPW) M-estimators for cross section and two-period panel data applications. Under an ignorability assumption, I show that population parameters are identified, and provide straightforward ÖN\sqrt{N} -consistent and asymptotically normal estimation methods. I show that estimating a binary response selection model by conditional maximum likelihood leads to a more efficient estimator than using known probabilities, a result that unifies several disparate results in the literature. But IPW estimation is not a panacea: in some important cases of nonresponse, unweighted estimators will be consistent under weaker ignorability assumptions.
Article
Asymptotic distribution theory is the primary method used to examine the properties of econometric estimators and tests. We present conditions for obtaining cosistency and asymptotic normality of a very general class of estimators (extremum estimators). Consistent asymptotic variance estimators are given to enable approximation of the asymptotic distribution. Asymptotic efficiency is another desirable property then considered. Throughout the chapter, the general results are also specialized to common econometric estimators (e.g. MLE and GMM), and in specific examples we work through the conditions for the various results in detail. The results are also extended to two-step estimators (with finite-dimensional parameter estimation in the first step), estimators derived from nonsmooth objective functions, and semiparametric two-step estimators (with nonparametric estimation of an infinite-dimensional parameter in the first step). Finally, the trinity of test statistics is considered within the quite general setting of GMM estimation, and numerous examples are given.
Article
It is shown that sequential estimators can be interpreted as members of a class method of moments of estimators, and that this interpretation facilities derivation of asymptotic covariance matrices for two-step estimators. An example is given which deals with a two-step least squares estimator used to estimate rational expectations models.
Article
This paper studies the efficient estimation of a large class of multi-valued treatment effects as implicitly defined by a collection of possibly over-identified non-smooth moment conditions when the treatment assignment is assumed to be ignorable. Two estimators are introduced together with a set of sufficient conditions that ensure their -consistency, asymptotic normality and efficiency. Under mild assumptions, these conditions are satisfied for the Marginal Mean Treatment Effect and the Marginal Quantile Treatment Effect, estimands of particular importance for empirical applications. Previous results for average and quantile treatments effects are encompassed by the methods proposed here when the treatment is dichotomous. The results are illustrated by an empirical application studying the effect of maternal smoking intensity during pregnancy on birth weight, and a Monte Carlo experiment.
Article
I study inverse probability weighted M-estimation under a general missing data scheme. Examples include M-estimation with missing data due to a censored survival time, propensity score estimation of the average treatment effect in the linear exponential family, and variable probability sampling with observed retention frequencies. I extend an important result known to hold in special cases: estimating the selection probabilities is generally more efficient than if the known selection probabilities could be used in estimation. For the treatment effect case, the setup allows a general characterization of a “double robustness” result due to Scharfstein et al. [1999. Rejoinder. Journal of the American Statistical Association 94, 1135–1146].