Content uploaded by David Drukker

Author content

All content in this area was uploaded by David Drukker on Jun 19, 2014

Content may be subject to copyright.

The Stata Journal (yyyy)vv, Number ii, pp. 1 –13

Quantile treatment eﬀect estimation from

censored data by regression adjustment

David M. Drukker

Stata

College Station, Texas

ddrukker@stata.com

Abstract. I discuss the mqgamma command that estimates the quantiles of the

potential-outcome distributions for each treatment level from censored observa-

tional data in which the dependent variable is inherently positive, such as time-to-

event data and health-expenditure data. Diﬀerences in these marginal quantiles

are quantile treatment eﬀects.

The implemented estimator is a regression-adjustment type estimator based on

a two-parameter gamma distribution for each potential outcome.

(This prepublication draft was distributed on 19 June 2014.)

Keywords: st0001, mqgamma, casual inference, treatment eﬀects, quantile estima-

tion

1 Introduction

I discuss the mqgamma command that estimates the quantiles of the potential-outcome

distributions for each treatment level from censored observational data in which the de-

pendent variable is inherently positive, such as time-to-event data and health-expenditure

data. Diﬀerences in these marginal quantiles are quantile treatment eﬀects (QTEs).

QTEs can vary by quantile. For example, the treatment eﬀect of exercise on in-

dividuals with weak hearts could be signiﬁcantly smaller than the treatment eﬀect on

those with strong hearts. The treatment eﬀect at a lower marginal quantile of time to

second heart attack could be signiﬁcantly smaller than the treatment eﬀect at an upper

marginal quantile.

The regression-adjustment (RA) estimator implemented in mqgamma handles obser-

vational data. The logic of the estimator has several steps, but solving all the estimating

equations jointly produces a consistent, asymptotically normal, one-step estimator.

The RA estimator implemented in mqgamma is a simple application of parametric

modeling techniques. The details of the estimator are provided in section 9. As an RA

estimator, the implemented estimator is related to the RA estimators for average treat-

ment eﬀects discussed in (Wooldridge 2010, chapter 21). As an estimator of a quantile

treatment eﬀect, the implemented estimator is related to the estimators discussed in

Fr¨olich and Melly (2010), Cattaneo (2010), Cattaneo, Drukker, and Holland (2013),

and Firpo (2007).

c

yyyy StataCorp LP st0001

2Quantile treatment eﬀects

To extend the weighting estimators derived by Firpo (2007) and Cattaneo (2010),

one would have to model the probability of censoring and then jointly correct for miss-

ingness due to treatment and missingness due to censoring. (This extension follows

immediately from the results in Wooldridge (2007, 2002).) The advantage of the imple-

mented RA estimator over the weighted estimators is that only an outcome model is

required. The weighting estimators require models for the treatment allocation and the

censoring process. Furthermore, the model for the censoring process is most likely just

as complicated as the model for the outcome.

2 An example

Suppose I have data from a study that followed a random sample of middle-aged men

who had previously had a heart attack for three years. I am interested in whether an

exercise regime aﬀects the time to a second heart attack. Some observations on the

time to second heart attack are censored. Because the data is observational, treatment

allocation depends on covariates and I use a model for the outcome to adjust for this

dependence.

Key to this story is that exercise could help individuals with relatively strong hearts

but not help those with weak hearts. An individual with a “strong heart” would be

in an upper quantile of the marginal distribution, over the covariates, of the potential

outcome generated by each treatment level. Analogously, an individual with a “weak

heart” would be in a lower quantile of the marginal distribution of the potential outcome

generated by each treatment level. As is standard in the treatment-eﬀect literature, the

treated potential outcome is the variable that would occur if everyone in the population

received the treatment. Analogously, the control potential outcome is the variable

that would occur if everyone received the control. See Holland (1986); Imbens and

Wooldridge (2009); Wooldridge (2010) for discussions and further references.

I estimate the diﬀerence in the marginal quantiles for the treated potential-outcome

distribution and the control potential-outcome distribution at the upper quantile of .75.

This diﬀerence in marginal quantiles is a QTE at quantile .75, denoted by QTE(.75).

Similarly, I estimate the diﬀerence in the marginal quantiles for the treated potential-

outcome distribution and control potential-outcome distribution at the lower quantile

of .25. This diﬀerence in marginal quantiles is a QTE at quantile .25, denoted by

QTE(.25). Our story indicates that the QTE(.75) should be signiﬁcantly larger that

the QTE(.25).

In the (ﬁctional) data for this example, tis the possibly censored observation on

the time in years to second heart attack and fail is 1 if the observation is not censored

and 0 if it is censored. In this over-simpliﬁed example, the covariates are an index of

pretreatment health status health, and an index of pretreatment activity level active.

The binary variable exercise is 1 if an individual joins the exercise regime and 0 if he

does not.

I begin by estimating the marginal .25 quantile and the marginal .75 quantile for

Drukker 3

each potential-outcome distribution.

. use exercise

. mqgamma t active, treat(exercise) fail(fail) lns(health) quantile(.25 .75)

Iteration 0: EE criterion = .7032254

Iteration 1: EE criterion = .05262105

Iteration 2: EE criterion = .00028553

Iteration 3: EE criterion = 6.892e-07

Iteration 4: EE criterion = 4.706e-12

Iteration 5: EE criterion = 1.604e-22

Gamma quantile-treatment-effect estimation Number of obs = 2000

Robust

t Coef. Std. Err. z P>|z| [95% Conf. Interval]

q25_0

_cons .2151604 .0159611 13.48 0.000 .1838771 .2464436

q25_1

_cons .2612655 .0249856 10.46 0.000 .2122946 .3102364

q75_0

_cons 1.591147 .0725607 21.93 0.000 1.44893 1.733363

q75_1

_cons 2.510068 .1349917 18.59 0.000 2.245489 2.774647

The estimated .25 quantile for the treated potential outcome is 0.26 while the esti-

mated .25 quantile for the control potential outcome is .22. The estimated .75 quantile

for the treated potential outcome is 2.51 while the estimated .75 quantile for the con-

trol potential outcome is 1.59. These results appear to conﬁrm the conjecture that the

QTE(.75) is signiﬁcantly larger than the QTE(.25). Below I use nlcom to estimate the

QTEs from the estimated marginal quantiles.

. nlcom (_b[q25_1:_cons] - _b[q25_0:_cons]) ///

> (_b[q75_1:_cons] - _b[q75_0:_cons])

_nl_1: _b[q25_1:_cons] - _b[q25_0:_cons]

_nl_2: _b[q75_1:_cons] - _b[q75_0:_cons]

t Coef. Std. Err. z P>|z| [95% Conf. Interval]

_nl_1 .0461051 .0295846 1.56 0.119 -.0118796 .1040899

_nl_2 .9189214 .1529012 6.01 0.000 .6192405 1.218602

The above output conﬁrms that the estimated QTE(.75) of .92 is signiﬁcantly larger

than the estimated QTE(.25) of .05.

3 Estimator details: Some examples

The estimator implemented in mqgamma is a regression-adjustment type estimator. For

each treatment level, after ﬁnding the maximum-likelihood (ML) estimates of the conditional-

on-covariates distribution, the implemented estimator uses the ML estimates to estimate

the marginal quantiles.

4Quantile treatment eﬀects

This section clariﬁes the above description by discussing a detailed example of how

this RA estimator works. (You may skip this section if you wish to avoid these details.)

The implemented RA estimator models each treatment level using a two-parameter

gamma distribution. For treatment level j∈ {0,1}, as a function of covariates xiand

wi, the shape parameter is parameterized as αj= exp(−2xiβ0

j) and the scale parameter

is parameterized as βj= exp(wiγ0

j) exp(2xiβ0

j). To facilitate model speciﬁcation, I note

that a variable yjwith this two-parameter gamma distribution has conditional mean

E[yj|x,w] = αjβj

= exp(−2xiβ0

j)exp(wiγ0

j) exp(2xiβ0

j)

=exp(wiγ0

j)

The conditional variance is given by

Var[yj|x,w] = αjβ2

j

= exp(−2xiβ0

j)[exp(wiγ0

j) exp(2xiβ0

j)]2

=exp(2wiγ0

j+ 2xiβ0

j)

The conditional distribution function is given by

F(yj|x,w) = G(exp(−2xiβ0

j), y/ exp(wiγ0

j) exp(2xiβ0

j))

where G() is the distribution function of the one-parameter gamma distribution, imple-

mented in Stata as gamma(a, x).

I repeat the previous estimation but also specifying the option aequations so that

the command reports the auxiliary parameters.

. mqgamma t active, treat(exercise) fail(fail) lns(health) ///

> quantile(.25 .75) aequations

Iteration 0: EE criterion = .7032254

Iteration 1: EE criterion = .05262105

Iteration 2: EE criterion = .00028553

Iteration 3: EE criterion = 6.892e-07

Iteration 4: EE criterion = 4.706e-12

Iteration 5: EE criterion = 1.604e-22

Gamma quantile-treatment-effect estimation Number of obs = 2000

Robust

t Coef. Std. Err. z P>|z| [95% Conf. Interval]

q25_0

_cons .2151604 .0159611 13.48 0.000 .1838771 .2464436

q25_1

_cons .2612655 .0249856 10.46 0.000 .2122946 .3102364

q75_0

_cons 1.591147 .0725607 21.93 0.000 1.44893 1.733363

q75_1

_cons 2.510068 .1349917 18.59 0.000 2.245489 2.774647

z_0

Drukker 5

active .1571665 .1363228 1.15 0.249 -.1100212 .4243542

_cons .0588663 .0824981 0.71 0.476 -.102827 .2205596

lns_0

health .1667081 .1069483 1.56 0.119 -.0429067 .3763229

_cons .1275698 .0356235 3.58 0.000 .0577489 .1973906

z_1

active .9148007 .1010204 9.06 0.000 .7168043 1.112797

_cons .0968546 .0905065 1.07 0.285 -.0805349 .2742441

lns_1

health .4050442 .0851527 4.76 0.000 .238148 .5719404

_cons .1299308 .0368826 3.52 0.000 .0576421 .2022194

Note that

•the z 0 equation reports the estimated γjfor the control potential outcome,

•the lns 0 equation reports the estimated βjfor the control potential outcome,

•the z 1 equation reports the estimated γjfor the treated potential outcome, and

•the lns 1 equation reports the estimated βjfor the treated potential outcome.

Given the control-potential-outcome estimates b

β0and b

γ0,

1/N

N

X

i=1

G(exp(−2xib

β

0

0), q/ exp(wib

γ0

0) exp(2xib

β

0

0))

consistently estimates the marginal distribution, over the covariates, of tfor the control

potential outcome at the point q. The bqthat sets this average to a value τ∈(0,1) is a

consistent estimator of the marginal τquantile in the control potential outcome.

I provide two illustrations of this point. First, I show that the average of the con-

ditional distribution function for the controls at the estimated .25 marginal quantile is

.25. I begin this ﬁrst illustration by computing wib

γ0

0for all the observations in the

data.

. predict double z0, equation(z_0)

Now I compute xib

β0for all the observations in the data.

. predict double lns0, equation(lns_0)

Next I compute the conditional-on-covariates distribution at the marginal-quantile

point .2151604 for all the observations in the data. The mean of this variable estimates

the marginal distribution for the control potential outcome at the marginal-quantile

point .2151604 and this mean is .25 because .2151604 is the .25 quantile of this distri-

bution.

6Quantile treatment eﬀects

. generate double cd0 = gammap(exp(-2*lns0), .2151604/(exp(z0)*exp(2*lns0)))

. sum cd0

Variable Obs Mean Std. Dev. Min Max

cd0 2000 .25 .0213933 .1777323 .3922674

Second, I use the gmm command to show that .215604 is the value that solves the

sample moment condition

1/N

N

X

i=1

G(exp(−2xib

β

0

0),bq/ exp(wib

γ0

0) exp(2xib

β

0

0)) −.25 = 0

. gmm ( gammap(exp(-2*lns0), {qh}/(exp(z0)*exp(2*lns0))) - .25), onestep

Step 1

Iteration 0: GMM criterion Q(b) = .0625

Iteration 1: GMM criterion Q(b) = .00235841

Iteration 2: GMM criterion Q(b) = .00001057

Iteration 3: GMM criterion Q(b) = 1.934e-10

Iteration 4: GMM criterion Q(b) = 6.505e-20

GMM estimation

Number of parameters = 1

Number of moments = 1

Initial weight matrix: Unadjusted Number of obs = 2000

Robust

Coef. Std. Err. z P>|z| [95% Conf. Interval]

/qh .2151604 .0006355 338.57 0.000 .2139148 .2164059

Instruments for equation 1: _cons

As expected, gmm reports the estimated parameter to be .2151604. The standard

errors reported by gmm do not match those reported by mqgamma because I used gmm to

solve this moment condition taking the other estimated parameters as ﬁxed. mqgamma

estimates all the parameters jointly and it reports consistent estimates of the standard

errors.

Further estimation details are provided in the 9.

4 Estimator details:Some assumptions

In this section, I provide an intuitive discussion of the assumptions needed to identify

the marginal quantiles and to interpret diﬀerences in them as QTEs. I also provide

references to more formal versions in the literature.

I assume that the two-parameter gamma distributions are correctly speciﬁed for each

potential outcome. Note that this implies that the true distributions are continuous, so

that smooth quantile estimation makes sense.

I assume that conditional on the covariates, the distribution of the potential out-

Drukker 7

comes is independent from the treatment; see for example Assumption 1 in Firpo (2007).

This assumption allows us to recover the distributions of potential outcomes from the

conditional-on-treatment distributions on which I have data. I also assume that, condi-

tional on the covariates, each person has a positive probability of getting either treat-

ment; see for example Assumption 1 in Firpo (2007). This assumption ensures that for

each covariate pattern, there are observations in each treatment level so that compar-

isons make sense.

Finally, I assume that the rank of an individual in the conditional-on-covariates

distributions is the same for each treatment level. This assumption is known as the

rank-preservation assumption; see Firpo (2007).

All of these assumptions rule out some cases of interest, but all are standard in the

literature.

5 Simulations

To test the implementation and illustrate the ﬁnite-sample performance of the imple-

mented estimator, I ran a Monte Carlo simulation. Each sample had 2,000 observations

and I drew 10,000 samples.

The data-generating process (DGP) had the following features

•Selection into treatment depended on each of two covariates.

•Each potential outcome came from a two-parameter gamma distribution and the

parameters diﬀered by potential outcome.

•A censoring time came from another two-parameter gamma distribution that was

a function of the same covariates that determined the treatment allocation and

the potential outcomes. Each potential outcome was possibly censored by setting

it to the minimum of the true potential outcome and the censoring time.

For more details see the program used to draw each repetition which is provided in

section 10.

For each sample I estimated the marginal .25, .50, and .75 quantiles, and the pa-

rameters of the conditional distributions.

Table 1 summarizes the results.

•Column 1 gives the parameter name.

•Column 2 gives the true value of the parameter.

•Column 3 gives the mean of the estimates for the parameter.

•Column 4 gives the standard deviation of the estimates for the parameter.

8Quantile treatment eﬀects

•Column 5 gives the mean of the estimated standard errors.

•Column 6 gives the rejection rate for a 5% Wald test against the null hypothesis

that the parameter equals its true value.

Table 1: Results

Parameter True Mean S.D. Mean S.E. Rej.

q25 0 0.25 0.25 0.25 0.02 0.05

q25 1 0.27 0.27 0.27 0.02 0.05

q50 0 0.78 0.78 0.78 0.04 0.05

q50 1 1.01 1.01 1.01 0.06 0.05

q75 0 1.84 1.84 1.84 0.09 0.06

q75 1 2.69 2.69 2.69 0.14 0.05

γx2,00.30 0.30 0.30 0.13 0.06

γ1,00.12 0.12 0.12 0.08 0.06

βx1,00.20 0.20 0.20 0.10 0.05

β1,00.12 0.12 0.12 0.03 0.05

γx2,11.00 1.00 1.00 0.11 0.06

γ1,10.11 0.11 0.11 0.09 0.05

βx1,10.50 0.50 0.50 0.08 0.06

β1,10.11 0.11 0.11 0.03 0.05

The results in table 1 indicate that the command and the method perform well.

There is no evidence of bias and the rejection rates are all close to the expected .05.

6 Extensions

The implemented RA estimator can be extended to a ﬁnite integer set of k > 2 treat-

ments by estimating diﬀerent parameters for each treatment level and using each set of

estimated parameters to estimate the marginal quantiles. I will perform this extension

in future research.

The idea underlying the implemented RA estimator could also be applied to estimate

marginal hazard functions, at a point, for each potential outcome. Diﬀerences and ra-

tios of these estimated marginal hazard functions deﬁne hazard treatment eﬀects. The

point-wise asymptotics follow immediately from the point-wise results in Newey and

McFadden (1994). Results over intervals of points and tests for, say, stochastic domi-

nance would require another asymptotic framework such as empirical process theory. I

will also investigate these extensions in future research.

Another set of extensions replaces the parametric gamma distribution with a ﬂexible

distribution built from terms of a basis for the space of possible distributions, such

Drukker 9

as orthogonal polynomials; see Chen (2007); Gallant and Nychka (1987). I will also

investigate these extensions in future research.

7 Appendix: Syntax

Here is the syntax of the mqgamma command.

mqgamma depvar indepvars if in ,treat(varname)quantile(numlist)

lns(varlist) fail(varname) aequations from(matrix)display options

Note that the indepvars are used to model the natural log of the conditional mean

of the depvar. See section 9 for details.

Options

treat(varname)is a required option and it speciﬁes the binary treatment variable. The

treatment variable must be coded 0 for control cases and 1 for treated cases.

quantile(numlist)speciﬁes the marginal quantiles to estimated. Each speciﬁed quan-

tile must be in (0,1).

lns(varlist)speciﬁes the variables used to model the natural log of the scale. See section

9 for details.

fail(varname)speciﬁes the binary failure indicator which must be coded 1 for an

observed value and 0 for a censored observation.

aequations speciﬁes that the auxiliary-equation parameters should be displayed.

from(matrix)speciﬁes a row vector of initial values for the optimization routine. Each

element in the speciﬁed matrix speciﬁes the initial value for the corresponding pa-

rameter.

display options the standard display options. See [R]estimation options

7.1 Post estimation syntax

After mqgamma,predict has the following syntax.

predict type newvarname if in , equation(eqno)

Options

equation(eqno)speciﬁes the equation for which predict calculates the xb term. eqno

speciﬁes the equation number. Examples are equation(#1) speciﬁes the ﬁrst equa-

tion and equation(z 0) speciﬁes the z 0 equation. See [R]predict for more details

about the equation() option.

After mqgamma,predict computes linear predictions from the ﬁtted model, known

10 Quantile treatment eﬀects

as xb terms in Stata parlance.

8 Appendix: Saved results

The mqgamma saves oﬀ the following e-results.

Scalars

e(N) number of observations e(converged) 1 if the estimator converged

e(k eq) number equations to display e(k quant) number of estimated quantiles

Macros

e(cmdline) command line input e(title) title for estimation header

e(vce) robust e(vcetype) Robust

e(quantile) quantiles estimated e(predict) mqgamma p

e(cmd) mqgamma

Matrices

e(b) coeﬃcient vector e(V) variance–covariance matrix of

the estimators

Functions

e(sample) marks estimation sample

9 Appendix: Methods and Formulas

The probability density function (PDF) of the two-parameter gamma distribution is

frequently written as

f(y|α, β) = 1

Γ(α)βαyα−1e−y/β

To model the conditional-on-covariates PDF, I let

α= (1/s2

i) = exp[−2 ln(si)] = exp(−2∗xiγ0)

β=ezis2

i= exp(wiγ0) exp(2xiβ0)

The conditional mean is E[y|zi, si] = eziand the conditional variance is Var[y|zi, si] =

e2zis2

i. Thus, ziparameterizes the natural log of the conditional mean and siparam-

eterizes the scale. The variables speciﬁed as indepvars are the wiused to model the

natural log of the conditional mean; zi=wiγ0. The variables speciﬁed in option lns()

are the xiused to model the natural log of si; ln(si) = xiβ0.

This parameterization of αand βyields a PDF of

f(yi|zi, si) = 1

Γ(s−2

i)(ezis2

i)s−2

i

ys−2

i−1

ie−yi/(ezis2

i)

and CDF of

F(yi|zi, si) = G(s−2

i, ye−zi−2 ln(si))

where G(a, x) is the CDF of a one-parameter Gamma distribution evaluated at xwhich

is implemented in Stata as gammap(a, x).

Drukker 11

Letting ci= 1 if observation iis censored and ci= 0 when the observation is not

censored yields the following log-likelihood function for observation i

Li= (1 −ci)−ln Γ(s−2

i)−s−2

i[zi+ 2 ln(si) + yie−zi]+(s−2

i−1) ln(yi)

+cinln h1−G(s−2

i, yie−zi−2 ln(si))io

This log-likelihood is implemented separately for the treated observations and for

the control observations. For each treatment level, the score equations for the model

parameters are estimating equations. In addition, for each treatment level d∈ {0,1},

each quantile qdsolves a sample estimating equation

1/N

N

X

i=1

G(exp(−2xib

β

0

d), qd/exp(wib

γ0

d) exp(2xib

β

0

d)) −τ= 0

where τ∈(0,1) and G(a, w) is the CDF of a one-parameter Gamma distribution eval-

uated at wwhich is implemented in Stata as gammap(a, w).

The estimating equations are solved jointly using gmm, whose robust standard errors

are consistent.

The implemented RA is a simple application of stacking the moment conditions from

a parametric estimator and smooth functions of the parameters. Newey (1984) discussed

stacking the moment conditions to produce a one-step consistent and asymptotically

normal estimator. Alternatively, I could prove the consistency and asymptotic normality

of the implemented RA estimator using the results in Newey and McFadden (1994).

10 Appendix: Data Generation

The following command was used to generate each draw from the DGP.

program define mkxdata

drop _all

set obs 2000

// generate covariates

gen double x1 = rchi2(3)/10

gen double x2 = rchi2(4)/7

gen double uy = runiform()

// generate y0

gen double ln_sy0 = .12 + .2*x1

gen double sy0 = exp(ln_sy0)

gen double zy0 = .12 + .3*x2

gen double exp_zy0 = exp(zy0)

12 Quantile treatment eﬀects

gen double alphay0 = 1/(sy0^2)

gen double betay0 = (sy0^2)*exp_zy0

gen double y0 = invgammap(alphay0 , uy )*(betay0 )

// generate y1

gen double ln_sy1 = .11 + .5*x1

gen double sy1 = exp(ln_sy1)

gen double zy1 = .11 + 1.0*x2

gen double exp_zy1 = exp(zy1)

gen double alphay1 = 1/(sy1^2)

gen double betay1 = (sy1^2)*exp_zy1

gen double y1 = invgammap(alphay1 , uy )*(betay1 )

// generate censor time

gen double uc = runiform()

gen double ln_sc = .7 + .7*x1

gen double sc = exp(ln_sc)

gen double zc = 3.3 + 3.2*x2

gen double exp_zc = exp(zc)

gen double alphac = 1/(sc^2)

gen double betac = (sc^2)*exp_zc

gen double c = invgammap(alphac , uc )*(betac )

gen treat = (-.6 + .5*x1 + .75*x2 + rnormal()) > 0

gen double w = treat*min(y1,c) + (1-treat)*min(y0,c)

gen f0 = y0<=c

gen f1 = y1<=c

gen double f = treat*f1 + (1-treat)*f0

gen double cons = 1

end

11 References

Cattaneo, M. 2010. Eﬃcient semiparametric estimation of multi-valued treatment eﬀects

under ignorability. Journal of Econometrics 155(2): 138–154.

Cattaneo, M. D., D. M. Drukker, and A. D. Holland. 2013. Estimation of multivalued

treatment eﬀects under conditional independence. Stata Journal 13(3): ??

Chen, X. 2007. Large sample sieve estimation of semi-nonparametric models. In Hand-

book of Econometrics, vol. 6, 5549–5632. Amsterdam: Elsevier.

Firpo, S. 2007. Eﬃcient semiparametric estimation of quantile treatment eﬀects. Econo-

metrica 75(1): 259–276.

Drukker 13

Fr¨olich, M., and B. Melly. 2010. Estimation of quantile treatment ef-

fects with Stata. Stata Journal 10(3): 423–457(35). http://www.stata-

journal.com/article.html?article=st0203.

Gallant, A. R., and D. W. Nychka. 1987. Semi-nonparametric maximum likelihood

estimation. Econometrica: Journal of the Econometric Society 363–390.

Holland, P. W. 1986. Statistics and causal inference. Journal of the American Statistical

Association 945–960.

Imbens, G. W., and J. M. Wooldridge. 2009. Recent Developments in the Econometrics

of Program Evaluation. Journal of Economic Literature 47: 5–86.

Newey, W. K. 1984. A method of moments interpretation of sequential estimators.

Economics Letters 14(2): 201–206.

Newey, W. K., and D. McFadden. 1994. Large sample estimation and hypothesis testing.

In Handbook of Econometrics, vol. 4, 2111–2245. Amsterdam: Elsevier.

Wooldridge, J. M. 2002. Inverse probability weighted M-estimators for sample selection,

attrition, and stratiﬁcation. Portuguese Economic Journal 1: 117–139.

. 2007. Inverse probability weighted estimation for general missing data problems.

Journal of Econometrics 141(2): 1281–1301.

. 2010. Econometric Analysis of Cross Section and Panel Data. 2nd ed. Cam-

bridge, Massachusetts: MIT Press.

About the authors

David M. Drukker is the Director of Econometrics at Stata.