Biostatistics (2010), 11, 1, pp. 34–47
Advance Access publication on October 8, 2009
Semiparametric estimation of the average causal effect of
treatment on an outcome measured after a
postrandomization event, with missing outcome data
PETER B. GILBERT∗
Department of Biostatistics, University of Washington, Seattle, WA 98105, USA and Fred Hutchinson
Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA
Department of Biostatistics, University of Washington, Seattle, WA 98105, USA
In the past decade, several principal stratification–based statistical methods have been developed for test-
ing and estimation of a treatment effect on an outcome measured after a postrandomization event. Two
examples are the evaluation of the effect of a cancer treatment on quality of life in subjects who remain
alive and the evaluation of the effect of an HIV vaccine on viral load in subjects who acquire HIV infec-
tion. However, in general the developed methods have not addressed the issue of missing outcome data,
and hence their validity relies on a missing completely at random (MCAR) assumption. Because in many
applications the MCAR assumption is untenable, while a missing at random (MAR) assumption is defen-
sible, we extend the semiparametric likelihood sensitivity analysis approach of Gilbert and others (2003)
and Jemiai and Rotnitzky (2005) to allow the outcome to be MAR. We combine these methods with the
robust likelihood–based method of Little and An (2004) for handling MAR data to provide semiparamet-
ric estimation of the average causal effect of treatment on the outcome. The new method, which does not
require a monotonicity assumption, is evaluated in a simulation study and is applied to data from the first
HIV vaccine efficacy trial.
Keywords: Causal inference; HIV vaccine trial; Missing at random; Posttreatment selection bias; Principal stratifica-
tion; Sensitivity analysis.
For randomized placebo-controlled efficacy trials of an HIV vaccine evaluated in HIV-uninfected vol-
unteers, a primary objective is to evaluate the effect of vaccination on the incidence of HIV infection.
Another objective, which was secondary in 2 trials of antibody-based vaccines (Flynn and others, 2005;
Pitisittithumandothers,2006)andco-primaryin2trialsofTcell–basedvaccines(Buchbinder and others,
∗To whom correspondence should be addressed.
c ? The Author 2009. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: email@example.com.
Semiparametric estimation of the average causal effect of treatment
2008), is to evaluate the effect of vaccination on HIV viral load measured after HIV infection. Two causal
approaches of interest for assessing the latter objective are intent-to-treat (ITT) methods that assess the
burden of illness (a composite end point that combines the infection end point and the postinfection end
point) (Chang and others, 1994) and conditional methods that assess the postinfection end point in the
principal stratum of subjects who would be HIV-infected under either treatment assignment (the so-called
“always infected” stratum). As discussed by Gilbert and others (2003) [henceforth GBH] and others, the
2 approaches address different substantive questions. For example, in registrational/licensure trials, the
ITT approach may be most appropriate for the primary analysis and the conditional approach would
be used in secondary analyses of the mechanistic vaccine effect on the postinfection outcome, whereas
in preregistrational test-of-concept trials, the conditional approach might be used for the primary analysis
(Mehrotra and others, 2006).
Here, we develop methodology for the conditional approach to evaluate the causal vaccine effect on
the postinfection end point. For concreteness, we refer to the first postrandomization event as infection
and the outcome of interest measured after this event as viral load. However, the method has general
application to evaluating causal treatment effects on outcomes measured after a postrandomization event,
including studies of quality of life (Rubin, 2000), prostate cancer severity (Shepherd and others, 2008),
kidney disease, and cancer screening (Joffe and others, 2007).
Because the set of trial participants who are in the always infected stratum is unknown, causal treat-
ment effects for this group are not identified from randomized trial data. Two approaches for addressing
the nonidentifiability have been to derive sharp bounds for causal effects (Jemiai and Rotnitzky, 2003;
Hudgens and others, 2003; Zhang and Rubin, 2003) and to estimate causal effects under an additional set
of identifiability assumptions that include models describing the nature and degree of possible selection
bias, with a sensitivity analysis to explore how the inferences vary over a range of the selection models
(GBH; Hayden and others, 2005; Hudgens and Halloran, 2006; Jemiai and Rotnitzky, 2005 (henceforth
JR); Jemiai and others, 2007; Shepherd, Gilbert, Jemiai and others, 2006; Shepherd and others, 2007,
2008). All these methods assume that viral load is observed for all infected subjects. Therefore, the va-
lidity of their inferences depends on a missing completely at random assumption (MCAR). While the
MCAR assumption is often untenable, trials may collect sufficient data on participant characteristics to
make a missing at random (MAR) assumption plausible. This article extends the approach of GBH and
JR to accommodate MAR missingness of the viral load end point.
To illustrate the new method, we focus on the first efficacy trial (Flynn and others, 2005) and define
the end point Ypto be the pre-antiretroviral therapy (pre-ART) log10viral load at the month 12 visit post
HIV infection diagnosis. Only subjects who have not started ART by the month 12 visit contribute a value
Yp. We exclude viral load values measured after ART initiation because ART strongly effects viral load
levels (Gilbert and others, 2003). Inferences about Ypapply to a population where ART is not prescribed
during the first 12 months after infection diagnosis.
Of the 368 subjects who became HIV-infected during the trial, 121 had Ypobserved, 138 had missing
data because they initiated ART prior to the month 12 study visit, and 109 had missing data because they
dropped out prior to initiating ART and prior to the visit. Figure 1 shows the VaxGen data on Yp= Y3in
relationship to 2 key covariates: Y1is early pre-ART square root CD4 cell count, and Y2is early pre-ART
log10viral load; these variables average all pre-ART values measured within 2 months of infection diagno-
sis. Viral load early and at 12 months were positively correlated (Figures 1(e) and (f)), with Spearman rank
early CD4 cell count and viral load at 12 months (Figures 1(c) and (d)) for the vaccine group (Spearman
correlation = −0.33) but not for the placebo group (Spearman correlation = 0.16).
In addition, lower levels of Y1 and higher levels of Y2 were highly predictive of ART initiation
(P values< 0.001 in a multivariate Cox model; Gilbert and others, 2005), showing that MCAR was badly
violated. MAR may be plausible, however, because physicians base decisions to prescribe ART on the
36 P. B. GILBERT AND Y. JIN
Fig. 1. VaxGen trial data: jittered pre-antiretroviral (pre-ART) viral loads at the month 12 postinfection diagnosis
visit (Y3) ((a) and (b)); pre-ART viral loads at the month 12 postinfection diagnosis visit versus square root CD4 cell
counts early after infection diagnosis (Y3versus Y1; (c) and (d)); pre-ART viral loads at the month 12 postinfection
diagnosis visit vs early after infection diagnosis (Y3versus Y2; (e) and (f)).
monitoring of CD4 cell counts, viral loads, HIV-related clinical events, and comorbidities (Hammer and
others, 2008). In fact, MAR missingness due to ART initiation may approximately hold “systematically”
in settings where ART is offered to all infected participants when their biomarkers are observed to cross
prespecified thresholds or they present with prespecified symptomatic illnesses.
Semiparametric estimation of the average causal effect of treatment
Two popular approaches to making valid inferences under MAR are inverse probability weighted
(IPW)–based methods (e.g. Robins and others, 1995) and likelihood-based methods (e.g. Little and
possibly unstable because the estimated weights for some subjects are expected to be near zero. Specif-
ically, for a subject with Ypobserved, the weight in the denominator of the estimating equation equals
the estimated probability that the subject did not drop out by the month 12 visit multiplied by the
estimated probability that the subject did not start ART by the visit conditional on not dropping out.
These estimates can be computed as fitted values in regression models based on the subject’s covariates.
The latter conditional probability may be near zero for subjects whose CD4 cell counts drop below the
prespecified threshold at which ART initiation is recommended (currently the recommended threshold
is between 200 and 500 cells/mm3depending on the country). In fact, for ethical reasons any efficacy
trial will offer ART to all infected participants who meet treatment criteria, such that the more suc-
cessful the treatment coverage the closer some estimated weights in the denominator may be to zero.
Therefore, for some HIV vaccine trials, IPW methods are expected to provide relatively imprecise
While likelihood methods are less subject to the instability problem, they are susceptible to mis-
specification of the model relating Yp to covariates. To partially ameliorate this problem, we use the
robust likelihood–based method of Little and An (2004) (henceforth LA) that is based on penalized
splines of the propensity score. Our approach only handles monotone missing data (i.e. dropout); an
alternative approach would handle nonmonotone missingness using parametric multiple imputation and
Monte Carlo Markov chain, for example, by extending the method of Mogg and Mehrotra (2007) for
analyzing viral load data to address postrandomization selection bias. The article is organized as fol-
lows. Section 2 describes the causal estimand of interest and identifiability assumptions. Section 3 shows
how to combine the methodologies of GBH/JR and LA into a procedure for consistently estimating the
causal estimand under an MAR assumption. Section 4 evaluates the new method in a simulation study.
Section 5 applies the method to the first HIV vaccine efficacy trial, and Section 6 offers concluding
2. AVERAGE CAUSAL EFFECT ESTIMAND AND IDENTIFIABILITY ASSUMPTIONS
Notation and estimand
Let Z be treatment assignment, and let X be a q-vector of baseline covariates fully observed for everyone.
Let S be the indicator of the postrandomization event. Subjects experiencing S = 1 are subsequently
evaluated at V visits, where variables Y1,...,Yn1are collected at visit 1, variables Yn1+1,...,Yn1+n2are
collected at visit 2, and so on, with variables Y?V−1
variable of interest. For j = 1,..., p, let Mj be the indicator of whether Yj is missing and set M =
(M1,..., Mp)?. The variables Y are only meaningful if S = 1; thus, Y and M are undefined if S = 0 and
we denote this by Y = M = ∗. For HIV vaccine trials, Z is vaccination assignment (Z = 1, vaccine;
Z = 0, placebo) and S is HIV infection diagnosis during the trial.
Eachparticipanthaspotentialinfectionoutcome S(1)ifassignedvaccineand S(0)ifassignedplacebo.
For Z = 0,1, the potential outcomes Y(Z) ≡ (Y1(Z),Y2(Z),...,Yp(Z))?and M(Z) ≡ (M1(Z),
M2(Z),..., Mp(Z))?are defined if S(Z) = 1; otherwise Y(Z) ≡ ∗ and M(Z) ≡ ∗. With µz ≡
E(Yp(z)|S(0) = S(1) = 1) for z = 0,1, the “average causal effect (ACE)” estimand of interest is
ACE ≡ µ1− µ0. Our goal is to estimate the ACE based on assumptions and the observed i.i.d. data
(Zi, Xi, Si, Mi,Yi), i = 1,..., N.
i=1ni+1,...,Ypcollected at visit V, where p =?V
The entire collection of p variables measured after S = 1 is Y ≡ (Y1,...,Yp)?, where Ypis the outcome
38P. B. GILBERT AND Y. JIN
For subjects with S = 1, let Yobsdenote the components of Y that are observed and Ymisdenote the
components of Y that are missing. Let f be the conditional cumulative distribution function (cdf) of
M given Y and S = 1, f (M|Y, S = 1,ν), where ν denotes unknown parameters. MAR states that
missingness depends only on the observed values Yobs, that is,
f (M|Y, S = 1,ν) = f (M|Yobs, S = 1,ν)
for all Ymis,ν.
For simplicity, we develop the methods for a setting where Y1,...,Yp−1are fully observed in infected
subjects (with S = 1) and only Yphas missing values. In Section 6, we describe how to extend the method
to the case of monotone missing data.
Throughout, we make the following 3 assumptions.
A1: Stable unit treatment values assumption (Rubin, 1978).
A2: The treatment assignment Z is independent of (X, S(0), S(1), M(0), M(1),Y(0),Y(1)).
A3: For infected subjects (with S = 1), the missing data mechanism for Y is MAR.
All the papers for studying the causal vaccine effect in the always infected stratum cited in Section 1
assume A1 and A2; we refer the reader to these articles for discussion about their justification. As dis-
cussed in Section 1, the MAR assumption A3 may be quite plausible when missingness is mainly due to
ART initiation and ART guidelines are used. If the missing data mechanism is believed to be non-MAR,
then the methodology here may mislead and should be used with caution. An advantageous feature of the
MAR assumption is that investigators can design clinical trials to collect the covariate data that make it
We next postulate additional assumptions that identify the ACE and are indexed by fixed sensitivity
parameters. Following JR, we suppose 3 models, which we refer to collectively as A4:
g0(Pr(S(1) = 1|S(0) = 1,Yp(0) = y) = α0+ β0y,
g1(Pr(S(0) = 1|S(1) = 1,Yp(1) = y) = α1+ β1y,
Pr(S(0) = 1|S(1) = 1) = φ,
where g0and g1are known invertible link functions whose inverses are continuous in α0and α1, α0and
α1are unknown parameters to be estimated, and β0, β1, and φ are known sensitivity parameters that are
varied over plausible ranges, where subject matter experts can help define the plausible ranges. With logit
links g0and g1, β0is interpreted as the difference in the log odds of infection in the vaccine group given
infection in the placebo group with y versus y − 1 viral load, and β1is interpreted similarly reversing
the role of vaccine and placebo. The parameter φ is interpreted as the probability that a subject infected
in the vaccine group would also be infected in the placebo group. Except for the method of JR, all of the
previously developed methods cited above assume φ = 1 (i.e. monotonicity, that the vaccine does not
increase the risk of infection for any subject), in which case the selection model (2.2) is superfluous and
only model (2.1) is used, and the only sensitivity parameter is β0. In this case, the methods of GBH and
JR are equivalent. For greater applicability of the method, here, we allow for nonmonotonic settings by
considering the trio of models (2.1–2.3).