Page 1

Biostatistics (2010), 11, 1, pp. 34–47

doi:10.1093/biostatistics/kxp034

Advance Access publication on October 8, 2009

Semiparametric estimation of the average causal effect of

treatment on an outcome measured after a

postrandomization event, with missing outcome data

PETER B. GILBERT∗

Department of Biostatistics, University of Washington, Seattle, WA 98105, USA and Fred Hutchinson

Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA

pgilbert@scharp.org

YUYING JIN

Department of Biostatistics, University of Washington, Seattle, WA 98105, USA

SUMMARY

In the past decade, several principal stratification–based statistical methods have been developed for test-

ing and estimation of a treatment effect on an outcome measured after a postrandomization event. Two

examples are the evaluation of the effect of a cancer treatment on quality of life in subjects who remain

alive and the evaluation of the effect of an HIV vaccine on viral load in subjects who acquire HIV infec-

tion. However, in general the developed methods have not addressed the issue of missing outcome data,

and hence their validity relies on a missing completely at random (MCAR) assumption. Because in many

applications the MCAR assumption is untenable, while a missing at random (MAR) assumption is defen-

sible, we extend the semiparametric likelihood sensitivity analysis approach of Gilbert and others (2003)

and Jemiai and Rotnitzky (2005) to allow the outcome to be MAR. We combine these methods with the

robust likelihood–based method of Little and An (2004) for handling MAR data to provide semiparamet-

ric estimation of the average causal effect of treatment on the outcome. The new method, which does not

require a monotonicity assumption, is evaluated in a simulation study and is applied to data from the first

HIV vaccine efficacy trial.

Keywords: Causal inference; HIV vaccine trial; Missing at random; Posttreatment selection bias; Principal stratifica-

tion; Sensitivity analysis.

1. INTRODUCTION

For randomized placebo-controlled efficacy trials of an HIV vaccine evaluated in HIV-uninfected vol-

unteers, a primary objective is to evaluate the effect of vaccination on the incidence of HIV infection.

Another objective, which was secondary in 2 trials of antibody-based vaccines (Flynn and others, 2005;

Pitisittithumandothers,2006)andco-primaryin2trialsofTcell–basedvaccines(Buchbinder and others,

∗To whom correspondence should be addressed.

c ? The Author 2009. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

Page 2

Semiparametric estimation of the average causal effect of treatment

35

2008), is to evaluate the effect of vaccination on HIV viral load measured after HIV infection. Two causal

approaches of interest for assessing the latter objective are intent-to-treat (ITT) methods that assess the

burden of illness (a composite end point that combines the infection end point and the postinfection end

point) (Chang and others, 1994) and conditional methods that assess the postinfection end point in the

principal stratum of subjects who would be HIV-infected under either treatment assignment (the so-called

“always infected” stratum). As discussed by Gilbert and others (2003) [henceforth GBH] and others, the

2 approaches address different substantive questions. For example, in registrational/licensure trials, the

ITT approach may be most appropriate for the primary analysis and the conditional approach would

be used in secondary analyses of the mechanistic vaccine effect on the postinfection outcome, whereas

in preregistrational test-of-concept trials, the conditional approach might be used for the primary analysis

(Mehrotra and others, 2006).

Here, we develop methodology for the conditional approach to evaluate the causal vaccine effect on

the postinfection end point. For concreteness, we refer to the first postrandomization event as infection

and the outcome of interest measured after this event as viral load. However, the method has general

application to evaluating causal treatment effects on outcomes measured after a postrandomization event,

including studies of quality of life (Rubin, 2000), prostate cancer severity (Shepherd and others, 2008),

kidney disease, and cancer screening (Joffe and others, 2007).

Because the set of trial participants who are in the always infected stratum is unknown, causal treat-

ment effects for this group are not identified from randomized trial data. Two approaches for addressing

the nonidentifiability have been to derive sharp bounds for causal effects (Jemiai and Rotnitzky, 2003;

Hudgens and others, 2003; Zhang and Rubin, 2003) and to estimate causal effects under an additional set

of identifiability assumptions that include models describing the nature and degree of possible selection

bias, with a sensitivity analysis to explore how the inferences vary over a range of the selection models

(GBH; Hayden and others, 2005; Hudgens and Halloran, 2006; Jemiai and Rotnitzky, 2005 (henceforth

JR); Jemiai and others, 2007; Shepherd, Gilbert, Jemiai and others, 2006; Shepherd and others, 2007,

2008). All these methods assume that viral load is observed for all infected subjects. Therefore, the va-

lidity of their inferences depends on a missing completely at random assumption (MCAR). While the

MCAR assumption is often untenable, trials may collect sufficient data on participant characteristics to

make a missing at random (MAR) assumption plausible. This article extends the approach of GBH and

JR to accommodate MAR missingness of the viral load end point.

To illustrate the new method, we focus on the first efficacy trial (Flynn and others, 2005) and define

the end point Ypto be the pre-antiretroviral therapy (pre-ART) log10viral load at the month 12 visit post

HIV infection diagnosis. Only subjects who have not started ART by the month 12 visit contribute a value

Yp. We exclude viral load values measured after ART initiation because ART strongly effects viral load

levels (Gilbert and others, 2003). Inferences about Ypapply to a population where ART is not prescribed

during the first 12 months after infection diagnosis.

Of the 368 subjects who became HIV-infected during the trial, 121 had Ypobserved, 138 had missing

data because they initiated ART prior to the month 12 study visit, and 109 had missing data because they

dropped out prior to initiating ART and prior to the visit. Figure 1 shows the VaxGen data on Yp= Y3in

relationship to 2 key covariates: Y1is early pre-ART square root CD4 cell count, and Y2is early pre-ART

log10viral load; these variables average all pre-ART values measured within 2 months of infection diagno-

sis. Viral load early and at 12 months were positively correlated (Figures 1(e) and (f)), with Spearman rank

correlation0.68(0.55)forthevaccine(placebo)group.Thereisevidenceofanegativecorrelationbetween

early CD4 cell count and viral load at 12 months (Figures 1(c) and (d)) for the vaccine group (Spearman

correlation = −0.33) but not for the placebo group (Spearman correlation = 0.16).

In addition, lower levels of Y1 and higher levels of Y2 were highly predictive of ART initiation

(P values< 0.001 in a multivariate Cox model; Gilbert and others, 2005), showing that MCAR was badly

violated. MAR may be plausible, however, because physicians base decisions to prescribe ART on the

Page 3

36P. B. GILBERT AND Y. JIN

Fig. 1. VaxGen trial data: jittered pre-antiretroviral (pre-ART) viral loads at the month 12 postinfection diagnosis

visit (Y3) ((a) and (b)); pre-ART viral loads at the month 12 postinfection diagnosis visit versus square root CD4 cell

counts early after infection diagnosis (Y3versus Y1; (c) and (d)); pre-ART viral loads at the month 12 postinfection

diagnosis visit vs early after infection diagnosis (Y3versus Y2; (e) and (f)).

monitoring of CD4 cell counts, viral loads, HIV-related clinical events, and comorbidities (Hammer and

others, 2008). In fact, MAR missingness due to ART initiation may approximately hold “systematically”

in settings where ART is offered to all infected participants when their biomarkers are observed to cross

prespecified thresholds or they present with prespecified symptomatic illnesses.

Page 4

Semiparametric estimation of the average causal effect of treatment

37

Two popular approaches to making valid inferences under MAR are inverse probability weighted

(IPW)–based methods (e.g. Robins and others, 1995) and likelihood-based methods (e.g. Little and

Rubin,2002).ForsomeHIVvaccinetrials,theformermethodsareexpectedtoberelativelyinefficientand

possibly unstable because the estimated weights for some subjects are expected to be near zero. Specif-

ically, for a subject with Ypobserved, the weight in the denominator of the estimating equation equals

the estimated probability that the subject did not drop out by the month 12 visit multiplied by the

estimated probability that the subject did not start ART by the visit conditional on not dropping out.

These estimates can be computed as fitted values in regression models based on the subject’s covariates.

The latter conditional probability may be near zero for subjects whose CD4 cell counts drop below the

prespecified threshold at which ART initiation is recommended (currently the recommended threshold

is between 200 and 500 cells/mm3depending on the country). In fact, for ethical reasons any efficacy

trial will offer ART to all infected participants who meet treatment criteria, such that the more suc-

cessful the treatment coverage the closer some estimated weights in the denominator may be to zero.

Therefore, for some HIV vaccine trials, IPW methods are expected to provide relatively imprecise

inferences.

While likelihood methods are less subject to the instability problem, they are susceptible to mis-

specification of the model relating Yp to covariates. To partially ameliorate this problem, we use the

robust likelihood–based method of Little and An (2004) (henceforth LA) that is based on penalized

splines of the propensity score. Our approach only handles monotone missing data (i.e. dropout); an

alternative approach would handle nonmonotone missingness using parametric multiple imputation and

Monte Carlo Markov chain, for example, by extending the method of Mogg and Mehrotra (2007) for

analyzing viral load data to address postrandomization selection bias. The article is organized as fol-

lows. Section 2 describes the causal estimand of interest and identifiability assumptions. Section 3 shows

how to combine the methodologies of GBH/JR and LA into a procedure for consistently estimating the

causal estimand under an MAR assumption. Section 4 evaluates the new method in a simulation study.

Section 5 applies the method to the first HIV vaccine efficacy trial, and Section 6 offers concluding

remarks.

2. AVERAGE CAUSAL EFFECT ESTIMAND AND IDENTIFIABILITY ASSUMPTIONS

2.1

Notation and estimand

Let Z be treatment assignment, and let X be a q-vector of baseline covariates fully observed for everyone.

Let S be the indicator of the postrandomization event. Subjects experiencing S = 1 are subsequently

evaluated at V visits, where variables Y1,...,Yn1are collected at visit 1, variables Yn1+1,...,Yn1+n2are

collected at visit 2, and so on, with variables Y?V−1

variable of interest. For j = 1,..., p, let Mj be the indicator of whether Yj is missing and set M =

(M1,..., Mp)?. The variables Y are only meaningful if S = 1; thus, Y and M are undefined if S = 0 and

we denote this by Y = M = ∗. For HIV vaccine trials, Z is vaccination assignment (Z = 1, vaccine;

Z = 0, placebo) and S is HIV infection diagnosis during the trial.

Eachparticipanthaspotentialinfectionoutcome S(1)ifassignedvaccineand S(0)ifassignedplacebo.

For Z = 0,1, the potential outcomes Y(Z) ≡ (Y1(Z),Y2(Z),...,Yp(Z))?and M(Z) ≡ (M1(Z),

M2(Z),..., Mp(Z))?are defined if S(Z) = 1; otherwise Y(Z) ≡ ∗ and M(Z) ≡ ∗. With µz ≡

E(Yp(z)|S(0) = S(1) = 1) for z = 0,1, the “average causal effect (ACE)” estimand of interest is

ACE ≡ µ1− µ0. Our goal is to estimate the ACE based on assumptions and the observed i.i.d. data

(Zi, Xi, Si, Mi,Yi), i = 1,..., N.

i=1ni+1,...,Ypcollected at visit V, where p =?V

i=1ni.

The entire collection of p variables measured after S = 1 is Y ≡ (Y1,...,Yp)?, where Ypis the outcome

Page 5

38P. B. GILBERT AND Y. JIN

2.2

MAR assumption

For subjects with S = 1, let Yobsdenote the components of Y that are observed and Ymisdenote the

components of Y that are missing. Let f be the conditional cumulative distribution function (cdf) of

M given Y and S = 1, f (M|Y, S = 1,ν), where ν denotes unknown parameters. MAR states that

missingness depends only on the observed values Yobs, that is,

f (M|Y, S = 1,ν) = f (M|Yobs, S = 1,ν)

for all Ymis,ν.

For simplicity, we develop the methods for a setting where Y1,...,Yp−1are fully observed in infected

subjects (with S = 1) and only Yphas missing values. In Section 6, we describe how to extend the method

to the case of monotone missing data.

2.3

Identifiability assumptions

Throughout, we make the following 3 assumptions.

A1: Stable unit treatment values assumption (Rubin, 1978).

A2: The treatment assignment Z is independent of (X, S(0), S(1), M(0), M(1),Y(0),Y(1)).

A3: For infected subjects (with S = 1), the missing data mechanism for Y is MAR.

All the papers for studying the causal vaccine effect in the always infected stratum cited in Section 1

assume A1 and A2; we refer the reader to these articles for discussion about their justification. As dis-

cussed in Section 1, the MAR assumption A3 may be quite plausible when missingness is mainly due to

ART initiation and ART guidelines are used. If the missing data mechanism is believed to be non-MAR,

then the methodology here may mislead and should be used with caution. An advantageous feature of the

MAR assumption is that investigators can design clinical trials to collect the covariate data that make it

plausible.

We next postulate additional assumptions that identify the ACE and are indexed by fixed sensitivity

parameters. Following JR, we suppose 3 models, which we refer to collectively as A4:

g0(Pr(S(1) = 1|S(0) = 1,Yp(0) = y) = α0+ β0y,

g1(Pr(S(0) = 1|S(1) = 1,Yp(1) = y) = α1+ β1y,

Pr(S(0) = 1|S(1) = 1) = φ,

(2.1)

(2.2)

(2.3)

where g0and g1are known invertible link functions whose inverses are continuous in α0and α1, α0and

α1are unknown parameters to be estimated, and β0, β1, and φ are known sensitivity parameters that are

varied over plausible ranges, where subject matter experts can help define the plausible ranges. With logit

links g0and g1, β0is interpreted as the difference in the log odds of infection in the vaccine group given

infection in the placebo group with y versus y − 1 viral load, and β1is interpreted similarly reversing

the role of vaccine and placebo. The parameter φ is interpreted as the probability that a subject infected

in the vaccine group would also be infected in the placebo group. Except for the method of JR, all of the

previously developed methods cited above assume φ = 1 (i.e. monotonicity, that the vaccine does not

increase the risk of infection for any subject), in which case the selection model (2.2) is superfluous and

only model (2.1) is used, and the only sensitivity parameter is β0. In this case, the methods of GBH and

JR are equivalent. For greater applicability of the method, here, we allow for nonmonotonic settings by

considering the trio of models (2.1–2.3).