Causal Inference in Hybrid Intervention Trials Involving Treatment Choice
ABSTRACT Although the randomized, controlled trial (RCT) is considered the gold standard in research for determining the efficacy of health education interventions, such trials may be vulnerable to "preference effects"; that is, differential outcomes depending on whether an individual is randomized to his or her preferred treatment. In this study, we review theoretical and empirical literature regarding designs that account for such effects in medical research, and consider the appropriateness of these designs to health education research. To illustrate the application of a preference design to health education research, we present analyses using process data from a mixed RCT/preference trial comparing two formats (Group or SelfDirected) of the "Women take PRIDE" heart disease management program. Results indicate that being able to choose one's program format did not significantly affect the decision to participate in the study. However, women who chose the Group format were over 4 times as likely to attend at least one class and were twice as likely to attend a greater number of classes than those who were randomized to the Group format. Several predictors of format preference were also identified, with important implications for targeting diseasemanagement education to this population.

Article: HONORS/AWARDS
 SourceAvailable from: Xihong Lin[Show abstract] [Hide abstract]
ABSTRACT: Data analysis for randomized trials including multitreatment arms is often complicated by subjects who do not comply with their treatment assignment. We discuss here methods of estimating treatment efficacy for randomized trials involving multitreatment arms subject to noncompliance. One treatment effect of interest in the presence of noncompliance is the complier average causal effect (CACE) (Angrist et al. 1996), which is defined as the treatment effect for subjects who would comply regardless of the assigned treatment. Following the idea of principal stratification (Frangakis & Rubin 2002), we define principal compliance (Little et al. 2009) in trials with three treatment arms, extend CACE and define causal estimands of interest in this setting. In addition, we discuss structural assumptions needed for estimation of causal effects and the identifiability problem inherent in this setting from both a Bayesian and a classical statistical perspective. We propose a likelihoodbased framework that models potential outcomes in this setting and a Bayes procedure for statistical inference. We compare our method with a method of moments approach proposed by Cheng & Small (2006) using a hypothetical data set, and further illustrate our approach with an application to a behavioral intervention study (Janevic et al. 2003).Applied Statistics 05/2010; 59(3):513531. · 1.25 Impact Factor  SourceAvailable from: sph.umich.edu
Page 1
Causal Inference in Hybrid Intervention Trials Involving
Treatment Choice
Qi Long, Roderick J. Little, and Xihong Lin?
June 14, 2006
Abstract
Randomized allocation of treatments is a cornerstone of experimental design, but has draw
backs when a limited set of individuals are willing to be randomized, or the act of randomization
undermines the success of the treatment. Choicebased experimental designs allow a subset of
the participants to choose their treatments. We discuss here causal inferences for hybrid experi
mental designs where some participants are randomly allocated to treatments and others receive
their treatment preference. This work was motivated by the \Women Take Pride" (WTP) study
(Janevic et al., 2001), a doubly randomized preference trial (DRPT) to assess behavioral inter
ventions for women with heart disease. We propose a model for estimating the causal e?ects in
the subpopulations de?ned by treatment preferences, and hence preference e?ects. An EM algo
rithm is described for computing maximum likelihood estimates of the model parameters. The
method is illustrated by analyzing sickness impact pro?le (SIP) scores and treatment adherence
in the WTP data. Our results show 1) some evidence that SIP scores were improved when women
received their prefered treatment; and 2) strong preference e?ects on program adherence; that is,
women assigned to their prefered treatment were more likely to adhere to the program. We also
provide a framework for assessing the DRPT and other hybrid trial designs, and discuss some
alternative designs from the perspective of the strength of assumptions required to make causal
inferences.
KEY WORDS: Clinical Trials; Doubly Randomized Preference Trials; EM algorithm; Partially Ran
domized Preference Trials; Randomization; Selection Bias.
?Qi Long is Rollins Assistant Professor, Department of Biostatistics, Emory University, Atlanta, GA, 30329; Rod
erick J. Little is Professor, Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109; Xihong Lin
is Professor, Department of Biostatistics, Harvard University, Boston, MA, 02115. This research is supported by the
National Cancer Institute grant R01CA76404. We would like to thank Noreen Clark for providing the WTP data and
useful comments.
Page 2
1 Introduction
Randomized assignment of subjects to treatments is a cornerstone of good experimental design. With
full compliance and no missing data, randomization allows valid estimates of average treatment e?ects
under minimal assumptions, by avoiding selection bias and ensuring that the observed mean for each
treatment is an unbiased estimate of the overall mean if all individuals in the population had received
that treatment (e.g., Little and Rubin, 2001).
However, it is also well known that randomization does not solve all problems in experiments
involving alternative treatments. It is not always ethically feasible, as in medical trials when the
principle of equipoise is not widely accepted, or in assessments of potentially harmful environmental
e?ects that are unlikely to be bene?cial. Inferences are only possible for the subset of subjects willing
to be randomized, potentially excluding a signi?cant fraction of the population, including subjects
with strong treatment preferences. A behavioral treatment, in which strong motivation on behalf
of the participant is required and treatment assignment cannot be blinded, may be more successful
if subjects are allowed to choose, rather than are randomized to a treatment. Random assignment
to a treatment perceived to be inferior may lead to issues of noncompliance and missing data that
undermine the randomization and complicate causal inferences.
An alternative to randomization is to simply allow participants to choose their treatments. Partic
ipants that choose their treatment are more likely to participate and fully comply with the protocol,
and the trial may more realistically measure the outcome of the treatment in the population of in
terest. On the other hand, this approach has all the well known problems of observational studies,
in that treatment comparisons are obscured by the confounding e?ects of selection. Few experi
mentalists would accept the notion that the advantages of these designs compensate for their major
weaknesses.
A natural question is whether there are hybrid designs between the extremes of randomization
or choice that improve on either design. One candidate is Zelen's (1990) randomized consent design,
which reverses the usual order of consent and randomization, by randomizing prior to consent and
predicating the consent process on the randomized treatment.In a study with two alternative
treatments (say A and B), participants are randomized to treatment (say A or B). Those randomized
to A are asked if they are willing to receive A, after a discussion of the two treatments. If the
participant agrees, A is given. If not, then B is given. The same procedure is followed in the other
arm, with roles of A and B reversed. Zelen (1990) shows that this design allows for valid tests of the
null hypothesis of no treatment e?ect, and can be more powerful than a randomized design restricted
1
Page 3
to participants willing to be randomized. However, the ethics of describing the treatments after the
treatment assignment has already been made have been questioned (Ellenberg, 1992). See Altman
et al. (1995) for more discussion of this design.
We consider here other hybrid designs that combine features of randomization and patient choice
of treatments (Lambert and Wood, 2000). Perhaps the simplest approach is to ask the participants'
treatment preferences in the context of a conventional fully randomized trial (Torgerson et al., 1996),
and use that information as a covariate or e?ectmodi?er; however, randomizing to a treatment other
than the stated preference may be problematic. A more radical approach is the partially randomized
preference trial (PRPT), where participants who are willing to be randomized to the treatments are
randomized, and those who are not are assigned to their preferred treatment (Brewin and Bradley,
1989). A variation on this theme is the doublyrandomized preference trial (DRPT), where partici
pants are randomized into a \randomization arm", within which treatments are randomized, and a
\preference arm", within which participants get to choose their treatments. Versions of the DRPT
are described by R• ucker (1989), Wennberg et al. (1993) and Janevic et al. (2003). It seems plausible
that in PRPT's and DRPT's, the additional information on participants who get to choose their
treatments might usefully supplement the information from the participants who are randomized,
although as discussed below this often requires modeling assumptions. Overall, hybrid designs en
able us to estimate the preference e?ects and causal e?ects in subpopulations de?ned by treatment
preferences, which cannot be estimated in a completely randomized trial.
We consider causal inference for these designs within the framework of potential outcomes, also
known as Rubin's causal model (Holland, 1986). Originally formalized by Neyman (1923) in the con
text of randomized experiments, this framework was generalized and extended by Rubin (1974, 1977,
1978) to nonrandomized studies. The key underlying idea is that causal estimands are comparisons of
the potential outcomes that would have been observed under di?erent exposures of the same units to
treatments at a particular place and time. Robins (1986, 1987) extended Rubin's \point treatment"
potential outcome framework to evaluate direct and indirect e?ects of timevarying treatments in
experimental and observational longitudinal studies. However, those methods are not directly ap
plicable to the hybrid designs discussed in this paper.
The ?rst objective of this paper is to propose a general model for assessing preference e?ects on
the outcomes of interest. Our second objective is to propose a framework based on recent statistical
ideas of causal inference for assessing the hybrid designs described above, and extensions. The basic
idea is to classify individuals in the population into strata, which may or may not be observed
for participants in the trial, and then assess assumptions required to identify the causal e?ects of
2
Page 4
treatments within these strata. Causal e?ects are de?ned as the di?erence in average outcome if all
individuals within a stratum were assigned to treatment A and if all individuals within a stratum
were assigned to treatment B (e.g. Rubin, 1974).
Our paper is motivated by the \Women Take Pride" study (Janevic et al., 2003), which utilized a
DRPT to assess behavioral interventions for women with heart disease. In Section 2 we describe the
design of that study, and show how to estimate causal e?ects in subpopulations de?ned by treatment
preference, using a method of moments approach similar to that proposed by R• ucker (1989). In
Section 3, we propose a conceptual model for analyzing DRPTs and other hybrid trial designs.
In Section 4, we propose a more general likelihoodbased method of analysis that accommodates
covariates and estimates causal parameters using the EM algorithm, and in Section 5 we apply these
methods to the WTP data. In Section 6, we present a simulation study to evaluate the performance of
the proposed method. In Section 7 we discuss alternative designs from the perspective of the strength
of assumptions required to make causal inference. Section 8 presents some concluding remarks.
2 The \Women Take Pride" Study
The \Woman Take Pride" (WTP) intervention study (Janevic et al., 2001) concerns women aged
60 years and older with diagnosed cardiac disease being treated by daily heart medication. The
interventions are behavioral programs aimed at enhancing the women's ability to manage their
disease, based on principles of selfregulation from social cognitive theory. The comparison is between
two versions of an intervention consisting of six weekly units: a Group treatment (Ti= A), where
68 women meet for 22 1/2 hours in a group setting; and a Selfdirected treatment (Ti= B) where
the participant studies at home following an initial orientation session. Motivation and support are
provided through the social environment in the Group version, and through weekly telephone calls
from the health educator and peer leader in the Selfdirected treatment. These two versions of the
intervention present the same material and only di?er in format.
A DRPT design where some participants choose their treatment was seen as preferable to a
completely randomized design in this setting, since the choice more accurately re?ects a clinical
situation. In a \real world" setting, patients may well have a preference for a group or selfdirected
format, and preference may well impact the motivation and adherence of the participants, and hence
the success of the treatment. The design is summarized in Figure 1. At the ?rst stage, a total of
3079 women with heart disease were randomized to a Random arm (Wi = R), where treatments
were randomly assigned, and a Choice arm (Wi= P), where participants received their preferred
treatment. Within the Random arm (n = 1613), 575 (35:6%) women agreed to participate; within
3
Page 5
the Choice arm (n = 1466), 496 (33:8%) agreed to participate. At the second stage, the women in the
Random arm were randomized to three groups: control (n = 184) , the Group treatment A (n = 190)
and the SelfDirected treatment B (n = 201); women in the Choice arm were asked to choose between
Treatment A (n = 321) and Treatment B (n = 175). We do not analyze the data for the Control
group here, since our focus is on comparisons of the Group and SelfDirected Treatments, and our
general conceptual model is more simply presented for the case of two treatments. Extensions of our
methods to more than two treatments are straightforward, however.
Primary outcomes Y of the study are measures of improved physical and psychosocial functioning,
frequency and severity of symptoms, and healthcare utilization measured at baseline, 4, 12 and 18
months. In this paper, we analyze physical, psychological and total scores of the sickness impact
pro?le (SIP) at month 12. These are measures of physical, psychological and total functional health
status, scored between 0 and 100, with higher scores indicating greater impairment due to illness
(Bergner et al., 1981). To improve normality assumptions we analyze all three SIP scores on the
logtransformed scale after adding a constant 0.05. We conduct an intenttotreat analysis of the
SIP score outcomes. However, we also analyze a measure of treatment adherence as a secondary
outcome measure, speci?cally a binary variable for whether a woman completed at least one unit of
materials. Table 1a summarizes the observed data for those outcomes.
Table 1a shows that in the Random arm (Wi= R), women assigned the SelfDirected treatment
(Ti= A) have higher average SIP scores than women assigned the Group treatment (Ti= B); the
same trend is observed in the Choice arm (Wi = P), with a greater mean di?erence. Adherence
rates are 76% for both treatments in the Random arm, while in the Choice arm the adherence rate
is higher for the Group treatment (93%) than for the SelfDirected treatment (77%).
The treatment comparisons in the Random arm are valid because of the random allocation. On
the other hand, the treatment comparisons in the Choice arm are potentially biased by the e?ects
of selfselection. Let Cidenote treatment preference, with Ci= A if an individual prefers treatment
A and Ci = B if an individual prefers treatment B. The mean outcome for treatment A are for
individuals with Ci= A, and the mean outcome for treatment B is for individuals with Ci= B.
The comparison of these two means does not estimate a causal e?ect in a particular population,
which is the key requirement for a causal e?ect in Rubin's (1974) sense. A direct comparison of
these two means requires the very debatable assumption that the subpopulation with Ci= A and
the subpopulation with Ci= B are equivalent with respect to treatment outcomes. This assumption
might be improved by regression adjustment for known characteristics of participants in the two
groups, but as in any observational setting, such adjustments do not necessarily remove the bias. A
4
Page 6
direct comparison of individuals with Ci= A(B) in the choice arm and individuals with Ti= A(B)
in the random arm as was done in Janevic et al. (2003), is problematic for similar reasons.
A causal analysis that addresses the issue of choice is to construct estimates of mean outcomes
of treatments A and B within each of the two preference subpopulations Ci= A and Ci= B. This
is not possible from data in the Choice arm alone, because it requires outcome data for participants
who do not receive their treatment of choice. However, it can be addressed with a DRPT, since some
participants in the Random arm do not receive their treatment of choice, and treatment assignment
remains random within the two preference subpopulations, Ci= A and Ci= B. Speci?cally, de?ne
? ?(A)=overall mean outcome if assigned to treatment A in the whole population
? ?A(A)=mean outcome if assigned to treatment A in the subpopulation that prefers A (Ci= A)
? ?B(A)=mean outcome if assigned to A in the subpopulation that prefers B (Ci= B),
and de?ne ?(B), ?A(B) and ?B(B) as the corresponding mean outcomes if assigned to the Group
treatment B, in the overall population and the two subpopulations respectively. Let ?B be the
proportion of the population that prefers the Group treatment (Ci= B). Then
?(A)=?B?B(A) + (1 ? ?B)?A(A)
?B?B(B) + (1 ? ?B)?A(B):?(B)=
From the choice arm, we can estimate ^ ?A(A), ^ ?B(B), and ^ ?B= 321=496 = 0:65 (Fig. 1). From the
random arm, we can estimate ^ ?(A) and ^ ?(B). Thus, ^ ?B(A) and ^ ?A(B) can be estimated by solving
a set of linear equations.
Let ?B = ?B(B) ? ?B(A), the di?erence in outcome means for treatments B and A in the
subpopulation that prefers B. Then ?Bcan be estimated as
^?B= ^ ?B(B) ? ^ ?B(A);
Similarly in the subpopulation that prefers A, we have ?A= ?B(B) ? ?B(A), which is estimated by
^?A= ^ ?A(B) ? ^ ?A(A):
The di?erence ?B? ?Ais de?ned as the preference e?ect for the two treatments, and measures the
extent to which treatment preference modi?es the treatment e?ect.
We ?rst apply this method to our health outcome measures in the WTP study. Our results (Table
1b) show that for women who preferred the Group format, average SIP physical scores at month 12
5
Page 7
were lower (0.370) when they were assigned to the Group format than when they were assigned to
the SelfDirected format; for women who preferred the SD format, average SIP physical scores at
month 12 were higher (+0.080) when they were assigned to the Group format than when they were
assigned to the SelfDirected format. These results, though not statistically signi?cant, are in the
direction of women having better physical functional health status when assigned their treatment of
choice. Our results also show a similar trend for the other two SIP scores (results not included). The
intervention e?ects as well as the preference e?ects, however, are not statistically signi?cant.
We now apply our method to study intervention adherence. The results show that for women
who prefer the Group format, the adherence rate when assigned to the Group format is estimated to
be 18% (P < 0:001) higher than the adherence rate when assigned to the SelfDirected format; for
women who prefer the SelfDirected format, the adherence rate when assigned to the SelfDirected
format is estimated to be 33% (P < 0:001) higher than the adherence rate when assigned to the Group
format. These results indicate that women are more likely to adhere to the program they prefer.
The preference e?ect, de?ned as the treatment e?ect di?erence between the two subpopulations, is
thus estimated as^?B?^?A= 0:51 (P < 0:001). It follows that the treatment e?ects on adherence are
highly signi?cant in the two subpopulations and they are signi?cantly di?erent. The results suggest
that the very similar adherence rates for the two intervention groups in the Random arm mask strong
preference e?ects, with much higher adherence rates for the prefered interventions.
This analysis method is a more formal description of the method of R• ucker (1989). It does not
require a distributional assumption and is applicable to estimating causal e?ects of an arbitrary
outcome, e.g., the causal e?ect of a health outcome at a postintervention time point such as 12
month. We now describe a framework that elucidates the implicit assumptions in this analysis, and
then generalize the analysis to include covariates.
3 A Conceptual Model for Analyzing Hybrid Trial Designs
We present a general conceptual model for assessing hybrid intervention trials like the WTP study,
which clari?es assumptions that are implicit in the above analysis. This framework will also be
applied to assess other designs in Section 7. For simplicity we focus on designs involving just two
treatments A and B, although the framework extends in an obvious way to designs with more than
two treatments. We ?rst stratify the target population into ?ve groups (Figure 2(a)):
1. The set of individuals unwilling to participate even if given their choice of treatments (P).
Clearly we cannot learn anything empirically about treatment e?ects for this group without
6
Page 8
making assumptions that relate it to a group we can study. We do not consider this group
further here.
2. The set of individuals willing to participate if given the choice of treatment (P). We stratify
this group into four subpopulations:
(a) Individuals that prefer A and will not participate unless allowed to choose A (PRA).
(b) Individuals that prefer A but are willing to participate in a randomized trial (PRA).
(c) Individuals that prefer B but are willing to participate in a randomized trial (PRB).
(d) Individuals that prefer B and will not participate unless allowed to choose B (PRB).
We consider two versions of each treatment, a version where the treatment is chosen by the
participant (AC and BC) and a version where the treatment is assigned by randomization (AR
and BR). We allow for the possibility that outcomes under these two versions of each treatment
might di?er. Throughout we assume the potential outcomes for each individual do not depend on
the treatment status of other individuals in the sample, the socalled Stable Unit Treatment Value
Assumption (SUTVA) (Rubin, 1978).
The table in Figure 2(b) results from crossing subpopulation stratum with treatment. Mean
outcomes in the empty cells can be estimated from the data, but mean outcomes in the cells labeled
F are inestimable, since they are apriori counterfactuals (Angrist, Imbens and Rubin, 1996) (AIR):
we cannot observe outcomes in these cells under any design. For example, we do not get to see
the e?ect of randomizing to A (AR) in the subpopulation of individuals who prefer A but will not
participate in a randomized trial (PRA). Opinions di?er on the extent to which it is meaningful to
consider treatment outcomes in such cells, and this needs to be considered in the speci?c context of
each trial. In our discussion we follow AIR and focus attention on outcomes that are measurable
under some design, that is, the cells without F's in Figure 2(b). Causal comparisons (in Rubin's
sense) are comparisons of column means within rows of Figure 2(b). Comparisons between means
in di?erent rows are not causal since they concern di?erent subpopulations.
The WTP study described above is an example of a particular hybrid trial design, the Doubly
Randomized Preference Trial (DRPT), which has the generic form of Figure 3(a). People willing to
participate (P) are ?rst randomized to a choice arm and a random arm. Within the choice arm,
people receive their treatment of choice (PA or PB). Within the random arm, individuals willing
to be randomized (R) are randomized to ARor BR, and those not willing to be randomized do not
participate. Figure 3(b) indicates that mean outcomes can be estimated directly for four pooled
7
Page 9
subpopulations: the mean for AC in the subpopulation that prefers A, namely PRASPRA; the
and the mean for BC in the subpopulation that prefers B; namely PRBSPRB. The only causal
e?ect that is estimable directly without additional assumptions is the comparison of ARand BRin
the combined population PR = PRA [ PRB; this is the treatment comparison from the random
arm of the study.
means for ARand BRin the subpopulation that is willing to be randomized, namely PRASPRB;
If the outcome for an individual randomly assigned to a treatment is the same as if that individual
had chosen that treatment, then AR= AC= A and BR= BC= B. We follow other authors by calling
this assumption the \exclusion restriction" (ER), since it is an example of an exclusion restriction in
the sense that the term is used in econometrics (Angrist and Rubin, 1996). Under the ER assumption,
the four columns in Figure 3(b) reduce to two. Additional assumptions are still needed to estimate
the individual cells in the table, since there are eight cells (two apriori counterfactual) and only four
means can be directly estimated from the data. Suppose now we also assume that the random arm
and choice arm participants are random samples from the same population, that is PRA = PA and
PRB = PB. We call this assumption \no selection bias from randomization" (NSBR), which allows
us to combine the information from the Random and Choice arms of the study. Under ER and
NSBR, the table in Figure 3(b) collapses to Figure 3(c) with just four cells, and the mean outcomes
of A and B in the PA and PB subpopulations are then identi?ed. The method described in Section
2 estimates these means, using information from the random and choice arms.
In particular, the analysis of the WTP data in the previous section implicitly makes the ER and
NSBR assumptions. The mean outcomes in the two diagonal cells of Figure 3(c) are estimated from
the Choice arm, and the column marginal mean outcomes are estimated from the Random arm. The
remaining o?diagonal cells can then be estimated, with the proportion of participants that prefer
A estimated from the choice arm. In support of the NSBR assumption, we note that the proportion
of screened individuals agreeing to participate is comparable in the randomization (575/1613) and
choice (496/1466) arms. If a sizeable proportion of the population only participated if given their
treatment of choice, we would expect the participation rate to be higher in the choice arm than in
the randomization arm. We discuss designs under which the ER and NSBR assumptions can be
relaxed in Section 7.
4A General Model for a DRPT with covariates
Janevic, et al. (2003) found that the probability of choosing each treatment was a?ected by demo
graphic variables and disease severity. The outcomes within the two subpopulations (Ci= A and
8
Page 10
Ci= B) are also likely to be a?ected by baseline covariates other than the treatments. We hence
extend the analysis of the previous section to accommodate covariates. As before, we consider a
DRPT with two treatments, generically denoted as A and B. In this section, we assume ER and
NSBR within the subpopulations de?ned by values of the covariates.
4.1The Model
Suppose the data are comprised of n subjects. For subject i, let Yidenote the observed outcome
of interest, Tidenote the treatment assignment (A or B), Cidenote the treatment preference (A or
B), Wi= R if subject i is randomized to the random arm and Wi= P if subject i is randomized
to the choice arm, X1i be a set of covariates associated with Yi, and X2i be a set of covariates
associated with Ci, where X1i and X2i may overlap. In the application to the WTP data, the
observed outcome of interest is the SIP physical score, the covariates associated with the SIP physical
score are age, employment status and some baseline measures, and the covariates associated with
treatment preference Ci are employment status, baseline total symptom impact and baseline SIP
physical score.
Let Yi(A) and Yi(B) be the potential outcomes of Y for subject i when Ti= A, and Ti= B,
respectively. The average causal e?ect of treatment assignment for the whole population is
? = E fYi(B)g ? E fYi(A)g;
The average causal e?ect of treatment assignment for the subpopulation preferring treatment m,
(m = A;B) and having covariates X1is
?m(X1) = E fYi(B)jCi= m;X1g ? E fYi(A)jCi= m;X1g:
Averaging over the distribution of X1, the average causal e?ect of treatment assignment for the
subpopulation preferring treatment m equals to
?m= EX1jCi=m[E fYi(B)jCi= m;X1g ? E fYi(A)jCi= m;X1g]:
The causal parameters ?m(X1) and ?mcan be related to estimable quantities under the ER and
NSBR assumptions. Speci?cally, since subjects are randomized to the Choice or Random arms, we
have
E fYi(j)jCi= m;X1g = E fYi(j)jCi= m;X1;Wig;
where m and j take values A;B. In the Random arm, since subjects are randomized to treatment
groups, we have
E fYi(j)jCi= m;X1;Wi= Rg=E fYi(j)jTi= j;Ci= m;X1;Wi= Rg
9
Page 11
=E(YijTi= j;Ci= m;X1;Wi= R):
In the Choice arm, subjects are assigned to treatments they prefer, that is, Ti = Ci. Thus for
j = A;B,
E fYi(j)jCi= j;X1;Wi= Pg=E fYi(j)jTi= Ci;Ci= j;X1;Wi= Pg
E(YijTi= j;Ci= j;X1;Wi= C):=
We can not estimate EfYi(A)jCi= B;X1;Wi= Pg and EfYi(B)jCi= A;X1;Wi= Pg from data
from the Choice arm alone. However, under the ER and NSBR assumptions, we have
EfYi(A)jCi= B;X1;Wi= Pg
EfYi(B)jCi= A;X1;Wi= Pg
=EfYi(A)jTi= A;Ci= B;X1;Wi= Rg = EfYijTi= A;Ci= B;X1;Wi= Rg
EfYi(B)jTi= B;Ci= A;X1;Wi= Rg = EfYijTi= B;Ci= A;X1;Wi= Rg=
Hence, we can then use data in the random arm in conjunction with the data in the choice arm
to estimate these quantities, by viewing each group in the random arm as a mixture of the two
preference subpopulations.
For notational simplicity, we recode the values of Tiand Ciby replacing A by 1 and B by 0. We
assume that the distribution of Yigiven Ti, Ci, X1iand Wibelongs to the exponential family
f(YijTi;Ci;X1i;Wi) = exp
(
Yi?i? b(?i)
?a?1
i
+ c(Yi;?)
)
;
where aiis a known constant, ? is a scale parameter, ?iis the canonical parameter, b(?) and c(?) are
known functions. The mean of Yiis ?i= E(YijTi;Ci;X1i;Wi) = b0(?i) and is assumed to have the
form
g(?i) = ?0+ XT
1i?X1+ Ti?T+ Ci?C+ TiCi?TC;(1)
where g(?) is a monotonic link function (McCullagh and Nelder, 1989). The model is completed by
assuming the treatment preference Cigiven X2ifollows a logistic model with ?i= Pr(Ci= 1jX2i)
satisfying
logit(?i) = ?0+ XT
2i?X2:(2)
The causal e?ect of treatment assignment for the subpopulation preferring treatment m (m = 1;0)
given covariates X1is
?m(X1)=E fY (1)jC = m;X1g ? E fY (0)jC = m;X1g
g?1f?0+ X1?X1+ ?T+ m(?C+ ?TC)g ? g?1(?0+ X1?X1+ m?C)
=
10
Page 12
The marginal causal e?ects, ?mhave the form
?m=
Z
?m(X1= x1)f(x1jC = m)dx1;(3)
where f(x1jC = m) can be empirically estimated from the Choice arm, that is,^f(x1jC = m) =
P
In the random arm (Wi;Ti;Yi) are observed but Ciis not observed. In the choice arm, (Wi;Ti;Yi;Ci)
iI(Ci= m;X1i= x1)=P
iI(Ci= m), where I(?) is an indicator function.
are observed and Ti = Ci, since subjects are assigned to their preferred treatment. Denote by
? = (?;?;?) the parameter vector. De?ne Y = (Y1;Y2;:::;Yn)T, and T;C, X1, X2, W similarly.
The observed data loglikelihood is given by
`(Y;CobsjT;X1;X2;W;?)=
X
+
Wi=P
[logff(YijTi;X1i;Ci)g + Cilog(?i) + (1 ? Ci)log(1 ? ?i)]
X
(4)
Wi=R
[logf?if(YijTi;X1i;Ci= 1) + (1 ? ?i)f(YijTi;X1i;Ci= 0)g];
where Cobsdenote the observed C values for the Choice arm, and f(YijTi;X1i;Ci) follows the gener
alized linear model (1) and ?ifollows the logistic model (2).
4.2Estimation Using the EM Algorithm
An EM algorithm (Dempster, Laird and Rubin 1977) can be used to calculate the maximum likelihood
(ML) estimate of ? for the above model. The complete data (Yi;Ci;Ti;X1i;X2i;Wi) have loglikelihood
`(Y;CjT;X1;X2;W;?)=
X
X
i
`(Yi;CijTi;X1i;X2i;Wi;?)
=
i
[logff(YijTi;Xi;Ci;Wi)g + Cilog(?i) + (1 ? Ci)log(1 ? ?i)]: (5)
The EM algorithm iterates between an E step, which replaces missing values of Ciin (5) by their
conditional expectations given the observed data, and an M step, which maximizes the expected
completedata loglikelihood (5) to yield updated parameter estimates.
1. E Step at the kth iteration. We calculate the expected completedata loglikelihood given Y ,
Cobs, T, X, Z, W and the current parameter estimates ?(k), namely
Ef`(Y;CjT;X1;X2;W)jY;Cobs;T;X1;X2;W;?(k)g
X
where for participants in the random arm (Wi= R),
=
Wi=R;
X
m=0;1
w(k)
i;m`(Yi;Ci= mjTi;X1i;X2i;?(k)) +
X
Wi=P
`(Yi;CijTi;X1i;X2i;?(k)); (6)
w(k)
i;m
=p(Ci= mjYi;Ti;Xi;?(k))
f(YijTi;X1i;Ci= m;?(k))p(Ci= mjTi;X2i;?(k))
?(k)
=
if(YijTi;X1i;Ci= 1;?(k)) + (1 ? ?(k)
i)f(YijTi;X1i;Ci= 0;?(k))
:
11
Page 13
where ?(k)
i
= p(Ci= 1jTi;X2i;?(k)). The Estep estimates the weights w(k)
completedata loglikelihood (5) for the random arm.
i;min the weighted
2. M Step at the kth iteration. This step updates the parameter estimates ?(k+1)by maximizing
the expected completedata loglikelihood (5). We ?rst construct an augmented data set as
follows. Each observation in the random arm, with Ci missing, is replaced by two \?lled
in" observations, in which the missing treatment preference indicator Ciis replaced by 0 and
1 respectively and the corresponding weights wi;0 and wi;1 are computed using the current
estimates of the parameters. The observations in the choice arm are unchanged, with weights
are set to one. Using this augmented data set, we ?t a weighted generalized linear model for
?(k+1)and ?(k+1), and a weighted logistic model for ?(k+1).
The resulting estimates from the EM algorithm at convergence give the ML estimates of ?. For
the special case of no covariates X1i;X2iand a binary outcome, the ML estimates of ? transformed
to the mean scale reduce to the method of moments estimates in Section 2.2. A similar EM algorithm
was proposed by Ibrahim (1990) for missing categorical covariates. Our EM algorithm extends his
algorithm to the hybrid randomizedpreference design by estimating the weights in the random arm
and ?xing the weights to be one in the choice arm. We consider estimating standard errors of
the ML estimates using bootstrap, the observed information and the approximation proposed in
Ibrahim (1990). The observed information is obtained by directly computing the second derivative
of the observed likelihood using a symbolic di?erentiation algorithm.
5 Analysis of the WTP data
We now apply the methods of Section 4 to the WTP data, to estimate preference e?ects adjusting
for covariates. Group format is coded as 1 and SelfDirected format is coded as 0. Based on the
previous analysis in Janevic, et al. (2003), we consider the following covariates in models (1) and (2):
employment status, age, total symptom impact at baseline, which is a measure of symptom severity
scored between 0 and 70 (Clark et al., 1997), and SIP scores at baseline.
Table 2 presents the estimates of the regression parameters in (1) and (2) and their estimated
standard errors for the outcome SIP physical score. The results from model (2) show that employed
women (OR = 1:84, P = 0:046), and women with greater physical limitations at baseline (OR = 1:03,
P = 0:012) are more likely to choose the SelfDirected format , suggesting that these women tend
to opt for the more ?exible scheduling that the SelfDirected format provides. Women with a higher
total symptom impact score at baseline are less likely to choose the SelfDirected format (OR = 0:97,
12
Page 14
P = 0:013), suggesting that these women may be interested in opportunites to meet women in a
similar situation, which the Group format provides. The magnitudes of these e?ects on treatment
preference are comparable to those in Janevic et al. (2003), which includes more detailed discussion
on predictors of program format preference. In terms of the covariate e?ects on SIP physical score,
women with higher baseline SIP score or total symptom impact score have higher SIP physical score
at month 12; other baseline covariates are not signi?cant (P > 0:05). The signi?cant interaction
between treatment preference and treatment assignment (P = 0:039) suggests an e?ect of treatment
preference for this outcome, with women having a better SIP physical score when assigned their
treatment of choice.
Table 3 summarizes the treatment and preference e?ects on the subpopulation means for SIP
physical score and the other two SIP outcomes. These adjusted means are obtained by integrating
over the distributions of the covariates using equation (3). The treatment and preference e?ects for
the SIP psychological and SIP Total scores have a similar pattern to that for SIP physical score, but
are not statistically signi?cant. Overall, our analysis shows limited evidence of a bene?t of women
getting the types of treatment they prefer. This ?nding addresses one of the hypotheses proposed
by the investigators in the WTP study, that is, making both treatment formats available to women
may be advantageous.
The last row of Table 3 shows strong preferences e?ects on treatment adherence. The marginal
causal treatment e?ect on adherence on the probability scale for the subpopulation preferring the
Group treatment is^?1= 0:208 (P < 0:001), and the marginal causal treatment e?ect on adherence
on the probability scale for the subpopulation preferring the SelfDirected treatment is^?0= ?0:336
(P < 0:001). These results show that women who prefer the Group treatment are 20.8% more likely
to adhere to the treatment if assigned to the Group format than if assigned to the SelfDirected form,
whereas women who prefer the SelfDirected treatment are 33.6% more likely to adhere if assigned
to the SelfDirected treatment than if assigned to the Group treatment. These results are consistent
with the ?ndings in Section 2 that women are more likely to adhere to the program they prefer. The
covariateadjusted treatment e?ects are slightly stronger than those in Section 2.2 without covariate
adjustments.
6A Simulation Study
We conducted a simulation study to evaluate the ?nite sample performance of the proposed method.
The design of the simulation study was similar to that of the WTP study. Each data set consisted
of 1000 observations. Independent binary observations Ciof the treatment preference indicator were
13
Page 15
generated using the logistic model
logitfPr(Ci= 1)g = ?0+ X2i?1;
where ?1= ?1, ?1= 2, and the X2i were generated from a uniform distribution on the interval
[0;1]. The outcome variable Yiwas assumed to be binary and generated independently using the
logistic model
logitfE(yi)g = ?0+ X1i?1+ Ti?T+ Ci?C+ TiCi?TC;
where ?0= ?2, ?1= ?T= ?C= 2, ?TC= ?2, the X1iwere generated from a uniform distribution on
the interval [0;1]. Values of Widenoting random versus choice arm were generated from a Bernoulli
distribution with P(Wi= R) = 0:5. The treatment assignment indicators Tiwere set equal to Ciin
the choice arm (Wi= P), and were generated from a Bernoulli distribution with P(Ti= 1) = 0:5
in the random arm (Wi= R). The preference indicators Ciwere set to missing in the random arm
(Wi= R).
A total of 125 simulated data sets were generated and analyzed. Table 4 presents the simulation
results. The point estimates are very close to the true values and there is no evidence of bias. We
compared three methods for estimating the standard errors: the bootstrap method, the observed
information, and the approximation given by Ibrahim (1990). These estimated standard errors can
be compared with the empirical standard errors. Our results show that the observed information
based standard errors are closest to the empirical standard errors, the bootstrap standard errors
perform similarly but are slight overestimates for the coe?cients of treatment preference and the
interaction between treatment preference and treatment assignment. The approximation given by
Ibrahim (1990) seems to slightly underestimate the standard errors. This might be due to the
fact that the estimators of the regression parameters ? and ? in model (1) and (2) are assumed
independent in the approximation.
7 Alternative Hybrid Trial Designs
We now consider some other hybrid designs using the framework developed in Section 3. A simple
alternative to the DRPT is the partially randomized preference trial (PRPT) mentioned in Section 1.
In this design, individuals willing to be randomized are assigned to the random arm and receive AR
or BR, and individuals not willing to be randomized are allowed to choose the treatment, ACor BC
(Figure 4(a)). Thus the PR individuals are randomized to ARand BR, the PRA individuals receive
ACand the PRB individuals receive BC. Figure 4(b) shows which of the cell means in Figure 2(b) can
be directly estimated under this design (E); the cells denoted by E are not apriori counterfactual
14
View other sources
Hide other sources
 Available from Roderick J Little · Jun 3, 2014
 Available from harvard.edu