ArticlePDF Available

Retrieved-Dropout-Based multiple imputation for time-to-event data in cardiovascular outcome trials

Authors:

Abstract and Figures

Recently, retrieved-dropout-based multiple imputation has been used in some therapeutic areas to address the treatment policy estimand, mostly for continuous endpoints. In this approach, data from subjects who discontinued study treatment but remained in study were used to construct a model for multiple imputation for the missing data of subjects in the same treatment arm who discontinued study. We extend this approach to time-to-event endpoints and provide a practical guide for its implementation. We use a cardiovascular outcome trial dataset to illustrate the method and compare the results with those from Cox proportional hazard and reference-based multiple imputation methods.
Content may be subject to copyright.
Full Terms & Conditions of access and use can be found at
https://www.tandfonline.com/action/journalInformation?journalCode=lbps20
Journal of Biopharmaceutical Statistics
ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/lbps20
Retrieved-Dropout-Based multiple imputation
for time-to-event data in cardiovascular outcome
trials
Jiwei He, Roberto Crackel, William Koh, Ling-Wan Chen, Feng Li, Jialu Zhang
& Mark Rothmann
To cite this article: Jiwei He, Roberto Crackel, William Koh, Ling-Wan Chen, Feng Li, Jialu Zhang
& Mark Rothmann (2023) Retrieved-Dropout-Based multiple imputation for time-to-event data
in cardiovascular outcome trials, Journal of Biopharmaceutical Statistics, 33:2, 234-252, DOI:
10.1080/10543406.2022.2118763
To link to this article: https://doi.org/10.1080/10543406.2022.2118763
Published online: 19 Sep 2022.
Submit your article to this journal
Article views: 372
View related articles
View Crossmark data
Retrieved-Dropout-Based multiple imputation for time-to-event
data in cardiovascular outcome trials
Jiwei He , Roberto Crackel, William Koh, Ling-Wan Chen, Feng Li , Jialu Zhang,
and Mark Rothmann
Office of Biostatistics, Office of Translational Sciences, Center for Drug Evaluation and Research, U.S. Food and Drug
Administration, Silver Spring, Maryland, USA
ABSTRACT
Recently, retrieved-dropout-based multiple imputation has been used in
some therapeutic areas to address the treatment policy estimand, mostly
for continuous endpoints. In this approach, data from subjects who discon-
tinued study treatment but remained in study were used to construct
a model for multiple imputation for the missing data of subjects in the
same treatment arm who discontinued study. We extend this approach to
time-to-event endpoints and provide a practical guide for its implementa-
tion. We use a cardiovascular outcome trial dataset to illustrate the method
and compare the results with those from Cox proportional hazard and
reference-based multiple imputation methods.
ARTICLE HISTORY
Received 16 January 2022
Accepted 25 August 2022
KEYWORDS
Missing data; multiple
imputation; treatment policy
estimand; retrieved dropout;
cardiovascular outcome trial
1. Introduction
Cox proportional hazard (PH) model is a conventional way of analyzing time-to-event endpoints in
clinical trials. Under this method, censorings are assumed to be non-informative, i.e., each subject’s
censoring time is independent of their event time. In other words, among those at risk of an event at
time t, the event rate of those with times censored at time t is similar to that of those who are still under
follow-up at time t. Although this assumption may be valid for administrative censoring at the end of
the study (EOS), it is unlikely to hold for subjects who withdraw from the study early due to lack of
efficacy or an adverse event. Non-informative censoring can lead to biased results similar to the
missing not at random (MNAR) problem for continuous and categorical endpoints.
The National Research Council (NRC) and Food and Drug Administration (FDA) have recom-
mended the use of an estimand framework (Food and Drug Administration 2021; National Research
Council 2010). An estimand defines the target of estimation for a particular trial. ICH E9 (R1)
mentions four common ways of handling intercurrent events: treatment policy, hypothetical, compo-
site, and while on-treatment. In cardiovascular outcome trials (CVOT), if subjects’ time-to-event are
censored at treatment discontinuation or initiation of rescue therapy and a Cox PH model is applied,
we would consider the target estimand a hypothetical one. The targeted estimand might involve
a hypothetical scenario in which ‘the subjects remain on their assigned treatment throughout the
study’, counter to the fact not all of the subjects can. Sometimes this estimand is viewed as lacking
clinical relevance. In many therapeutic areas, FDA recommends the use of the treatment policy
estimand, which targets the intent-to-treat treatment effect. For example, treatment policy estimand
is believed to provide the best summary of the drug effect to evaluate long-term, real-world effective-
ness for weight management drugs (McEvoy 2016). The FDA’s analyses for the weight management
CONTACT Jiwei He jiwei.he@fda.hhs.gov Office of Biostatistics, Office of Translational Sciences, Center for Drug Evaluation
and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
This article has been corrected with minor changes. These changes do not impact the academic content of the article.
JOURNAL OF BIOPHARMACEUTICAL STATISTICS
2023, VOL. 33, NO. 2, 234–252
https://doi.org/10.1080/10543406.2022.2118763
© 2022 Taylor & Francis Group, LLC
drug liraglutide included data from subjects who discontinued study treatment but remained in study
(retrieved dropouts) and used them to construct a model for multiple imputation of missing data of
subjects in the same treatment arm who discontinued study. The objective is to adequately represent
the missing outcome based on what would have been the expected outcome had the outcome been
measured (Rothmann et al. 2011). For time-to-event endpoints, the Cox PH model is often applied by
convention without questioning the underlying estimands and assumptions.
The goal of this paper is to describe a treatment policy strategy to impute time-to-event for non-
administratively censored subjects in CVOTs and to provide a practical guide for its implementation.
In recent years, multiple imputation methods have been increasingly used in the analysis of time-to-
event endpoints. They were usually used as sensitivity analyses to assess the impact of the non-
informative censoring assumption in the primary analysis, for example, impute missing data based on
censoring at random followed by a multiplicative adjustment to the hazard rate (tipping point
analysis) (Lipkovich et al. 2016; Lu et al. 2015), or impute missing data based on the hazard rate in
a reference control group (‘Jump to Reference’) (Atkinson et al. 2019). Our proposed analysis intends
to impute missing time-to-event data in a fashion consistent with what the outcome would have been,
had it been observed. It will include off-treatment data from subjects who discontinued study
treatment but remained in study and use the off-treatment hazard rate to impute for the missing
data of subjects in the same treatment arm who discontinued study. It can be considered an extension
of the retrieved dropout approach in McEvoy (2016) to time-to-event endpoints.
This paper is organized as follows: Section 2 describes the missing data problem in CVOTs. Section 3
outlines a strategy for imputing missing time-to-event data with the goal of targeting the treatment policy
estimand, i.e., the effect on cardiovascular risk regardless of whether subjects remain on the assigned
treatment. Section 4 provides details about implementation of this strategy with an emphasis on three
different multiple imputation methods. Section 5 illustrates the imputation methods using a dataset from
a clinical trial. Section 6 discusses the findings and provides recommendations.
2. Missing data problem in CVOTs
The primary composite endpoint in CVOTs often involves clinical components such as cardiovascular
(CV) death, CV hospitalization, myocardial infarction (MI), or stroke. Figure 1 shows the typical
Figure 1. Typical Scenarios in CVOTs. P: Pattern; X: Event of Interest; C: Censored; D: Death
JOURNAL OF BIOPHARMACEUTICAL STATISTICS 235
scenarios for follow-up in CVOTs. P1-P8 indicate 8 patterns. The blue and red segments represent on-
treatment and off-treatment periods, respectively.
P1 and P2 are subjects who were followed up until EOS without experiencing any event. P1 was on
study treatment for the duration of the study, whereas P2 discontinued from study treatment early but
nevertheless completed the study. The censoring of their times is considered non-informative since it
is due to the administrative cutoff date, unrelated to study treatment or underlying medical conditions.
P3 and P4 are subjects who experienced an event during the trial. P1-P4 are considered completers and
do not require imputation. P5 and P6 are subjects who discontinued from study before EOS and
without any event. The reason for their withdrawal may be related to their study treatment or
underlying medication conditions. These non-administratively censored subjects are considered non-
completers and require imputation. In CVOTs, subjects who discontinue from study usually discon-
tinue from treatment prior to or around the same time of study discontinuation. P7 and P8 are subjects
who died during the trial. P7 and P8 should only include deaths that are not a component of the
composite endpoint of interest. If the composite endpoint includes CV death, P7 and P8 should only
include non-CV death. For this paper, we assume non-informative censoring for non-CV death and
apply the Cox PH model without imputation to these cases. See the next section for discussion about
the targeted estimand.
3. Strategy to target treatment policy estimand
A key objective when addressing missing data is determining the most appropriate estimate of the
treatment effect or treatment difference along with the appropriate corresponding uncertainty in that
estimate. When possible, this would involve determining the best guess of what that missing value
would have been had it been measured and the corresponding uncertainty in that guess (Rothmann et
al. 2011). In that regard, we are interested in the contrast between two treatment groups based on the
assigned treatment regardless of the actual adherence status, namely the treatment policy estimand.
One approach that aims to estimate the treatment policy estimand is to include data from subjects
who discontinued study treatment but remained in study (retrieved dropouts) and use them to
construct a model for multiple imputation for the missing data of subjects in the same treatment
arm who discontinued study (subjects in P5 and P6 categories) (McEvoy 2016). Note that ‘retrieved
dropout’, a term used in the McEvoy (2016) paper, refers to a subject who did NOT drop out from
study and did have their outcome assessed despite discontinuing the assigned treatment. Investigators
are recommended to continue to measure endpoints on all subjects, even those who have discontinued
the assigned treatment to minimize missing data as well as to provide more retrieved dropout data for
imputation. It is important to distinguish in the protocol between the reasons for treatment disconti-
nuation and reasons for study withdrawal. We want to extend this retrieved-dropout approach to
time-to-event endpoints. In time-to-event analyses, we would like to use all observed person time
including off-treatment time from subjects who discontinued the assigned treatment. The event rate
during the off-treatment time can be used to impute missing time-to-event of subjects in the same
treatment arm who discontinued from study early before the EOS date without having an event or
death. In most trials including CVOTs, subjects who discontinued from study had no access to study
treatment. It may be reasonable to assume that their event rate after discontinuation from study
resembles that from subjects in the same treatment arm who discontinued from treatment but
continued to be followed in the study (retrieved dropout). In doing so, we intend to impute missing
time-to-event data in a fashion consistent with what the outcome would have been, had it been
observed in the real setting, to the best of our knowledge. The next section will describe the details for
implementation.
We would like to use a strategy that targets the treatment policy estimand whenever possible.
However, for subjects who died of non-CV death before any event occurred, their time-to-event does
not exist. The aforementioned imputation approach will not be applicable in these cases. For this
paper, we assume there is interest in an estimand involving a hypothetical scenario where non-CV
236 J. HE ET AL.
death would not occur and use a model with an underlying non-informative censoring assumption for
this intercurrent event. Additional discussion on this estimand strategy and corresponding assump-
tions is warranted.
Atkinson et al. (2019) have proposed a ‘Jump to Reference’ imputation method, which is also
considered a treatment policy strategy by the authors (Atkinson et al. 2019). Their approach also uses
all observed data regardless of the actual adherence status, consistent with the treatment policy
estimand. Under jump to reference, imputation for missing data in both treatment groups is based
on all observed data in the control group. Specifically, an active arm subject censored at time c switches
to the control arm hazard for t>c:This method relies on the assumption that subjects who discon-
tinued the assigned treatment and had missing follow-up in both groups are like the other subjects in
the control group, which seems implausible since subjects in the control group who discontinued
treatment are likely inherently different from those in the control group who did not discontinue
treatment, and subjects in the active group who discontinued treatment are likely inherently different
from those in the control group. For instance, subjects in the active group who discontinue treatment
due to adverse events from the active treatment may also have worse prognosis compared to the
control group. Considering the limitation of the reference-based imputation method, the retrieved-
dropout-based imputation method we propose in this paper can be a useful alternative analysis to
address a treatment policy type of estimand.
4. Multiple imputation methods
4.1. Overview
Multiple imputation often mimics the data generation process based on an assumed parametric model.
Since the true parameters of the model are unknown, a reasonable estimator of the true parameters
must be used. Based on how the uncertainty associated with the estimator is accounted for, imputation
methods can be classified as improper (fixed parameter) method, proper (Bayesian) method, and
proper-like method. More details about these methods are provided below.
One imputation method is to use the estimated parameter values from the observed data as true
values and directly substitute them into the parametric model of the sampling distribution. Rubin
(1987) refers to this type of imputation as ‘improper’ imputation. Rubin’s variance estimator used
together with improper imputation underestimates the true asymptotic variance of the final estimator
from improper imputation (Tsiatis 2006).
With proper imputation, also known as Bayesian imputation, the parameters are regarded as
random with a prior distribution (usually non-informative) and the parameter values are generated
using random draws from the posterior distribution (Tsiatis 2006). Missing values for individuals are
then produced based on the conditional distribution given the sampled random values of the
parameters.
Tsiatis (2006) mentions another approach that resembles proper imputation, which we will refer to
as ‘proper-like’ imputation in this paper. With this approach, the parameter values are randomly
drawn from the asymptotic normal distribution of a consistent estimator. With large sample sizes, the
asymptotic distribution is expected to closely approximate the posterior distribution of the parameter.
Missing values for individuals are then produced based on the conditional distribution given the
sampled random values of the parameters.
We have seen all three methods being used in CVOTs in recent years, although it is inappropriate to
use improper imputation together with Rubin’s variance estimator. These imputation methods were
often implemented with mistakes since the implementation of these methods for time-to-event
endpoints is not as straightforward as that for continuous endpoints. It will be helpful to offer
a practical guide to the application of these methods in time-to-event analyses.
Piecewise exponential models are popular in multiple imputation of time-to-event data. Such
models allow the hazard rate to vary over time with considerable flexibility. If the hazard rate appears
JOURNAL OF BIOPHARMACEUTICAL STATISTICS 237
to be constant over time, the regular exponential model can be used as a special case by setting the
number of pieces to one. Therefore, we will use piecewise exponential models in our imputation
methods. Weibull models can also be considered. They have been used by Atkinson et al. (2019) in
their Jump to Reference imputation.
4.2. Notation
Since we impute missing data for each treatment group separately in the same way, the methods
discussed in this paper apply generally to each treatment group unless otherwise stated. Subscript for
the treatment group is omitted from all notations for simplicity.
For the full dataset of all subjects, let Yyand Cydenote the time from randomization to event and
the time from randomization to censoring at EOS or death (e.g., non-CV death in CVOTs), whichever
occurs first, respectively, and Ty¼min Yy;Cy
. We consider death non-informative censoring in
this case, but it may not be appropriate in some situations. Let δy¼I YyCy
denote the censoring
indicator. Xyrepresents a vector of baseline covariates at the time of randomization. The full data
consist of ty
i;δy
i;xy
i
n oN
i¼1 for subjects i¼1;. . . ;N, but not all subjects had observed ty
i;δy
i
n o:
For subjects who discontinued treatment but continued to be followed for an off-treatment period
(P2, P4, P6, P7 in Figure 1), we denote their observed off-treatment time by T¼min Y;Cð Þ, where Y
and C are the time from treatment discontinuation to event and the time from treatment discontinua-
tion to censoring or death (whichever occurs earlier), respectively. Let δ¼I Y Cð Þ denote the
censoring indicator. X represents a vector of covariates used for imputation. X may include covariates
at the time of treatment discontinuation in addition to baseline covariates. The observed off-treatment
data for this subgroup consist of ti;δi;xi
f gn
i¼1 for subjects i¼1;...;n.
We will impute time from treatment discontinuation to event, T, for subjects who were censored
prior to EOS and death without having an event (P5, P6 in Figure 1) and therefore had missing
Ty;δy
n o. We consider these subjects non-completers and the rest completers. Non-completers are
assumed to discontinue the assigned treatment after dropout. T
i for non-completers i¼1;...;n0will
be imputed based on the observed off-treatment data from P2, P4, P6, P7 in the same treatment group.
Their observed off-treatment time is also denoted by Ti (Ti¼0 for P5). Note that P6 is in both the
group for estimating off-treatment hazard rates and the group for imputation, since P6 has an off-
treatment period but P6 is also a non-completer with early censoring before EOS. Imputation for T
i in
non-completers is conditional on T
i>ti. Based on imputed t
i and the date of EOS or death whichever
occurs first, we can derive (ty
i;δy
i) for non-completers. Sometimes death information can be collected
from a subject’s vital record even after he or she discontinues from the study.
4.3. Proper (Bayesian) imputation
In this section, we show how time-to-event can be imputed from a piecewise exponential model using
a Bayesian multiple imputation approach. For the imputation, we only consider the off-treatment time
after treatment discontinuation.
Let the time from treatment discontinuation be partitioned into K intervals, τk1;τk
½ Þ for
k¼1;...;K, with 0 ¼τ0<τ1<... <τK¼ 1. The baseline hazard hotð Þ is assumed to be constant
within each interval:
h0tð Þ ¼ λk;if τk1t<τk;k¼1;...;K:
238 J. HE ET AL.
The hazard function for subject i is h tjxi;β;λð Þ ¼ h0tð Þexp βTxi
, where β is a vector of regression
coefficients for baseline covariates, and λ¼λ1;...;λK
ð Þ is a vector of parameters for baseline hazard.
Given that Δktð Þ ¼
0;tτk1
tτk1;τk1<tτk
τkτk1;t>τk
8
<
:,
the baseline cumulative hazard function can be written as:
H0tð Þ ¼ XK
k¼1λkΔktð Þ:
The survival function for subject iis
S tjxi;β;λð Þ ¼ exp H0tð Þexp βTxi
:(1)
Thus, the likelihood for observed off-treatment data is:
L Djλ;βð Þ ¼ n
i¼1K
k¼1λkexp βTxi
δiI τk1ti<τk
ð Þ exp H0ti
ð Þexp βTxi
;(2)
where D refers to the observed data. For Bayesian imputation, we need to specify a prior distribution
for λ;βð Þ. For example, we can assume each λk has an independent gamma prior G ak;bk
ð Þ such that
πλð Þ ¼ Q
K
k¼1
G ak;bk
ð Þ, and β has a multivariate normal prior Nβ0;P0
. The parameter values for
priors should be set as reasonably noninformative. Alternatively, the likelihood (2) can be parameter-
ized in terms of α¼log λð Þ and β.
The posterior distribution for λ;βð Þ given the observed data is:
Pλ;βjDð Þ ¼ L Djλ;βð Þπλð Þπβð Þ
òL Djλ;βð Þπλð Þπβð Þdv λ;βð Þ /L Djλ;βð Þπλð Þπβð Þ:(3)
Bayesian imputation is to sample time from treatment discontinuation to event, T
i, for non-
administratively censored subjects i¼1;...;n0(P5 and P6 in Figure 1) from the posterior predictive
distribution:
P TjDð Þ ¼ òP TjD;λ;βð ÞPλ;βjDð Þdv λ;βð Þ
It can be implemented using the following Steps 1 & 2.
Step 1: Sample the parameters from posterior distribution.
Let M be the number of imputations. For m¼1;...;M, b
λmð Þ;b
βmð Þ
can be drawn from their joint
posterior distribution in (3) via a Markov chain Monte Carlo (MCMC) algorithm. It is straightforward
to implement the sampling of b
λmð Þ;b
βmð Þ
using ‘bayes’ statement in SAS PROC PHREG. Refer to
Appendix C for details.
Step 2: Conditional on sampled parameters, impute time-to-event.
Time-to-event can be imputed based on the method in Lipkovich et al. (2016). For each imputation
m¼1;. . . ;M, conditional on sampled b
λmð Þ;b
βmð Þ
, we want to sample T
i from a conditional survival
function S tjt>ti;xi;b
β;b
λ
, where ti is the observed off-treatment duration. This ensures t
i is greater
than ti. The survival function is stated in (1). Superscript mð Þ is now omitted for all the parameters for
convenience. Since
JOURNAL OF BIOPHARMACEUTICAL STATISTICS 239
S t
ijt
i>ti;xi;b
β;b
λ
¼S t
ijxi;b
β;b
λ
S tijxi;b
β;b
λ
(4)
follows a uniform distribution U0;1ð Þ, it is equivalent to solving t
i from S t
ijxi;b
β;b
λ
¼ui, where
ui is from a uniform distribution U0;S tijxi;b
β;b
λ
. Specifically, assuming the notation λik ¼
b
λkexp b
βTxi
and Sik ¼S τkjxi;b
β;b
λ
, T
ifor each non-completer i¼1;. . . ;n0can be imputed as
follows:
t
i¼
log ui
ð Þ
λi1;Si1<ui
τk1log ui
ð ÞþPq¼1
k1λiq τqτq1
ð Þ
λik ;Si;k<uiSi;k1;k¼2;...;K1
τK1log ui
ð ÞþPq¼1
K1λiq τqτq1
ð Þ
λiK ;uiSi;K1
8
>
>
>
<
>
>
>
:
(5)
Time from randomization to event yy
i is simply time from randomization to the last exposure date
plus t
i, or time from randomization to the last follow-up date plus t
iti. By censoring at EOS or
death whichever occurs first, we obtain ty
i¼min yy
i;cy
i
and δy
i¼I yy
icy
i
.
Steps 3 & 4 below show how the final estimator for the parameter of interest and associated variance
estimator can be obtained from the imputed full data sets.
Step 3: Analyze each of the Mimputed datasets.
For each imputation m¼1;. . . ;M, by combining the imputed data ty
i;δy
i
n on0
i¼1mð Þ for non-
completers i¼1;. . . ;n0with observed data from completers, we obtain a full dataset
ty
i;δy
i;xy
i
n oN
i¼1mð Þ for all subjects i¼1;. . . ;N:Steps 1 and 2 can be applied within each treatment
group to obtain an imputed full dataset for all subjects.
Suppose the parameter of interest is θ, the hazard ratio between two treatment groups. Each of the
M imputed full dataset can be analyzed using a Cox PH model to yield an estimator b
θmð Þas well as the
associated variance estimator c
θ
mð Þ;m¼1;...;M.
Step 4: Combine the results of the M analyses using Rubin’s rule.
The final estimator b
θfor θ is given by the average of b
θmð Þ;m¼1;. . . ;M, obtained from each of
the M imputed full data sets. Rubin’s rule uses law of total variance to derive the variance estimator c
θ
as a sum of between and within imputation variances. This procedure can be implemented using
PROC MIANALYZE in SAS.
4.4. Proper-Like imputation
The proper-like imputation approach mentioned in Section 4.1 can be an alternative method to fully
Bayesian proper imputation when the sample size is not too small. The two approaches only differ in
the way the λ and β parameters are sampled (Step 1 in Section 4.3). The rest of the procedures (Steps
2–4) can be identical. This section describes the proper-like imputation method in general and its
application to a piecewise exponential model.
Assume that we want to sample a certain parameter λ. Suppose that ^
λmð Þ;m¼1;. . . ;M are
randomly drawn from a normal distribution:
N^
λobsð Þ;c
λ
obsð Þ
;
240 J. HE ET AL.
where ^
λobsð Þ is a consistent and asymptotic normal estimator for λ from the observed data (e.g.
maximum likelihood estimator MLE) and c
λobsð Þ is a consistent estimator of the true asymptotic
covariance matrix obsð Þ
λ:By doing so, we are roughly drawing from the posterior distribution in an
asymptotic sense; for large sample size and certain choices of prior distribution for λ, the asymptotic
distributions can closely approximate the posterior distributions (Tsiatis 2006). In contrast, with
improper imputation, a fixed ^
λobsð Þ is used to impute missing values.
For exponential models, we usually do not draw λ directly from a normal distribution, which may
result in negative values. Instead, each λk is parameterized as αk¼log λk
ð Þ.
From (2), the log-likelihood function for a piecewise exponential model is:
lλ;βð Þ ¼ X
K
k¼1
dklog λk
ð Þ þ X
n
i¼1
δiβTxiX
K
k¼1
λkX
n
i¼1
Δkti
ð Þexp βTxi
" #;
where dk¼P
n
i¼1
δiIðτk1ti<τkÞis the total number of observed events in the k
th
interval.
Equivalently in terms of α,
lα;βð Þ ¼ X
K
k¼1
dkαkþX
n
i¼1
δiβTxiX
K
k¼1
expðαkÞX
n
i¼1
Δkti
ð Þexp βTxi
" #:(6)
The details of solving for MLE and covariance matrix, based on Lawless (2011), are provided in
Appendix A. When there is no covariate in the model, (6) can be simplified to:
lαð Þ ¼ X
K
k¼1
dkαkX
K
k¼1
expðαkÞX
n
i¼1
Δkti
ð Þ
" #:
In this case, it is straightforward to obtain b
αk¼log dk
Pn
i¼1Δkti
ð Þ
, the variance estimator
b
σ2
k¼1
dk;k¼1;...;K, and the covariance d
σk;k0¼0 for kk0. Each set of b
αkmð Þ;m¼1;. . . ;M can be
drawn from a separate normal distribution Nb
αk;b
σ2
k
. Once b
αkmð Þ is sampled, it can be converted to
b
λkmð Þ ¼exp b
αkmð Þ
. In cases with covariates, b
αmð Þand b
βmð Þneed to be drawn jointly from
a multivariate normal distribution based on the MLE and covariance matrix stated in Appendix A.
The sampled parameters can be used to impute T
i for subjects i¼1;...;n0in the same way as the
proper imputation method.
5. Data example
We use a dataset from a clinical trial to illustrate the retrieved-dropout-based multiple imputation
methods.
5.1. Subject dispositions
The study was an event-driven, multi-center, randomized, double-blind, placebo-controlled parallel-
group study to evaluate the effect of active treatment on CV health and mortality in hypertriglyceri-
demic patients with CV disease or at high risk of CV disease. The primary endpoint was the time to the
first occurrence of 5-point major adverse cardiac event (MACE), which is a composite of CV death,
nonfatal MI, nonfatal stroke, coronary revascularization, and hospitalization for unstable angina.
A total of 4089 and 4090 subjects were randomized to the active treatment and placebo groups,
respectively. Table 1 lists the number of subjects for each follow-up pattern depicted in Figure 1. The
JOURNAL OF BIOPHARMACEUTICAL STATISTICS 241
percentage of subjects with early withdrawal was 9.7% in the active treatment group and 10.3% in the
placebo group. Their time-to-event needs to be imputed. P0 in Table 1 refers to subjects with unknown
last exposure date, which will not be included for imputation. Subjects who experienced non-CV death
without 5-point MACE were considered as having non-informative censoring. They will not be
included for imputation but off-treatment data from P8 subjects will contribute to the estimation of
the off-treatment hazard rate. We noticed that more than 500 subjects had their last follow-up date
1 day after their last exposure date. Most of these cases were due to study completion. Therefore, we
define ‘on-treatment’ period as on or before the last exposure date +1 to exclude such cases from
imputation.
5.2. Results
Figure 2 shows the cumulative hazard for each treatment group for all observed person time regardless
of adherence to the assigned treatment. The hazard rate appeared to be roughly constant over time in
both treatment groups. The hazard ratio (HR) from the Cox PH model was 0.752 (95% CI: 0.682,
0.830), suggesting the active treatment is beneficial to the study population.
Figure 3 shows the cumulative hazard for each treatment group during off-treatment time among
subjects who discontinued from their assigned treatment but continued to be followed in the study for
at least some time (retrieved dropouts). A piecewise exponential model with three pieces,
0;40½ Þ;40;500½ Þ;500;1½ Þ days, appeared to fit the off-treatment data well. The choice of pieces
accommodates the big change in event rate at the initial period after treatment discontinuation. It
also ensures there are enough events within each piece for estimating the hazard rate.
We applied the proper, proper-like, and improper imputation methods described in Sections 4 to
impute missing time-to-event data in non-administratively censored subjects. First, we used
a piecewise exponential model with no covariate to fit the off-treatment data from retrieved dropouts.
The main results are listed in Table 2. For proper imputation, a gamma prior G104;102
ð Þ was used
for each λk, setting the prior mean to 0.01 (per person day) and variance to 1. This prior is proper and
reasonably noninformative. Results from Cox PH model based on all observed data, or on-treatment
data only without imputation were also included for comparison. Artificially censoring at treatment
discontinuation might be considered to target a different estimand, i.e., one involving a scenario in
which subjects are able to remain on treatment throughout the study. These methods without
imputation reply on the assumption of non-informative censoring in the early dropouts.
Table 1. Number of subjects in each follow-up pattern – data example.
Event Status
Treatment
Status
a
N (%)
Active Treatment
(N=4089)
Placebo
(N=4090)
P1 Administrative
Censoring at EOS
ON 2343 57.30% 2028 49.58%
P2 Administrative
Censoring at EOS
OFF 588 14.38% 684 16.72%
P3 Event ON 579 14.16% 759 18.56%
P4 Event OFF 125 3.06% 142 3.47%
P5 Early Withdrawal ON 186 4.55% 190 4.65%
P6 Early Withdrawal OFF 209 5.11% 231 5.65%
P7 Non-CV Death ON 15 0.37% 24 0.59%
P8 Non-CV Death OFF 42 1.03% 30 0.73%
P0 Unknown Treatment
Discontinuation Date
- 2 0.05% 2 0.05%
a. Treatment status at the time when the subject experienced an event or was censored due to completion of study, early
withdrawal, or non-CV death.
EOS: end of study.
242 J. HE ET AL.
Figure 3. Cumulative hazard during off-treatment: actual data vs. estimated model – data example.
Note: Average event rate is calculated in terms of per person day within each internal. The dotted lines represent cumulative hazard
estimated from a piecewise constant exponential model.
Figure 2. Cumulative hazard including all observed person time – data example.
Note: Average event rate is calculated in terms of per person day within each internal.
JOURNAL OF BIOPHARMACEUTICAL STATISTICS 243
On average, the three imputation methods imputed similar number of events in both treatment
groups. The point estimates for HR from the three imputation methods were very similar. They were
all closer to the null compared to that from Cox PH model. This can be explained by the fact that the
HR from off-treatment time was 0.966 (95% CI: 0.758, 1.231), which is very close to the null, also
shown by the largely overlapping cumulative hazard curves in Figure 3. The 95% confidence intervals
(CI) from proper imputation and proper-like imputation were very similar. Both gave slightly larger
standard error (SE) than Cox PH model without imputation. The SE from improper imputation was
smaller than those from the other two imputation methods. This is consistent with Tsiatis (2006)’s
conclusion that improper imputation used together with Rubin’s variance estimator will under-
estimate the true asymptotic variance.
Next, we consider including covariates in the piecewise exponential model that may predict time-
to-event. We included baseline CV risk category (primary versus secondary prevention) as a covariate
in the piecewise exponential model and repeated all three imputation methods. The results were also
listed in Table 2. They were quite similar to those from retrieved-dropout-based imputation without
covariate. Interestingly, the SEs did not increase even with an additional parameter in the imputation
model. Another potentially important covariate to consider is the on-treatment duration prior to
treatment discontinuation.
We then applied the ‘Jump to Reference’ imputation method from Atkinson et al. (2019) for
comparison since their approach also intends to address a treatment policy type of estimand. For this
part, we only applied proper imputation since proper-like imputation is expected to give similar
Table 2. Retrieved-dropout-based imputation versus cox proportion hazard model without imputation and jump-to-reference
imputation – data example.
Active Treatment Event
(%)
a
(N=4089)
Placebo Event
(%)
a
(N=4090)
Hazard Ratio
(95% CI) Log(HR)
SE of
log(HR) P-value
Cox PH without Imputation
e
Cox PH 705 (17.24) 901 (22.03) 0.752
(0.682, 0.830)
−0.284 0.0503 <0.0001
Cox PH (On-treatment) 580 (14.18) 759 (18.56) 0.721
(0.647, 0.803)
−0.328 0.0552 <0.0001
Retrieved-Dropout-Based Imputation
b, e
Improper Imputation 774 (18.92) 970 (23.72) 0.773
(0.700, 0.853)
−0.258 0.0502 <0.0001
Proper Imputation 774 (18.92) 970 (23.72) 0.773
(0.699, 0.854)
−0.258 0.0513 <0.0001
Proper-like Imputation 774 (18.93) 971 (23.73) 0.773
(0.699, 0.854)
−0.258 0.0512 <0.0001
Retrieved-Dropout-Based Imputation with Covariate
c, e
Improper Imputation 772 (18.89) 963 (23.54) 0.777
(0.705, 0.858)
−0.252 0.0501 <0.0001
Proper Imputation 772 (18.89) 963 (23.54) 0.777
(0.703, 0.859)
−0.252 0.0512 <0.0001
Proper-like Imputation 773 (18.91) 963 (23.56) 0.778
(0.703, 0.860)
−0.252 0.0512 <0.0001
Jump-to-Reference Imputation
d, e
Proper Imputation 769 (18.81) 970 (23.71) 0.769
(0.697, 0.848)
−0.263 0.0501 <0.0001
Proper Imputation with
covariate
763 (18.66) 963 (23.54) 0.768
(0.696, 0.847)
−0.264 0.0501 <0.0001
HR: hazard ratio; SE: standard error; PH: proportional hazard.
a. For analyses with multiple imputation, average number of events across imputations were presented.
b. Piecewise Exponential model contained no covariate.
c. Piecewise Exponential model contained CV risk category (primary vs. secondary prevention) as a covariate.
d. The covariate used in proper imputation with covariate is CV risk category (primary vs. secondary prevention).
e. All the analyses used Cox PH model stratified by geographical region, CV risk category and use of ezetimibe.
244 J. HE ET AL.
results. Under Jump to Reference, an active arm subject censored at time c switches to the control arm
hazard for t>c:For the placebo arm, it may be reasonable to fit a model to time from randomization if
hazard is not expected to change much after discontinuation from placebo. A piecewise exponential
model with three pieces, 0;500½ Þ;500;1000½ Þ;1000;1½ Þ days, was fit to all the observed data from the
placebo group based on roughly even allocation of events. The origin here refers to time of randomi-
zation. The results are listed in Table 2. On average, the jump to reference approach imputed similar
number of events in the placebo arm and slightly fewer events in the active treatment arm compared to
the retrieved-dropout-based approach, resulting in slightly smaller point estimates for HR. Note that
similar observations may not be generalized to other datasets, for instance, when the placebo group
does not have constant hazard over time. The SEs from the jump to reference approach were
considerably smaller than those from the retrieved-dropout-based approach. They were even smaller
than the SEs from Cox PH model, despite the fact that Rubin’s rule is said to bias the variance upward
in reference-based multiple imputation (Bartlett 2021). This may be due to the between-group
correlation induced by imputation of all missing data based on the same reference group. The
simulation results from Atkinson et. al. (2019) also suggested that jump to reference may yield smaller
SEs than censoring at random (Atkinson et al. 2019). These results suggest that the jump to reference
approach may not be a conservative way of imputing missing data as some may assume.
6. Discussion
This paper proposed a retrieved-dropout-based strategy for imputing missing time-to-event data for
non-administratively censored subjects in CVOTs and provided a detailed guide for its implementa-
tion. Although this paper used CVOTs as examples, the proposed method is applicable to other time-
to-event trials with similar missing data situations. Our imputation method is illustrated using
a piecewise exponential model for the hazard function, since it is believed to be flexible enough in
most cases. However, the concept of imputing missing time-to-event based on retrieved dropouts
could be applied to other models. Ideally, the pieces and all other details of the imputation (and
analysis) model should be pre-specified. Pre-specification is critical for primary and secondary
analyses of confirmatory clinical trials. However, this may be less critical if this approach is used as
a sensitivity analysis. Selecting pieces and models in a data-adaptive manner based on a certain
goodness-of-fit criterion could be considered as a future exploration.
In the data example, proper (Bayesian) imputation and proper-like imputation gave similar results.
Both are valid choices for imputing missing time-to-event data. For proper imputation, a relatively
non-informative prior needs to be selected. Given that and a relatively large sample size, the posterior
and asymptotic distributions are expected to be similar. Improper imputation used together with
Rubin’s rule appeared to underestimate the true variance, consistent with the theory from Tsiatis
(2006), and should be avoided. We previously saw improper imputation being used in some CVOT
applications. Another common mistake is sampling a different set of parameters for each subject with
missing data in the same imputation. This is against the recommended procedure of conditioning on
the same set of parameters in each imputation. This mistake also leads to underestimation of the true
variance under Rubin’s rule since it would make the imputed values more similar on average among
the imputations.
We have conducted simulation studies to confirm that the proposed imputation method works well
under the correct model specification (Appendix E). The method’s performance depends on the
assumption that the observed off-treatment data resemble the missing off-treatment data. Although
we cannot formally test this assumption, we analyzed the baseline characteristics of non-completers,
completers who discontinued treatment (OFF completer), and completers who were on-treatment
throughout the study (ON completer) to draw some evidence. For the data example, we examined sex,
age, region, BMI, diabetes status, CV risk category and ezetimibe use (Table A1, Appendix B). Non-
completers appeared to be more similar to OFF completers compared to ON completers in terms of
important baseline factors such as CV risk category and diabetes status. The impact of differences
JOURNAL OF BIOPHARMACEUTICAL STATISTICS 245
between non-completers and retrieved dropouts can be further explored using a tipping point analysis,
i.e., each λk parameter for the active treatment group can be multiplied with a sensitivity parameter
δ>1 to increase the hazard rate in this group until the conclusion changes. In addition, if the number
of retrieved dropouts is too small to build a reliable imputation model, it is not feasible to apply this
method. Although unlikely to happen for CVOTs, this could be a concern for small studies. In cases
where the vast majority of subjects who discontinue treatment continue to be followed till the EOS,
Cox PH model may be acceptable, since a very small amount of missing data is not expected to impact
the results significantly regardless of the missing data mechanism.
In the data example, we assume there is interest in an estimand involving a hypothetical scenario
where non-CV death would not occur and used Cox PH model without imputation for this inter-
current event. Other analyses that target different estimands have been considered, for example, (1)
use all-cause mortality in place of CV death in the composite endpoint (2) use Fine and Grey
regression to estimate the sub-distribution HR considering non-CV death as a competing risk
(Austin and Fine 2017). For the sub-distribution hazard, subjects who fail from a competing cause
(non-CV death) remain in the risk set until the EOS. Appendix D contains the SAS code for this
method. For this data example, the sub-distribution method gave very similar results as the results in
Table 2 (results not shown). Additional discussion on the targeted estimands and corresponding
assumptions of these strategies is warranted. The details are beyond the scope of this paper.
Our method assumes that almost all subjects who discontinued from study also discontinued from
treatment. Non-completers and retrieved dropouts shared an important attribute, namely they dis-
continued study drug. This makes sense since subjects usually do not have access to study treatment
after discontinuing from study. If many non-completers were still on treatment after censoring, which
rarely happens, it is inappropriate to impute time-to-event for these subjects based on the event rates
from off-treatment time. Our method cannot be applied directly in such cases. The reason for
discontinuing from study without discontinuing from treatment in these subjects needs to be
examined. It may be reasonable to consider the censoring of their time non-informative if the reason
appeared to be unrelated to study treatment or underlying medical conditions.
Acknowledgements
The authors thank Greg Levin for his helpful comments on this manuscript.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Disclaimer
The views expressed in this article should not be construed to represent those of U.S. Food and Drug Administration.
Funding
The author(s) reported there is no funding associated with the work featured in this article.
ORCID
Jiwei He http://orcid.org/0000-0001-6874-7803
Feng Li http://orcid.org/0000-0003-1247-832X
Mark Rothmann http://orcid.org/0000-0002-1830-7057
246 J. HE ET AL.
References
Atkinson, A., M. G. Kenward, T. Clayton, and J. R. Carpenter. 2019. Reference-Based sensitivity analysis for time-to-
event data. Pharmaceutical Statistics 18 (6):645–658. doi:10.1002/pst.1954.
Austin, P. C., and J. P. Fine. 2017. Practical recommendations for reporting F ine-g ray model analyses for competing
risk data. Statistics in Medicine 36 (27):4391–4400. doi:10.1002/sim.7501.
Bartlett, J. W. Reference-Based multiple imputation—What is the right variance and how to estimate it. Statistics in
Biopharmaceutical Research. Published online: 12 Nov 2021. doi: 10.1080/19466315.2021.1983455.
Food and Drug Administration. 2021. E9 (R1) Statistical principles for clinical trials: Addendum: Estimands and
sensitivity analysis in clinical trials. Guidance for Industry.
Lawless, J. F. 2011. Parametric regression models. In Statistical model and methods for lifetime data, Hoboken, NJ: John
Wiley & Sons; pp. 322–324.
Lipkovich, I., B. Ratitch, and M. O’Kelly. 2016. Sensitivity to censored-at-random assumption in the analysis of time-to-
event endpoints. Pharmaceutical Statistics 15 (3):216–229. doi:10.1002/pst.1738.
Lu, K., D. Li, and G. G. Koch. 2015. Comparison between two controlled multiple imputation methods for sensitivity
analyses of time-to-event data with possibly informative censoring. Statistics in Biopharmaceutical Research
7 (3):199–213. doi:10.1080/19466315.2015.1053572.
McEvoy, B. W. 2016. Missing data in clinical trials for weight. Journal of Biopharmaceutical Statistics 26 (1):30–36.
doi:10.1080/10543406.2015.1094814.
National Research Council, “The prevention and treatment of missing data in clinical trials,” 2010.
Rothmann, M. D., B. L. Wiens, and I. S. Chan. 2011. Missing data and analysis sets. In Design and analysis of non-
inferiority trials, New York, NY: CRC Press; p. 181.
Tsiatis, A. A. 2006. Multiple imputation: A frequentist perspective. In Semiparametric theory and missing data, New
York, NY: Springer; pp. 366–371.
JOURNAL OF BIOPHARMACEUTICAL STATISTICS 247
Appendix
A. MLE and Covariance Matrix for Piecewise Exponential Model
Based on the log likelihood (6), by @lα;βð Þ
@α¼0, we obtain
^
αkβð Þ ¼ log dk
Pn
i¼1Δkti
ð Þexp βTxi
!;k¼1;...;K:(7)
Substituting this to (6) gives the profile likelihood for β after some rearrangements:
lpβð Þ ¼ X
n
i¼1
δiβTxiX
K
k¼1
dklog X
n
i¼1
Δkti
ð Þexp βTxi
" #þc;(8)
where c is part of the likelihood that does not depend on β. The partial derivatives of lpβð Þ are
@lpβð Þ
@β¼X
n
i¼1
δixiX
K
k¼1
dk
S1
kβð Þ
S0
kβð Þ;
@2lpβð Þ
@β¼X
K
k¼1
dk
S1
kβð Þ
S0
kβð Þ
S1
kβð Þ
S0
kβð Þ
T
S2
kβð Þ
S0
kβð Þ
( );
where S0
k¼P
n
i¼1
Δkti
ð Þexp βTxi
, S1
k¼P
n
i¼1
Δkti
ð Þexp βTxi
xi, S2
k¼P
n
i¼1
Δkti
ð Þexp βTxi
xixT
i:
b
β can be obtained by maximizing (7), for example, solving @lpβð Þ
@β¼0 using a root-finding algorithm such as Newton-
Raphson. Substituting b
β into (6) gives b
αk¼b
αkb
β
;k¼1;. . . ;K:Alternatively, b
β and b
α can be obtained using “pchreg”
function in R package “eha”. That function does not provide the full covariance matrix.
The asymptotic covariance matrix can be obtained as the inverse of the informative matrix given by
@2lα;βð Þ
@α2α¼b
α;β¼bβ¼Diag exp bα1
ð ÞS0
1;...;exp cαK
ð ÞS0
K
¼Diag d1;...;dK
½ ;
@2lα;βð Þ
@β2α¼b
α;β¼bβ¼X
K
k¼1
exp b
αk
ð ÞS2
k;
@2lα;βð Þ
@α@βα¼b
α;β¼bβ¼exp b
α1
ð ÞS1
1;...;exp c
αK
ð ÞS1
K
;
where Diag stands for diagonal matrix.
B. Baseline Characteristics in Data Example
This section summarized subject characteristics for non-completers, completers who discontinued treatment (OFF
completers), and completers who were on-treatment throughout the study (ON completers) by treatment group in the
data example.
248 J. HE ET AL.
C. Example SAS Code
In this section, we show the main SAS code for proper imputation and proper-like imputation applied to the data
example. A piecewise exponential model without covariates was used here, but the code can be adapted for models with
covariates. The baseline hazard contained three pieces: 0;40½ Þ;40;500½ Þ;500;1½ Þ days. The number of imputations was
M¼1000.
C1. Sample Lambda in Proper Imputation
For proper imputation, we used BAYES statement in PROC PHREG to sample λ from its posterior distribution. The
piecewise option specifies a piecewise exponential model. This allows sampling of either λ or log(λ) with ‘piecewise=-
hazard’ and ‘piecewise=log’ respectively. Users can specify either the number of intervals using ‘n=’ or the specific cut-
points using ‘intervals=().’ Setting ‘n = 1’ gives a usual exponential distribution. The initial samples before convergence
were discarded (by nbi = 1000), and then every 100th sample was sampled.
* trt01p: treatment group
* offipday: observed off-treatment days
* CNSR: censoring indicator
%let iterations=1000;
%let tau1=40;
%let tau2=500;
proc phreg data=data1;
by trt01p;
class trt01p;
model offipday*CNSR(1)=trt01p;
hazardratio trt01p;
bayes seed=12345 outpost=outsample nbi=1000 nmc=%eval(100*(&iterations. -1) +1) thin=1
coeffprior=normal(var=1e+3) statistics=(summary interval)
piecewise=hazard(intervals=(&tau1., &tau2.) prior=gamma(shape=1e-4 iscale=1e-2));
ods output postsummaries=sum;
run;
data lambda; set outsample; by trt01p;
if mod(iteration,100)=1;
if first.trt01p then sim=0;
Table A1. Baseline characteristics – data example.
Active Treatment Placebo
Non-completer
N=396
OFF completer
N=755
ON Completer
N=2937
Non-completer
N=423
OFF completer
N=856
ON Completer
N=2811
Sex, n(%)
Female 139 35.10% 250 33.10% 772 26.20% 180 42.50% 309 36.00% 706 25.10%
Male 257 64.80% 505 66.80% 2165 73.70% 243 57.40% 547 63.90% 2105 74.80%
Age Group, n(%)
<65 Years 199 50.20% 338 44.70% 1694 57.60% 218 51.50% 403 47.00% 1563 55.60%
>=65 Years 197 49.70% 417 55.20% 1243 42.30% 205 48.40% 453 52.90% 1248 44.30%
Region, n(%)
Asia Pacific 29 7.30% 14 1.80% 87 2.90% 32 7.50% 13 1.50% 87 3.00%
Eastern Europe 66 16.60% 118 15.60% 869 29.50% 67 15.80% 138 16.10% 848 30.10%
Westernized 301 76.00% 623 82.50% 1981 67.40% 324 76.50% 705 82.30% 1876 66.70%
Baseline BMI, n(%)
<25 kg/m2 41 10.30% 62 8.20% 217 7.30% 40 9.40% 65 7.50% 190 6.70%
>=25 to <30 kg/m2 119 30.00% 232 30.70% 1076 36.60% 116 27.40% 292 34.10% 1006 35.70%
>=30 kg/m2 230 58.00% 460 60.90% 1641 55.80% 254 60.00% 495 57.80% 1613 57.30%
Diabetes at Baseline, n(%)
Yes 284 71.70% 472 62.50% 1637 55.70% 299 70.60% 542 63.30% 1552 55.20%
No 112 28.20% 283 37.40% 1300 44.20% 121 28.60% 314 36.60% 1259 44.70%
CV Risk Category, n(%)
Primary Prevention 213 53.70% 500 66.20% 2179 74.10% 213 50.30% 569 66.40% 2111 75.00%
Secondary Prevention 183 46.20% 255 33.70% 758 25.80% 210 49.60% 287 33.50% 700 24.90%
Ezetimibe Use, n(%)
Yes 17 4.20% 62 8.20% 183 6.20% 25 5.90% 70 8.10% 167 5.90%
No 379 95.70% 693 91.70% 2754 93.70% 398 94.00% 786 91.80% 2644 94.00%
JOURNAL OF BIOPHARMACEUTICAL STATISTICS 249
sim+1;
drop loglike logpost iteration;
run;
C2. Sample Lambda in Proper-like Imputation
*hr1, hr2, hr3: observed event rates for three pieces in the piecewise exponential model
*event1, event2, event3: number of observed events for three pieces
data lambda;
call streaminit(12345);
set rates;
do imput = 1 to &iterations. by 1;
keep trt01p lambda1 lambda2 lambda3 sim;
sim=imput;
randnml1=rand(”NORMAL”);
randnml2=rand(”NORMAL”);
randnml3=rand(”NORMAL”);
lambda1 =exp(log(hr1) + (sqrt(1/event1)*randnml1));
lambda2 =exp(log(hr2) + (sqrt(1/event2)*randnml2));
lambda3 =exp(log(hr3) + (sqrt(1/event3)*randnml3));
output;
end;
run;
C3. Impute Time-to-Event
* adtte_NONcomp: data for non-completers
data MI; set adtte_NONcomp;
do sim=1 to &iterations.;
output;
end;
run;
proc sort data=MI; by trt01p sim; run;
* ADT: analysis day
* PACUTD: end of study date
* DTHDT: death date
* RANDDT: randomization day
data MI; merge MI lambda; by trt01p sim;
if 0<=offipday<&tau1. then do;
time1=offipday;
time2=0;
time3=0;
end;
else if &tau1.<=offipday<&tau2. then do;
time1=&tau1.;
time2=offipday-&tau1.;
time3=0;
end;
else if offipday>=&tau2. then do;
time1=&tau1.;
time2=&tau2.-&tau1.;
time3=offipday-&tau2.;
end;
St=exp(-(lambda1*time1+lambda2*time2+lambda3*time3));
u=ranuni(12345)*St;
if u> exp(- Lambda1*&tau1.) then t=-log(u)/Lambda1;
else if exp(- Lambda1*&tau1.)>=u>exp(-(Lambda1*&tau1.+Lambda2*(&tau2.-&tau1.))) then t=&tau1.-(log(u)
+lambda1*&tau1.)/lambda2;
else if exp(- (Lambda1*&tau1.+Lambda2*(&tau2.-&tau1.)))>=u then t=&tau2.-(log(u)+lambda1*&tau1.+lambda2*
(&tau2.-&tau1.))/lambda3;
ADT2=ADT+(t-offipday); * imputed event date;
if ADT2 le min(PACUTDT, DTHDT) then do; CNSR=0; AVAL3=(ADT2- RANDDT+1); end;
else do; CNSR=1; AVAL3=(min(PACUTDT, DTHDT)-RANDDT+1); ADT2=min(PACUTDT, DTHDT); end;
drop u t Lambda1 Lambda2 Lambda3 adt aval;
250 J. HE ET AL.
format adt2 date9.;
run;
C4. Combine Multiple Imputation Estimates
* adtte_comp: data for completers
* Combine each imputed non-completer dataset with a completer dataset
data data2; set adtte_comp(in=A) MI(rename=(adt2=adt aval3=aval));
if A then do;
do sim=1 to &iterations.;
output;
end;
end;
else output;
run;
proc sort data=data2; by sim subjid; run;
ods select none;
ods output ParameterEstimates=ParameterEstimates;
proc phreg data =data2;
by sim;
class TRT01P;
model aval*CNSR(1) = TRT01P;
hazardratio TRT01P;
run;
ods output ParameterEstimates = est;
proc mianalyze data=parameterestimates;
modeleffects estimate;
stderr StdErr;
run;
ods select all;
data finalest; set est;
HR=exp(estimate);
LCL=exp(LCLMean);
UCL=exp(UCLMean);
run;
D. Subdistribution Method for Handling Non-CV Death
The following SAS code can be used to fit a proportional subdistribution hazard model to each imputed dataset, where
CNSR is a modified censoring indicator that has 3 levels: CNSR = 0: event of interest, CNSR = 1: censor, CNSR = 2: non-
CV death. ‘eventcode = 0’ needs to be specified in the model statement.
proc phreg data=data2;
by sim;
class trt01pn;
model aval*CNSR(1)=trt01pn/eventcode=0;
hazardratio trt01pn;
run;
E. Simulation Studies
E1. Simulation Set-up
For simplicity, time-to-event data were simulated from exponential distributions. The results can be generalized to
more complicated piecewise exponential models. Six different simulation scenarios were tested. Sample size is N¼5000
per group. The number of simulations for each scenario is 1000.
For Scenario 1, on-treatment time were simulated from an exponential distribution with rate λon ¼0:01, and off-
treatment time were simulated from an exponential distribution with rate λoff ¼0:02 for both treatment groups. The
true hazard ratio is 1. On-treatment time was censored with another exponential distribution with rate λcensor ¼0:005 to
mimic treatment discontinuation. The simulated time-to-event (on-treatment time plus off-treatment time) was
censored at 100 to mimic the EOS. Missing data were created by randomly choosing 20% and 60% subjects from the
control and treated groups who discontinued treatment respectively and censoring at the time of treatment discontinua-
tion. The percent of non-completers is around 10%.
JOURNAL OF BIOPHARMACEUTICAL STATISTICS 251
Scenario 2 is similar to scenario 1, except that the control and treated groups have different event rates. The control group
have the same rates as in Scenario 1. The treated group have λon ¼0:008 and λoff ¼0:016. The true hazard rate is around 0.8.
Scenarios 3 and 4 are the same as Scenarios 1 and 2 respectively, except that the proportion of missing data was higher
for both arms (50% and 90% among those who discontinued treatment). The percent of non-completers is around 20%.
Scenarios 5 and 6 are the same as Scenarios 1 and 2 respectively, except that missing data were created by censoring
the simulated time-to-event with another exponential distribution (with rates 0.02 and 0.06 for the two groups
respectively). Therefore, some non-completers have an off-treatment period.
E2. Simulation Results
For all the scenarios tested, the estimates from the retrieved-drop-based imputation method are unbiased and the 95%
CIs have good nominal coverage probabilities. The fact that the coverage probability is 95% when the null hypothesis is
true implies type I error is controlled at 5%. In contrast, the estimators from the Cox PH method are biased and the 95%
CIs have poor coverage probabilities. When the percent of missing data is higher (in Scenarios 3 and 4), the SEs are
slightly larger. In Scenarios 3 and 4, the percent of retrieved dropouts in the treated group is very low (<3%, around 100
subjects), but the method appears to perform well.
Table A2. Results from simulation studies.
Scenarios
True
HR
Cox PH Without Imputation Retrieved-Dropout-Based Imputation
Mean
HR
%
Bias
Empirical SE of
HR
95% Coverage
Probability
Mean
HR
%
Bias
Empirical SE of
HR
95% Coverage
Probability
1 1.0 0.953 −4.73 0.024 51.7 1.000 −0.02 0.025 95.4
2 0.8 0.764 −4.92 0.020 50.3 0.804 0.09 0.022 95.1
3 1.0 0.945 −5.46 0.025 42.4 0.999 −0.07 0.033 94.9
4 0.8 0.756 −5.86 0.021 40.2 0.804 0.10 0.028 95.8
5 1.0 0.964 −3.58 0.024 70.5 1.000 0.01 0.026 96.0
6 0.8 0.772 −3.93 0.020 66.3 0.805 0.02 0.022 95.9
HR: hazard ratio; SE: standard error; PH: proportional hazard.
252 J. HE ET AL.
... In cases where missing data still arise despite these precautions, the imputation of missing values will be conducted in a manner that reflects the likely real-world values that would have been observed had the data been collected. Methods to impute missing values under the treatment policy strategy have been discussed recently (Wang et al., 2023;He et al., 2023). These methods impute all missing values in a clinical study uniformly using one single approach, in a fashion consistent with what the values would have been had they been collected while off treatment. ...
Preprint
Full-text available
The International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) E9 (R1) Addendum provides a framework for defining estimands in clinical trials. Treatment policy strategy is the mostly used approach to handle intercurrent events in defining estimands. Imputing missing values for potential outcomes under the treatment policy strategy has been discussed in the literature. Missing values as a result of administrative study withdrawals (such as site closures due to business reasons, COVID-19 control measures, and geopolitical conflicts, etc.) are often imputed in the same way as other missing values occurring after intercurrent events related to safety or efficacy. Some research suggests using a hypothetical strategy to handle the treatment discontinuations due to administrative study withdrawal in defining the estimands and imputing the missing values based on completer data assuming missing at random, but this approach ignores the fact that subjects might experience other intercurrent events had they not had the administrative study withdrawal. In this article, we consider the administrative study withdrawal censors the normal real-world like intercurrent events and propose two methods for handling the corresponding missing values under the retrieved dropout imputation framework. Simulation shows the two methods perform well. We also applied the methods to actual clinical trial data evaluating an anti-diabetes treatment.
Article
Full-text available
Conducting clinical trials (CTs) has become increasingly costly and complex in terms of designing and operationalizing. These challenges exist in running CTs on novel therapies, particularly in oncology and rare diseases, where CTs increasingly target narrower patient groups. In this study, we describe external control arms (ECA) and other relevant tools, such as virtualization and decentralized clinical trials (DCTs), and the ability to follow the clinical trial subjects in the real world using tokenization. ECAs are typically constructed by identifying appropriate external sources of data, then by cleaning and standardizing it to create an analysis-ready data file, and finally, by matching subjects in the external data with the subjects in the CT of interest. In addition, ECA tools also include subject-level meta-analysis and simulated subjects’ data for analyses. By implementing the recent advances in digital health technologies and devices, virtualization, and DCTs, realigning of CTs from site-centric designs to virtual, decentralized, and patient-centric designs can be done, which reduces the patient burden to participate in the CTs and encourages diversity. Tokenization technology allows linking the CT data with real-world data (RWD), creating more comprehensive and longitudinal outcome measures. These tools provide robust ways to enrich the CT data for informed decision-making, reduce the burden on subjects and costs of trial operations, and augment the insights gained for the CT data.
Article
Full-text available
We have explored several statistical approaches to impute missing time-to-event data that arise from outcome trials with relatively long follow-up periods. Aligning with the primary estimand, such analyses evaluate the robustness of results by imposing an assumption different from censoring at random (CAR). Although there have been debates over which assumption and which method is more appropriate to be applied to the imputation, we propose to use the collection of retrieved dropouts as the basis of missing data imputation. As retrieved dropouts share a similar disposition, such as treatment discontinuation, with subjects who have missing data, they can reasonably be assumed to characterize the distribution of time-to-event among subjects with missing data. In terms of computational intensity and robustness to violation of underlying distributional assumption, we have compared parametric approaches via MCMC or MLE multivariate sampling procedures to a non-parametric bootstrap approach with respect to baseline hazard function. Each of these approaches follows a process of multiple imputation (“proper imputations”), analysis of complete datasets, and final combination. The type-I error, and power rates are examined under a wide range of scenarios to inform the performance characteristics. A subset of a real unblinded phase III CVOT is used to demonstrate the application of the proposed approaches, compared to the Cox proportional hazards model and jump-to-reference multiple imputation.
Article
The International Council for Harmonization (ICH) E9(R1) addendum recommends choosing an appropriate estimand based on the study objectives in advance of trial design. One defining attribute of an estimand is the intercurrent event, specifically what is considered an intercurrent event and how it should be handled. The primary objective of a clinical study is usually to assess a product's effectiveness and safety based on the planned treatment regimen instead of the actual treatment received. The estimand using the treatment policy strategy, which collects and analyzes data regardless of the occurrence of intercurrent events, is usually utilized. In this article, we explain how missing data can be handled using the treatment policy strategy from the authors' viewpoint in connection with antihyperglycemic product development programs. The article discusses five statistical methods to impute missing data occurring after intercurrent events. All five methods are applied within the framework of the treatment policy strategy. The article compares the five methods via Markov Chain Monte Carlo simulations and showcases how three of these five methods have been applied to estimate the treatment effects published in the labels for three antihyperglycemic agents currently on the market.
Article
Full-text available
Reference-based multiple imputation methods have become popular for handling missing data in randomised clinical trials. Rubin’s variance estimator is well known to be biased compared to the reference-based imputation estimator’s true repeated sampling (frequentist) variance. Somewhat surprisingly given the increasing popularity of these methods, there has been relatively little debate in the literature as to whether Rubin’s variance estimator or alternative (smaller) variance estimators targeting the repeated sampling variance are more appropriate. We review the arguments made on both sides of this debate, and argue that the repeated sampling variance is more appropriate. We review different approaches for estimating the frequentist variance, and suggest a recent proposal for combining bootstrapping with multiple imputation as a widely applicable general solution. At the same time, in light of the consequences of reference-based assumptions for frequentist variance, we believe further scrutiny of these methods is warranted to determine whether the strength of their assumptions is generally justifiable.
Article
Full-text available
The analysis of time‐to‐event data typically makes the censoring at random assumption, ie, that—conditional on covariates in the model—the distribution of event times is the same, whether they are observed or unobserved (ie, right censored). When patients who remain in follow‐up stay on their assigned treatment, then analysis under this assumption broadly addresses the de jure, or “while on treatment strategy” estimand. In such cases, we may well wish to explore the robustness of our inference to more pragmatic, de facto or “treatment policy strategy,” assumptions about the behaviour of patients post‐censoring. This is particularly the case when censoring occurs because patients change, or revert, to the usual (ie, reference) standard of care. Recent work has shown how such questions can be addressed for trials with continuous outcome data and longitudinal follow‐up, using reference‐based multiple imputation . For example, patients in the active arm may have their missing data imputed assuming they reverted to the control (ie, reference) intervention on withdrawal. Reference‐based imputation has two advantages: (a) it avoids the user specifying numerous parameters describing the distribution of patients' postwithdrawal data and (b) it is, to a good approximation, information anchored , so that the proportion of information lost due to missing data under the primary analysis is held constant across the sensitivity analyses. In this article, we build on recent work in the survival context, proposing a class of reference‐based assumptions appropriate for time‐to‐event data. We report a simulation study exploring the extent to which the multiple imputation estimator (using Rubin's variance formula) is information anchored in this setting and then illustrate the approach by reanalysing data from a randomized trial, which compared medical therapy with angioplasty for patients presenting with angina.
Article
Full-text available
In survival analysis, a competing risk is an event whose occurrence precludes the occurrence of the primary event of interest. Outcomes in medical research are frequently subject to competing risks. In survival analysis, there are 2 key questions that can be addressed using competing risk regression models: first, which covariates affect the rate at which events occur, and second, which covariates affect the probability of an event occurring over time. The cause-specific hazard model estimates the effect of covariates on the rate at which events occur in subjects who are currently event-free. Subdistribution hazard ratios obtained from the Fine-Gray model describe the relative effect of covariates on the subdistribution hazard function. Hence, the covariates in this model can also be interpreted as having an effect on the cumulative incidence function or on the probability of events occurring over time. We conducted a review of the use and interpretation of the Fine-Gray subdistribution hazard model in articles published in the medical literature in 2015. We found that many authors provided an unclear or incorrect interpretation of the regression coefficients associated with this model. An incorrect and inconsistent interpretation of regression coefficients may lead to confusion when comparing results across different studies. Furthermore, an incorrect interpretation of estimated regression coefficients can result in an incorrect understanding about the magnitude of the association between exposure and the incidence of the outcome. The objective of this article is to clarify how these regression coefficients should be reported and to propose suggestions for interpreting these coefficients.
Chapter
IntroductionGraphical Methods and Model AssessmentInference for Log-Location-Scale (Accelerated Failure Time) ModelsExtensions of Log-Location-Scale ModelsSome Other Models
Article
Over the past years, significant progress has been made in developing statistically rigorous methods to implement clinically interpretable sensitivity analyses for assumptions about the missingness mechanism in clinical trials for continuous and (to a lesser extent) for binary or categorical endpoints. Studies with time-to-event outcomes have received much less attention. However, such studies can be similarly challenged with respect to the robustness and integrity of primary analysis conclusions when a substantial number of subjects withdraw from treatment prematurely prior to experiencing an event of interest. We discuss how the methods that are widely used for primary analyses of time-to-event outcomes could be extended in a clinically meaningful and interpretable way to stress-test the assumption of ignorable censoring. We focus on a 'tipping point' approach, the objective of which is to postulate sensitivity parameters with a clear clinical interpretation and to identify a setting of these parameters unfavorable enough towards the experimental treatment to nullify a conclusion that was favorable to that treatment. Robustness of primary analysis results can then be assessed based on clinical plausibility of the scenario represented by the tipping point. We study several approaches for conducting such analyses based on multiple imputation using parametric, semi-parametric, and non-parametric imputation models and evaluate their operating characteristics via simulation. We argue that these methods are valuable tools for sensitivity analyses of time-to-event data and conclude that the method based on piecewise exponential imputation model of survival has some advantages over other methods studied here. Copyright © 2016 John Wiley & Sons, Ltd.
Article
In 2014, the US FDA approved liraglutide for weight management. The statistical review of the application presented various challenges related to the handling of missing data. The ability of the drug to cause lose weight was not in question. The challenge centered on obtaining a reliable estimate of the intention-to-treat effect to support the risk-benefit evaluation. Subjects in the trials that stopped treatment prior to the endpoint were encouraged to attend the primary endpoint visit. Data from the subjects that returned for a primary efficacy assessment played a significant role in the statistical review. They were used to illustrate shortcomings of the applicant's primary efficacy analysis and sensitivity analyses. They were also used in the FDA analyses to address missing data. A goal of this paper is to illustrate challenges and considerations associated with the handling of missing data in clinical trials.
Article
Controlled imputation methods provide general and flexible sensitivity analyses to address nonignorable missing data. For time-to-event data with possibly informative censoring, we compare two popular methods for imputing the censored event time conditional on the time of follow-up discontinuation. One is the delta-adjusted method that specifies that the hazard of having an event for subjects who discontinued before the time point is multiplicatively increased relative to the hazard for subjects who continued beyond the time point. The other is the reference-based method that specifies that the hazard for experimental subjects who discontinued lies between the hazard for experimental subjects who continued and the hazard for the reference control (e.g., placebo) subjects. We consider both piecewise constant and nonparametric baseline hazard functions, Bayesian and frequentist imputations, and Rubin’s and bootstrap variances for the multiple imputation estimator. We show that both the reference-based and delta-adjusted sensitivity analyses control the one-sided Type I error rate (in the direction of a difference favoring the experimental treatment). In addition, when the bootstrap variance is used for inference, the reference-based sensitivity analysis has better power than the delta-adjusted sensitivity analysis for the same underlying treatment effect.
Book
What Is a Non-Inferiority Trial? Definition of Non-Inferiority Reasons for Non-Inferiority Trials Different Types of Comparisons A History of Non-Inferiority Trials References Non-Inferiority Trial Considerations Introduction External Validity and Assay Sensitivity Critical Steps and Issues Sizing a Study Example of Anti-Infectives References Strength of Evidence and Reproducibility Introduction Strength of Evidence Reproducibility References Evaluating the Active Control Effect Introduction Active Control Effect Meta-Analysis Methods Bayesian Meta-Analyses References Across-Trials Analysis Methods Introduction Two Confidence Interval Approaches Synthesis Methods Comparing Analysis Methods and Type I Error Rates A Case in Oncology References Three-Arm Non-Inferiority Trials Introduction Comparisons to Concurrent Controls Bayesian Analyses References Multiple Comparisons Introduction Comparing Multiple Groups to an Active Control Non-Inferiority on Multiple End Points Testing for Both Superiority and Non-Inferiority References Missing Data and Analysis Sets Introduction Missing Data Analysis Sets References Safety Studies Introduction Considerations for Safety Study Cardiovascular Risk in Antidiabetic Therapy References Additional Topics Introduction Interaction Tests Surrogate End Points Adaptive Designs Equivalence Comparisons References Inference on Proportions Introduction Fixed Thresholds on Differences Fixed Thresholds on Ratios Fixed Thresholds on Odds Ratios Bayesian Methods Stratified and Adjusted Analyses Variable Margins Matched-Pair Designs References Inferences on Means and Medians Introduction Fixed Thresholds on Differences of Means Fixed Thresholds on Ratios of Means Analyses Involving Medians Ordinal Data References Inference on Time-to-Event End Points Introduction Censoring Exponential Distributions Nonparametric Inference Based on a Hazard Ratio Analyses Based on Landmarks and Medians Comparisons Over Preset Intervals References Appendix: Statistical Concepts Frequentist Methods Bayesian Methods Comparison of Methods Stratified and Adjusted Analyses References Index
Parametric regression models. In Statistical model and methods for lifetime data
  • J F Lawless