On Estimating the Relationship between Longitudinal
Measurements and Time-to-Event Data Using a Simple
Paul S. Albert and Joanna H. Shih
Biometric Research Branch
Division of Cancer Treatment and Diagnosis
National Cancer Institute
Bethesda, Maryland U.S.A. 20892
January 30, 2009
Ye et al. (2008) proposed a joint model for longitudinal measurements and time-to-event
data in which the longitudinal measurements are modeled with a semiparametric mixed
model to allow for the complex patterns in longitudinal biomarker data. They proposed a
two-stage regression calibration approach which is simpler to implement than a joint mod-
eling approach. In the first stage of their approach, the mixed model is fit without regard
to the time-to-event data. In the second stage, the posterior expectation of an individual’s
random effects from the mixed-model are included as covariates in a Cox model. Although
Ye et al. (2008) acknowledged that their regression calibration approach may cause bias
due to the problem of informative dropout and measurement error, they argued that the
bias is small relative to alternative methods. In this article, we show that this bias may
be substantial. We show how to alleviate much of this bias with an alternative regression
calibration approach which can be applied for both discrete and continuous time-to-event
data. Through simulations, the proposed approach is shown to have substantially less bias
than the regression calibration approach proposed by Ye et al. (2008). In agreement with the
methodology proposed by Ye et al., an advantage of our proposed approach over joint mod-
eling is that it can be implemented with standard statistical software and does not require
complex estimation techniques.
Ye et al. (2008) proposed a two-stage regression calibration approach for estimating the re-
lationship between longitudinal measurements and time-to-event data. Their approach was
motivated by trying to establish such a relationship when the longitudinal measurements
follow a complex semi-parametric mixed model with subject-specific random stochastic pro-
cesses and the time-to-event data follow a proportional hazards model. Specifically, they
proposed a semi-parametric model with additive errors for the longitudinal measurements
Xijof the form
iβ + ϕ(tij) + Ui(tij)bi+ Wi(tij) + ?ij
i(tij) + ?ij,(1)
where β is a vector of regression coefficients associated with fixed effect covariates Zi, ϕ(t)
is an unknown smooth function over time, biis a vector of subject-specific random effects
corresponding to covariates Ui(t) which is assumed normally distributed with mean 0 and
variance Σb. Further, Wi(tij) is a zero mean integrated Wiener stochastic process. We
denote Xias all longitudinal measurements on the ith individual.
In Ye et al.’s approach, the relationship between the slope of the longitudinal process
and a time-to-event outcome Tiis characterized by a Cox proportional hazards model with
the slope at time t, denoted as X∗?
i(t), being treated as a time-dependent covariate. The
authors proposed a two-stage estimation procedure in which in the first stage, the mean of
the posterior distribution of the slope at time t, E[X∗?
i(t)|Xi,Zi], is estimated using model
(1) without regard to the time-to-event process Ti. In the second stage, E[X∗?
i(t) in the Cox model. Ye et al. (2008) proposed two approaches: (i) the ordinary
regression calibration (ORC) approach in which E[X∗?
i(t)|Xi,Zi] is estimated using (1) with
all available longitudinal measurements and (ii) the risk set regression calibration (RRC)
approach in which these expectations are obtained by estimating model (1) after each event
using only longitudinal measurements for subjects at risk at time t (i.e., subjects who have
Table 1: Estimates of α1from model (2)-(4) when β0= 1, β1= 3, σb0= 1, σb1= 1, and
σb0,b1= 0. We assume that σ?= 0.75, α0= −1.5, α1= 0.50, J = 5, and I = 300. Further,
we assume that tj = j and all individual’s who are alive at t5 = 5 are administratively
censored at that time point. The means (standard deviations) from 1000 simulations are
Prop M=3 w/o MC
Prop M=3 w/ MC
Prop M=10 w/o MC
Prop M=10 w/ MC
Prop M=20 w/o MC
Prop M=20 w/ MC
Prop M=50 w/o MC
Prop M=50 w/ MC
Prop M=100 w/o MC
Prop M=100 w/ MC
1Model (2) fit with bi1assumed known.
2Model (2) fit with?bi1 replacing bi1. The empirical Bayes estimates?bi1 are obtained by
fitting (3) and (4) with complete longitudinal measurements.
Table 2: Estimates of α from model (3)-(4) and λ(t,bi1) = λ0(t)exp(αbi1) where I = 300
and M = 10. We also assume that β0= 1, β1= 3, σb0= 1, σb1= 1, and σbi0,bi1= 0. The
means (standard deviations) from 1000 simulations are presented.
Estimators of α