Page 1

On Estimating the Relationship between Longitudinal

Measurements and Time-to-Event Data Using a Simple

Two-Stage Procedure

Paul S. Albert and Joanna H. Shih

Biometric Research Branch

Division of Cancer Treatment and Diagnosis

National Cancer Institute

Bethesda, Maryland U.S.A. 20892

January 30, 2009

SUMMARY

Ye et al. (2008) proposed a joint model for longitudinal measurements and time-to-event

data in which the longitudinal measurements are modeled with a semiparametric mixed

model to allow for the complex patterns in longitudinal biomarker data. They proposed a

two-stage regression calibration approach which is simpler to implement than a joint mod-

eling approach. In the first stage of their approach, the mixed model is fit without regard

to the time-to-event data. In the second stage, the posterior expectation of an individual’s

random effects from the mixed-model are included as covariates in a Cox model. Although

Ye et al. (2008) acknowledged that their regression calibration approach may cause bias

due to the problem of informative dropout and measurement error, they argued that the

bias is small relative to alternative methods. In this article, we show that this bias may

be substantial. We show how to alleviate much of this bias with an alternative regression

calibration approach which can be applied for both discrete and continuous time-to-event

data. Through simulations, the proposed approach is shown to have substantially less bias

than the regression calibration approach proposed by Ye et al. (2008). In agreement with the

methodology proposed by Ye et al., an advantage of our proposed approach over joint mod-

eling is that it can be implemented with standard statistical software and does not require

complex estimation techniques.

Page 2

1Introduction

Ye et al. (2008) proposed a two-stage regression calibration approach for estimating the re-

lationship between longitudinal measurements and time-to-event data. Their approach was

motivated by trying to establish such a relationship when the longitudinal measurements

follow a complex semi-parametric mixed model with subject-specific random stochastic pro-

cesses and the time-to-event data follow a proportional hazards model. Specifically, they

proposed a semi-parametric model with additive errors for the longitudinal measurements

Xijof the form

Xij

= Z?

iβ + ϕ(tij) + Ui(tij)bi+ Wi(tij) + ?ij

= X∗

i(tij) + ?ij,(1)

where β is a vector of regression coefficients associated with fixed effect covariates Zi, ϕ(t)

is an unknown smooth function over time, biis a vector of subject-specific random effects

corresponding to covariates Ui(t) which is assumed normally distributed with mean 0 and

variance Σb. Further, Wi(tij) is a zero mean integrated Wiener stochastic process. We

denote Xias all longitudinal measurements on the ith individual.

In Ye et al.’s approach, the relationship between the slope of the longitudinal process

and a time-to-event outcome Tiis characterized by a Cox proportional hazards model with

the slope at time t, denoted as X∗?

i(t), being treated as a time-dependent covariate. The

authors proposed a two-stage estimation procedure in which in the first stage, the mean of

the posterior distribution of the slope at time t, E[X∗?

i(t)|Xi,Zi], is estimated using model

(1) without regard to the time-to-event process Ti. In the second stage, E[X∗?

i(t)|Xi,Zi]

replaces X∗?

i(t) in the Cox model. Ye et al. (2008) proposed two approaches: (i) the ordinary

regression calibration (ORC) approach in which E[X∗?

i(t)|Xi,Zi] is estimated using (1) with

all available longitudinal measurements and (ii) the risk set regression calibration (RRC)

approach in which these expectations are obtained by estimating model (1) after each event

using only longitudinal measurements for subjects at risk at time t (i.e., subjects who have

1

Page 3

an event before time t are removed from the estimation).

The advantage of these regression calibration approaches are that they do not require the

complex joint modeling of the longitudinal and time-to-event processes. In the discussion

of their paper, Ye et al. acknowledge that these approaches may result in biased estimation

due to informative dropout and measurement error, and that improved performance will

require incorporating informative dropout and the uncertainty of measurement error into

the estimation. In this article we show that an alternative two-stage procedure can be for-

mulated which reduces the bias considerably without requiring complex joint modeling of

both processes. For simplicity, we develop the approach for a longitudinal model without the

smooth function ϕ(t) and the stochastic component Wi(t) in (1), but the proposed approach

applies more generally. In this approach, we approximate the conditional distribution of the

longitudinal process given the event time, simulate complete follow-up data based on the

approximate conditional model, and then fit the longitudinal model with complete follow-up

on each patient (hence avoiding the problem of informative dropout in Ye et al.’s approach).

Section 2 develops the approach for a discrete event time distribution followed by an approx-

imation for the continuous event time distribution. The results of simulations which show

the advantages of the proposed approach over ORC and RRC are provided in Section 3. A

discussion follows in Section 4.

2Modeling Framework

We begin by considering a discrete event time distribution. Define Tito be a discrete event

time which can take on discrete values tj, j = 1,2,..,J, and Yijto be a binary indicator of

whether the ith patient is dead at time tj. Then Ji=

J ?

j=1(1−Yij) = J−Yi.where Yi.=

J ?

j=1Yij

indicates the number of follow-up measurements before death or administrative censoring

for the ith patient. Every patient will be followed until death or the end of follow-up at time

tJ.

For illustrative purposes, we will consider a joint model for longitudinal and discrete time-

to-event data in which the discrete event time distribution is modeled as a linear function of

2

Page 4

the slope of an individual’s longitudinal process on the probit scale. Specifically,

P(Yij= 1|Yi(j−1)= 0) = Φ(α0j+ α1bi1),(2)

where j = 1,2,...,J, Yi0is taken as 0, α0jgoverns the baseline discrete event time distribution

and bi1is the individual slope from the linear mixed model,

Xij= Xi∗(tj) + ?ij, (3)

Xi∗(tj) = β0+ β1tj+ bi0+ bi1tj,(4)

where i = 1,2,...,I and j = 1,2,...,Ji. In (4), the parameters β0 and β1 are fixed-effect

parameters characterizing the mean intercept and slope of the longitudinal process, respec-

tively, (bi0,bi1)?is a vector of random effects which are assumed multivariate normal with

?

normal with mean zero and variance σ2

mean 0 and variance Σb=

σ2

σb0,b1

b0

σb0,b1

σ2

b1

?

, and ?ijis a residual error term which is assumed

?. In (2)-(4), the event time and the longitudinal pro-

cess are linked through bi1, and the parameter α1governs the relationship between the slope

of the longitudinal process and the event time distribution. Denote Xi= (Xi1,Xi2,...,XiJi)?,

bi= (bi0,bi1)?, and β = (β0,β1)?. As in Ye et al., the normality assumption for biis made

for these joint models. Although, not the focus of this article, various articles have proposed

methods with flexible semi-parametric random effects distributions and have demonstrated

that inferences are robust to departures from normality (Song et al., 2002; Hsieh et al.,

2006).

For estimating the relationship between the slope of the longitudinal process and the time-

to-event process, the calibration approach of Ye et al. (2008) reduces to first, estimating

E[bi1|Xi,β] using (3) and (4), and second, replacing bi1by E[bi1|Xi,?β] in estimating (2).

problem of informative dropout, whereby bi0and bi1can depend on the event time Ti(which

As recognized by Ye et al., this methodology introduces bias in two ways. First, there is the

will occur if α1?= 0 in (2) ). Ignoring this informative dropout may result in substantial bias.

Second, not accounting for the measurement error in E[bi1|Xi,?β] relative to true values of

bi1will result in attenuated estimation of α1.

3

Page 5

We propose a simple approach which reduces these two sources of bias. We first focus

on the problem of informative dropout.The bias from informative dropout is a result

of differential follow-up whereby the response process is related to the length of follow-up

(i.e., in (2)-(4), when α1 is positive, patients who die early are more likely to have large

positive slopes). There would be no bias if all J follow-up measurements were observed on

all patients. Thus, we recapture these missing measurements by generating data from the

conditional distribution of Xigiven Ti, denoted as Xi|Ti. Since Xi|Tiunder (2)-(4) does not

have a tractable form, we propose a simple approximation for this conditional distribution.

Under model (2)-(4), the distribution of Xi|Tican be expressed as

P(Xi|Ti) =

?

h(Xi|bi,Ti)g(bi|Ti)dbi. (5)

Since Tiand the values of Xiare conditionally independent given bi, h(Xi|bi,Ti) = h(Xi|bi),

where h(Xi|bi) is the product of Ji univariate normal density functions each with mean

X∗

i(tj) (j = 1,2,...,Ji) and variance σ2

?. The distribution of Xi|Tican easily be obtained with

standard statistical software if we approximate g(bi|Ti) by a normal distribution. Under the

assumption that g(bi|Ti) is normally distributed with mean µTi= (µ0Ti,µ1Ti)?and variance

Σ∗

bTi, and by rearranging mean structure parameters in the integrand of (5) so that the

random effects have mean zero, Xi|Ticorresponds to the following mixed model

Xij|(Ti,b∗

i0Ti,b∗

i1Ti) = β∗

0Ti+ β∗

1Titj+ b∗

i0Tii+ b∗

i1Titj+ ?∗

ij, (6)

where i = 1,2,...,I, j = 1,2,..,Ji, and the residuals ?∗

ijare assumed to have independent

normal distributions with mean zero and variance σ∗

?

2. Further, the fixed-effects parameters

β∗

0Tiand β∗

Tior who are censored at time Ti= tJ. In addition, the associated random effects b∗

1Tiare intercept and slope parameters for patients who have an event at time

iTi=

(b∗

i0Ti,b∗

i1Ti)?are multivariate normal with mean 0 and variance Σ∗

bTifor each Ti. Thus,

this flexible conditional model involves estimating separate fixed effect intercept and slope

parameters for each potential event-time and for subjects who are censored at time tJ.

4