Page 1
Survival analysis with errorprone timevarying covariates: a risk
set calibration approach
Xiaomei Liao1,2,*, David M. Zucker3, Yi Li1, and Donna Spiegelman1,2,**
1Department of Biostatistics, Harvard School of Public Health, Boston, MA, 02115,U.S.A.
2Department of Epidemiology, Harvard School of Public Health, Boston, MA, 02115,U.S.A.
3Department of Statistics, Hebrew University, Mt. Scopus, 91905 Jerusalem, Israel.
Summary
Occupational, environmental, and nutritional epidemiologists are often interested in estimating the
prospective effect of timevarying exposure variables such as cumulative exposure or cumulative
updated average exposure, in relation to chronic disease endpoints such as cancer incidence and
mortality. From exposure validation studies, it is apparent that many of the variables of interest are
measured with moderate to substantial error. Although the ordinary regression calibration
approach is approximately valid and efficient for measurement error correction of relative risk
estimates from the Cox model with timeindependent point exposures when the disease is rare, it
is not adaptable for use with timevarying exposures. By recalibrating the measurement error
model within each risk set, a risk set regression calibration method is proposed for this setting. An
algorithm for a biascorrected point estimate of the relative risk using an RRC approach is
presented, followed by the derivation of an estimate of its variance, resulting in a sandwich
estimator. Emphasis is on methods applicable to the main study/external validation study design,
which arises in important applications. Simulation studies under several assumptions about the
error model were carried out, which demonstrated the validity and efficiency of the method in
finite samples. The method was applied to a study of diet and cancer from Harvard’s Health
Professionals Followup Study (HPFS).
Keywords
Cox proportional hazards model; Measurement error; Risk set regression calibration; Time
varying covariates
1. Introduction
Many epidemiological studies involve survival data with covariates measured with error: the
true covariate value c, as defined by some “gold standard”, is represented approximately by
a surrogate measure C. Often, interest centers on cumulatively updated total or cumulatively
updated average levels of a timevarying exposure, which are computed from a series of
errorprone point exposure measurements. For example, in prospective studies of diet and
*stxia@channing.harvard.edu. **stdls@channing.harvard.edu.
Supplementary Material
Web Appendix A, referenced in Section 6 is available under the Paper Information link at the Biometrics website http:
www.tibs.org/biometrics.
Web Appendix B, referenced in Section 5 is available under the Paper Information link at the Biometrics website
http:www.tibs.org/biometrics.
NIH Public Access
Author Manuscript
Biometrics. Author manuscript; available in PMC 2011 March 17.
Published in final edited form as:
Biometrics. 2011 March ; 67(1): 50–58. doi:10.1111/j.15410420.2010.01423.x.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 2
health such as the Nurses’ Health Study (Hunter et al., 1996), the primary exposure variable
is the cumulatively updated average dietary intake of a given nutrient. Similarly, in
prospective studies of the effects of air pollution on health, there is often interest in the
effect of cumulative exposure to specific pollutants (Zanobetti et al., 2000). Typically the
surrogate point exposures are measured once at each point of a specified grid, and are
validated at timepoints in a coarser grid (e.g., Willett et al., 1985; Brauer et al., 2003). There
is a practical need for statistical methods suited specifically for such applications.
Covariate measurement error in the Cox survival regression model was first addressed by
Prentice (1982), in the setting of a timeinvariant exposure. Under certain assumptions, with
a linear Gaussian model for c given C, the regression calibration estimator emerged. In the
Cox model, the regression calibration estimator is not consistent, but it is a good
approximation under certain conditions. In later papers by many authors, more accurate
methods were developed for various settings; see Zucker (2005) for a review. We note in
particular the risk set regression calibration estimator, which Xie et al. (2001) developed in
the setting of a timeinvariant exposure under a main/reliability study design. Xie et al.
(2001) assumed the classical homoscedastic measurement error model C = c+ϵ, with E(ϵ) =
0 and Var(ϵ) constant.
Timevarying covariates are more challenging to handle. A number of papers have dealt
with measurement error in timevarying covariates. Huang and Wang (2000) presented an
elegant solution for the setting where the classical homoscedastic error model applies and
replicate measurements of the surrogate covariate are available on all (or a sizeable sample
of) study individuals at all times at which events occur.
In practice, as noted above, measurements on the surrogate are usually available only on an
intermittent basis. A common strategy is to use the last available covariate measurements,
although this strategy can lead to some bias (Raboud et al., 1993). A number of authors have
developed more sophisticated methods, based on the joint modeling paradigm. A joint
model consists of a model for the covariate process (often a mixed linear model) and a
model for the hazard of the relevant event given the covariate (typically a Cox model).
Considerable work along this line has been published; Tsiatis and Davidian (2004) provide a
recent review.
The joint modeling approach has a number of features that impede its use in applications.
The approach is computationally intensive. In addition, it puts an undesirable focus on
modeling the exposure process, which requires significant effort but is of no intrinsic
interest to the investigators. Moreover, model checking is problematic, because the covariate
measurement times are typically too few and too sparse for effective model checking.
As we stated at the outset, we are specifically interested in epidemiological applications
involving cumulatively updated total or average exposure. The Willett’s classic textbook on
nutritional epidemiology cites hundreds of papers which use the cumulatively updated
average exposure variable in survival data models, and, similarly, the environmental
epidemiology textbooks by Thomas and by Savitz and Steenland cite hundreds of papers
using the cumulative exposure and distributed lagged exposure variable in survival data
models (Willett, 1998; Thomas, 2009; Steenland and Savitz, 1997). Commonly, in these
studies, the point exposures are subject to considerable measurement error, while the error
induced by carrying forward the most recent cumulative exposure value is less serious. This
motivates an effort to develop methods for analyzing the effect of cumulative exposure in
the presence of measurement error in the point exposures, without invoking the complex
joint modeling approach. Methodology of this sort would have immediate applicability in a
wide range of largescale epidemiological studies, including in our own work Harvard’s
Liao et al.Page 2
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 3
Nurses’ Health Studies, the Harvard Professionals FollowUp Study, the Harvard Six Cities
Study, and many others. It would allow cumulative exposure effects to be assessed in a
practical way that meets the needs of the applied reality.
The purpose of this paper is to develop such a method. Our approach is to extend the risk set
regression calibration method of Xie et al. (2001) from the setting of a timeinvariant
covariate with a classical measurement model to the setting of cumulative exposure with
respect to a timevarying covariate. We work with a measurement error model that is
substantially more general than the classical model, and our method is appropriately
designed to handle this more complex error structure. Instead of the replicate measures
setting, we work under the main study/validation study design, which is suitable for studies
in nutritional and environmental epidemiology, where a gold standard measure of exposure,
or an unbiased measure thereof, often exists.
Section 2 provides a motivating example. Section 3 presents notation and background.
Section 4 presents the method. In Section 5, we derive the variance of the proposed
estimator for the case of the main study / external validation design. Section 6 presents
simulation studies of the method for a range of scenarios motivated by practical
applications, including timevarying exposures with different correlation structures, and rare
and common disease settings. In Section 7, we illustrate the method in the setting of the
example presented in Section 2. Section 8 provides a discussion.
2. Motivating example
We present here an example of the type of problem that motivated our work. The Health
Professionals’ FollowUp Study (HPFS) is an ongoing prospective cohort study of cancer
and heart disease among 51529 U.S. male health professionals who responded to a mailed
baseline questionnaire in 1986, asking about demographics, family history of disease, diet,
smoking, physical activity, medications and other lifestyle factors. Every two years, study
participants receive questionnaires to update health status information and potential risk
factors.
In this paper, we consider data from the HPFS concerning the relationship between total
calcium intake and risk of fatal prostate cancer (Giovannucci et al., 2006). Total vitamin E
intake was the only important confounder with respect to this relationship. Nutrient intake,
including calcium and vitamin E, was assessed via a food frequency questionnaire (FFQ) in
1986, 1990, 1994, 1998 and 2002. After excluding men with a history of cancer at baseline,
or who did not adequately complete the 1986 dietary questionnaire, there were 390703
personyear observed in the main study with 357 fatal prostate cancer cases among 47760
subjects between 1986 to 2004. In our analysis, we used age as the time scale, as is more
suitable for epidemiologic studies of chronic disease (Korn, Graubard, and Midthune, 1997),
hence a lefttruncated analysis is implied here.
The FFQ measures dietary intake with some degree of error and more reliable information
can be obtained from a diet record (DR) (Willett, 1998). In the HPFS validation study, 2
weeks of weighed diet records (DR) were observed in 127 personyears among 127 study
participants. To increase the validation study sample size, we incorporated another dietary
validation study, the Eating at America’s Table Study (EATS) in our analysis (Subar et al.,
2001). EATS is a study that was designed to validate the Diet History Questionnaire (DHQ),
a food frequency questionnaire (FFQ) similar to the one used in HPFS. In EATS, the
exposures was validated by four telephoneadministered 24hour dietary recalls. We discuss
later the transportability of the EATS validation study to the HPFS cohort. The goal of the
analysis is to estimate the relationship between dietary calcium intake and prostate cancer
based on the HPFS main cohort data, using the validation data to correct for the
Liao et al.Page 3
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 4
measurement error associated with the dietary intake variables. Note that, in the context of
this study, calcium intake is a timedependent covariate. The variable of interest is
cumulatively updated average calcium intake, which is the average of all reported calcium
intake up to the present time.
3. Definition and preliminary results
The Cox model (Cox, 1972) for censored survival data specifies the hazard rate λ(t) for the
survival time T of an individual with sdimensional covariate vector ξ to have the form
(1)
where β is a svector of regression coefficients and λ0(t) is the underlying hazard function.
In the survival data setting with timeinvariant covariates, a main/external validation study
design consists of data {Ci, Wi, Ti, Di}, i = 1, … , n1 in the main study, and {ci, Ci, Wi, Ti},
i = n1 + 1, … , n in the validation study. Because data on the outcome, Di is not available in
the validation study, we call this an external validation study. Here, c is the p1vector of true
exposure which is subject to measurement error, and, in the main study, we observe a vector
of surrogate variables C instead. W is a p2vector of errorfree covariates . T is the followup
time, which is defined as the minimum of the potential failure time T0 and potential
censoring time V, i.e. T = min(T0, V, t*), where t* is the end of follow up; D is an indicator
for failure from the event of interest, n1 is the sample size of the main study, n2 is the sample
size of the validation study, and n = n1 + n2. Typically, c is expensive to measure, and hence
n1 >> n2. In what follows, we start by reviewing the ordinary regression calibration method
for several different error models.
Prentice (1982) shows that if
model holds in the perfectly measured covariates, if λ(tc, C, W) = λ(tc, W), i.e.
measurement error is nondifferential and if λ(tC, no censorship in [0, t)) = λ(tC), i.e. if
there is random censorship conditional on the observed main study data, then
, i.e. if the proportional hazards
(2)
where β1 and β2 are respectively p1vector and p2vector of regression coefficients
corresponding to c and W, and following Prentice (1982), T ≥ t can be dropped out when the
event is rare.
From (2), we see that the critical quantity is
dealing with this quantity: exact evaluation or approximation. Exact evaluation requires
assuming a model for the full distribution of (cC, W). Approximation can be carried out
using moment assumptions only. The simplest approximation involves modeling only the
conditional mean μc(C, W) = E(cC, W), and uses the firstorder approximation
. This approach leads naturally to imputing μc(C, W) for c
and running a standard Cox analysis. A more sophisticated approximation can be carried out
by introducing models for both the conditional mean μc(C, W) = E(cC, W) and the
conditional variance Σc(C, W) = Cov(cC, W). The approximation is given by
. There are two basic ways of
Liao et al.Page 4
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 5
(3)
which is obtained from a secondorder Taylor approximation to the cumulant generating
function of (cC, W). In the special case where (cC, W) is multivariate normal, the second
order approximation is exact (Prentice (1982)); however, the approximation can be used
even in the nonnormal case. The firstorder approximation is the approach most commonly
taken.
Equation (3) allows for a semiparametric error model (cC, W), where only the conditional
mean and covariance of (cC, W), rather than the whole distribution, needs to be specified.
For the ordinary regression calibration method, the multivariate results are similar to those
given for the logistic regression model in Rosner et al. (1990). For onedimensional β
without any errorfree covariates, when the disease is rare, or β is small, or if the
measurement error variance is small and constant, the ordinary regression calibration (ORC)
estimator is given by
et al., 1997), where β̂naive is the naive Cox regression estimate using the surrogate measure
C directly, and α̂1 is obtained in the validation study by fitting the linear regression model
given by E(cC) = α0 + α1C and Var(cC) = σ2.
(Spiegelman
4. Risk set regression calibration for timevarying exposures in a main
study/validation study design
The validity of the ordinary regression calibration method depends on the rare disease
assumption, i.e. when Pr(T ≥ t) ≈ 1, as shown in (2). Risk set regression calibration (RRC) is
an attempt to improve the estimator by recalibrating within each risk set (Xie et al., 2001).
Here, we consider the first order approximation of (2) as
(4)
Although T ≥ t is retained in (4), whenever Var(cC, W, T ≥ t) and higher order moments
are not constants or are not independent of time, an asymptotic bias is expected to be
incorporated due to the effect of the higher order moments. Xie et al. (2001) explored
analytically the asymptotic bias of the RRC estimate and derived the sandwich variance
estimator for the main/reliability study design, in which one or more additional
measurements are obtained from a random subsample of study subjects, under the
assumption that the classic additive homoscedastic error model is suitable for the data at
hand. Since the assumption of classical additive error in a timeinvariant exposure is rarely
suitable in nutritional and environmental epidemiology, it was necessary to extend the risk
set regression calibration method to timevarying exposures in the main study/external
validation study design, assuming a more general measurement error model.
With a timevarying point exposure, a main/external validation study design consists of data
{Ci(t), Wi(t), Ti, Di}, i = 1, …, n1 in the main study, and {ci(t), Ci(t), Wi(t), Ti}, i = n1 + 1,
… , n in the validation study, where the timevarying surrogate exposure Ci(t) is measured at
a discrete grid of time points and the true exposure ci(t) is also measured on certain
Liao et al.Page 5
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 6
occasions in the validation study. The gold standard ci(t) is usually measured much less
frequently than the surrogate Ci(t).
The exposure variables of interest in these studies are, as noted in the introduction, generally
some function of the timevarying point exposures. Denote the function of the timevarying
true exposure ci(t) as xi(t), and the function of the timevarying surrogate exposure Ci(t) as
Xi(t). Timevarying errorfree covariates, and functions of these errorfree covariates, are
denoted by Zi(t). Then, the main/external validation study data take the form {Xi(t), Zi(t), Ti,
Di}, i = 1, … , n1 for the main study, and {xi(t), Xi(t), Zi(t), Ti}, i = n1 + 1, … , n for the
validation study, and (4) becomes
The basic idea here is that the measurement error model
(5)
is reestimated from the validation study at each main study failure time, and the true
exposure is reestimated for everyone at risk. Then, β̂RRC will solve
(6)
where
and
Here,
1 if the subject i fails at the failure time t, Ni(dt) = 0 otherwise. Ym(i, tl) = I(Ti ≥ tl) is the risk
process indicator in the main study, the subscript ‘m’ indicates the main study, the subscript
‘v’ indicates the validation study. dim(α0(tl)) = p1, dim(α1(tl)) = p1 × p1, dim(α2(tl)) = p1 ×
p2, ψ(tl) = (α0(tl), α1(tl), α2(tl)), l = 1, … , r, and r is the number of unique failure times in
the main study. Next, we explain how to obtain estimates of α0(tl), α1(tl), α2(tl):
1.
Order the unique failure times that occur in the main study as t1 < t2 < … < tr. Find
the r risk sets Rm(tl), l = 1, 2, … , r, where Rm(tl) is the set of individuals in the
main study who are alive and uncensored at a time just prior to tl. Generate the risk
process indicator Ym(i, tl) so that Ym(i, tl) = 1 if i ∈ Rm(tl), and Ym(i, tl) = 0
otherwise.
, Ni(t) = I(Ti ≤ t, Di = 1) is the counting process corresponding to Ti. Ni(dt) =
2.
Find the r risk sets Rv(tl), l = 1, 2, … , r, in the validation study. Generate the risk
process indicator Yv(i, tl) = I(Ti ≥ tl), so that Yv(i, tl) = 1 if i ∈ Rv(tl), and Yv(i, tl) = 0
Liao et al.Page 6
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 7
otherwise. For any given t, let
values of Xi(t), xi(t) and Zi(t) prior to t. We then run, for each failure time tl, a
regression of
on the subjects in Rv(tl), obtaining α̂̂0(tl),
α̂̂1(tl), α̂̂2(tl).
be the most recent observed
3.
Estimate
risk set Rm(tl) in the main study, l = 1, 2, … , r.
Fit the “naive” Cox model on (x̂̂i(tl), Zi(tl), Ym(i, tl), Ti, Di) to get the risk set
regression calibration estimator β̂RRC.
for each subject i in the
4.
Nondifferential measurement error and random censorship are also required here to ensure
validity of estimation and inference, as in the original regression calibration estimate
discussed in Section 3. In addition, we assume that the measurement error model (5) that
holds in the validation study is applicable to the main study as well. This assumption is the
transportability assumption discussed by Carroll et al. (2006).
5. Var(β̂̂RRC) for the main/ external validation study
We write the score equation (6) as , where
(7)
with ψ̂ = (ψ̂(t1), ψ̂(t2), … , ψ̂(tr)), dim(ψ̂) = p1 × (p1+p2+1)r and dim(Uβi) = (p1+p2) × 1.
Define
Denote the covariance matrix of ψ̂ as Vψ̂̂, so dim(Vψ̂̂) = p1(p1 + p2 + 1)r × p1(p1 + p2 + 1)r.
, then dim(Û*) = (p1 + p2) × p1(p1 + p2+1)r.
In the validation study, ψ̂ solves
(8)
where is defined as follows
(9)
for j = 1, … , r, i = 1, … , n2, dim(Uα0i, j) = 1 × p1, dim(Uα1i, j) = p1 × p1, dim(Uα2i, j) = p2 ×
p1, dim(Uψi, j(ψ)) = p1 × (p1 + p2 + 1), Uψi = (Uψi, 1, Uψi, 2, … , Uψi, r) and dim(Uψi) = p1 ×
(p1 + p2 + 1)r.
Then, as shown in Web Appendix B, Cov(ψ̂) can be estimated by
constructed as
, with V̂̂ψ̂̂
Liao et al.Page 7
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 8
(10)
Following the derivation in Web Appendix B, we have
(11)
where
(12)
(13)
with
The asymptotic distribution theory proven in Web Appendix B follows arguments in
Andersen and Gill (1982) and Lin and Wei (1989); the reader is referred to Web Appendix
B for further details.
6. A simulation study under the main study/external validation study
We report finitesample simulation results for our proposed method under the main study /
external validation study design, following the algorithm in Sections 4 and 5. We consider a
single errorprone covariate c with surrogate C. All simulation results are based on 1000
replications.
We consider two event rate scenarios: rare disease and common disease. Motivated by our
real data set, the Health Professional Followup Study, we set the number of events to be
around 500. For the rare disease situation, we set the cumulative event rate at 1%, and thus
the main study sample size is set at n1 = 50; 000. For the common disease situation, we set
the cumulative event rate at 50%, and thus the main study sample size is set at n1=1000. The
cases we considered for the validation study size were n2 = 150 and n2 = 500. A validation
Liao et al. Page 8
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 9
sample size of n2 = 150 is very common in nutritional epidemiology studies. The case of n2
= 500 is less common, but does arise in some applications, particularly when two or more
validation studies can be combined, as in the example in Section 2.
We take the baseline hazard function to be of the Weibull form λ0(t) = θν(νt)θ−1, with θ = 6,
which is typical of many epithelial cancers (Armitage and Doll (1961); Breslow and Day
(1993)). Censoring is assumed exponential with a rate of 1% per year. The maximum
followup time is taken to be t* =50 years. The parameter ν is set to achieve the specified
cumulative event rate.
In preliminary simulations, we examined the case where the covariate is timeinvariant, and
compared the ORC method to the RRC method. These results are summarized in Web Table
1 of Web Appendix A.1. The parameter which describes the extent of measurement error, ρ
= Corr(c, C) was varied across the values 0.3, 0.6, 0.9. The key parameter in the simulation
comparisons is η = β2Var(cC) = (1 − ρ2)β2σ2, where β is the regression coefficient and σ2 is
the variance of the true exposure. When η is small, the ORC method was adequate, but when
η is large, the RRC method was clearly superior. These findings are consistent with those of
Xie et al. (2001).
We turn now to the timevarying exposure situation. In line with the motivating example in
Section 2, we worked with the cumulatively updated average exposure (Hu et al., 1999). The
true and surrogate exposures, x(t) and X(t), were defined as follows:
(14)
for t ∈ [tk, tk+1) with 1 ≤ k ≤ p, where the set {t0, t1, t2, …, tp} are the times at which ci(t) is
measured. Ci(t) can be measured on the same time scale, or, as is typically the case in
applications, Ci(t) is measured on a much finer time scale. In the simulation study below, we
used the same time scale for ci(t) and Ci(t).
The true and surrogate cumulatively updated average exposures were generated as follows:
1.
The true exposure ci ~ MV N(μc, Σc), where ci is a p–vector with p as the number of
the observation time points, μc is the mean vector and Σc is a covariance matrix.
Without loss of generality, we consider a simple case with μc = 0 and Σc such that
Σc(j, k) = 1 if j = k and if j ≠ k, with τ = 0 or 1. When τ = 0, a
compound symmetry covariance structure is obtained, and the intraclass
correlation ρICS was set at 0:3, 0:6, 0:9. When τ = 1, an AR(1) structure is obtained.
To put these two covariance scenarios on an equal footing, we set the average
correlation of ρIAR over [0; t*] equal to ρICS, i.e.,
(15)
Solving this equation, we obtained the corresponding values of ρIAR as 0.938,
0.978, 0.996.
2.
Cij = cij+eij, eij ~ N(0, Δ), with the measurement error variance Δ given by
, where ρ is the correlation between Cij and cij and varied as 0:3, 0:6, 0:9.
Liao et al. Page 9
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 10
3.
A cumulative average exposure x(t) and X(t) was then generated by
for 1 ≤ k ≤ p, over the time points {t0, t1, t2, … , tp}. In applications, time could be
expressed in terms of participant’s age, time on the study, calendar year, or in some
other appropriate manner. In our simulations, for simplicity, we set tj = 5 * j, for j =
0, 1, … , 10. Thus, there were 10 exposure measurements in total over the study
period of t* = 50 years. This scheme was patterned after the measurement schedule
in our motivating example, HPFS, and the other Harvard cohort studies.
The survival data were generated according to the hazard model λ(tx(t)) = λ0(t) exp(βx(t)),
where x(t) is as defined in (14). Web Appendix A.2 presents details of the data generation.
Since survival is usually measured on a finer time scale than that defined by the exposure
measurement schedule, we generated the survival time on a finer time scale, based on tj = j,
for j = 0, 1, … , 50.
Table 1 presents results for the compound symmetry structure with validation sample size n2
= 150. More results with different validation sample sizes are presented in Web Table 2 of
Web Appendix A.3. The results for the AR(1) structure, which were similar, are presented in
Web Table 3 of Web Appendix A.3. Overall, the RRC estimator performed very well in this
timevarying cumulatively updated average exposure setting, with the good performance
being robust across different levels of the autocorrelation in the true exposure process and
the different correlation structures. The bias was very small for all scenarios and became
even smaller as the autocorrelation became higher. An increased bias was seen with the
AR(1) correlation structure when the autocorrelation was low, consistent with the fact that
withinperson variability increases more quickly with time for the AR(1) structure than the
CS structure. However, the worst bias seen was only 5%. The coverage probability was
nearly accurate in all cases considered. Increasing the frequency of exposure measurements
improved the results even more (data not shown), although the bias was already minimal
with the exposure frequencies investigated here.
7. Application of the method
We illustrate our method in an analysis of the HPFS study described in Section 2. Recall that
the goal is to estimate the relationship between dietary calcium intake and prostate cancer,
adjusting for the confounding effect of vitamin E intake. Calcium and vitamin E intake were
assessed in the full cohort every four years using a food frequency questionnaire (FFQ). The
FFQ is known to be errorprone, so we use data from two validation studies where the
nutrients were measured using both the FFQ and the more accurate diet record (DR): the
HPFS validation study and the EATS validation study.
As a preliminary step, we checked the transportability of the EATS data to the HPFS
population. To this end, we ran a regression analysis of DR on FFQ for each study, adjusting
for age in 5year age groups. The slope for total calcium intake was 0.445 in EATS and
0.371 in HPFS. For total vitamin E, the slope was 0.818 in EATS and 0.762 in HPFS. Given
the similarity of these estimates, the transportability assumption appeared reasonable. The
rightmost major column of Table 2 shows the basic characteristics of the HPFS main study,
so they can be compared to the characteristics of the two validation studies.
Liao et al.Page 10
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 11
We then carried out a further empirical check of the transportability assumption. We defined
a binary indicator Studyind with value 1 if the study was HPFS and 0 if the study was EATS,
and then ran the regression model
(16)
and similarly for vitamin E. To test the hypothesis that there is no betweenstudies variation
in the slope, i.e. γ2 = 0 by age group, we fit regression models by 5year age groups to the
combined validation study, using model (16). We found that the null hypothesis was
accepted for most age groups, except in age group 56 – 60, the p–value for the total calcium
intake by study interaction was less than 0.001, and in age group older than 71, the p–value
for the total vitamin E intake by study interaction was less than 0.03. We tailored our
analyses accordingly: in analyses where both total calcium intake and total vitamin E intake
were regarded as errorprone exposures, we excluded the EATS study participants from the
risk sets in those ages; in analyses where only the total calcium intake was regarding as
being measured with error, we kept the EATS study participants in the risk set of age 71 and
older, but we excluded the EATS study subjects from the age of 56 – 60 risk sets.
After completing this preliminary step, we proceeded to the main analysis. Table 3 gives the
results of the RRC method for timevarying exposures developed in this paper compared to
the naive Cox approach. The relative risks are given in units of 1838 mg/day for total
calcium intake to facilitate comparison of the result reported in the original publication of
these data (Giovannucci et al., 2006), where 1838 mg/day was the difference between the
median of the top quintile and the median of bottom quintile. We performed the analysis
using only the HPFS validation study and using both HPFS and EATS validation studies.
Because the HPFS validation study had 127 personyear observations for 127 subjects, we
grouped the risk sets by 4year age interval to stabilize the estimates. This was not necessary
when we included both validation studies. The results show that with more data in the
validation study, the variance of the estimates is substantially lower. Moreover, with the
added validation data, the deattenuation effect is more pronounced, producing estimates in
better accord with prior expectations. From the results including both HPFS and EATS
validation studies, we can see that the underestimation of the effect of calcium intake on
fatal prostate cancer from the naive Cox approach was corrected by the RRC estimate. Total
calcium intake had a significant positive association with fatal prostate cancer and the total
vitamin E intake was weakly inversely associated, perhaps not associated at all.
8. Discussion
In this paper, we have considered the Cox survival regression model with timedependent
covariates subject to measurement error. We derived a biascorrected point and interval
estimate of the relative risk using a RRC approach. We emphasized the main study/external
validation study design, which arises commonly in nutritional epidemiology, but has been
given less attention in the literature than other designs, particularly in the setting of time
varying covariates. We focused on cumulatively updated total or average exposure, which is
often of interest in nutritional and environmental epidemiology, but the approach could also
be applied to analysis of timevarying point exposures. In addition, the approach extends in
a straightforward way to cover left truncation. A FORTRAN program for the method is
available at http://www.hsph.harvard.edu/faculty/spiegelman/rrc_timevarying_method.f.
Simulation studies in the setting of cumulatively updated average exposure showed that the
method performs very well.
Liao et al.Page 11
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 12
Clearly, the effectiveness of the method depends on the size of the validation study. The
analysis of the HPFS data in Section 7 illustrates this point. When we used only the data
from the 127 participants in the HPFS validation study, and with vitamin E assumed to be
measured without error, a very wide confidence interval for the effect of calcium intake was
obtained, leading to results that were not very meaningful. When we included validation
data data from the 573 men in the EATS study, the results were much more reasonable, and
significant pvalues for both calcium intake and vitamin E were obtained, even when both
these nutrients were regarded as subject to with error.
To date, most validation studies available in nutritional epidemiology have only one
measurement per subject. Only the Nurses Health Study has validation data repeatedly
assessed more than once, and this is among a small number of subjects. The limited
validation data makes it challenging to implement measurement error correction for time
varying exposures. However, we understand that new validation studies are underway,
including one within our own group at the Nurses Health Study, involving a much larger
number of subjects and repeated exposure assessments over time. Such expanded validation
studies will provide a firmer basis of measurement error correction, and the method we have
developed here provides a practical and reliable way for carrying out such correction.
In summary, measurement error in timevarying covariates, including those that are
functions of a series of exposure measurements available up to a failure time, is an
extremely common problem in nutritional and environmental epidemiology. The method
developed in this paper provides a mechanism for handling this problem that is wellsuited
to these applications.
Supplementary Material
Refer to Web version on PubMed Central for supplementary material.
Acknowledgments
We acknowledge the funding support of NIH R01 CA 50597 in performing this work. We thank Walter Willett, the
principal investigator for the HPFS program project (NIH/NCI grant CA 55075), for sharing the data which
motivated this analysis. We also thank Amy F. Subar and her colleagues from the NCI for sharing the EATS study
data with us.
References
Andersen PK, Gill RD. Cox’s regression model for counting process: a large sample study. The Annals
of Statistics. 1982; 10:1100–1120.
Armitage, P.; Doll, R. Stochastic models for carcinogenesis. In: Neyman, J., editor. Proceedings of the
Fourth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of
California Press; 1961. p. 1938.
Brauer M, Hoek G, van Vliet P, Meliefste K, Fischer P, Gehring U, Heinrich J, Cyrys J, Bellander T,
Lewne M, Brunekreef B. Estimating longterm average particulate air pollution concentrations:
Application of traffic indicators and geographic information systems. Epidemiology. 2003; 14:228–
239. [PubMed: 12606891]
Breslow, NE.; Day, NE. World Health Organization. Statistical Methods in Cancer Research. 1993.
Carroll, RJ.; Ruppert, D.; Stefanski, LA.; Crainiceanu, C. Measurement error in nonlinear models: A
Modern Perspective. Second Edition. Chapman & Hall; 2006.
Cox D. Regression models and life tables(with discussion). Journal of the Royal Statistical Society,
Series B. 1972; 34:187–220.
Giovannucci E, Liu Y, Stampfer MJ, Willett WC. A prospective study of calcium intake and incident
and fatal prostate cancer. Cancer Epidemiology Biomarkers & Prevention. 2006; 15:203–210.
Liao et al. Page 12
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 13
Hu FB, Stampfer MJ, Rimm E, Ascherio A, Rosner BA, Spiegelman D, Willett WC. Dietary fat and
coronary heart disease: a comparison of approaches to adjusting for total energy intake and
modeling repeated dietary measurements. American Journal of Epidemiology. 1999; 149:531–540.
[PubMed: 10084242]
Huang Y, Wang CY. Cox regression with accurate covariates unascertainable: a nonparametric
correction approach. 2000; 45:1209–1219.
Hunter DJ, Spiegelman D, Adami HO, Beeson L, van den Brandt PA, Folsom AR, Fraser GE,
Goldbohm RA, Graham S, Howe GR, Kushi LH, Marshall JR, McDermott A, Miller AB, Speizer
FE, Wolk A, Yaun SS, Willett W. Cohort studies of fat intake and risk of breast cancer: a pooled
analysis. New England Journal of Medicine. 1996; 334:356–361. [PubMed: 8538706]
Korn EL, Graubard BI, Midthune D. Timetoevent analysis of longitudinal followup of a survey:
Choice of the timescale. American Journal of Epidemiology. 1997; 145:72–80. [PubMed:
8982025]
Lin D, Wei L. The robust inference for the cox proportional hazards model. Journal of the American
Statistical Association. 1989; 84:1074–1078.
Prentice R. Covariate measurement errors and parameter estimation in a failure time regression model.
Biometrika. 1982; 69:331–342.
Raboud J, Reid N, Coates R, Farewell V. Estimating risks of progressing to aids when covariates are
measured with error. Journal of the Royal Statistical Society, Series A. 1993; 156:393–406.
Rosner B, Spiegelman D, Willett WC. Correction of logistic regression relative risk estimates and
confidence intervals for measurement error: the case of multiple covariates measured with error.
American Journal of Epidemiology. 1990; 132:734–745. [PubMed: 2403114]
Spiegelman D, McDermott A, Rosner B. Regression calibration method for correcting measurement
error bias in nutritional epidemiology. American Journal of Clinical Nutrition. 1997; 65 suppl:
1179S–1186S. [PubMed: 9094918]
Steenland, K.; Savitz, DA. Topics in Environmental Epidemiology. New York: Oxford University
Press; 1997.
Subar A, Thompson F, Kipnis V, Midthune D, Hurwitz P, McNutt S, McIntosh A, Rosenfeld S.
Comparative validation of the block, willett, and national cancer institute food frequency
questionnaires: The eating at america’s table study. American Journal of Epidemiology. 2001;
154:1089–1099. [PubMed: 11744511]
Thomas, DC. Statistical Methods in Environmental Epidemiology. New York: Oxford University
Press; 2009.
Tsiatis A, Davidian M. Joint modeling of longitudinal and timetoevent data: an overview. Statistica
Sinica. 2004; 14:809–834.
Willett, WC. Nutritional Epidemiology. 2nd Ed. New York: Oxford University Press; 1998.
Willett WC, Sampson L, Stampfer MJ, Rosner BA, Bain C, Witschi J, Hennekens CH, Speizer FE.
Reproducibility and validity of a semiquantitative food frequency questionnaire. American Journal
of Epidemiology. 1985; 122:51–65. [PubMed: 4014201]
Xie SX, Wang C, Prentice R. A risk set calibration method for failure time regression by using a
covariate reliability sample. Journal of the Royal Statistical Society, Series B. 2001; 63:855–870.
Zanobetti A, Wand MP, Schwartz J, Ryan LM. Generalized additive distributed lag models:
quantifying mortality displacement. Biostatistics. 2000; 1:279–292. [PubMed: 12933509]
Zucker D. A pseudo partial likelihood method for semiparametric survival regression with covariate
errors. Journal of the American Statistical Association. 2005; 100:1264–1277.
Liao et al.Page 13
Biometrics. Author manuscript; available in PMC 2011 March 17.
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Page 14
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Liao et al.Page 14
Table 1
Results for simulation of cumulatively updated average exposure with a compound symmetry covariance structure, for different intraclass correlation
ρICS.
Estimated β̂̂(SÊ[β̂̂])
Percent Bias(%)
95% CI Coverage(%)
ρICS
ρ
Naive
RRC
Naive
RRC
Naive
RRC
n1 = 50000,
n2 = 150,
Rare disease
0.3
0.3
0.117(0.036)
0.502(0.179)
−76.6
0.4
0.0
95.6
0.6
0.318(0.058)
0.500(0.098)
−36.3
0.1
12.8
94.1
0.9
0.464(0.070)
0.499(0.077)
−7.3
−0.3
91.2
95.6
0.6
0.3
0.172(0.032)
0.509(0.118)
−65.6
1.7
0.0
95.0
0.6
0.373(0.048)
0.498(0.069)
−25.4
−0.4
24.5
95.6
0.9
0.474(0.054)
0.495(0.057)
−5.1
−0.9
92.6
94.9
0.9
0.3
0.212(0.030)
0.502(0.090)
−57.5
0.3
0.0
94.2
0.6
0.405(0.041)
0.503(0.056)
−19.1
0.6
37.4
95.0
0.9
0.486(0.045)
0.501(0.047)
−2.9
0.2
92.8
94.5
n1 = 1000,
n2 = 150,
Common disease
0.3
0.3
0.105(0.035)
0.492(0.193)
−79.1
−1.5
0.0
94.5
0.6
0.293(0.058)
0.490(0.103)
−41.3
−2.1
4.7
94.4
0.9
0.438(0.071)
0.490(0.078)
−12.5
−2.0
87.3
95.6
0.6
0.3
0.153(0.033)
0.509(0.135)
−69.4
1.8
0.0
94.1
0.6
0.352(0.049)
0.503(0.076)
−29.6
0.7
14.3
94.8
0.9
0.457(0.057)
0.498(0.061)
−8.5
−0.5
88.2
95.1
0.9
0.3
0.185(0.031)
0.502(0.107)
−62.9
0.3
0.0
94.3
0.6
0.380(0.044)
0.504(0.063)
−24.1
0.8
22.0
94.1
0.9
0.466(0.049)
0.501(0.052)
−6.8
0.1
89.8
94.6
True β = 0.5, the study duration t* = 50, the number of simulation replications B = 1000.
In the rare disease situation, the cumulative incidence is about 1% with n1 = 50000.
In the common disease situation, the cumulative incidence is about 50% with n1 = 1000.
Biometrics. Author manuscript; available in PMC 2011 March 17.
Page 15
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Liao et al.Page 15
Table 2
Basic characteristics of the study population.
Variable
Exposure
type
Validation study
Main study
HPFS (n21 = 127, Age:(39 – 75))
EATS (n22 = 446, Age:(21 – 76))
HPFS(n1 = 47760)
Mean(s.d.)
Range
Correlation
Mean(s.d.)
Range
Correlation
Mean(s.d.)
Range
Total calciuma
DR
0.43(0.12)
(0.24, 0.87)
0.46(0.16)
(0.18, 1.02)


(1838 mg/day)
FFQ
0.47(0.17)
(0.25, 1.17)
0.51
0.53(0.22)
(0.14, 1.54)
0.61
0.50(0.21)
(0.07,3.96)
Total vitamin Eb
DR
1.32(0.42)
(0.85, 2.72)
1.36(0.56)
(0.57, 3.03)


(mg/day)
FFQ
1.20(0.49)
(0.71, 2.97)
0.89
1.33(0.53)
(0.33, 2.87)
0.79
1.42(0.54)
(0.48,3.02)
Biometrics. Author manuscript; available in PMC 2011 March 17.
Page 16
NIHPA Author Manuscript
NIHPA Author Manuscript
NIHPA Author Manuscript
Liao et al.Page 16
Table 3
Estimated coefficients and standard errors for HPFS of the relationship between total calcium intake and the fatal prostate cancer incidence.
Total Calcium intakea
Total Vitamin E intakeb
Method
Validation study
Estimate(S.E.)
HRe(95% C.I.)
pvalue
Estimate(S.E.)
HRe(95% C.I.)
pvalue
Uncorrected
N/A
0.587(0.239)
1.80[1.13,2.87]
0.014
−0.274(0.103)
0.76[0.62,0.93]
0.008
RRCc
HPFS
0.623(0.653)
1.87[0.52,6.71]
0.340
−0.342(0.160)
0.71[0.52,0.97]
0.033
HPFS + EATS
1.359(0.500)
3.89[1.46,10.37]
0.007
−0.374(0.120)
0.69[0.54,0.87]
0.002
RRCd
HPFS
0.150(0.766)
1.16[0.26,5.22]
0.845
−0.203(0.447)
0.82[0.34,1.96]
0.650
HPFS + EATS
0.734(0.371)
2.08[1.01,4.31]
0.048
−0.370(0.201)
0.69[0.47,1.02]
0.065
aTotal calcium intake is energy adjusted and the unit is 1838 mg/day.
bTotal vitamin E intake is energy adjusted and log10 transformed.
cAssuming total calcium intake is errorprone, but total vitamin E intake is errorfree.
dAssuming both total calcium intake and total vitamin E intake are errorprone.
eHR is the hazard ratio eβ.
Biometrics. Author manuscript; available in PMC 2011 March 17.