A Review on Joint Models in Biometrical Research
ABSTRACT In some fields of biometrical research joint modelling of longitudinal measures and event time data has become very popular. This article reviews the work in that area of recent fruitful research by classifying approaches on joint models in three categories: approaches with focus on serial trends, approaches with focus on event time data and approaches with equal focus on both outcomes. Typically longitudinal measures and event time data are modelled jointly by introducing shared random effects or by considering conditional distributions together with marginal distributions. We present the approaches in an uniform nomenclature, comment on sub-models applied to longitudinal measures and event time data outcomes individually and exemplify applications in biometrical research.
- [show abstract] [hide abstract]
ABSTRACT: We consider regression analysis when covariate variables are the underlying regression coefficients of another linear mixed model. A naive approach is t o use each subject's repeated measurements, which are assumed to follow a linear mixed model, and obtain subject-specific estimated coefficients to replace the covariate variables. However, directly replacing the unobserved covariates in the primary regression by these estimated coefficients may result in a significantly biased estimator. The aforementioned problem can be evaluated as a generalization of the classical additive error model where repeated measures are considered as replicates. To correct for these biases, we investigate a pseudo-expected estimating equation (EEE) estimator, a regression calibration (RC) estimator, and a refined version of the RC estimator. For linear regression, the first two estimators are identical under certain conditions. However, when the primary regression model is a nonlinear model, the RC estimator is usually biased. We thus consider a refined regression calibration estimator whose performance is close to that of the pseudo-EEE estimator but does not require numerical integration. The RC estimator is also extended to the proportional hazards regression model. In addition to the distribution theory, we evaluate the methods through simulation studies. The methods are applied to analyze a real dataset from a child growth study.Biometrics 05/2000; 56(2):487 - 495. · 1.41 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: Joint models are formulated to investigate the association between a primary endpoint and features of multiple longitudinal processes. In particular, the subject-specific random effects in a multivariate linear random-effects model for multiple longitudinal processes are predictors in a generalized linear model for primary endpoints. Li, Zhang, and Davidian (2004, Biometrics60, 1-7) proposed an estimation procedure that makes no distributional assumption on the random effects but assumes independent within-subject measurement errors in the longitudinal covariate process. Based on an asymptotic bias analysis, we found that their estimators can be biased when random effects do not fully explain the within-subject correlations among longitudinal covariate measurements. Specifically, the existing procedure is fairly sensitive to the independent measurement error assumption. To overcome this limitation, we propose new estimation procedures that require neither a distributional or covariance structural assumption on covariate random effects nor an independence assumption on within-subject measurement errors. These new procedures are more flexible, readily cover scenarios that have multivariate longitudinal covariate processes, and can be implemented using available software. Through simulations and an analysis of data from a hypertension study, we evaluate and illustrate the numerical performances of the new estimators.Biometrics 01/2008; 63(4):1068-78. · 1.41 Impact Factor
- Statistica Sinica. 14:793-818.
Neuhaus, Augustin, Heumann, Daumer:
A Review on Joint Models in Biometrical Research
Sonderforschungsbereich 386, Paper 506 (2006)
Online unter: http://epub.ub.uni-muenchen.de/
University of Munich
Discussion Paper 506 - SFB 386
A Review on Joint Models in Biometrical Research
A. Neuhaus†1, T. Augustin‡, C. Heumann‡, M. Daumer†
†Sylvia Lawry Centre for MS Research, Hohenlindenerstr. 1, D-81677 Munich
‡Department of Statistics, Ludwigstr. 33, D-80539 Munich
In some fields of biometrical research joint modelling of longitudinal measures and event
time data has become very popular. This article reviews the work in that area of recent
fruitful research by classifying approaches on joint models in three categories: approaches
with focus on serial trends, approaches with focus on event time data and approaches with
equal focus on both outcomes. Typically longitudinal measures and event time data are
modelled jointly by introducing shared random effects or by considering conditional dis-
tributions together with marginal distributions. We present the approaches in an uniform
nomenclature, comment on sub-models applied to longitudinal measures and event time
data outcomes individually and exemplify applications in biometrical research.
Key words: Joint model, shared random effects, mixed effects model, survival analysis.
Clinical trials and epidemiological studies often collect more than one outcome for each
subject. In addition to the outcome for which the study was primarily initiated, secondary
and tertiary outcomes are collected during an investigation. These data are often time to
event data or repeated measurements. Various approaches have been described in the past
decades to handle repeated measurements and survival data separately, but in the situation
that both outcomes were selected on one subject, classical modelling does not consider
dependencies between the two types of responses. A powerful method to overcome this
problem is a joint modelling of survival and repeated measurements. Well known examples
in which repeated measurements and event time data are generated are studies in the field of
the acquired immunodeficiency syndrome (e.g. Tsiatis et al., 1995). In these studies disease
Discussion Paper 506 - SFB 3862
markers (e.g. viral load or CD4 counts) are measured repeatedly and disease specific events
(e.g. seroconversion or death) are documented at the same time. A joint examination of
the longitudinal process of such disease markers and the time to event is possible using
joint model approaches. This example also indicates that the issue of surrogate markers is a
natural area for the application of joint models, since the course of the longitudinal disease
marker process might serve as surrogate for the event.
This article classifies approaches on joint modelling, spread around the literature, in the
categories ‘focus on serial trend’, ‘focus on event times’ and ‘equal focus on both outcomes’.
We concentrate on methodological aspects of joint model approaches, details on forming
estimates are only noted marginally.
Throughout the paper, we assume that k subjects are observed, each with possibly differ-
ent visit schedules, i.e. at different time points, ti1 := 0,ti2,...,tini. Thus altogether ni
individual observations are collected for the ith subject (i = 1,...,k). In addition to the
outcome measured longitudinally, the time to a specific event is recorded.
We use the following notation for the ith subject to harmonise diverse approaches:
(ni× 1) vector containing longitudinal observations
(ni× 1) vector with corresponding observation time
points, with ti1:= 0
(ni× 1) vector of errors independent from yi
(ni× p) matrix of possibly time-varying covariates with
Xi[t] the corresponding step function which is xijif
tij≤ t < ti(j+1)for j = 1,...,ni− 1 and xiniif tini≤ t
(p × 1) fixed effects corresponding to X = (X?
(ni× q) matrix of covariates with Zi[t] defined in the
same way as Xi[t]
(q × 1) random effects corresponding to Zi
event time of survival outcome
censoring indicator with δi= I(τi≤ ci)
Ideally, the complete longitudinal process yiis known and measurement times are non-
informative. The latter means that the tijs are not affected by the trend or values of yi.
Therefore the tis might differ in length and schedules for different subjects. If the tis are
identical for all subjects panel data are available and corresponding models can be applied.
Two approaches are available to arrive at a joint distribution for repeated measures yiand
survival outcome τi: (1) the introduction of shared random effects and (2) the use of mixture
and selection models. In the first approach random effects biare used to connect yiand τi.
Discussion Paper 506 - SFB 3863
Conditioning on these random effects provides assumed independence of yiand τi. That is
f(yi,τi|bi) = f(yi|bi)f(τi|bi).
Assuming a certain distribution f(bi) for the random effects gives the joint distribution
f(y,τ) for k independent subjects:
Unless explicitly stated we rely on the common assumption bi∼ N(0,Σb).
The second approach to arrive at the joint distribution is based on a factorisation of y and
τ using conditional and marginal distributions. That is
These models are known as mixture and selection models (Little, 1993).
In both approaches joint distributions are constructed on basis of sub-models for the lon-
gitudinal process and the survival outcome. A variety of models can be fitted to both
outcomes. Longitudinal measurements are easiest described by a linear mixed effects model
yi= Xiβ+Zibi+?i. Possible extensions allow for more complex relationships, for example
polynomial specifications of Xiand Zior any functions f(Xi,β) and f(Zi,bi).
In general survival models are constructed within the class of multiplicative hazards models
and are built without random effects. The hazard rate λi, conditioned on covariates at time
t, has the form:
λi(t|Xi[t]) = λ0(t)c(Xi[t]?β).
Mostly known representatives are the Cox proportional hazards model and the Weibull
model. In both models c(.) is specified as exp(.). Whereas the baseline hazard rate λ0(t) is
left completely unspecified in the Cox model it is taken as αµtα−1in the Weibull model (e.g
Klein & Moeschberger, 2003). The latter model is also a representative of another model class
used for specification of the survival outcome within joint models, the accelerated failure
time (AFT) models in which covariates are assumed to have linear influence on log(τi),
that is log(τi) = Xi[t]?β + ei, with eithe error. The distribution of the error specifies the
model (e.g. extreme value distribution which leads to the Weibull regression model). The
conditional hazard rate of an AFT model has the form
λi(t|Xi[t]) = λ0(texp(Xi[t]?β))exp(Xi[t]?β) (6)
where λ0is specified by the error e as mentioned above. Extensions to semiparametric ap-
proaches are possible by leaving λ0unspecified (Lin & Zhiliang, 1995).
Discussion Paper 506 - SFB 3864
Subsequent sections of this paper are organised as follows: In Section 2, we describe joint
model approaches with focus on serial trends in which the pattern of repeated measure-
ments given a survival outcome are of main concern. Models with focus on event times
are described in Section 3. These models specify how the longitudinal measurements affect
survival outcomes. Section 4 reviews approaches that jointly focus on serial trends and event
times. Within these approaches covariate effects on longitudinal measurements and survival
outcome are jointly estimated. In Section 5, we give examples in which joint models have
been applied in practice. A brief look at further approaches and extensions is presented in
2Joint models with focus on serial trends
Models with primary focus on serial trends are applied when the description of repeated
measurements is of main concern. Informative dropouts or events are considered within these
models to avoid biased estimates for the longitudinal process. We describe two approaches,
one based on a shared random effects model and the other one is based on a mixture model,
to handle such data.
The approach introduced by Vonesh et al. (2006) is based on shared random effects; the joint
density is factorised as in (2). Within this factorisation the class of generalised non-linear
mixed-effects models is assumed for the sub-model yi|bi.
The conditional distribution of τi|biis modelled using multiplicative hazards models that
include subject-specific intercept and time trends via a function gi(.) that may also depend
on fixed effects,
λi(t|bi) = λ0(t)exp[gi(β,bi,Xi[t],Zi[t])].
Vonesh et al. specify λ0(t) in two ways, similar to a Weibull or a piecewise exponential model.
In the latter case the time scale is partitioned in disjoint exogenously given intervals. The
baseline hazard rate is assumed to be constant within each interval but may vary from
interval to interval. That is for p disjoint intervals λ0(t) =?p
in the model.
The estimation of the unknown parameters is done via the likelihood
h=1λ0hI(t ∈ (th−1,th]).
Replacing gi(.) by gih(.) in (7) additionally allows the inclusion of time-dependent covariates
Since the integral has no closed form solution, numerical integration or alternatively the
Laplace approximation is needed.
i|bi) is the conditional survivor function.