ArticlePDF Available

Modeling Multistate Survival Illustrated In Bone Marrow Transplantation

Authors:

Abstract and Figures

In many applications of survival analysis techniques there are intermediate events whose occurrence may effect a patient's prognosis. The occurrence of these intermediate events can be modeled using a proportional hazards model with time dependent covariates or by a model using distinct hazards for each event that allows for non proportional hazard rates when other intermediate events occur. Of interest to clinical investigators are not the estimates of these transition intensities, but rather synthesized estimates of predictive probabilities of patient's final response given their current history of occurrence of these intermediate events. We show, using an example of bone marrow transplantation taken from the data base of the International Bone Marrow Transplant Registry, that these predictive probabilities are equivalent to certain transition probabilities in a multistate Markov model. We show how, by using a combination of proportional hazards regression and left truncated proporti...
Content may be subject to copyright.
MODELING MULTISTATE SURVIVAL
ILLUSTRATED IN
BONE MARROW TRANSPLANTATION
John P. Klein, Ph.D.
and
Chunlin Qian, Ph.D.
Technical Report 15
August 1996
Division of Biostatistics
Medical College of Wisconsin
8701 Watertown Plank Road
Milwaukee WI 53226
Phone: (414) 456-8280
MODELING MULTISTATE SURVIVAL ILLUSTRATED IN
BONE MARROW TRANSPLANTATION
John P. Klein, Chunlin Qian
John P. Klein, Medical College of Wisconsin, 8701 Watertown Plank Road,
Milwaukee, WI 53226
KEY WORDS: Proportional hazards models, Time Dependent Covariates, Left Truncation
Abstract
In many applications of survival analysis techniques there are intermediate events whose
occurrence may effect a patient's prognosis. The occurrence of these intermediate events can be
modeled using a proportional hazards model with time dependent covariates or by a model using
distinct hazards for each event that allows for non proportional hazard rates when other
intermediate events occur. Of interest to clinical investigators are not the estimates of these
transition intensities, but rather synthesized estimates of predictive probabilities of patient's final
response given their current history of occurrence of these intermediate events. We show, using
an example of bone marrow transplantation taken from the data base of the International Bone
Marrow Transplant Registry, that these predictive probabilities are equivalent to certain transition
probabilities in a multistate Markov model. We show how, by using a combination of proportional
hazards regression and left truncated proportional hazards regression, one can estimate model
parameters and the desired predictive probabilities. Asymptotic properties of the estimators are
discussed. Finally, we show how these predictive probabilities can be used to study the effects of
treatment strategies which alter the rate at which some intermediate events occur.
1. INTRODUCTION
In many applications of survival analysis techniques the ultimate outcome of a patient’s
treatment depends on the occurrence and timing of some intermediate events. This is particularly
true when studying the recovery process of a patient from a bone marrow transplant for leukemia.
Here a patient can experience one of several terminal events, such as death in remission,
reoccurrence of their leukemia or simply death. As the patient recovers from their transplant a
number of intermediate events may occur that have an influence on their eventual prognosis.
Examples of such intermediate events are the return of the patient’s platelets to a “normal” level, the
development of various types of infections, the occurrence of acute or chronic graft-versus-host
disease, etc.
A natural way to model complex experiments such as this is by using a multistate model.
Andersen et al (1991) (See also Andersen et al 1993) has studied such models using a finite state
Markov process model where the hazard rates for each possible transition in the multistate model
are modeled by a separate Cox (1972) proportional hazards model. Here each of the transition
probabilities is estimated using a (left truncated) Cox model. In a multistate model with two
intermediate events and two terminal events this entails fitting 12 separate Cox models.
Recently, Klein et al (1993) have suggested an alternative approach to multistate modeling.
They suggest fitting a Cox model to each of the events with time dependent covariates used to
model the timing of the intermediate events that precede the event of interest. In a multistate model
with two intermediate events and two terminal events this entails fitting 4 separate Cox models.
This model is discussed in Section 3.
The Klein and Andersen approach are two extremes of how one can model multistate
survival. In this report we shall examine how one may model multistate survival experiments
where some of the transition rates are assumed to be proportional to others. This general model is
discussed in Section 4.
Once the transition rates are modeled it is necessary to synthesize these rates to provide
predictions of the patient's eventual prognosis. The patient’s prognosis is a dynamic entity that
depends on their history at a given point in time. The models we fit allow us to estimate a series of
predictive probabilities based on potential patient histories which may be observed at some time t.
These patient histories include the information known on the patient at entry to the study (the fixed-
time covariates) and the knowledge of when the intermediate events have occurred.
Recently, Arjas and Eerola
(1994)
(cf. Eerola
(1993)) have described a framework of
“predictive causality” for longitudinal studies that can be used to illustrate how the timing of the
occurrences of the time dependent covariates in a patient’s recovery process changes the prediction
of his or her final prognosis. For a given patient, let (T,X)={(T
m
,X
m
); m
>
1} denote the ordered
times, 0
<
T
1
<
T
2
<
...
, at which events occur during a patient’s recovery from transplantation,
with description, X
m
, of what has happened to the patient at time T
m
. In the bone marrow
transplantation recovery process X
m
may denote return of the platelets to normal levels, the
development of acute GVHD, or the occurrence of relapse, or death. A patient history, H
t,
at
some time t post-transplantation consists of all the pre-transplantation information available on the
patient (the fixed-time covariates) and the set of marked points, {(T
m
,X
m
); T
m
<
t}, reflecting
what has happened to the patient up to this point in time. We consider the prediction that some
event, W, such as relapse, occurs in time interval, E (WE ), for example within two years post-
transplantation. The predicted probability that WE should depend on the patient’s history at the
time t at which this prediction is made. We define a prediction process by µ
t
(E)=P[WE|H
t
]
The prediction process allows us to examine the effect of time dependent (and fixed-time)
covariates on the predicted prognosis of a given patient in three ways. First, we can fix the time t
and the history, H, for a patient up to time t and see how the predicted probability of W being in Ε
changes as the prediction interval E varies. In the bone marrow transplantation example this will
allow us to estimate how the probability of relapse within τ years post-transplantation, changes as τ
varies for a patient with a given history at time t. That is, given a particular history at a given time
for a patient we can provide a prognosis for this patient at times in the future. Second, we can fix a
potential history, H, for a patient and the prediction interval, E, and see how the µ
t
(E) changes as t
increases. For example, for a patient with a given history of development of acute GVHD or
platelet recovery, this will give insight into how more and more of a patient history allows us to
refine our prediction of the chance that he or she would relapse within the first two years post-
transplantation, say. Arjas and Eerola call this the learning effect. Finally, we can fix the
prediction interval, E, and the time at which we observe the patient history, t, and look at the
prediction process for patients with different histories. This allows us to study directly the effect
of the timing of the intermediate endpoints on the prognosis of future patients. In the bone
marrow transplantation recovery process this may suggest to the physician that, if certain events
have not occurred by a given time, some additional therapy should be given, based on this model.
The example that is used throughout this paper is from a multicenter bone marrow
transplantation study of patients given an HLA identical sibling transplants, conducted between
1985 and 1990, for patients with acute lymphoblastic leukemia (ALL) or acute myelogenous
leukemia reported to the International Bone Marrow Transplant Registry. The data set consists of
1823 patients with observation times ranging from 10 days to 2236 days. 957 patients were alive
and disease free at their last observation time, 442 died in remission and 424 patients were
observed to relapse. In Section 2 a multistate model for this data is presented and in Section 5 we
shall present some empirical estimates of the predicted probabilities.
2 BONE MARROW TRANSPLANTATION
Bone marrow transplantation is a standard treatment for acute leukemia. Recovery
following bone marrow transplantation is a complex process. Prognosis for recovery may depend
on risk factors known at the time of transplantation, such as patient's or donor's age and sex, the
stage of initial disease, the time from diagnosis to transplantation, and so on. The final prognosis
may change as the patient’s post-transplantation history develops with the occurrence of events
during the recovery process, such as the development of acute or chronic graft-versus-host disease
(GVHD), the return of the platelet count to normal levels, the return of granulocytes to normal
levels, or the development of infections. Transplantation can be considered a failure when a
patient’s leukemia returns (relapse) or when he or she dies while in remission (treatment-related
death). Of interest is how the probabilities of relapse (denoted by R) and treatment-related death
(denoted by D), as well as leukemia-free survival (the probability of being alive and in remission),
depend on the pre-transplantation (fixed-time covariates) and post-transplantation (time dependent
covariates) patient history.
Figure 1 shows a simplified diagram of a patient’s recovery process based on two
intermediate events which may occur in the recovery process. These intermediate events are the
development of acute GVHD which typically occurs within the first 100 days following
transplantation (denoted by an A), and the recovery of the platelet count to a self-sustaining level
40 x 10
9
/L (called platelet recovery in the sequel and denoted by a P). Immediately following
transplantation, patients have depressed platelet counts and are free of acute GVHD. At some point
in time they may develop acute GVHD or have their platelets recover, at which time their prognosis
(probabilities of treatment-related death or relapse at some future time) may change. These events
may occur in any order or a patient may die or relapse without any of these events occurring.
Patients may then experience the other event, which again modifies their prognosis, or they may
die or relapse.
T
r
a
n
s
p
l
a
n
t
P
12
P
A
D
R
24
45
46
D
R
25
26
A
13
P
D
R
45
46
34
D
R
35
36
D
A
15
16
FIGURE 1
Multistate Model For Bone Marrow Transplant Recovery
Figure 1 shows that there are 12 possible transitions that can occur in this multistate model.
There are six possible states in which a patient may be in at any given time, t. These states are:
1-{T
P
t, T
A
t, T
D
t, T
R
t} (Alive disease free without having GVHD or having had
platelets recovered)
2-{T
P
<t, T
A
t, T
D
t, T
R
t} (Alive disease free without having GVHD with platelets
recovered)
3-{T
P
t, T
A
<t, T
D
t, T
R
t} (Alive disease free without platelets recovered having
experienced GVHD)
4 -{T
P
<t, T
A
<t, T
D
t, T
R
t} (Alive disease free with platelets recovered having experienced
GVHD)
5 - {T
D
<t, T
R
t} (Dead prior to relapse)
6- {T
D
t, T
R
<t} (Relapsed)
3. PROPORTIONAL HAZARDS MODEL
In this section we shall present a basic model for multistate survival studies based on a
series of Cox regression analysis using time dependent covariates. To model survival we assume
that an individual is at risk having any one of the events in some set e. This set consists of both the
intermediate events which may affect a patient's eventual prognosis and the terminal events. In
the bone marrow transplant example the set e is {A, P, R, D}, where A is the event GVHD has
occurred, P is the event the platelets have recovered, R is the event relapsed and D is the event
died.
From the events in the set e we can define a set of states s = {1,2,...,p}. Each element of s
tells us which final event has occurred or what combination of intermediate events has occurred.
In the transplant example there are six states listed in the previous section.
For a given model only certain transitions are possible. We let t be the set of possible
transitions. In the transplant example t has twelve elements as shown in Figure 1. That is
t ={12, 13, 15, 16, 24, 25, 26, 34, 25, 26, 45, 46}. For any event X e we define t(X) as the
set of transitions into event X that are possible. In our example t(P) ={12, 34}, t(A) ={13, 23},
t(D) ={15, 25, 35, 45}, and t(R) ={16, 26, 36, 46}.
For any event, X, in e we define the ancestor set a(X) as the set of intermediate events that
may happen prior to the occurrence of the event X. In our example we have a(P)= {A}, a(A)=
{P} and a(R)= a(D) = {A, P}.
To model the transitions rates for this model we shall use a proportional hazards regression
model. For each event, X, in e we fit a proportional hazards regression model which includes the
fixed time covariates specific to the event as well a time dependent covariate for each of the events
in the ancestor set of X. If we let Z
F
be the vector of fixed time covariates that have an influence
on any event in e and let β
FX
be a vector of risk coefficients for these covariates for the event X.
Note that if a fixed covariate has no effect on the timing of event X then the risk coefficient for that
factor is set to 0. The model for the hazard rate of the time to event X is given by
λ(t |Z
F
) = λ
oX
(t) exp{β
FX
Z
F
+
x'a(X)
β
x'x
I[T
x'
<t] }. (3.1)
Here I[] is the indicator function and β
x'x
is the risk coefficient for the effect of the occurrence of
event X' on the time to event X. The baseline hazard rate, λ
oX
(t), can be different for distinct
levels of some fixed covariates although for simplicity we shall consider the unstratified case in the
sequel. The parameters in (4.1) can be estimated from any standard Cox regression package.
Using the model (4.1) the hazard rate for any of the transitions in the set t can be modeled.
Specifying a transition determines X and the values to be assigned to the indicators I[T
X'
<t] for any
intermediate event. For example,
λ
15
(t | Z
F
)=λ
oD
(t) exp{β
FD
Z
F
}
λ
25
(t | Z
F
)=λ
oD
(t) exp{β
FD
Z
F
+ β
PD
}
λ
35
(t | Z
F
)=λ
oD
(t) exp{β
FD
Z
F
+ β
AD
}
and
λ
45
(t | Z
F
)=λ
oD
(t) exp{β
FD
Z
F
+β
PD
+ β
AD
}.
For any transition, ij, we define the cumulative transition rate as
Λ
ij
(t | Z
F
)=
0
t
λ
ij
(u| Z
F
)du , ij, i,jt
Λ
ij
(t | Z
F
)= 0 if ij, i,jt, and
Λ
ii
(t | Z
F
) = -
js
Λ
ij
(u | Z
F
) , is.
Since Λ
ij
(t | Z
F
)is absolutely continuous for any i,j,s it follows that the matrix Λ = (Λ
ij
)
pxp
is the
transition intensity of a Markov process with state space s = {1,...,p} (See Andersen et al pp 92-
93). The transition probability matrix of this Markov process is given by
P[s,t | Z
F
] =
s<ut
[I+dΛ(u | Z
F
)] , (3.2)
where Π is the product-integral (cf. Gill and Johansen (1990) for details on the matrix product
integral) and I is the pxp identity matrix. This transition probability matrix serves as the basis for
making an inference about a patient's eventual prognosis given their current history.
To estimate the transition probability matrix the required Cox models are fit and the
estimators of β are obtained. Breslow's estimator of the baseline hazard (Breslow 1972) rates are
then computed and substituted into (4.2). For the bone marrow transplant example this yields the
following estimators of the predicted probabilities (Here we shall ignore the dependence on Z
F
for
notational convenience)
P
^
ii
(s,t)
=
s<ut
{1-
j:i<j
∆Λ
^
ij
(u) }, i=1, 2, 3, 4;
P
^
ij
(s,t)
=
s<ut
P
^
ii
(s,u-)
P
^
jj
(u,t) ∆Λ
^
ij
(u) , ij=12,13,24, 34,45,46;
P
^
ij
(s,t)
=
s<ut
P
^
ii
(s,u-)[ ∆Λ
^
ij
(u)+
P
^
4j
(u,t)∆Λ
^
i4
(u)] , ij=25,26, 35, 36;
and
P
^
1j
(s,t)
=
s<ut
P
^
11
(s,u-)[ ∆Λ
^
1j
(u)+
P
^
2j
(u,t)∆Λ
^
12
(u)+
P
^
3j
(u,t)∆Λ
^
13
(u)], j=4,5,6.
The asymptotic distribution of P[s,t | Z
F
] can be obtained by basic counting process
techniques. Details are found in Qian(1995). The basic result is as follows (Here for ease of
exposition we have suppressed the dependence on the fixed covariates, Z
F
) :
Theorem 1 Under suitable regularity conditions each of the elements of the random matrix
n {P
^
[s,t | Z
F
] -P[s,t | Z
F
]} converges weakly to a zero-mean Gaussian martingale with
covariance function given by
Cov(
n(P
^
ij
(s,t), P
^
km
(s,t)) =
xe
s
t
F
ij
,
X
(s,u,t) F
km,X
(s,u,t)
s
x
(0)
(β
X
,u)
dΛ
oX
(u) + G
'
ij,X
Σ
-1
X
G
km,X
,
where
F
ij
,
X
=
ght(X)
ig<hj
D
ighj,X
(s,u,t); ijs
G
ij,X
(s,t)=
s
t
ght(X)
ig<hj
{D
ighj,X
(s,u,t)[Z
gl
- e
X
(β
x
,u)]dΛ
ox
(u))}; ijs
D
ighj,X
(s,u,t) = exp{β
X
Z
gh
} P
ig
(s,u-) [P
hj
(u,t) -P
gj
(u,t)] ij, ghs.
s
x
(0)
(β
X
,u) =
l=1
n
exp{β
X
Z
Xl
(u)}Y
Xl
(t),
e
X
(β
X
,u) =
l=1
n
Z
Xl
(u) exp{β
X
Z
Xl
(u)}Y
Xl
(t)
s
x
(0)
(β
X
,u)
; and
Σ
X
is the covariance matrix of the estimates of β
X
.
Here Ζ
jk
is the union of the set of fixed covariates with a set of indicator covariates that tell
us that an individual is in state j at time t. Y
Xl
(t) is the indicator that individual l is at risk for event
X at time t, and Z
Xl
(t) is the covariate vector for event X for individual l at time t.
Estimators of the variability of the predicted probabilities are obtained by substituting the
appropriate estimator into the covariance in Theorem 1. In particular we have that the variance of P
^
ij
(s,t) is estimated consistently by
xe
s
t
[
F
^
ij
,
X
(s,u,t)
S
x
(0)
(β
X
,u)
]
2
dΝ
X
(u) + G
^
'
ij,X
i
-1
(β
X
^
) G
^
ij,X
, (3.3)
where dN
x
(t) is the number of type X events occurring at time t and i
X
is the observed information
matrix for the regression estimates for event X.
4. Child-Event Models
The model constructed in Section 3 assumes that for any event X in e the hazard rates of
any two X transitions ij, km t(X) are proportional. This is a testable hypothesis that may fail to
be true in some circumstances. In this section we shall look at models that relax this assumption.
To relax this proportionality assumption we consider models with time dependent
stratification. Suppose we can divide the ancestor set a(X) into two disjoint sets a
s
(X) and a
c
(X) .
Here a
s
(X) is the set of ancestors of X for which a time dependent stratification will be used and
a
c
(X) is the set of ancestors for which the proportional hazards modeling will be used. Let m(X) =
2 to the power the number of elements in a
s
(X). Here m(X) is the total number of distinct baseline
hazard rates to be fit in the model. Number the m(X) baseline hazard rates from (0, ...,0) to
(1,...,1). At an event time T
X
we shall call an event a type X
h
th event if h=(I[T
x'
<t], X' a
s
(X)).
Thus we have created m(X) "child-events", X
h
, from each parent-event X. The X
h
transition set is
naturally t(X
h
) = {ijt(X): {h=(I[T
x'
<t], X' a
s
(X) )} as determined by state i}.
For each child event a distinct baseline hazard rate is assumed so that
λ
X
h
(t| Z
F
) = λ
oX
h
(t ) exp{β
FX
Z
F
+
X'a
c
(X)
β
X'X
I{T
X'
<t] }
and the hazard rate for each X
h
transition is
λ
ij
(t) = λ
oX
h
(t ) exp{β
X
Z
ij
)}.
Here Z
ij
consists of the fixed covariates and a vector of 0 and 1's with a 1 in the correct position
for any event in a
c
(X) which must have occurred prior to time t to be in state i.
Estimates of Λ
oX
h
(t ) and the β's can be obtained from standard Cox regression packages.
As opposed to the proportional hazards model, in this analysis there may be some time dependent
stratification so that left truncated regression models must be employed. Once the parameter
estimates are obtained and an estimate for Λ
ij
(t) is obtained then these can be used in (3.2) to obtain
estimates of the predicted probabilities.
To illustrate this approach consider the bone marrow transplantation example. One
possible time dependent stratification is to fit different baseline rates for the death event for
individuals whose platelets have or have not recovered. Consider the parent event D whose
ancestors are the events P and A. The set a(D)is divided into the sets a
c
(D)= {A} and a
s
(D)={P}.
Two child events, D
1
and D
2
are defined by {T
p
T
D
}and {T
P
<T
D
}. Here D
1
is the event death
without platelets being recovered and D
2
the event death with platelets recovered. Two
proportional hazards models are fit for to the death event. The first model is λ
D
1
(t | Z
F
)= λ
oD
1
(t)
exp{β
FX
Z
F
+β
AD
I[T
A
t]}. Individuals are censored for λ
oD
1
when their platelets recover. For
the second model we have λ
oD
2
(t) exp{β
FX
Z
F
+β
AD
I[T
A
t]}. Here the analysis for λo
D
2
is
based on a left truncated Cox regression model with individuals entering the risk set at the time at
which their platelets recover. The four transition rates to the state D are
λ
15
(t | Z
F
) = λ
oD
1
(t) exp{β
FX
Z
F
},
λ
25
(t | Z
F
) = λ
oD
2
(t) exp{β
FX
Z
F
},
λ
35
(t | Z
F
) = λ
oD
1
(t) exp{β
FX
Z
F
+ β
AD
}; and
λ
45
(t | Z
F
) = λ
oD
2
(t) exp{β
FX
Z
F
+ β
AD
}.
If in addition to stratifying on the recovery time for the platelets we also stratify for D on
the occurrence of acute GVHD we have a
s
(D)={P,A} and a
c
(D) is the empty set. Now there are
four child events for D corresponding h = (0,0), (1,0), (0,1) and (1,1). These correspond to the
states {T
P
>T
D
, T
A
>T
D
}, {T
P
T
D
, T
A
>T
D
}, {T
P
>T
D
, T
A
T
D
} and {T
P
T
D
, T
A
T
D
},
respectively. The models for the transitions into state D contain distinct baseline hazard rates for
each of these states, and there are no time dependent covariates in the model.
The asymptotic properties of the estimated prediction probabilities are similar to those in
theorem one with the simple change of the summations over Xe being changed to double sums
over both Xe and h=1,...,m(X). For example, the estimated variance of the predicted
probability of a type ij transition in the time period (s,t] is
V
^
(P
^
ij
(s,t)) =
xe
h=1
m(X)
s
t
[
F
^
ij
,
X
(s,u,t)
S
x
h
(0)
(β
X
,u)
]
2
dΝ
X
h
(u) + G
^
'
ij,X
h
i
-1
(β
X
^
)G
^
ij,X
h
.
In the model presented above the coefficient vector, β
X
, is the same for all child events,
X
h
. This assumption can be relaxed as well by allowing each child event to have its own β. This
involves fitting separate Cox models for each child event. The estimation process follows as
above. Here an estimate of the asymptotic variance of P
^
ij
(s,t) is
V
^
(P
^
ij
(s,t)) =
xe
h=1
m(X)
s
t
[
F
^
ij
,
X
h
(s,u,t)
S
x
h
(0)
(β
X
h
,u)
]
2
dΝ
X
h
(u) + G
^
'
ij,X
h
i
-1
(β
X
h
^
)G
^
ij,X
h
.
The extreme case of this model is where all events are divided to their fullest (i.e. each child
event corresponds to one and only one transition) and each transition has its own β. This is the
usual model for multi-state processes introduced by Andersen et al (1991) (Cf. Andersen et al
(1993) Section VII.2).
5. BONE MARROW TRANSPLANT EXAMPLE
To illustrate these calculations we shall fit the multistate proportional hazards model to the
data from the International Bone Marrow Transplant Registry. As shown in figure 1 we have a
model with two intermediate events, platelet recovery (P) and acute GVHD (A) and two terminal
events, death in remission (D) and relapse (R). There were 1823 patients in the data set.
After a careful examination of the effects of various fixed time covariates on the four events
we found that the most important covariates were the patients Karnofsky score at transplant, their
waiting time from diagnosis to transplant and their age. In testing for proportional hazards for each
of these covariates using a time dependent covariate approach (See Klein and Moeschberger
(1996)) we found that the relapse hazards were not proportional at different ages. In the analysis
reported below we have decided to stratify all the analysis on age (two strata age 20 or age >20).
The other two risk factors were discretized as Karnofsky Score 80 versus Karnofsky score 90,
and time from diagnosis to transplant 10 weeks versus >10 weeks.
To apply the proportional hazards model we fit four Cox models to the data, one for each
of the four endpoints. For each event, X, we include a time dependent covariate for each event in
a(X). The results are found in Table 1.
Table 1
Estimated Risk Coefficients And Standard Errors For The Proportional Hazards
Model
Covariate
Acute GVHD
Relapse
Karnofsky Score 80
.208 (.109) *
.414 (.119)
Waiting Time >10 Weeks
.014 (.099) *
.351 (.102)
Platelet Recovered
-.347 (.166)
-.322 (.126)
Acute GVHD
-.283 (.130)
* Not significant at 5% level
Here we see that patients with a low Karnofsky score tend to take longer to have their
platelets recover and are more likely to die or relapse. Patients with a long waiting time to
transplant also have an increased risk of relapse and death.
Examining the two time dependent covariates we see that when a patient's platelets recover
their risks of GVHD, death and relapse are decreased. When a patient develops GVHD their risk
of relapse is decreased but their risk of death is increased. This decease in relapse risk is the well-
known graft-versus-leukemia effect of GVHD.
To examine the fit of the proportional hazards model we also fit the Andersen model with
distinct baseline hazard rate (stratified on age) and different covariate values for each transition.
Here a standard Cox model is used for transitions 12, 13, 15, 16 and left truncated Cox models are
used for all other transitions. The results are in Table 2.
Table 2
Estimated Risk Coefficients And Standard Errors From Fitting The Andersen
Model
Transition
Karnofsky Score 80
Waiting Time >10 Weeks
1->2
-.319 (.083)
-.065 (.065)*
1->3
.251 (.115)
-.013 (.106)
1->5
.422 (.185)
.760 (.170)
1->6
.609 (.251)
.518 (.239)
2->4
-.098 (.364)*
.189 (.288)*
2->5
.959 (.254)
.031 (.267)*
2->6
.332 (.157)
.246 (.127)
3->4
-.334 (.173)
-.040 (.146)
3->5
.142 (.190)*
.330 (.180)*
3->6
1.063 (.454)
.445 (.434)*
4->5
.235 (.273)*
.297 (.233)*
4->6
.133 (.372)*
.474 (.297)*
* Not significant at 5% level
To examine the fit of the simpler proportional hazards we plot in Figure 2 the logs of the
baseline hazards estimated from the Andersen model for each of the transitions. If the proportional
hazards model holds true then we should have parallel curves for each transition into one of the
four events. A cursory look at these figures does not suggest any marked departure from
proportionality.
We shall use the proportional hazards multistate model to examine how a patient's
prognosis at one year after transplant depends on their history in the first few weeks of their
recovery process. We first estimate the probability of dying in remission in the first year given the
patient's history at s weeks following transplant for each of the four possible states a patient may
be in at s weeks. This estimated probability is given by P
^
i5
[7s,365]. Figure 3 shows the
estimates under the proportional model for an individual who is under 20 years of age with a
Karnofsky score of 90 or more and a waiting time to transplant of less than 10 weeks. Other
values of the fixed covariates would give slightly different pictures. Here a patient is initially in the
state 1 and we see that when their platelets recover their risk of death drops. The development of
GVHD at any point in time elevates the chance of death. This probability is particularly high if the
platelets have yet to recover. Figure 4 gives the one year probability of relapsing for each of the
four states. Here again a patient is initially in state 1 and has a relatively high likelihood of
relapsing. When graft-versus-host disease occurs this probability drops.
Figure 5 gives the leukemia free survival probabilities for the first year given a patient's
history at s weeks. This is the probability of being alive and disease free at the end of the first year
after transplant. This probability is given by 1- {P
i5
[7s,365]+ P
i6
[7s,365]}. The curves
naturally increase as a patient survives disease free for a longer time. We see that once a patient
has their platelets recover their prognosis is much better. The occurrence of GVHD without the
platelets being recovered leads to the least favorable prognosis.
Figure 6 shows 95% confidence intervals and point estimates for the leukemia free survival
at one year for each possible history a patient may have at s weeks. For comparison the
proportional hazards and Andersen models are presented. Here we note that the confidence
intervals based on the proportional hazards model are shorter. This is to be expected since this
model has fewer parameters to estimate.
6 DISCUSSION
In our example we have presented estimates of predicted probabilities for some basic
outcomes in bone marrow transplantation for a patient with a given history at some point in their
recovery process. Similar plots can be used to examine how different values of the fixed time
covariates affect the predicted patient prognosis.
We have chosen here to fix the time, t, to which the prediction is made at one year and to
see how changes in the history affect the estimated probabilities. We could have fixed the time at
which the history was measured and draw a curve for a range of times. These curves would be
predicted survival curves given a patient's history at some time. An example of this approach can
be found in Klein et al (1993).
The models presented here can also be used to provide some insight into how changing the
rate or the timing of intermediate events effect a patient's eventual prognosis. For example, if
some therapy was developed to increase the rate at which platelets recover this hypothetical therapy
could be compared to existing therapy by modifying the baseline platelet recovery hazard rate and
examining the predicted probabilities of death and relapse. This approach can also be used to
examine how changing the rate at which one competing risk occurs affects the occurrence of
another competing risk. For example, if the treatment mortality rate where cut in half how does
this effect the predicted probability of relapse? This approach is more reasonable than existing
methods for analyzing competing risks where one postulates a world in which one of the
competing risks can not occur.
The basis of all the models presented here is a sound preliminary analysis of the data using
proportional hazards regression models. This analysis involves not only finding important
prognostic factors, but also involves checking of the proportionality assumptions of the models to
determine the number of child events.
ACKNOWLEDGMENTS
This research was supported by contract 2R01 CA54706-04A1 from the National Cancer
Institute.
REFERENCES
Andersen, P.K., Borgan, Ø ., Gill, R.D. and Keiding, N. (1993). Statistical Models
Based on Counting Processes. Springer-Verlag.
Andersen, P.K., Hansen, L.S. and Keiding N. (1991). Non- and Semi- parametric
Estimation of Transition Probabilities from Censored Observation of a Non- homogeneous Markov
Process. Scand. J. Statist. 18: 153-167.
Arjas, E. and Eerola, M. (1993). On predictive Causality in Longitudinal Studies. J. of
Statist. Planning and Inference, 34, 361-386.
Breslow, N.E. (1972). Contribution to the discussion on the paper by D.R. Cox,
Regression and life table. J. Roy. Statist. Soc. B., 34: 216-7.
Cox, D.R. (1972). Regression models and life tables (with discussion). J. Roy. Statist.
Soc. B 34, 187-220.
Eerola, M. (1993). On Predictive Causality in the Statistical Analysis of a Series of
Events. Statistical Research Report 14, The Finnish Statistical Society.
Gill, R.D. and Johansen, S. (1990). A survey of product-integration with a view towards
application in survival analysis. Ann. Statist. 18, 1501-1555.
Klein, J.P., Keiding, N. and Copelan, E.A. (1993). Plotting Summary Predictions in
Multistate Survival Models: Probabilities of Relapse and Death in Remission for Bone Marrow
Transplantation Patients. Statist. Med., 12, 2315-2332.
Klein, J.P. and Moeschberger, M.L. (1996). Survival Analysis, Springer, New York,
(In Press).
Qian, C. (1995). Time-dependent covariates in a general survival model with any finite
number of intermediate and final events unpublished Ph.D. Thesis, The Ohio State University.
... When there are covariates which may affect the rate of transition from one state to the next, a number of Markov models have been proposed in the literature. These include parametric models for the transition intensities (Begg & Larson, 1982; Kalbfleisch & Lawless, 1985; Marshall & Jones, 1995; Alioum & Commenges, 2001; Pérez-Ocón et al., 2001) or semiparametric Markov regression models where transition intensities are modelled by the Cox (1972) proportional hazards regression models (Andersen, 1988; Andersen et al., 1991; Klein et al., 1993; Klein & Qian, 1996; Andersen et al., 2000). In this paper, we focus on the semiparametric case. ...
... In the bone marrow transplant example this approach requires the fitting of four Cox models. The transition probability estimators and their asymptotics for this model can be found in Klein & Qian (1996). Recently, Aalen et al. (2001 They suggested that the transition intensities be modelled by Aalen's (1989) additive hazards regression models rather than Cox's proportional hazards regression models. ...
Article
When there are covariate effects to be considered, multi-state survival analysis is dominated either by parametric Markov regression models or by semiparametric Markov regression models using Cox's (1972) proportional hazards models for transition intensities between the states. The purpose of this research work is to study alternatives to Cox's model in a general finite-state Markov process setting. We shall look at two alternative models, Aalen's (1989) nonparametric additive hazards model and Lin & Ying's (1994) semiparametric additive hazards model. The former allows the effects of covariates to vary freely over time, while the latter assumes that the regression coefficients are constant over time. With the basic tools of the product integral and the functional delta-method, we present an estimator of the transition probability matrix and develop the large-sample theory for the estimator under each of these two models. Data on 1459 HLA identical sibling transplants for acute leukaemia from the International Bone Marrow Transplant Registry serve as illustration. Copyright 2005, Oxford University Press.
... Therefore, it is often necessary to accommodate the influence of these covariates on transition intensities through a regression model. In this regard, there are a number of models for transition intensities that have been proposed in the literature including parametric models [5][6][7][8][9], semiparametric Markov regression models where transition intensities are modeled by the Cox [10] proportional hazards regression model [11][12][13][14][15], or the Aalen additive hazards regression model [2,13,16]. However, most of the time, there are large correlations between covariates as well as non-linear or multivariable relations especially in high dimension settings. ...
Article
Full-text available
s Background This study aimed to introduce recursively imputed survival trees into multistate survival models (MSRIST) to analyze these types of data and to identify the prognostic factors influencing the disease progression in patients with intermediate events. The proposed method is fully nonparametric and can be used for estimating transition probabilities. Methods A general algorithm was provided for analyzing multi-state data with a focus on the illness-death and progressive multi-state models. The model considered both beyond Markov and Non-Markov settings. We also proposed a multi-state random survival method (MSRSF) and compared their performance with the classical multi-state Cox model. We applied the proposed method to a dataset related to HIV/AIDS patients based on a retrospective cohort study extracted in Tehran from April 2004 to March 2014 consist of 2473 HIV-infected patients. Results The results showed that MSRIST outperformed the classical multistate method using Cox Model and MSRSF in terms of integrated Brier score and concordance index over 500 repetitions. We also identified a set of important risk factors as well as their interactions on different states of HIV and AIDS progression. Conclusions There are different strategies for modelling the intermediate event. We adapted two newly developed data mining technique (RSF and RIST) for multistate models (MSRSF and MSRIST) to identify important risk factors in different stages of the diseases. The methods can capture any complex relationship between variables and can be used as a useful tool for identifying important risk factors in different states of this disease.
... In the next section, we show how we can use the three parameter GG distribution to identify h 13 0 (s − ) and h 23 (t − s) and how the properties of this distribution can be used to do the calculations related to RLL due to each stroke type using the concept of RT. More examples of multistate models with additive hazards extensions can be found in Klein et al. 14 and in Shu et al. 15 Variance estimation for some of these models can be found in Klein et al. 16 Large-sample properties of three different additive hazards extension models can be found in Shu. 17 Likewise, examples with a nonparametric semi-Markov model and a semi-Markov model incorporating covariate information are illustrated in Voelkel et al. 18 and Andersen et al., 19 respectively. Estimation and prediction in a semi-Markov model in a Cox regression framework with baseline hazards and covariates depending on recurrence times has been illustrated in Dabrowska. ...
Article
Time-dependent covariates can be modeled within the Cox regression framework and can allow both proportional and nonproportional hazards for the risk factor of research interest. However, in many areas of health services research, interest centers on being able to estimate residual longevity after the occurrence of a particular event such as stroke. The survival trajectory of patients experiencing a stroke can be potentially influenced by stroke type (hemorrhagic or ischemic), time of the stroke (relative to time zero), time since the stroke occurred, or a combination of these factors. In such situations, researchers are more interested in estimating lifetime lost due to stroke rather than merely estimating the relative hazard due to stroke. To achieve this, we propose an ensemble approach using the generalized gamma distribution by means of a semi-Markov type model with an additive hazards extension. Our modeling framework allows stroke as a time-dependent covariate to affect all three parameters (location, scale, and shape) of the generalized gamma distribution. Using the concept of relative times, we answer the research question by estimating residual life lost due to ischemic and hemorrhagic stroke in the chronic dialysis population.
... Since the papers of Klein et al. [21], Keiding et al. [22] the multi-state approach, is becoming more popular but remain solely in hematopoeitic stem cell transplantation (HSCT). The use of multistate in HSCT is not particlularly new [21,23]. One possible reason is that a multi-state model regression analysis typically involves the modelling of each transition intensity separately. ...
... For each treatment pathway, we estimated the probability over time of a given treatment pathway using the data from the AF cohort (e.g. the probability of a patient first being treated with an antithrombotic medication and then with rate-limiting medication and then not treated). A finite state Markov process model was used to model the hazard rates for each possible transition using a separate (left truncated) Cox proportional hazard model (24,25). The transition probabilities were calculated using the base line hazard function (26,27). ...
Article
Although there has been growing concordance over what constitutes best practice in recent guidelines for treatment of atrial fibrillation (AF), notably regarding anticoagulant use, it remains unclear whether patients are being treated accordingly. The aims of this study were to explore the pattern of treatment pathways - i.e. how patients are treated over time - for patients with AF, and to test the hypothesis that comparative to patients in lower stroke-risk categories (as measured by CHADS(2) score), patients with higher CHADS(2) scores are less likely to discontinue anticoagulant therapy or, if not started on anticoagulant treatment, more likely to be transferred to anticoagulant therapy, in keeping with guideline recommendations. A total of 67,857 patients with a diagnosis of AF in practices registered with the General Practice Research Database. A series of possible treatment pathways were identified, and for each initial treatment, we estimated the probability of treatment change and the average time that a patient newly diagnosed with AF spent on a particular treatment, projected across 5 years and stratified by CHADS(2) score. There was no relationship between CHADS(2) score and maintenance or discontinuation of particular approaches to antithrombotic treatment. While those beginning on antiplatelet therapy were more likely to change treatment than those on anticoagulants (approximately 60% vs. 50% within the first year), as much as one-third of treatment time of all those starting on a therapeutic approach involving anticoagulants featured no use of anticoagulants (either as monotherapy or in combination) over the 5-year period, and whether treatment was discontinued or maintained did not vary by CHADS(2) score. No difference was found in treatment pathways controlling for post-2002 diagnoses as against the whole sample. Although there is more evidence of treatment maintenance than treatment change, especially in the first year after diagnosis, the amount of therapeutic change remains noteworthy and appears higher than in some previous studies. Prescription patterns for AF therapy suggest that too few high-risk patients are receiving best practice treatment, and particularly of concern is that some of these patients are being transferred away from best practice treatment over time.
Article
Full-text available
Background Understanding the progression of kidney disease is of great interest among clinicians. The multi-state model is an adequate tool to model the effects of covariates that influence the onset, progression, and regression of kidney function. Objective The goal of the present study is to propose a stochastic model for kidney disease progression and to demonstrate the application of the same. Methodology We proposed a semi-parametric continuous time homogeneous multi-state Markov model for the kidney disease progression data obtained from a retrospective study of 225 patients prescribed with colistin (a re-emerging antibiotic) in a tertiary care hospital in coastal Karnataka. Different stages of kidney disease were defined based on the Kidney Disease Improving Global Outcome (KDIGO) score. The model consists of three transient states, and an absorbing state death. Covariate effects on the bidirectional transition rates were estimated using the multi-state model. Results We used the data of 225 patients to see their kidney disease progression. All the patients were under colistin therapy. The median length of hospital stay was 21 days. A total of 83 (36.89%) patients died in the hospital. The prognostic factors such as gender, hypertension, sepsis, and surgery are significant factors affecting kidney disease in different stages. Conclusion The findings of the study will be useful for public health policymakers to implement the policies and treatment plans to improve the survival of the patients. Moreover, modelling the disease progression helps in understanding the expected burden of the disease.
Article
Healthcare spending in Australia has increased rapidly in recent decades, partly due to the prevalence of lifestyle related illness, as physical inactivity and ageing have become common. Diabetes is a chronic and costly illness resulting from poor lifestyle choice and ageing. In this study, we estimate the long-run cumulative costs of a complication of diabetes, diabetic kidney disease, using well-known techniques from health economics. We find that spending on treatment for diabetic kidney disease will amount to $9.2 billion over the next 30 years. Most of the cost from this complication comes from the clinical stage of diabetic kidney disease.
Article
Multi-state models have proved versatile and useful in the statistical analysis of the complicated course of events after bone marrow transplantation. Working from data from the International Bone Marrow Transplant Registry, we show that summary probability calculations may be useful to explore hypothetical scenarios where some transition intensities are set by the researcher. A multi-state Markov process model is specified with six states: the initial state 0; acute; chronic and both acute and chronic graft-versus-host disease A, C and AC; relapse R and death in remission D. Transition rates between the states are estimated using Nelson-Aalen estimators and Cox regression models and combined to transition probability estimators using Aalen-Johansen product integration. Besides the estimated transition probabilities to D and R we explore hypothetical probabilities obtained by artificially changing certain transition intensities, with the general purposes of getting summary views of the development for actual patients 'in this world' and of exploring the intrinsic information from real patients about consequences of various changed conditions.
Article
In many cancer studies we are interested the survival experience of the patient. This survival experience may be the time until response to therapy, the duration of remission, the length of time free of disease or toxic effects of the treatment or simply the overall survival experience of the patient. We are interested in estimation of the distribution of these times, in comparing these outcomes between two or more treatments, or in finding models for these outcomes that allow us to predict outcome based on a particular patient profile.
Article
We present an extension of the non-homogeneous Markov model for a bone marrow transplant recovery process which allows for possible associations between the transition intensities. The associations between intensities are modeled by a correlated gamma frailty model. Based on a parametric model for the conditional transition intensities, we obtain estimates of the model parameters. We use these estimates to make predictions of patient's eventual prognosis given the current medical history of the patient. Estimates of the uncertainty in our predictions are obtained by a modified bootstrap technique.
Article
Full-text available
The correspondence between a survival function and its hazard or failure-rate is a central idea in survival analysis and in the theory of counting processes. This correspondence is shown to be a special case of a more general correspondence between multiplicative and additive matrix-valued measures on the real line. Additive integration of the survival function produces the hazard, while the multiplicative integral, or so-called product-integral, of the hazard yields the survival function. The easy generalization to the matrix case (noncommutative multiplication) allows an elegant and completely parallel treatment of intensity measures of Markov processes, with many possible applications in multistate survival models. However, the difficulties and multiplicity of theories of product-integration in multivariate time explain why so many different multivariate product-limit estimators exist. We give a complete and elementary treatment of the basic theory of the product-integral π(1+dX)\pi(1 + dX) together with a discussion of some of its applications. New results are given on the compact differentiability of the product-integral, to be used along with the functional δ\delta-method for getting large-sample results for product-limit estimators.
Article
Most people take it as self-evident that C, in order to be a cause of E, has to temporally precede it. The same holds for different notions of probabilistic causality. Apart from this obvious ordering, little attention has been paid to analyzing and modeling explicitly such time progression. We make an attempt in this direction, by modeling the considered sequence of events in time (casual chain) in terms of the general framework of marked point processes, and considering then the associated prediction probabilities. Various dependencies on time, and on time delays, become then explicit. These are illustrated bya detailed analysis of three examples.
Article
The analysis of censored failure times is considered. It is assumed that on each individual are available values of one or more explanatory variables. The hazard function (age‐specific failure rate) is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time. A conditional likelihood is obtained, leading to inferences about the unknown regression coefficients. Some generalizations are outlined.
Article
Multistate survival analysis usually involves a series of detailed regression analyses describing transitions between various states. There is an often neglected need for the many estimates resulting from such an analysis to be re-synthesized into summary statements, such as prediction of various outcomes from specified patient histories. Arjas and Eerola recently proposed a framework for dynamic probabilistic causality which has calculation of such prediction statements as a central tool. We illustrate these procedures on data from a multicentre bone marrow transplantation study, with death while in remission and relapse as terminal events and recovery of the patients's platelets to a normal level and the onset of acute graft-versus-host disease as intermediate events, using Cox regression models throughout. Among the features illustrated by the resulting plots is a strong effect on death while in remission if the platelets do not recover within the first three months.
On Predictive Causality in the Statistical Analysis of a Series of Events
  • M Eerola
Eerola, M. (1993). On Predictive Causality in the Statistical Analysis of a Series of Events. Statistical Research Report 14, The Finnish Statistical Society.
Time-dependent covariates in a general survival model with any finite number of intermediate and final events unpublished Ph
  • C Qian
Qian, C. (1995). Time-dependent covariates in a general survival model with any finite number of intermediate and final events unpublished Ph.D. Thesis, The Ohio State University.
  • J P Klein
  • M L Moeschberger
Klein, J.P. and Moeschberger, M.L. (1996). Survival Analysis, Springer, New York, (In Press).