Estimation of the 2sample hazard ratio function using a semiparametric model.
ABSTRACT The hazard ratio provides a natural target for assessing a treatment effect with survival data, with the Cox proportional hazards model providing a widely used special case. In general, the hazard ratio is a function of time and provides a visual display of the temporal pattern of the treatment effect. A variety of nonproportional hazards models have been proposed in the literature. However, available methods for flexibly estimating a possibly timedependent hazard ratio are limited. Here, we investigate a semiparametric model that allows a wide range of timevarying hazard ratio shapes. Point estimates as well as pointwise confidence intervals and simultaneous confidence bands of the hazard ratio function are established under this model. The average hazard ratio function is also studied to assess the cumulative treatment effect. We illustrate corresponding inference procedures using coronary heart disease data from the Women's Health Initiative estrogen plus progestin clinical trial.

Article: Semiparametric inference on the absolute risk reduction and the restricted mean survival difference.
[Show abstract] [Hide abstract]
ABSTRACT: For timetoevent data, when the hazards are nonproportional, in addition to the hazard ratio, the absolute risk reduction and the restricted mean survival difference can be used to describe the timedependent treatment effect. The absolute risk reduction measures the direct impact of the treatment on event rate or survival, and the restricted mean survival difference provides a way to evaluate the cumulative treatment effect. However, in the literature, available methods are limited for flexibly estimating these measures and making inference on them. In this article, point estimates, pointwise confidence intervals and simultaneous confidence bands of the absolute risk reduction and the restricted mean survival difference are established under a semiparametric model that can be used in a sufficiently wide range of applications. These methods are motivated by and illustrated for data from the Women's Health Initiative estrogen plus progestin clinical trial.Lifetime Data Analysis 02/2013; · 0.85 Impact Factor
Page 1
Biostatistics (2011), 12, 2, pp. 354–368
doi:10.1093/biostatistics/kxq061
Advance Access publication on September 21, 2010
Estimation of the 2sample hazard ratio function using a
semiparametric model
SONG YANG∗
Office of Biostatistics Research, National Heart, Lung, and Blood Institute,
6701 Rockledge Drive, MSC 7913, Bethesda, MD 20892, USA
yangso@nhlbi.nih.gov
ROSS L. PRENTICE
Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North,
PO Box 19024 Seattle, WA 98109, USA
SUMMARY
The hazard ratio provides a natural target for assessing a treatment effect with survival data, with the
Cox proportional hazards model providing a widely used special case. In general, the hazard ratio is a
function of time and provides a visual display of the temporal pattern of the treatment effect. A variety of
nonproportionalhazardsmodelshavebeenproposedintheliterature.However, availablemethodsforflex
ibly estimating a possibly timedependent hazard ratio are limited. Here, we investigate a semiparametric
model that allows a wide range of timevarying hazard ratio shapes. Point estimates as well as pointwise
confidence intervals and simultaneous confidence bands of the hazard ratio function are established under
this model. The average hazard ratio function is also studied to assess the cumulative treatment effect. We
illustrate corresponding inference procedures using coronary heart disease data from the Women’s Health
Initiative estrogen plus progestin clinical trial.
Keywords: Clinical trial; Empirical process; Gaussian process; Hazard ratio; Simultaneous inference; Survival analy
sis; Treatment–time interaction.
1. INTRODUCTION
Consider the comparison of failure times between a treated and control group under independent cen
sorship. The hazard ratio provides a natural target of estimation in many applications since it permits a
focus on relative failure rates across the study followup period, without the need to model absolute fail
ure rates, which may be sensitive to study eligibility criteria and other factors. The proportional hazards
special case of the Cox (1972) regression model is widely used for hazard ratio estimation. The maximum
partial likelihood procedure (Cox, 1975) provides a convenient and robust means of estimating a constant
hazard ratio and yields a logrank procedure for testing equality of hazards between the 2 groups.
∗To whom correspondence should be addressed.
c ? The Author 2010. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oxfordjournals.org.
Page 2
Estimation of the 2sample hazard ratio function using a semiparametric model
355
In general, the hazard ratio may be a function of time, and estimation of the hazard ratio function may
provide useful insights into temporal aspects of treatment effects. For example, Gilbert and others (2002)
develop a nonparametric estimation procedure for the loghazard ratio function with simultaneous confi
dence bands, for use as an exploratory data analytic tool. Naturally, confidence bands may be wide with
such a nonparametric estimator, particularly at longer followup times where data may be sparse. See also
Gray (1992), Kooperberg and others (1995), Cai and Sun (2003), Tian and others (2005), Abrahamowicz
and Mackenzie (2007), and Peng and Huang (2007), and references therein, for additional related work.
Parametric or semiparametric hazard ratio models have potential to contribute valuably to treatment
effect assessment. Hazard ratio models having parameters of useful interpretation, and that embrace a
range of hazard ratio shapes, may be particularly valuable. The Cox model allows timevarying covariates
to be defined that can, for example, allow separate hazard ratios for the elements of a partition of the
time axis or allow the hazard ratio to be a parametric function of followup time more generally. Various
other semiparametric regression models have been proposed for failure time data analyses, including
accelerated failure time models, proportional odds models, and linear transformation models, many of
which are embraced by the broad class of models for which Zeng and Lin (2007) develop maximum
likelihood estimation procedures. Some more semiparametric models can be found in Vaupel and others
(1979), Hsieh (1996), Chen and Wang (2000), Tsodikov (2002), Yang and Prentice (2005), and Chen and
Cheng (2006). Many of these models induce a semiparametric class of models for the hazard ratio function
that includes proportional hazards as a special case. Hazard ratio estimators under such semiparametric
models can avoid the instability that may attend nonparametric hazard ratio function estimators.
One of these, proposed by Yang and Prentice (2005), involves shortterm and longterm hazard ratio
parameters, and a hazard ratio function that depends also on the control group survivor function. Assume
absolutely continuous failure times and label the 2 groups control and treatment, with hazard functions
λC(t) and λT(t), respectively. Let h(t) = λT(t)/λC(t) be the hazard ratio function and SC(t) the survivor
function of the control group. The model postulates that
h(t) =
1
e−β2+ (e−β1− e−β2)SC(t),
t < τ0,
(1.1)
where β1and β2are scalar parameters and
τ0= sup
?
x:
?x
0
λC(t)dt < ∞
?
. (1.2)
This model includes the proportional hazards model and the proportional odds model as special cases. It
has a monotone h(t) with a variety of patterns, including proportional hazards, no initial effect, disappear
ing effect, and crossing hazards, among others. Thus, the model presumably entails sufficient flexibility
for many applications. It has also been studied for current status data in Tong and others (2007).
In comparison, for many commonly used special cases of the accelerated failure time model either
limt↓0h(t) = 1 or limt↑τ0h(t) ∈ {0,1,∞} and the hazard ratio stays above or below one when λC
is increasing. This is less flexible than desired. For the class of linear transformation models, with the
logarithmic transformation, the hazard ratio also inherits some of these restrictions at many common
baseline distributions. Similar properties hold as well for many other semiparametric models.
Under model (1.1), estimation procedures to date have focused on the finitedimensional parameters,
as has mostly been the case also for estimation under other semiparametric models. Here, we extend the
estimation to pointwise and simultaneous inference on the hazard ratio function itself. First, consistency
and asymptotic normality of the estimate at a fixed time point are established. Then procedures for con
structing pointwise confidence intervals and simultaneous confidence bands for the hazard ratio function
are developed, and some modifications are implemented to improve moderate sample size performance.
Page 3
356 S. YANG AND R. L. PRENTICE
For additional display of the treatment effect, simultaneous confidence bands are also obtained for the
average hazard ratio function over a time interval. The average hazard ratio gives a summary measure of
treatment comparison and provides a picture of the cumulative treatment effect to augment display of the
temporal pattern of the hazard ratio. These hazard ratio estimation procedures are applied to data from the
Women’s Health Initiative (WHI) estrogen plus progestin clinical trial (Writing Group For the Women’s
Health Initiative Investigators, 2002; Manson and others, 2003), which yielded a hazard ratio function
for the primary coronary heart disease outcome that was decidedly nonproportional. Understanding the
hazard ratio function shape in this setting was important to integrating the clinical trial data with a large
body of preceding observational literature that had failed to identify an early hazard ratio increase (e.g.
Manson and others, 2003; Prentice and others, 2005).
We organize the article as follows: In Section 2, the shortterm and longterm hazard ratio model
and the hazard ratio estimate are described. Pointwise confidence intervals of the hazard ratio are estab
lished. Simultaneous confidence bands for the hazard ratio and the average hazard ratio are provided in
Section 3. Simulation results are presented in Section 4. Application to data from the WHI trial is given in
Section 5. Some concluding remarks are given in Section 6. Proofs of the asymptotic results are contained
in the Supplementary Material available at Biostatistics online.
2. HAZARD RATIO FUNCTION ESTIMATION
Let T1,...,Tnbe the pooled lifetimes of the 2 groups, with T1,...,Tn1, n1< n, constituting the control
group having the survivor function SC. Let C1,...,Cn be the censoring variables, and Zi = I(i >
n1),i = 1,...,n, where I(∙) is the indicator function. The available data consist of the independent
triplets (Xi,δi, Zi), i = 1,...,n, where Xi = min(Ti,Ci) and δi = I(Ti ? Ci). We assume that Ti
and Ciare independent given Zi. The censoring variables (Ci’s) need not be identically distributed, and
in particular, the 2 groups may have different censoring patterns. For t < τ0with τ0defined in (1.2), let
R(t) be the the odds function 1/SC(t) − 1 of the control group. The model of Yang and Prentice (2005)
can be expressed as
λi(t) =
1
e−β1Zi+ e−β2ZiR(t)
dR(t)
dt
,
i = 1,...,n,
t < τ0,
(2.1)
where λi(t) is the hazard function for Tigiven Zi. Under the model, the hazard ratio is
h(t) =
1 + R(t)
e−β1+ e−β2R(t).
To estimate h(t), we need to estimate the parameter β β β = (β1,β2)Tand the baseline function R(t), where
“T ” denotes transpose. Let us first introduce the estimators from Yang and Prentice (2005).
Define
n
?
where b = (b1,b2)T. Let τ < τ0be such that
lim
n
with probability 1. For t ? τ, let
?
K(t) =
i=1
I(Xi? t),
Hj(t; b) =
n
?
i=1
δie−bjZiI(Xi? t),
j = 1,2,
K(τ) > 0,
(2.2)
ˆP(t; b) =
?
s?t
1 −?H2(s; b)
K(s)
?
,
ˆR(t; b) =
1
ˆP(t; b)
?t
0
ˆP−(s; b)
K(s)
H1(ds; b),
Page 4
Estimation of the 2sample hazard ratio function using a semiparametric model
357
where ?H2(s; b) denotes the jump of H2(s; b) in s andˆP−(s; b) denotes the left continuous (in s) version
ofˆP(s; b), Define the martingale residuals
?t
ˆ Mi(t; b) = δiI(Xi? t) −
0
I(Xi? s)
ˆR(ds; b)
e−b1Zi+ e−b2Zi ˆR(s; b),
1 ? i ? n.
Yang and Prentice (2005) proposed a pseudo maximum likelihood estimatorˆβ β β = (ˆβ1,ˆβ2)Tof β β β, which
is the zero of Q(b), where
?τ
Q(b) =
n
?
i=1
0
fi(t; b)ˆ Mi(dt; b),
(2.3)
with fi= ( f1i, f2i)T, where
f1i(t; b) =
Zie−b1Zi
e−b1Zi+ e−b2Zi ˆR(t; b),
f2i(t; b) =
Zie−b2Zi ˆR(t; b)
e−b1Zi+ e−b2Zi ˆR(t; b).
Onceˆβ β β is obtained, R(t) can be estimated byˆR(t;ˆβ β β), and the hazard ratio h(t) can be estimated by
ˆh(t) =
1 +ˆR(t;ˆβ β β)
e−β1+ e−β2 ˆR(t;ˆβ β β)
.
In Appendix A of the Supplementary Material available at Biostatistics online, we show thatˆh(t) is
strongly consistent for h(t) under model (2.1).
To study the distributional properties ofˆh(t), let
Wn(t) =√n(ˆh(t) − h(t)),
t ? τ.
For the asymptotic distribution ofˆβ β β, define
A(t) =
?
e−β1
e−β1+ e−β2 ˆR(t; β β β),
?
?τ
e−β2 ˆR(t; β β β)
e−β1+ e−β2 ˆR(t; β β β)
?
?T
,
K1(t) =
i?n1
I(Xi? t),
K2(t) =
i>n1
I(Xi? t),
ω(t) =
t
A(s)h(s)K1(s)K2(s)
(1 + R(s))(1 +ˆR(s; β β β))K(s)(h(s)e−β2− 1)dR(s)
ˆP(s; β β β).
From Theorem A2 of Yang and Prentice (2005) and some algebra,
Q(β β β) =
?
i?n1
?τ
0
μ1dMi+
?
i>n1
?τ
0
μ2dMi
+ op(1),
Page 5
358 S. YANG AND R. L. PRENTICE
where
μ1(t) = −A(t)K2(t)h(t)
K(t)
+
ˆP−(t; β β β)(1 +ˆR(t; β))
K
ω(t),
μ2(t) = A(t)K1(t)
K(t)+
ˆP−(t; β β β)(e−β1+ e−β2 ˆR(t; β β β))
K(t)
ω(t),
(2.4)
Mi(t) = δiI(Xi? t) −
?t
0
I(Xi? s)
dR(s)
e−β1Zi+ e−β2ZiR(s),
i = 1,...,n.
Now forˆR(t;ˆβ β β), from Lemma A3 in Yang and Prentice (2005) and some algebra,
√n(ˆR(t; β β β) − R(t)) =
1
√nˆP(t; β β β)
?
i?n1
?t
0
ν1dMi+
?
i>n1
?t
0
ν2dMi
,
(2.5)
where
ν1(t) =nˆP−(t; β β β)
K(t)
(1 + R(t)),ν2(t) =nˆP−(t; β β β)
K(t)
(e−β1+ e−β2R(t)).
Let
D(t; β β β) =∂ˆR(t; β β β)
∂β β β
,
U =
?
−1
n
∂Q(β β β)
∂β β β
?−1
,
B(t) = h(t)A(t) +
e−β1− e−β2
(e−β1+ e−β2R(t))2D(t; β β β),
e−β1− e−β2
(e−β1+ e−β2R(t))2
C(t) =
1
ˆP(t; β β β).
For t ? τ, define the process
˜ Wn(t) =BT(t)U
√n
?
?
i?n1
?τ
?t
0
μ1dMi+
?
?t
i>n1
?τ
0
μ2dMi
+C(t)
√n
i?n1
0
ν1dMi+
?
i>n1
0
ν2dMi
.(2.6)
With the representations for Q(β β β) and√n(ˆR(t; β β β)−R(t)), in Appendix B of the Supplementary Material
available at Biostatistics online, we show that Wnis asymptotically equivalent to ˜ Wnwhich converges
weakly to a zeromean Gaussian process W∗. The weak convergence of Wnthus follows. The limiting
covariancefunctionσ(s,t)of W∗involvesthederivative D(t; β β β) andthederivativematrixinU.Although
analytic forms of these derivatives are available, they are quite complicated and cumbersome. Here, we
approximate them by numerical derivatives. For the functions B(t),C(t), μ1(t), μ2(t), ν1(t), and ν2(t),
define corresponding ˆB(t),ˆC(t),..., by replacing β β β withˆβ β β, R(t) with ˆR(t;ˆβ β β) and D(t; β β β) with the
numerical derivatives. Similarly, let ˆ U be the numerical approximation of U. Simulation studies show