Page 1

Biostatistics (2009), 0, 0, pp. 1–10

doi:10.1093/biostatistics/kxp002

Efficient parameter estimation in longitudinal data

analysis using a hybrid GEE method

DENIS H. Y. LEUNG∗

School of Economics, Singapore Management University,

90 Stamford Road, Singapore

denisleung@smu.edu.sg

YOU-GAN WANG

Commonwealth Scientific and Industrial Research Organization, Mathematical and Information

Sciences, CSIRO Long Pocket Laboratories, 120 Meiers Road, Indooroopilly,

Queensland 4068, Australia

MIN ZHU

Finance Discipline, School of Business and Economics, University of Sydney, NSW 2006, and

Division of Mathematical and Information Sciences, Commonwealth Scientific and

Industrial Research Organisation, PO Box 120, Cleveland, QLD 4163, Australia

SUMMARY

The method of generalized estimating equations (GEEs) provides consistent estimates of the regression

parameters in a marginal regression model for longitudinal data, even when the working correlation model

is misspecified (Liang and Zeger, 1986). However, the efficiency of a GEE estimate can be seriously

affected by the choice of the working correlation model. This study addresses this problem by proposing

a hybrid method that combines multiple GEEs based on different working correlation models, using the

empirical likelihood method (Qin and Lawless, 1994). Analyses show that this hybrid method is more

efficient than a GEE using a misspecified working correlation model. Furthermore, if one of the working

correlation structures correctly models the within-subject correlations, then this hybrid method provides

the most efficient parameter estimates. In simulations, the hybrid method’s finite-sample performance

is superior to a GEE under any of the commonly used working correlation models and is almost fully

efficient in all scenarios studied. The hybrid method is illustrated using data from a longitudinal study of

the respiratory infection rates in 275 Indonesian children.

Keywords: Empirical likelihood; Generalized estimating equations; Longitudinal data.

1. INTRODUCTION

Generalized estimating equations (GEEs) have been found to be very useful in analysis of correlated

and longitudinal outcomes using marginal regression models. Following Liang and Zeger (1986), many

∗To whom correspondence should be addressed.

c ? The Author 2009. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

Biostatistics Advance Access published April 4, 2009

Page 2

2D. H. Y. LEUNG AND OTHERS

aspects of GEE have been explored. Reviews of GEE include Pendergast and others (1996) and Desmond

(1997).

In a marginal regression model, the primary interest is in the regression parameters, which character-

ize the expectations of the subject’s response over time. However, in order to make proper inference about

the regression parameters, the within-subject covariance (correlation) structures must be taken into con-

sideration. The GEE approach has been popular because estimates of mean parameters remain consistent

even if the correlation or the covariance structure is misspecified. On the other hand, accurate modeling of

the correlation structure generally improves statistical inference on means (Albert and McShane, 1995;

Fitzmaurice, 1995; Hall and Severini, 1998). Wang and Carey (2003) analyzed how efficiency can be

affected by (i) the choice of the working correlation structure, (ii) the method by which the working

correlation parameters are estimated, and (iii) the layout of the design matrix. Higher moments can be

incorporated into estimation using a generalized version of GEE called GEE2 (Liang and others, 1992).

However, bias or efficiency losses may be introduced if higher moment assumptions of GEE2 are incor-

rectly specified. For this reason, GEE2 has yet to receive wide application.

In GEE modeling, the most commonly used working correlation models are the exchangeable, AR(1)

and MA(1). Wang and Carey (2003) found that among the 3, AR(1) is the most robust. However, they also

demonstrated scenarios where the exchangeable and MA(1) models give better results than the AR(1)

model. Therefore, it remains an issue of how to choose a working correlation model in a particular GEE

analysis. The AR(1) and MA(1) working correlation models appear to be favored by users of GEE be-

cause (i) in most situations, they are sufficient as an approximation to the true correlation structure and

(ii) they represent sensible compromises between the independence model (which ignores within-subject

correlations) and the completely unstructured model (which requires the estimation of large number of

nuisance parameters). These considerations lead us to propose a method that incorporates all 3 work-

ing correlation models in a single framework, yielding a method that is efficient if one of these 3 working

correlation models correctly captures the true correlation structure, and robust even if none of the

working correlation models is correct. The proposed method can be generalized to combining any number

of GEEs with working correlation models other than the exchangeable, AR(1) and MA(1) models.

Each GEE with a particular working correlation model is a mean-zero estimating equation under the

true parameters. When we are interested in combining multiple GEEs, then there are more estimating

equations than the number of parameters. In situations involving independent data where the number of

estimating equations may be larger than the number of parameters, Qin and Lawless (1994) showed how

to combine efficiently the estimation equations using an empirical likelihood (EL) (Owen, 1988). We

exploit this attribute of the EL technique to combine GEEs. The individual GEEs using different working

correlations are used as constraints in an EL for the parameters of interest. The parameter estimates are

then obtained by maximizing the EL. Other than its role as a tool for combining estimating equations,

EL also inherits a number of desirable properties from its parametric counterparts, as described in Owen

(2001).

The rest of this paper is organized as follows. Section 2 presents the basic problem, the modeling

framework of the proposed method, and its large-sample properties. The results of a simulation study are

summarized in Section 3. In Section 4, the method is illustrated using a real data set. Section 5 concludes

the paper with a discussion. Detailed simulation results and proofs are given as supplementary material

available at Biostatistics online, http://biostatistics.oxfordjournals.org.

2. COMBINING GEES

Consider a longitudinal study in which there are n subjects, each of whom is measured at K time points.

Let yi = (yi1,..., yiK)Tdenote the underlying outcome for the ith subject, xian associated vector of

r × 1 covariates, and xikthe value of the covariate at time k. Denote the marginal mean outcome at the

Page 3

Efficient parameter estimation in longitudinal data analysis

3

kth measurement for the ith subject by µik(β) = g(xT

conciseness, we suppress the explicit association of µikwith β if there is no confusion.

Following Liang and Zeger (1986), a GEE can be used to estimate the regression parameters, β,

ikβ), for a vector of unknown parameters, β. For

n

?

i=1

DT

iV−1

i

{yi− µi} = 0,

(2.1)

where µi = (µi1, ...,µik)T, Di = ∂µi/∂βT, and Viis the covariance matrix of yi. The matrix Viis

often modeled as φA1/2

ii

, where Aiis a diagonal matrix representing the variances of yik, R(α)

is a “working correlation” depending on a set of unknown parameters α, and φ is a scale parameter used

to model over-dispersion or under-dispersion. Liang and Zeger (1986) showed that, whether or not Viis

correctly specified, the estimators of β obtained from (2.1) remain consistent. In addition, if Vi= Cov(yi)

can be consistently estimated up to n−1/2, then the estimator of β is fully efficient. On the other hand, an

incorrectly specified Viwill lead to a loss of efficiency (Wang and Carey, 2003).

As Liang and Zeger (1986) pointed out, (2.1) can be re-expressed as a function of β by writing

α ≡ α(β,φ) and φ ≡ φ(β). An iterative algorithm can then be used to estimate β, α, and φ, start-

ing with initial estimates of α and φ. For suggested methods for estimating α and φ, see Liang and

Zeger (1986), Chaganty (1997), and Chaganty and Shults (1999). For ease of exposition, we assume

φ = 1. Liang and Zeger (1986, pp 17–18) discussed choices for R(α), while Wang and Carey (2003)

studied their relative efficiencies. For any chosen working correlation matrix R ≡ R(α), write Si(β) ≡

DT

ii

(yi− µi). Then, the GEE (2.1) estimates are solutions of

n

?

Now, consider different, linearly independent choices of R(α), say Rj(α), j = 1, ..., J, and write

n

?

for the estimating equation (2.2) but using working correlation matrix Rj(α). Let hi(β) ≡ (S1

Sj

of hiis higher than the dimension of β. Our propose is to use EL to combine the estimating equations

S1

iis the optimal estimating equation, in the sense that it solves (2.2) with

A−1/2

ii

i

, then the EL estimate will be optimal. If none of them is optimal, then the

EL estimate is still consistent and combines optimally the information in S1

popular choices of R(α) may be used; for example, exchangeable, AR(1) and MA(1).

We now describe how to use an EL framework to combine the GEEs in hi. Let F be the distribution

function associated with the observations {(yi,xi)}n

(yi,xi). Then, the nonparametric likelihood of the data can be written as?n

the maximum nonparametric likelihood estimate of F is the empirical distribution function Fn(yi|xi) =

?n

dFn≡ pi= 1/n. Instead, we can use the (empirical) likelihood

R(α)A1/2

iA−1/2

R−1A−1/2

S(β) ≡

i=1

Si(β) = 0.(2.2)

Sj(β) ≡

i=1

Sj

i(β) = 0,

(2.3)

i(β)T, ...,

i(β)T, ..., SJ

i(β)T)T, and note that hi ≡ hi(β) is a function of β only. In general, the dimension

i,...,SJ

{R(α)}−1A−1/2

i,...,SJ

i. If one of the S1

= V−1

i,...,SJ

i. In practice, a few

i=1. Denote pi = dF(yi|xi) as the jump size of F at

i=1dF(yi|xi) ≡?n

i=1pi,

subject to the constraints 0 ? pi ? 1,i = 1, ...,n, and?n

i=1I(yi ? y|xi), which corresponds to pi = 1/n. However, suppose we know that E(hi(β)) = 0

under F. Then, the empirical distribution function is no longer desirable because E(hi(β)) ?= 0 under

i=1pi = 1. Without any other information,

L(β) =

n?

i=1

pi

(2.4)

Page 4

4 D. H. Y. LEUNG AND OTHERS

subject to the constraints

0 ? pi? 1, i = 1, ...,n;

n

?

i=1

pi= 1,

n

?

i=1

pi{S1

i(β)T, ..., Sj

i(β)T, ..., SJ

i(β)T}T≡

n

?

i=1

pihi(β) = 0.

In this formulation, maximizing the EL gives a set of pis such that E(hi(β)) = 0 under {pi}n

the resulting values of {pi}n

value of β, the EL estimate of F is sensitive to the value of β.

The EL (2.4) can be maximized as a constrained maximization problem. By introducing Lagrange

multipliers η, λ = (λT

?

i=1. Since

i=1depend on the extra conditions E(hi(β)) = 0, which in turn depend on the

1, ...,λT

j, ...,λT

J)T, where each λjis r × 1, the log-EL can be written as

?

log L(β) =

n

?

i=1

log pi+ η

1 −

n

?

i=1

pi

− nλT

n

?

i=1

pihi(β).(2.5)

The values of {pi}n

i=1can be profiled out by differentiating (2.5) with respect to pito give

1

pi

− η − nλThi(β) = 0 ⇒ n − η = 0 ⇒ η = n. (2.6)

Equation (2.6) implies that the optimal values of {pi}n

i=1are

1

pi=

n{1 + λThi(β)}. (2.7)

Furthermore, the constraint?n

i=1pihi(β) = 0 implies that λ satisfies the equation

n

?

i=1can be profiled out in the negative log-EL to give

n

?

i=1

hi(β)

1 + λThi(β)= 0.(2.8)

Using (2.7) and (2.8), η and {pi}n

?(β) ≡ −log L(β) =

i=1

log{1 + λThi(β)} − n log(n).(2.9)

Let hβ

i(β) = ∂hi(β)/∂βT. Differentiating (2.9) with respect to β leads to

n

?

i=1

λThβ

1 + λThi(β)= 0.

i(β)

(2.10)

The maximum EL estimates (ˆβ,ˆλ) are the solutions to (2.8) and (2.10). Note that (2.8) consists of J

equations for each parameter and (2.10) consists of r equations, so in total there are (J +1)r simultaneous

equations to solve. We now give the results for the large-sample behavior of the parameter estimates using

the proposed method.

Page 5

Efficient parameter estimation in longitudinal data analysis

5

THEOREM 2.1 Under the conditions given in the supplementary material available at Biostatistics online,

as n → ∞,

n1/2(ˆβ − β∗)

d

→ MVN(0,(?T

12?−1

22?12)−1),

(2.11)

where ?12and ?22are defined in the supplementary material available at Biostatistics online.

THEOREM 2.2 If one of the S1

optimal in the sense that it will be equivalent to the GEE estimate with the correct specification of R(α).

In that case, as n → ∞,

⎛

i,...,SJ

iis the optimal estimating equation, then the EL estimate will be

n1/2(ˆβ − β∗)

d

→ MVN

⎝0,

?

lim

n→∞

?

i

D−1

i

V−1

i

DT

i

?−1⎞

⎠. (2.12)

In Theorem 2.2, S1

In practice, it is possible that none of the guesses correspond to the optimal estimating equation. We

demonstrate that even in that case, the EL method is still optimal in the sense that it optimally combines

the guesses S1

i. This fact can be established by considering the following. Expand the left-hand

side of (2.8) in a Taylor expansion around λ = 0 to give

i,...,SJ

irefer to the researcher’s “guesses” of the optimal estimating equation.

i, ..., SJ

0 =

n

?

i=1

hi(β)

1 + λThi(β)=

?n

n

?

i=1

hi(β) −

n

?

i=1

hi(β)hi(β)Tλ + op(1)

⇒ λ =

i=1hi(β)

?n

i=1hi(β)hi(β)T+ op(1). (2.13)

Substitute (2.13) back into the left-hand side of (2.10) to give

n

?

i=1

?n

i=1

?

pihβ

i(β)

??n

?

n

?

i=1

hi(β)hi(β)T

?−1

hi(β) = op(1)

⇒ n−1

n

?

i=1

?

n−1

i=1

hβ

i(β)

??

n−1

n

?

i=1

hi(β)hi(β)T

?−1

hi(β) = 0(2.14)

asymptotically. Expression (2.14) is in the form of the optimal combination of S1

McLeish, 1994, p 94).

In practice, finding the solution to the maximum EL via (2.8) and (2.10) may encounter numerical

problems. Furthermore, solving (2.10) requires finding hβ

Therefore, we follow the method of Mittelhammer and others (2003) by profiling out the Lagrange mul-

tipliers as well, so that for fixed β, the Lagrange multipliers are λ(β) = (λT

the first and second derivatives of (2.9) with respect to λ are

i,...,SJ

i(Small and

i(β), which is not straightforward analytically.

1(β), ...,λT

J(β))T. Given β,

n

?

i=1

hi(β)

1 + λThi(β)

and

n

?

i=1

−hi(β)hi(β)T

{1 + λThi(β)}2.(2.15)

Page 6

6 D. H. Y. LEUNG AND OTHERS

Therefore, with some abuse of notation, for given β and a starting value λ0, the following Newton–

Raphson procedure can be used:

?

λk= λk−1+

n

?

i=1

hi(β)

1 + (λk−1)Thi(β)

?−1?

hi(β)hi(β)T

(1 + (λk−1)Thi(β))2

?

,

(2.16)

and the solution used as λ(β). Substituting λ(β) back into (2.9) then gives

?(β) =

n

?

i=1

log{1 + λT(β)hi(β)} − n log(n)

(2.17)

which can be maximized with respect to β. Hence, the algorithm can be seen as a nested algorithm

with an outside loop that involves maximizing (2.17) with respect to β, while for each β, the inside

loop evaluates λ(β) using (2.16). The overall maximum gives the maximum EL estimateˆβ. Therefore,

instead of solving (J + 1)r simultaneous equations, only a function of r parameters needs to be maxi-

mized. This method becomes especially useful when the number of estimating equations, J, is large. In

our simulations, we used a simple modification of Owen’s S program for the inside loop (http://www-

stat.stanford.edu/∼owen/empirical/). The outside loop was performed using the optim function in R.

Details of the program can be found in the supplementary material available at Biostatistics online.

As an example, let µik= β0+ xikβ1, i = 1, ...,n, k = 1, ..., K, and β = (β0,β1). Furthermore,

suppose 2 different choices of R(α) are used, namely, the AR(1) with αij= α|i−j|and the exchangeable

with αij= α, for all i ?= j. Then,

⎛

⎜

⎜

where 1

hi(β) = (S1

i(β)T, S2

i(β)T)T=

⎜

⎜

⎜

⎝

⎜

1

?

1

?

TA−1/2

i

xT

{R1

{R1

{R2

{R2

i(α)}−1A−1/2

i(α)}−1A−1/2

i(α)}−1A−1/2

i(α)}−1A−1/2

i

{yi− (β01

{yi− (β01

{yi− (β01

{yi− (β01

?+ β1xi)}

?+ β1xi)}

iA−1/2

i

TA−1/2

i

xT

i

?+ β1xi)}

?+ β1xi)}

i

iA−1/2

ii

⎞

⎟

⎟

⎟

⎟

⎟

⎠

⎟

,

?= (1, ...,1)T. Furthermore, λ = (λT

λThi(β) = λ111

1,λT

2)T≡ (λ11,λ12,λ21,λ22)Tand

?

TA−1/2

i

{(R1

i(α)}−1A−1/2

i

{yi− (β01

?+ β1xi)}

?+ β1xi)}

?+ β1xi)}

?+ β1xi)}.

+λ12xT

iA−1/2

i

((R1

i(α))−1A−1/2

i

{yi− (β01

+λ211

?

TA−1/2

i

{(R2

i(α)}−1A−1/2

i

{yi− (β01

+λ22xT

iA−1/2

i

{(R2

i(α)}−1A−1/2

i

{yi− (β01

3. SIMULATIONS

We carried out a simulation study to evaluate the moderate sample properties of the proposed method.

Two sets of simulations were used:

Set A: xik,i = 1, ...,n, k = 1, ...,10, are independent and identically distributed as N(1,σ2

σx = 1. In this setup, xiks are subject-specific covariates that may change over time and are different

between subjects but there is no time trend.

x) with

Page 7

Efficient parameter estimation in longitudinal data analysis

7

Set B: xi = (xi1, ..., xik) followed MVN(0,σ2

off-diagonal elements equal to 0.2 and σx = 0.5. In this setup, the intrasubject covariates are correlated

over time.

Each set of simulations was based on 1000 runs. Samples sizes were n = 100 and 200. The following

model was used for the mean response at time k for the ith subject, E(yik) ≡ µik = β0− β1xik,

k = 1, ...,10, i = 1, ...,n. The true values of (β0,β1) were (1,−1). The simulation study shows

that the proposed method is nearly as efficient as the standard GEE using the correct working correlation

model and is superior to the standard GEE using an incorrect working correlation model. We also eval-

uated the empirical coverage probability of 95% confidence intervals of (β0,β1) using Theorem 2 and

found that they are close to the nominal level. Details of the results are given as supplementary material

available at Biostatistics online.

xR) with R a 10 × 10 matrix with unit diagonal and

4. APPLICATION TO INDONESIAN CHILDREN’S INFECTION DATA

In this section, we apply the proposed method to data from a longitudinal study of the respiratory infection

rate in a group of Indonesian children (Diggle and others, 2002). The sample consists of 275 preschool

children examined at 3-month intervals for 18 months. The maximum number of visits is therefore K = 6.

In total, the 275 children generated 1200 repeated measures of the response (infection versus no infec-

tion). The primary interest in this study is to determine the relationship between respiratory infection and

Vitamin A deficiency while adjusting for a number of confounders, as listed in Table 1.

We fitted the data by a GEE using an exchangeable (CS), AR(1), and MA(1) working correlation. We

then used the proposed method using R1(α) = CS, R2(α) = AR(1), and R3(α) = MA(1). The results

are given in Table 1. Standard errors of the estimates for GEEMA(1), GEECS, and GEEAR(1)were obtained

from the R routine geese in the geepack package. Those for the proposed method were estimated using

Theorem 1. The results using the 4 methods are quite similar. The conclusions from all methods are the

same, that there is no evidence of increased risk for infection due to xerophthalmia. These conclusions are

similar to those in earlier studies (e.g. Zeger and Karim, 1991; Lin and Carroll, 2001).

As a means to compare the merits of the different models, we used Akaike’s information criterion

(AIC) for GEE as developed by Pan (2001). Let Q(β) ≡ −?n

R, the AIC is defined as −2Q(ˆβ) + 2 trace(ˆ?−1

model coefficients under an independence working correlation andˆ?Ris that under working correlation

R. The AIC values for the 4 methods are given in Table 2. In Table 2, we also give the second term of

the AIC, that is, 2 trace(ˆ?−1

I

capturing the true correlation structure. The method proposed in this paper has the lowest AIC value and

i=1{yi− µi(β)}TV−1

i

(yi− µi(β)). Then,

to assess the merits of a model with parameter estimatesˆβ obtained using a working correlation matrix

I

ˆ?R), whereˆ?−1

I

is the inverse of the variance of the

ˆ?R), which has been shown in Hin and Wang (2009) to be more accurate in

Table 1. Parameter estimates (SE) using 4 methods to analyze the Indonesian children’s infection data

ParameterMethod

GEEMA(1)

−2.371 (0.162)

−0.0317 (0.00628)

0.680 (0.431)

−0.543 (0.161)

−0.398 (0.237)

−0.0488 (0.0244)

GEECS

GEEAR(1)

−2.377 (0.162)

−0.0315 (0.00627)

0.717 (0.419)

−0.550 (0.161)

−0.394 (0.237)

−0.0478 (0.0244)

EL

Intercept

Age

Xerophthalmia

Cos (season)

Sex

Height for age

−2.367 (0.162)

−0.0316 (0.00628)

0.651 (0.438)

−0.538 (0.160)

−0.396 (0.237)

−0.0493 (0.0243)

−2.370 (0.146)

−0.0317 (0.00578)

0.763 (0.372)

−0.537 (0.153)

−0.408 (0.227)

−0.0498 (0.0224)

Page 8

8 D. H. Y. LEUNG AND OTHERS

Table 2. Goodness of fit for the 4 methods in the Indonesian children’s infection data analysis

Method

GEEMA(1)

GEECS

GEEAR(1)

EL

AIC 2 trace(ˆ?−1

12.099

12.118

12.137

10.199

I

ˆ?R)

3312.993

3313.351

3313.575

3310.236

thelowestvaluein2 trace(ˆ?−1

data set.

I

ˆ?R)andtherefore,bythesemeasures,isthemostpreferredmethodforthis

5. CONCLUSION AND FUTURE RESEARCH

We have introduced a method for combining GEEs in analyzing longitudinal data so as to improve the effi-

ciency of the GEE method when, as is typically the case in practice, correct specification of the correlation

structure is problematic.

Validity of the GEE approach requires correct specification of the mean function. If some observa-

tions are missing completely at random (Little and Rubin, 1987), the mean function is not affected, and

in those cases, the method proposed here remains valid. However, if the missingness probability depends

on the observed responses (missing at random) or on the missing responses conditional on the observed

responses (nonignorable missingness), then correct modeling of the missingness probability is required

for the GEE approach, and therefore our method, to be valid. However, correct specification of the miss-

ingness probability is a nontestable condition (Gill and others, 1997; Manski, 2003), therefore, if missing

at random or nonignorable missingness are suspected, some sort of sensitivity analysis is necessary.

A related method to the one proposed in this paper is the quadratic inference function (QIF) by

Qu et al. (2000). In their work, the inverse of the working correlation matrix is approximated by a linear

combination of basis matrices, Mi,i = 1, ...,m, such as

R(α)−1≈ a0M0+ a1M1+ ··· + amMm,

where M1= IK×Kis an identity matrix and Mi,i = 2, ...,m, are known symmetric matrices. Instead

of estimating a0, ...,amdirectly, they recognized that a GEE based on (5.1) is equivalent to solving the

linear combination of a vector of estimating equations:

⎛

⎜

⎜

(5.1)

gn(β) =1

n

n

?

i=1

gi(β) =1

n

n

?

i=1

⎜

⎜

⎜

⎜

⎜

⎝

DT

iA−1/2

i

DT

M1A−1/2

M2A−1/2

...

MmA−1/2

i

{yi− µi}

{yi− µi}

iA−1/2

ii

DT

iA−1/2

ii

{yi− µi}

⎞

⎟

⎟

⎟

⎟

⎟

⎟

⎟

⎠

,

which can be performed using the generalized method of moments (Hansen, 1982). Their method gives

ˆβQIF= argminβQn(β) ≡ gT

of the variance of gn(β). We used a modest simulation study to compare our method to QIF and found that

the 2 methods give very similar results throughout when the true correlation structure is AR(1), whereas

for the CS structure there does seem to be a substantial difference in favor of EL when α = 0.7. Detailed

results of the study are given as supplementary material available at Biostatistics online. However, we

view these results as preliminary. More work needs to be done to compare these 2 related methods.

n(β)C−1

n(β)gn(β), where Cn(β) = 1/n2?n

i=1gi(β)gT

i(β) is an estimate

Page 9

Efficient parameter estimation in longitudinal data analysis

9

Finally, our proposed method is motivated on combining GEEs to find an optimal combination of

working correlations for a single data set. There are also situations where multiple longitudinal studies

are to be combined in a single analysis, for example, in a meta-analysis or multicenter study (e.g. Inoue

and others, 2004). In that case, S1(β), ..., SJ(β) may be viewed as GEEs from the different studies that

share a common parameter β of interest. The difference between that situation and the one considered

here is the multiple samples in the former. The method proposed here can be modified using a multiple

sample EL.

6. ACKNOWLEDGMENTS

We thank Dr Annie Qu and Ms Guei-feng (Cindy) Tsai for their valuable comments. We also thank the

referees and the coeditor for their valuable comments.

FUNDING

Research Center at Singapore Management University to D.L. (05-C208-SMU-003).

REFERENCES

ALBERT, P. S. AND MCSHANE, L. M. (1995). A generalized estimating equations approach for spatially correlated

binary data: applications to the analysis of neuroimaging data. Biometrics 51, 627–638.

CHAGANTY, N. R. (1997). An alternative approach to the analysis of longitudinal data via generalized estimating

equations. Journal of Statistical Planning and Inference 63, 39–54.

CHAGANTY, N. R. AND SHULTS, J. (1999). On eliminating the asymptotic bias in the quasi-least squares estimate

of the correlation parameter. Journal of Statistical Planning and Inference 76, 145–161.

DESMOND, A. (1997). Optimal estimating functions, quasi-likelihood and statistical modelling. Journal of Statistical

Planning and Inference 60, 77–121.

DIGGLE, P. J., HEAGERTY, P., LIANG, K. L. AND ZEGER, S. L. (2002). Analysis of Longitudinal Data. Oxford:

Oxford University Press.

FITZMAURICE, G. M. (1995). A caveat concerning independence estimating equations with multivariate binary data.

Biometrics 51, 309–317.

GILL, R. D., VAN DER LAAN, M. J. AND ROBINS, J. M. (1997). Coarsening at random: characterizations, conjec-

tures, counter-examples. In: Lin, D. Y. and Fleming, T. R. (editors), Proceedings of the First Seattle Symposium in

Biostatistics: Survival Analysis. New York: Springer, pp. 255–294.

HALL, D. AND SEVERINI, T. A. (1998). Extended generalized estimating equations for clustered data. Journal of

the American Statistical Association 93, 1365–1375.

HANSEN, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica 50,

1029–1054.

HIN, L.-Y., WANG, Y.-G. (2009) Working-correlation-structure identification in generalized estimating equations.

Statistics in Medicine 28, 642–658.

INOUE, L. Y., ETZIONI, R., SLATE, E., MORRELL, C. AND PENSON, D. F. (2004). Combining longitudinal studies

of PSA. Biostatistics 5, 483–500.

LIANG, K. Y. AND ZEGER, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73,

13–22.

Page 10

10D. H. Y. LEUNG AND OTHERS

LIANG, K. Y., ZEGER, S. L. AND QAQISH, B. (1992). Multivariate regression analyses for categorical data (with

discussion). Journal of the Royal Statistical Society, Series B 54, 3–24.

LIN, X. AND CARROLL, R. J. (2001). Semiparametric regression for clustered data using generalized estimating

equations. Journal of the American Statistical Association 96, 1045–1056.

LITTLE, R. AND RUBIN, D. (1987). Statistical Analysis with Missing Data. New York: Wiley.

MANSKI, C. (2003). Partial Identification of Probability Distributions. New York: Springer.

MITTELHAMMER, R., JUDGE, G. AND SCHOENBERG, R. (2003). Empirical evidence concerning the finite sample

performance of EL-type structural equation estimation and inference methods. CUDARE Working Paper Series,

Paper 945. Berkeley, CA: University of California.

OWEN, A. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75, 237–249.

OWEN, A. (2001). Empirical Likelihood. Boca Raton, FL: Chapman and Hall.

PAN, W. (2001). Akaike’s information criterion in generalized estimating equations. Biometrics 57, 120–125.

PENDERGAST, J. F., GANGE, S. J., NEWTON, M. A., LINDSTROM, M. J., PALTA, M. AND FISHER, M. R. (1996).

A survey of methods for analysing clustered binary response data. International Statistical Review 64, 89–118.

QIN, J. AND LAWLESS, J. (1994). Empirical likelihood and general estimating functions. Annals of Statistics 22,

300–325.

QU, A., LINDSAY, B. AND LI, B. (2000). Improving generalised estimating equations using quadratic inference

functions. Biometrika 87, 823–836.

SMALL, C. G. AND MCLEISH, D. L. (1994). Hilbert Space Methods in Probability and Statistical Inference.

New York: Wiley.

WANG, Y. G. AND CAREY, V. (2003). Working correlation structure misspecification, estimation and covariate

design: implications for generalised estimating equations performance. Biometrika 90, 29–41.

ZEGER, S. L. AND KARIM, M. R. (1991). Generalized linear models with random effects. Journal of the American

Statistical Association 86, 79–86.

[Received November 1, 2007; revised May 12, 2008; second revision October 18, 2008;

accepted for publication January 20, 2009]