Page 1

International Journal of Epidemiology

© International Epktemiotogical Association 1994

Vol. 23, No. 1

Printed in Great Britain

Letters to the Editor

Odds Ratio or Relative Risk for Cross-Sectional Data?

From JAMES LEE

Sir—The cross-sectional study is widely used in many

areas of research. Although this study design is appro-

priate mainly for descriptive investigations, it is used

also in some aetiological enquiries.1'2 Two effect

measures—the prevalence rate ratio (PRR) and

prevalence odds ratio (POR)—can be ascertained from

cross-sectional data with a dichotomous outcome

variable (presence or absence of a condition).

A cursory look at the epidemiology journals will

attest that the POR is much more frequently reported

than is the PRR. This practice is apparently attributed

to the routine use of logistic regression for the analysis

of cross-sectional data. Logistic regression is a

valuable statistical tool in that it allows statistical ad-

justment of several confounders as well as assessment

of effect modification based on modest study size. The

problem is that it gives POR as an effect measure but

the PRR appears to be a more meaningful statistic for

cross-sectional data.

First, the odds ratio is incomprehensible.1'2 As em-

phasized by Savitz3 an epidemiological measure must

not only convey the most germane information, but it

must also be easy to communicate and to comprehend.

As such, the odds ratio has no direct usefulness except

as a numerical mimic to other effect measures such as

the relative risk (rate ratio) or incidence density ratio.

In contrast, the PRR is easy to interpret. If the PRR

were 5, then at any given point in time the 'exposed'

subjects in the population are 5 times more likely to

have the condition in question as are the 'unexposed'

subjects. If the condition is of low prevalence, then

POR would be numerically similar to PRR so it would

not matter which effect measure was used. Because the

cross-sectional study is not appropriate for a rare

exposure or condition, the POR will generally be

markedly discrepant from PRR.

The odds ratio is the effect measure in a case-control

study only because the rate ratio cannot be deter-

mined. Fortunately the case-control study is most

Division of Biostatistics and Health Informatics, Department of Com-

munity, Occupational and Family Medicine, National University of

Singapore, NUH, Lower Kent Ridge, Singapore 0511.

suitable for diseases of low incidence, in which case

the odds ratio numerically resembles the rate ratio.

This is one of Cornfield's4 great contributions to

epidemiology and it made the case-control study im-

mensely popular. It has also been shown that the case-

control odds ratio is a direct estimate of the incidence

density ratio without imposing the 'rare disease'

assumption.5'6 Thus the splendour of the case-control

odds ratio is simply that it need not be interpreted in

terms of the odds ratio.

Greenland7 has demonstrated persuasively that as an

effect measure, the odds ratio is more defective for

cohort studies than is generally realized. The appro-

priate effect measure for the closed cohort is the

cumulative incidence ratio and that for the dynamic

cohort is the incidence density ratio. What this means

is that logistic regression, which is sometimes

employed for the analysis of closed and dynamic

cohort data, is not appropriate. Another serious pitfall

of logistic regression is that it does not consider the

time interval between exposure and disease occurrence

in the dynamic cohort.

The choice of effect measure for the cross-sectional

study (POR versus PRR) appears to be more equivocal

and, expectedly, textbooks are not explicit and may

even be contradictory. Thus, Checkoway8 prefers

POR whereas Elwood9 seems to favour PRR. Klein-

baum et al.6 noted that for a cross-sectional study in

which the disease has a protracted risk period (long

and ill-defined interval between exposure and disease

occurrence), the logical effect measure for aetiological

inference is the incidence density ratio (IDR). (If such

a condition were studied longitudinally, the study

design of choice would be the dynamic cohort, which

gives IDR). These authors6 also showed that the cross-

sectional POR is a better numerical approximation of

the IDR than is the cross-sectional PRR (the PRR

tends to underestimate IDR). However, the apparent

advantage of POR over PRR has little practical use

since a disease with a protracted risk period, especially

if the aetiological agent is changeable oveT time,

should not be investigated by a cross-sectional

design.2-10 Indeed, the cross-sectional study should

201

Page 2

202

INTERNATIONAL JOURNAL OF EPIDEMIOLOGY

only be used for diseases with short and well-defined

risk periods, in which case the most logical effect

measure is the PRR.6 (If such a condition were studied

longitudinally, the closed cohort design would likely be

employed, which gives the cumulative incidence ratio,

or incidence relative risk). Also, the cross-sectional

study is used mostly for non-aetiological research such

as health care planning and resource allocation, in

which case the prevalence rate is more germane than

the incidence rate and consequently there will be no

reason to use the POR. This means that logistic regres-

sion, which is often employed for the analysis of cross-

sectional data, is inappropriate.

Therefore, what is needed for cross-sectional data, is

a statistical model that estimates PRR rather than

POR yet preserves the virtues of logistic regression,

viz., statistical adjustment of several confounders and

assessment of effect modification based on modest

study size. Cox's proportional hazards model was

originally developed for the estimation of the condi-

tional hazard ratio and 'survival functions' based on

complete or censored longitudinal data with varying

follow-up times, viz., the 'person-time at risk' dyn-

amic cohort."-12 (The dynamic cohort also gives the

incidence density ratio which can be estimated by

Poisson regression.) Subsequently, Breslow13 showed

that by assuming constant risk period, namely 'the per-

sons at risk' closed cohort, the conditional hazard

ratio estimated by Cox's model is equal to the

cumulative incidence ratio. Thus by assuming constant

risk period, the Cox model can be adapted to estimate

PRR for cross-sectional data.

To illustrate the application of the Cox model for

the estimation of PRR with adjustment of confound-

ing, we consider a cross-sectional study to assess

whether the mother's correct knowledge (yes versus

no) about the developmental screening programme in

the maternal and child health clinic (dichotomous

response variable) is related to her educational

background (primary predictor variable). Potential

confounders include ethnicity, whether or not the

mother attended health education talks and whether

the mother is working outside the home or is a

housewife. Further details of the study are given

elsewhere.M The observed results are given in Table 1.

It is clear that the PRR estimated by Cox's model

(Table 2) are highly discrepant from the POR

estimated by the logistic model (Table 3). All statistical

analyses were carried out by SAS.l3

(The programs and related information document-

ing the analysis are available from the author. Please

send a 3.5 inch diskette for storage.)

TABLE 1 Mother's knowledge of developmental screening according

to her educational attainment

Educational attainment

Know correctly? Primary Secondary Tertiary

Yes

No

11(15.5%)

60

44(41.9*)

61

13(86.6%)

2

Total

71 105

TABLE 2 Crude and adjusted'

likelihood) of correct knowledge of development screening according

to educational attainment: proportional hazards model

prevalence rate ratio (relative

Crude

Adjusted

Educational

attainment Rate ratio (95% CI) Rate ratio (95% Cl)

Secondary5 2.70(1.40-5.24)

Tertiary" 5.59 (2.51-12.49)

2.83(1.43-5.56)

5.57 (2.38-13.04)

* Adjusted for ethnicity (Malay relative to Chinese), whether or not

the mother attended health education talks, and whether the mother is

working outside home or a housewife.

b Relative to primary.

TABLE 3 Crude and adjusted' prevalence odds ratio (relative odds)

of correct knowledge of development screening according to educa-

tional attainment: logistic model

Crude

Adjusted

Educational

attainment Odds ratio (95% CI) Odds ratio (95% Cl)

Secondary1'

Tertiaryb

3.93(1.86-8.33)

35.46(7.01-179)

4.34(1.97-9.60)

34.35(6.47-182)

a Adjusted for ethnicity (Malay relative to Chinese), whether or not

the mother attended health education talks, and whether the mother is

working outside home or a housewife.

b Relative to primary.

ACKNOWLEDGEMENT:

I wish to express my thanks to Dr M M Thein for the

use of her data.M

REFERENCES

1 Miettinen O S. Theoretical Epidemiology. New York: John Wiley,

1985.

2 Rothman K J. Modern Epidemiology. Boston: Little, Brown,

1986.

Page 3

LETTERS TO THE EDITOR

203

3 Savitz D A. Measurements, estimates and inferences in reporting

epidemiologic study results. Am J Epidemiol 1992; 135:

223-24.

* Cornfield J. A statistical property arising from retrospective

studies. Proceedings of the 3rd Berkely Symposium on

Mathematical and Statistical Problems, 1956; 4: 135-48.

5 Miettinen O S. Estimability and estimation in case-referent studies.

Am J Epidemiol 1976; 103: 226-35.

6 Kleinbaum D G, Kupper L L, Morgenstern H. Epidemiologic

Research: Principles and Quantitative Methods.

CA: Lifetime Learning Publications, 1982.

7 Greenland S. Interpretation and choice of effect measures in

epidemiologic analysis. Am J Epidemiol 1987; 125: 761-68.

8 Checkoway H, Pearce N, Crawford-Brown D J. Research Methods

in Occupational Epidemiology. New York: Oxford University

Press, 1989.

9 Elwood J M. Causal Relationships in Medicine. New York: Oxford

University Press, 1988.

Belmont,

10 Flanders W D, Lin L, Pirkle J L. Caudill S P. Assessing the direc-

tion of causality in cross-sectional studies. Am J Epidemiol

1992; 135: 926-35.

11 Cox D R. Regression models and life-tables (with discussion).

J R Stat Soc B 1972; 34: 187-220.

12 Lee J, Yoshizawa C, Wilkens L, Lee H P. Covariance adjustment

of survival curves based on Cox's proportional hazard regres-

sion model. Comut Appl Biosc (UK) 1992; 8: 23-27.

13 Breslow N E. Covanance analysis of censored survival data.

Biometrics 1974; 30: 89-99.

14 Thein M M, Lee J, Yoong T. Knowledge about developmental

screening in mothers attending a maternal and child health

clinic in Singapore. Ann Acad Med (Singapore) 1992; 21:

735-40.

15 SAS/STAT Software: The PHREC Procedure. SAS Technical

Report P-217. Cary, NC: SAS Institute, 1991.

Ascertainment Corrected Rates

From LAMBERTUS A L M KIEMENEY. LEO J SCHOUTEN AND HUUB STRAATMAN

Sir—In their well-written paper McCarty and col-

leagues argued that 'all rates be reported only after

formal evaluation and adjustments for underascertain-

ment have been completed'.1 Although they state that

the exact methodology to employ such evaluation and

adjustment is not critical, a strong case is made for the

use of capture-recapture methods. We agree with Mc-

Carty et at. that capture-recapture methods can be a

very useful tool in the evaluation of completeness of

registries. In our opinion, however, capture-recapture

analyses and the interpretation of their results are less

straightforward than the authors suggest.

The first difficulty is the assumption of in-

dependence. The authors recognize that (without extra

information) the dependency between sources is not

identifiable in the two-sources situation. When they

explain that dependencies can be controlled if at least

three sources are available, e.g. by means of log-linear

modelling, they fail to recognize that the three-way in-

teraction term in this modelling approach will remain

unknown. Comparable to the two-sources situation it

is not possible to estimate the quantitative relevance of

this three-way dependency between the sources

without extra information.

Second, there is the problem of cases with an almost

zero probability of being captured by any source. One

Correspondence to: Leo J Schouten, Department of Medical Infor-

matics and Epidemiology, University of Nijmegen, P.O. Box 9101,

NL-6500 HB Nijmegen, The Netherlands.

may argue that this is just another form of negative

dependence but it results in underestimates of the

total number of cases instead of overestimates. For

example, in a cancer registry using hospital discharge

records and pathology reports as notification sources,

all cases with chronic lymphocytic leukaemia (CLL)

may be missed if the diagnosis (based on a blood

sample) is made by the haematologist and patients are

not hospitalized. Using the records of radiotherapy

departments as a third notification source will not

identify the number of missed cases with CLL. In

general, no method can estimate the true rate in the

population if certain cases are systematically missed by

all sources.

Third, capture-recapture methods may be well-

established, the specific features of their methodology

(especially in the case of multiple sources) are not well

known yet. Only recently, Hook and Regal discussed

the phenomenon that capture-recapture estimates can

vary with 'variable catchability' in population

subgroups, e.g. by race, even if the different sources

are independent in each subgroup.2

For these reasons, we strongly disagree with the

authors' advice that rates should only be reported after

formal adjustment for underascertainment. Moreover,

if only ascertainment corrected rates are reported,

readers will lose access to the numbers on which these

adjusted rates were based. They may also lose impor-

tant information about the specific characteristics (or

perhaps: quality) of the registry that yielded those

numbers. We must remember that capture-recapture