International Journal of Epidemiology
© International Epktemiotogical Association 1994
Vol. 23, No. 1
Printed in Great Britain
Letters to the Editor
Odds Ratio or Relative Risk for Cross-Sectional Data?
From JAMES LEE
Sir—The cross-sectional study is widely used in many
areas of research. Although this study design is appro-
priate mainly for descriptive investigations, it is used
also in some aetiological enquiries.1'2 Two effect
measures—the prevalence rate ratio (PRR) and
prevalence odds ratio (POR)—can be ascertained from
cross-sectional data with a dichotomous outcome
variable (presence or absence of a condition).
A cursory look at the epidemiology journals will
attest that the POR is much more frequently reported
than is the PRR. This practice is apparently attributed
to the routine use of logistic regression for the analysis
of cross-sectional data. Logistic regression is a
valuable statistical tool in that it allows statistical ad-
justment of several confounders as well as assessment
of effect modification based on modest study size. The
problem is that it gives POR as an effect measure but
the PRR appears to be a more meaningful statistic for
First, the odds ratio is incomprehensible.1'2 As em-
phasized by Savitz3 an epidemiological measure must
not only convey the most germane information, but it
must also be easy to communicate and to comprehend.
As such, the odds ratio has no direct usefulness except
as a numerical mimic to other effect measures such as
the relative risk (rate ratio) or incidence density ratio.
In contrast, the PRR is easy to interpret. If the PRR
were 5, then at any given point in time the 'exposed'
subjects in the population are 5 times more likely to
have the condition in question as are the 'unexposed'
subjects. If the condition is of low prevalence, then
POR would be numerically similar to PRR so it would
not matter which effect measure was used. Because the
cross-sectional study is not appropriate for a rare
exposure or condition, the POR will generally be
markedly discrepant from PRR.
The odds ratio is the effect measure in a case-control
study only because the rate ratio cannot be deter-
mined. Fortunately the case-control study is most
Division of Biostatistics and Health Informatics, Department of Com-
munity, Occupational and Family Medicine, National University of
Singapore, NUH, Lower Kent Ridge, Singapore 0511.
suitable for diseases of low incidence, in which case
the odds ratio numerically resembles the rate ratio.
This is one of Cornfield's4 great contributions to
epidemiology and it made the case-control study im-
mensely popular. It has also been shown that the case-
control odds ratio is a direct estimate of the incidence
density ratio without imposing the 'rare disease'
assumption.5'6 Thus the splendour of the case-control
odds ratio is simply that it need not be interpreted in
terms of the odds ratio.
Greenland7 has demonstrated persuasively that as an
effect measure, the odds ratio is more defective for
cohort studies than is generally realized. The appro-
priate effect measure for the closed cohort is the
cumulative incidence ratio and that for the dynamic
cohort is the incidence density ratio. What this means
is that logistic regression, which is sometimes
employed for the analysis of closed and dynamic
cohort data, is not appropriate. Another serious pitfall
of logistic regression is that it does not consider the
time interval between exposure and disease occurrence
in the dynamic cohort.
The choice of effect measure for the cross-sectional
study (POR versus PRR) appears to be more equivocal
and, expectedly, textbooks are not explicit and may
even be contradictory. Thus, Checkoway8 prefers
POR whereas Elwood9 seems to favour PRR. Klein-
baum et al.6 noted that for a cross-sectional study in
which the disease has a protracted risk period (long
and ill-defined interval between exposure and disease
occurrence), the logical effect measure for aetiological
inference is the incidence density ratio (IDR). (If such
a condition were studied longitudinally, the study
design of choice would be the dynamic cohort, which
gives IDR). These authors6 also showed that the cross-
sectional POR is a better numerical approximation of
the IDR than is the cross-sectional PRR (the PRR
tends to underestimate IDR). However, the apparent
advantage of POR over PRR has little practical use
since a disease with a protracted risk period, especially
if the aetiological agent is changeable oveT time,
should not be investigated by a cross-sectional
design.2-10 Indeed, the cross-sectional study should
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
only be used for diseases with short and well-defined
risk periods, in which case the most logical effect
measure is the PRR.6 (If such a condition were studied
longitudinally, the closed cohort design would likely be
employed, which gives the cumulative incidence ratio,
or incidence relative risk). Also, the cross-sectional
study is used mostly for non-aetiological research such
as health care planning and resource allocation, in
which case the prevalence rate is more germane than
the incidence rate and consequently there will be no
reason to use the POR. This means that logistic regres-
sion, which is often employed for the analysis of cross-
sectional data, is inappropriate.
Therefore, what is needed for cross-sectional data, is
a statistical model that estimates PRR rather than
POR yet preserves the virtues of logistic regression,
viz., statistical adjustment of several confounders and
assessment of effect modification based on modest
study size. Cox's proportional hazards model was
originally developed for the estimation of the condi-
tional hazard ratio and 'survival functions' based on
complete or censored longitudinal data with varying
follow-up times, viz., the 'person-time at risk' dyn-
amic cohort."-12 (The dynamic cohort also gives the
incidence density ratio which can be estimated by
Poisson regression.) Subsequently, Breslow13 showed
that by assuming constant risk period, namely 'the per-
sons at risk' closed cohort, the conditional hazard
ratio estimated by Cox's model is equal to the
cumulative incidence ratio. Thus by assuming constant
risk period, the Cox model can be adapted to estimate
PRR for cross-sectional data.
To illustrate the application of the Cox model for
the estimation of PRR with adjustment of confound-
ing, we consider a cross-sectional study to assess
whether the mother's correct knowledge (yes versus
no) about the developmental screening programme in
the maternal and child health clinic (dichotomous
response variable) is related to her educational
background (primary predictor variable). Potential
confounders include ethnicity, whether or not the
mother attended health education talks and whether
the mother is working outside the home or is a
housewife. Further details of the study are given
elsewhere.M The observed results are given in Table 1.
It is clear that the PRR estimated by Cox's model
(Table 2) are highly discrepant from the POR
estimated by the logistic model (Table 3). All statistical
analyses were carried out by SAS.l3
(The programs and related information document-
ing the analysis are available from the author. Please
send a 3.5 inch diskette for storage.)
TABLE 1 Mother's knowledge of developmental screening according
to her educational attainment
Know correctly? Primary Secondary Tertiary
TABLE 2 Crude and adjusted'
likelihood) of correct knowledge of development screening according
to educational attainment: proportional hazards model
prevalence rate ratio (relative
attainment Rate ratio (95% CI) Rate ratio (95% Cl)
Tertiary" 5.59 (2.51-12.49)
* Adjusted for ethnicity (Malay relative to Chinese), whether or not
the mother attended health education talks, and whether the mother is
working outside home or a housewife.
b Relative to primary.
TABLE 3 Crude and adjusted' prevalence odds ratio (relative odds)
of correct knowledge of development screening according to educa-
tional attainment: logistic model
attainment Odds ratio (95% CI) Odds ratio (95% Cl)
a Adjusted for ethnicity (Malay relative to Chinese), whether or not
the mother attended health education talks, and whether the mother is
working outside home or a housewife.
b Relative to primary.
I wish to express my thanks to Dr M M Thein for the
use of her data.M
1 Miettinen O S. Theoretical Epidemiology. New York: John Wiley,
2 Rothman K J. Modern Epidemiology. Boston: Little, Brown,
LETTERS TO THE EDITOR
3 Savitz D A. Measurements, estimates and inferences in reporting
epidemiologic study results. Am J Epidemiol 1992; 135:
* Cornfield J. A statistical property arising from retrospective
studies. Proceedings of the 3rd Berkely Symposium on
Mathematical and Statistical Problems, 1956; 4: 135-48.
5 Miettinen O S. Estimability and estimation in case-referent studies.
Am J Epidemiol 1976; 103: 226-35.
6 Kleinbaum D G, Kupper L L, Morgenstern H. Epidemiologic
Research: Principles and Quantitative Methods.
CA: Lifetime Learning Publications, 1982.
7 Greenland S. Interpretation and choice of effect measures in
epidemiologic analysis. Am J Epidemiol 1987; 125: 761-68.
8 Checkoway H, Pearce N, Crawford-Brown D J. Research Methods
in Occupational Epidemiology. New York: Oxford University
9 Elwood J M. Causal Relationships in Medicine. New York: Oxford
University Press, 1988.
10 Flanders W D, Lin L, Pirkle J L. Caudill S P. Assessing the direc-
tion of causality in cross-sectional studies. Am J Epidemiol
1992; 135: 926-35.
11 Cox D R. Regression models and life-tables (with discussion).
J R Stat Soc B 1972; 34: 187-220.
12 Lee J, Yoshizawa C, Wilkens L, Lee H P. Covariance adjustment
of survival curves based on Cox's proportional hazard regres-
sion model. Comut Appl Biosc (UK) 1992; 8: 23-27.
13 Breslow N E. Covanance analysis of censored survival data.
Biometrics 1974; 30: 89-99.
14 Thein M M, Lee J, Yoong T. Knowledge about developmental
screening in mothers attending a maternal and child health
clinic in Singapore. Ann Acad Med (Singapore) 1992; 21:
15 SAS/STAT Software: The PHREC Procedure. SAS Technical
Report P-217. Cary, NC: SAS Institute, 1991.
Ascertainment Corrected Rates
From LAMBERTUS A L M KIEMENEY. LEO J SCHOUTEN AND HUUB STRAATMAN
Sir—In their well-written paper McCarty and col-
leagues argued that 'all rates be reported only after
formal evaluation and adjustments for underascertain-
ment have been completed'.1 Although they state that
the exact methodology to employ such evaluation and
adjustment is not critical, a strong case is made for the
use of capture-recapture methods. We agree with Mc-
Carty et at. that capture-recapture methods can be a
very useful tool in the evaluation of completeness of
registries. In our opinion, however, capture-recapture
analyses and the interpretation of their results are less
straightforward than the authors suggest.
The first difficulty is the assumption of in-
dependence. The authors recognize that (without extra
information) the dependency between sources is not
identifiable in the two-sources situation. When they
explain that dependencies can be controlled if at least
three sources are available, e.g. by means of log-linear
modelling, they fail to recognize that the three-way in-
teraction term in this modelling approach will remain
unknown. Comparable to the two-sources situation it
is not possible to estimate the quantitative relevance of
this three-way dependency between the sources
without extra information.
Second, there is the problem of cases with an almost
zero probability of being captured by any source. One
Correspondence to: Leo J Schouten, Department of Medical Infor-
matics and Epidemiology, University of Nijmegen, P.O. Box 9101,
NL-6500 HB Nijmegen, The Netherlands.
may argue that this is just another form of negative
dependence but it results in underestimates of the
total number of cases instead of overestimates. For
example, in a cancer registry using hospital discharge
records and pathology reports as notification sources,
all cases with chronic lymphocytic leukaemia (CLL)
may be missed if the diagnosis (based on a blood
sample) is made by the haematologist and patients are
not hospitalized. Using the records of radiotherapy
departments as a third notification source will not
identify the number of missed cases with CLL. In
general, no method can estimate the true rate in the
population if certain cases are systematically missed by
Third, capture-recapture methods may be well-
established, the specific features of their methodology
(especially in the case of multiple sources) are not well
known yet. Only recently, Hook and Regal discussed
the phenomenon that capture-recapture estimates can
vary with 'variable catchability' in population
subgroups, e.g. by race, even if the different sources
are independent in each subgroup.2
For these reasons, we strongly disagree with the
authors' advice that rates should only be reported after
formal adjustment for underascertainment. Moreover,
if only ascertainment corrected rates are reported,
readers will lose access to the numbers on which these
adjusted rates were based. They may also lose impor-
tant information about the specific characteristics (or
perhaps: quality) of the registry that yielded those
numbers. We must remember that capture-recapture