Odds ratio or relative risk for cross-sectional data?
- SourceAvailable from: Yanchen Bo[Show abstract] [Hide abstract]
ABSTRACT: There have been large-scale outbreaks of hand, foot and mouth disease (HFMD) in Mainland China over the last decade. These events varied greatly across the country. It is necessary to identify the spatial risk factors and spatial distribution patterns of HFMD for public health control and prevention. Climate risk factors associated with HFMD occurrence have been recognized. However, few studies discussed the socio-economic determinants of HFMD risk at a space scale. HFMD records in Mainland China in May 2008 were collected. Both climate and socio-economic factors were selected as potential risk exposures of HFMD. Odds ratio (OR) was used to identify the spatial risk factors. A spatial autologistic regression model was employed to get OR values of each exposures and model the spatial distribution patterns of HFMD risk. Results showed that both climate and socio-economic variables were spatial risk factors for HFMD transmission in Mainland China. The statistically significant risk factors are monthly average precipitation (OR = 1.4354), monthly average temperature (OR = 1.379), monthly average wind speed (OR = 1.186), the number of industrial enterprises above designated size (OR = 17.699), the population density (OR = 1.953), and the proportion of student population (OR = 1.286). The spatial autologistic regression model has a good goodness of fit (ROC = 0.817) and prediction accuracy (Correct ratio = 78.45%) of HFMD occurrence. The autologistic regression model also reduces the contribution of the residual term in the ordinary logistic regression model significantly, from 17.25 to 1.25 for the odds ratio. Based on the prediction results of the spatial model, we obtained a map of the probability of HFMD occurrence that shows the spatial distribution pattern and local epidemic risk over Mainland China. The autologistic regression model was used to identify spatial risk factors and model spatial risk patterns of HFMD. HFMD occurrences were found to be spatially heterogeneous over the Mainland China, which is related to both the climate and socio-economic variables. The combination of socio-economic and climate exposures can explain the HFMD occurrences more comprehensively and objectively than those with only climate exposures. The modeled probability of HFMD occurrence at the county level reveals not only the spatial trends, but also the local details of epidemic risk, even in the regions where there were no HFMD case records.BMC Public Health 04/2014; 14(1):358. · 2.08 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: The quality and quantity of social relationships are associated with depression but there is less evidence regarding which aspects of social relationship are most predictive. We evaluated the relative magnitude and independence of the association of four social relationship domains with major depressive disorder and depressive symptoms. We analyzed a cross-sectional telephone interview and postal survey of a probability sample of adults living in Switzerland (N = 12,286). Twelve-month major depressive disorder was assessed via structured interview over the telephone using the Composite International Diagnostic Interview (CIDI). The postal survey assessed depressive symptoms as well as variables representing emotional support, tangible support, social integration, and loneliness. Each individual social relationship domain was associated with both outcome measures, but in multivariate models being lonely and perceiving unmet emotional support had the largest and most consistent associations across depression outcomes (incidence rate ratios ranging from 1.55-9.97 for loneliness and from 1.23-1.40 for unmet support, p's < 0.05). All social relationship domains except marital status were independently associated with depressive symptoms whereas only loneliness and unmet support were associated with depressive disorder. Perceived quality and frequency of social relationships are associated with clinical depression and depressive symptoms across a wide adult age spectrum. This study extends prior work linking loneliness to depression by showing that a broad range of social relationship domains are associated with psychological well-being.BMC Public Health 03/2014; 14(1):273. · 2.08 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Despite randomization, selection bias may occur in cluster randomized trials. Classical multivariable regression usually allows for adjusting treatment effect estimates with unbalanced covariates. However, for binary outcomes with low incidence, such a method may fail because of separation problems. This simulation study focused on the performance of propensity score (PS)-based methods to estimate relative risks from cluster randomized trials with binary outcomes with low incidence. The results suggested that among the different approaches used (multivariable regression, direct adjustment on PS, inverse weighting on PS, and stratification on PS), only direct adjustment on the PS fully corrected the bias and moreover had the best statistical properties. Copyright © 2014 John Wiley & Sons, Ltd.Statistics in Medicine 04/2014; · 2.04 Impact Factor
International Journal of Epidemiology
© International Epktemiotogical Association 1994
Vol. 23, No. 1
Printed in Great Britain
Letters to the Editor
Odds Ratio or Relative Risk for Cross-Sectional Data?
From JAMES LEE
Sir—The cross-sectional study is widely used in many
areas of research. Although this study design is appro-
priate mainly for descriptive investigations, it is used
also in some aetiological enquiries.1'2 Two effect
measures—the prevalence rate ratio (PRR) and
prevalence odds ratio (POR)—can be ascertained from
cross-sectional data with a dichotomous outcome
variable (presence or absence of a condition).
A cursory look at the epidemiology journals will
attest that the POR is much more frequently reported
than is the PRR. This practice is apparently attributed
to the routine use of logistic regression for the analysis
of cross-sectional data. Logistic regression is a
valuable statistical tool in that it allows statistical ad-
justment of several confounders as well as assessment
of effect modification based on modest study size. The
problem is that it gives POR as an effect measure but
the PRR appears to be a more meaningful statistic for
First, the odds ratio is incomprehensible.1'2 As em-
phasized by Savitz3 an epidemiological measure must
not only convey the most germane information, but it
must also be easy to communicate and to comprehend.
As such, the odds ratio has no direct usefulness except
as a numerical mimic to other effect measures such as
the relative risk (rate ratio) or incidence density ratio.
In contrast, the PRR is easy to interpret. If the PRR
were 5, then at any given point in time the 'exposed'
subjects in the population are 5 times more likely to
have the condition in question as are the 'unexposed'
subjects. If the condition is of low prevalence, then
POR would be numerically similar to PRR so it would
not matter which effect measure was used. Because the
cross-sectional study is not appropriate for a rare
exposure or condition, the POR will generally be
markedly discrepant from PRR.
The odds ratio is the effect measure in a case-control
study only because the rate ratio cannot be deter-
mined. Fortunately the case-control study is most
Division of Biostatistics and Health Informatics, Department of Com-
munity, Occupational and Family Medicine, National University of
Singapore, NUH, Lower Kent Ridge, Singapore 0511.
suitable for diseases of low incidence, in which case
the odds ratio numerically resembles the rate ratio.
This is one of Cornfield's4 great contributions to
epidemiology and it made the case-control study im-
mensely popular. It has also been shown that the case-
control odds ratio is a direct estimate of the incidence
density ratio without imposing the 'rare disease'
assumption.5'6 Thus the splendour of the case-control
odds ratio is simply that it need not be interpreted in
terms of the odds ratio.
Greenland7 has demonstrated persuasively that as an
effect measure, the odds ratio is more defective for
cohort studies than is generally realized. The appro-
priate effect measure for the closed cohort is the
cumulative incidence ratio and that for the dynamic
cohort is the incidence density ratio. What this means
is that logistic regression, which is sometimes
employed for the analysis of closed and dynamic
cohort data, is not appropriate. Another serious pitfall
of logistic regression is that it does not consider the
time interval between exposure and disease occurrence
in the dynamic cohort.
The choice of effect measure for the cross-sectional
study (POR versus PRR) appears to be more equivocal
and, expectedly, textbooks are not explicit and may
even be contradictory. Thus, Checkoway8 prefers
POR whereas Elwood9 seems to favour PRR. Klein-
baum et al.6 noted that for a cross-sectional study in
which the disease has a protracted risk period (long
and ill-defined interval between exposure and disease
occurrence), the logical effect measure for aetiological
inference is the incidence density ratio (IDR). (If such
a condition were studied longitudinally, the study
design of choice would be the dynamic cohort, which
gives IDR). These authors6 also showed that the cross-
sectional POR is a better numerical approximation of
the IDR than is the cross-sectional PRR (the PRR
tends to underestimate IDR). However, the apparent
advantage of POR over PRR has little practical use
since a disease with a protracted risk period, especially
if the aetiological agent is changeable oveT time,
should not be investigated by a cross-sectional
design.2-10 Indeed, the cross-sectional study should
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
only be used for diseases with short and well-defined
risk periods, in which case the most logical effect
measure is the PRR.6 (If such a condition were studied
longitudinally, the closed cohort design would likely be
employed, which gives the cumulative incidence ratio,
or incidence relative risk). Also, the cross-sectional
study is used mostly for non-aetiological research such
as health care planning and resource allocation, in
which case the prevalence rate is more germane than
the incidence rate and consequently there will be no
reason to use the POR. This means that logistic regres-
sion, which is often employed for the analysis of cross-
sectional data, is inappropriate.
Therefore, what is needed for cross-sectional data, is
a statistical model that estimates PRR rather than
POR yet preserves the virtues of logistic regression,
viz., statistical adjustment of several confounders and
assessment of effect modification based on modest
study size. Cox's proportional hazards model was
originally developed for the estimation of the condi-
tional hazard ratio and 'survival functions' based on
complete or censored longitudinal data with varying
follow-up times, viz., the 'person-time at risk' dyn-
amic cohort."-12 (The dynamic cohort also gives the
incidence density ratio which can be estimated by
Poisson regression.) Subsequently, Breslow13 showed
that by assuming constant risk period, namely 'the per-
sons at risk' closed cohort, the conditional hazard
ratio estimated by Cox's model is equal to the
cumulative incidence ratio. Thus by assuming constant
risk period, the Cox model can be adapted to estimate
PRR for cross-sectional data.
To illustrate the application of the Cox model for
the estimation of PRR with adjustment of confound-
ing, we consider a cross-sectional study to assess
whether the mother's correct knowledge (yes versus
no) about the developmental screening programme in
the maternal and child health clinic (dichotomous
response variable) is related to her educational
background (primary predictor variable). Potential
confounders include ethnicity, whether or not the
mother attended health education talks and whether
the mother is working outside the home or is a
housewife. Further details of the study are given
elsewhere.M The observed results are given in Table 1.
It is clear that the PRR estimated by Cox's model
(Table 2) are highly discrepant from the POR
estimated by the logistic model (Table 3). All statistical
analyses were carried out by SAS.l3
(The programs and related information document-
ing the analysis are available from the author. Please
send a 3.5 inch diskette for storage.)
TABLE 1 Mother's knowledge of developmental screening according
to her educational attainment
Know correctly? Primary Secondary Tertiary
TABLE 2 Crude and adjusted'
likelihood) of correct knowledge of development screening according
to educational attainment: proportional hazards model
prevalence rate ratio (relative
attainment Rate ratio (95% CI) Rate ratio (95% Cl)
Tertiary" 5.59 (2.51-12.49)
* Adjusted for ethnicity (Malay relative to Chinese), whether or not
the mother attended health education talks, and whether the mother is
working outside home or a housewife.
b Relative to primary.
TABLE 3 Crude and adjusted' prevalence odds ratio (relative odds)
of correct knowledge of development screening according to educa-
tional attainment: logistic model
attainment Odds ratio (95% CI) Odds ratio (95% Cl)
a Adjusted for ethnicity (Malay relative to Chinese), whether or not
the mother attended health education talks, and whether the mother is
working outside home or a housewife.
b Relative to primary.
I wish to express my thanks to Dr M M Thein for the
use of her data.M
1 Miettinen O S. Theoretical Epidemiology. New York: John Wiley,
2 Rothman K J. Modern Epidemiology. Boston: Little, Brown,
LETTERS TO THE EDITOR
3 Savitz D A. Measurements, estimates and inferences in reporting
epidemiologic study results. Am J Epidemiol 1992; 135:
* Cornfield J. A statistical property arising from retrospective
studies. Proceedings of the 3rd Berkely Symposium on
Mathematical and Statistical Problems, 1956; 4: 135-48.
5 Miettinen O S. Estimability and estimation in case-referent studies.
Am J Epidemiol 1976; 103: 226-35.
6 Kleinbaum D G, Kupper L L, Morgenstern H. Epidemiologic
Research: Principles and Quantitative Methods.
CA: Lifetime Learning Publications, 1982.
7 Greenland S. Interpretation and choice of effect measures in
epidemiologic analysis. Am J Epidemiol 1987; 125: 761-68.
8 Checkoway H, Pearce N, Crawford-Brown D J. Research Methods
in Occupational Epidemiology. New York: Oxford University
9 Elwood J M. Causal Relationships in Medicine. New York: Oxford
University Press, 1988.
10 Flanders W D, Lin L, Pirkle J L. Caudill S P. Assessing the direc-
tion of causality in cross-sectional studies. Am J Epidemiol
1992; 135: 926-35.
11 Cox D R. Regression models and life-tables (with discussion).
J R Stat Soc B 1972; 34: 187-220.
12 Lee J, Yoshizawa C, Wilkens L, Lee H P. Covariance adjustment
of survival curves based on Cox's proportional hazard regres-
sion model. Comut Appl Biosc (UK) 1992; 8: 23-27.
13 Breslow N E. Covanance analysis of censored survival data.
Biometrics 1974; 30: 89-99.
14 Thein M M, Lee J, Yoong T. Knowledge about developmental
screening in mothers attending a maternal and child health
clinic in Singapore. Ann Acad Med (Singapore) 1992; 21:
15 SAS/STAT Software: The PHREC Procedure. SAS Technical
Report P-217. Cary, NC: SAS Institute, 1991.
Ascertainment Corrected Rates
From LAMBERTUS A L M KIEMENEY. LEO J SCHOUTEN AND HUUB STRAATMAN
Sir—In their well-written paper McCarty and col-
leagues argued that 'all rates be reported only after
formal evaluation and adjustments for underascertain-
ment have been completed'.1 Although they state that
the exact methodology to employ such evaluation and
adjustment is not critical, a strong case is made for the
use of capture-recapture methods. We agree with Mc-
Carty et at. that capture-recapture methods can be a
very useful tool in the evaluation of completeness of
registries. In our opinion, however, capture-recapture
analyses and the interpretation of their results are less
straightforward than the authors suggest.
The first difficulty is the assumption of in-
dependence. The authors recognize that (without extra
information) the dependency between sources is not
identifiable in the two-sources situation. When they
explain that dependencies can be controlled if at least
three sources are available, e.g. by means of log-linear
modelling, they fail to recognize that the three-way in-
teraction term in this modelling approach will remain
unknown. Comparable to the two-sources situation it
is not possible to estimate the quantitative relevance of
this three-way dependency between the sources
without extra information.
Second, there is the problem of cases with an almost
zero probability of being captured by any source. One
Correspondence to: Leo J Schouten, Department of Medical Infor-
matics and Epidemiology, University of Nijmegen, P.O. Box 9101,
NL-6500 HB Nijmegen, The Netherlands.
may argue that this is just another form of negative
dependence but it results in underestimates of the
total number of cases instead of overestimates. For
example, in a cancer registry using hospital discharge
records and pathology reports as notification sources,
all cases with chronic lymphocytic leukaemia (CLL)
may be missed if the diagnosis (based on a blood
sample) is made by the haematologist and patients are
not hospitalized. Using the records of radiotherapy
departments as a third notification source will not
identify the number of missed cases with CLL. In
general, no method can estimate the true rate in the
population if certain cases are systematically missed by
Third, capture-recapture methods may be well-
established, the specific features of their methodology
(especially in the case of multiple sources) are not well
known yet. Only recently, Hook and Regal discussed
the phenomenon that capture-recapture estimates can
vary with 'variable catchability' in population
subgroups, e.g. by race, even if the different sources
are independent in each subgroup.2
For these reasons, we strongly disagree with the
authors' advice that rates should only be reported after
formal adjustment for underascertainment. Moreover,
if only ascertainment corrected rates are reported,
readers will lose access to the numbers on which these
adjusted rates were based. They may also lose impor-
tant information about the specific characteristics (or
perhaps: quality) of the registry that yielded those
numbers. We must remember that capture-recapture