Odds ratio or relative risk for cross-sectional data?
- SourceAvailable from: Segundo Ramos Leon[Show abstract] [Hide abstract]
ABSTRACT: Background: Syphilis prevalence continues to be high among at-risk populations such as men who have sex with men (MSM). In low and middle-income countries, syphilis remains a neglected epidemic with a lack of effective prevention strategies. Methods: PICASSO is a clinic-based study of MSM in Lima, Peru that includes behavioral surveys and syphilis testing with rapid plasma reagin (RPR) titers (BD Macro-Vue RPR, Becton-Dickinson, NJ) and Treponema pallidum Particle Agglutination (Serodia TP-PA, Fujirebio Inc, Japan). Participants with recent syphilis infection (RPR titer ≥1:16) were compared to participants with non-reactive titers. Participants with RPR titers 1:1-1:8 were excluded. Age in years was analyzed as a continuous variable. HIV perceived risk was self-reported on a 4-point scale. Factors associated with recent syphilis were explored using Poisson regression to compute risk ratios (RR). Results: The frequency of recent syphilis infection was 26/171 (15.2%). More individuals with recent syphilis infection were HIV-infected; 6/26 (23.0%) compared to 18/91 (19.8%) participants with nonreactive RPR titers (RR 1.16, p = 0.711). Recent syphilis infection was associated with younger age (RR 0.95, p =0.016) and higher self-perceived risk of HIV (RR 1.33, p =0.06). Insertive anal sex was associated with a lower relative risk of recent syphilis infection (RR 0.25, p = 0.05) compared to receptive or versatile. Other sexual behaviors, substance use and alcohol use were not associated with recent syphilis infection. Conclusions: Recent syphilis infection was common in the clinic-based sample of high-risk MSM. Frequent syphilis and HIV co-infection suggest an integrated strategy is necessary for prevention and treatment efforts. Our findings suggest that patterns of syphilis transmission are only partially explained by current measures of behavioral risk.National STD Prevention Conference 2014 Centers for Disease Control and Prevention; 06/2014
- [Show abstract] [Hide abstract]
ABSTRACT: The objective for this paper was to present and discuss the use of odds ratios and prevalence ratios using real data with a complex sampling design. We carried out a cross-sectional study using data obtained from a two-stage stratified cluster sample from a study conducted in 2001-2002 (n = 1,958). Odds ratios and prevalence ratios were obtained by unconditional logistic regression and Poisson regression, respectively, for later comparison using the Stata statistical package (v. 7.0). Confidence intervals and design effects were considered in the evaluation of the precision of estimates. Two outcomes of a cross-sectional study with different prevalences were evaluated: vaccination against influenza (66.1%) and self-referred lung disease (6.9%). In the high-prevalence scenario, using prevalence ratios the estimates were more conservative and we found narrower confidence intervals. In the low-prevalence scenario, we found no important numeric differences between the estimates and standard errors obtained using the two techniques. A design effect greater than one indicates that the sample design has increased the variance of the estimate. However, it is the researcher's task to choose which technique and measure to use for each data set, since this choice must remain within the scope of epidemiology.Revista Brasileira de Epidemiologia 09/2008; 11(3):347-355.
- [Show abstract] [Hide abstract]
ABSTRACT: Purpose: The aim of the study was to determine the relationship between orofacial pain (OFP) in the community and other symptoms. Materials and Methods: This cross-sectional population-based study was conducted in a general medical practice in South East Cheshire, UK. Questionnaires were mailed to a random sample of 4,000 adults aged 18-65 years, of whom 2,504 responded (adjusted participation rate 74%). Results: The current study showed an association between self-reported OFP and all the other symptoms measured. The strongest association was found for a high level of sleep disturbance (relative risk (RR) 3.7; 95% Confidence Interval (CI) 2.9-4.9), tenderness of jaw muscles in the morning (RR 3.7; 95% CI 3.3-4.1), persons with frequent headaches (RR 3.1; 95% CI 2.7-3.5), and tiredness or stiffness of jaw muscles (RR 2.6; 95% CI 2.3-3.0). Having pain in the body other than the head was associated with a relative risk of OFP of 1.6 (95% CI 1.4-1.9), and increased risk persisted when individual body locations were considered (back, abdominal, forearm, shoulder and knee pain). Those who took medication for bowels had a higher risk of OFP (RR 1.4; 95% CI 1.1-1.8). Problems with micturition were associated with an elevated risk of 1.5 (95% CI 1.0-2.0). None of these results changed significantly after adjustment for age and gender. Conclusions: This cross-sectional community-based study contributes additional information on the relationship between other symptoms and OFP. It suggests that future research should adopt a multidisciplinary approach to OFP, however further longitudinal studies are required establishing the association between other symptoms and the onset of OFP.
International Journal of Epidemiology
© International Epktemiotogical Association 1994
Vol. 23, No. 1
Printed in Great Britain
Letters to the Editor
Odds Ratio or Relative Risk for Cross-Sectional Data?
From JAMES LEE
Sir—The cross-sectional study is widely used in many
areas of research. Although this study design is appro-
priate mainly for descriptive investigations, it is used
also in some aetiological enquiries.1'2 Two effect
measures—the prevalence rate ratio (PRR) and
prevalence odds ratio (POR)—can be ascertained from
cross-sectional data with a dichotomous outcome
variable (presence or absence of a condition).
A cursory look at the epidemiology journals will
attest that the POR is much more frequently reported
than is the PRR. This practice is apparently attributed
to the routine use of logistic regression for the analysis
of cross-sectional data. Logistic regression is a
valuable statistical tool in that it allows statistical ad-
justment of several confounders as well as assessment
of effect modification based on modest study size. The
problem is that it gives POR as an effect measure but
the PRR appears to be a more meaningful statistic for
First, the odds ratio is incomprehensible.1'2 As em-
phasized by Savitz3 an epidemiological measure must
not only convey the most germane information, but it
must also be easy to communicate and to comprehend.
As such, the odds ratio has no direct usefulness except
as a numerical mimic to other effect measures such as
the relative risk (rate ratio) or incidence density ratio.
In contrast, the PRR is easy to interpret. If the PRR
were 5, then at any given point in time the 'exposed'
subjects in the population are 5 times more likely to
have the condition in question as are the 'unexposed'
subjects. If the condition is of low prevalence, then
POR would be numerically similar to PRR so it would
not matter which effect measure was used. Because the
cross-sectional study is not appropriate for a rare
exposure or condition, the POR will generally be
markedly discrepant from PRR.
The odds ratio is the effect measure in a case-control
study only because the rate ratio cannot be deter-
mined. Fortunately the case-control study is most
Division of Biostatistics and Health Informatics, Department of Com-
munity, Occupational and Family Medicine, National University of
Singapore, NUH, Lower Kent Ridge, Singapore 0511.
suitable for diseases of low incidence, in which case
the odds ratio numerically resembles the rate ratio.
This is one of Cornfield's4 great contributions to
epidemiology and it made the case-control study im-
mensely popular. It has also been shown that the case-
control odds ratio is a direct estimate of the incidence
density ratio without imposing the 'rare disease'
assumption.5'6 Thus the splendour of the case-control
odds ratio is simply that it need not be interpreted in
terms of the odds ratio.
Greenland7 has demonstrated persuasively that as an
effect measure, the odds ratio is more defective for
cohort studies than is generally realized. The appro-
priate effect measure for the closed cohort is the
cumulative incidence ratio and that for the dynamic
cohort is the incidence density ratio. What this means
is that logistic regression, which is sometimes
employed for the analysis of closed and dynamic
cohort data, is not appropriate. Another serious pitfall
of logistic regression is that it does not consider the
time interval between exposure and disease occurrence
in the dynamic cohort.
The choice of effect measure for the cross-sectional
study (POR versus PRR) appears to be more equivocal
and, expectedly, textbooks are not explicit and may
even be contradictory. Thus, Checkoway8 prefers
POR whereas Elwood9 seems to favour PRR. Klein-
baum et al.6 noted that for a cross-sectional study in
which the disease has a protracted risk period (long
and ill-defined interval between exposure and disease
occurrence), the logical effect measure for aetiological
inference is the incidence density ratio (IDR). (If such
a condition were studied longitudinally, the study
design of choice would be the dynamic cohort, which
gives IDR). These authors6 also showed that the cross-
sectional POR is a better numerical approximation of
the IDR than is the cross-sectional PRR (the PRR
tends to underestimate IDR). However, the apparent
advantage of POR over PRR has little practical use
since a disease with a protracted risk period, especially
if the aetiological agent is changeable oveT time,
should not be investigated by a cross-sectional
design.2-10 Indeed, the cross-sectional study should
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
only be used for diseases with short and well-defined
risk periods, in which case the most logical effect
measure is the PRR.6 (If such a condition were studied
longitudinally, the closed cohort design would likely be
employed, which gives the cumulative incidence ratio,
or incidence relative risk). Also, the cross-sectional
study is used mostly for non-aetiological research such
as health care planning and resource allocation, in
which case the prevalence rate is more germane than
the incidence rate and consequently there will be no
reason to use the POR. This means that logistic regres-
sion, which is often employed for the analysis of cross-
sectional data, is inappropriate.
Therefore, what is needed for cross-sectional data, is
a statistical model that estimates PRR rather than
POR yet preserves the virtues of logistic regression,
viz., statistical adjustment of several confounders and
assessment of effect modification based on modest
study size. Cox's proportional hazards model was
originally developed for the estimation of the condi-
tional hazard ratio and 'survival functions' based on
complete or censored longitudinal data with varying
follow-up times, viz., the 'person-time at risk' dyn-
amic cohort."-12 (The dynamic cohort also gives the
incidence density ratio which can be estimated by
Poisson regression.) Subsequently, Breslow13 showed
that by assuming constant risk period, namely 'the per-
sons at risk' closed cohort, the conditional hazard
ratio estimated by Cox's model is equal to the
cumulative incidence ratio. Thus by assuming constant
risk period, the Cox model can be adapted to estimate
PRR for cross-sectional data.
To illustrate the application of the Cox model for
the estimation of PRR with adjustment of confound-
ing, we consider a cross-sectional study to assess
whether the mother's correct knowledge (yes versus
no) about the developmental screening programme in
the maternal and child health clinic (dichotomous
response variable) is related to her educational
background (primary predictor variable). Potential
confounders include ethnicity, whether or not the
mother attended health education talks and whether
the mother is working outside the home or is a
housewife. Further details of the study are given
elsewhere.M The observed results are given in Table 1.
It is clear that the PRR estimated by Cox's model
(Table 2) are highly discrepant from the POR
estimated by the logistic model (Table 3). All statistical
analyses were carried out by SAS.l3
(The programs and related information document-
ing the analysis are available from the author. Please
send a 3.5 inch diskette for storage.)
TABLE 1 Mother's knowledge of developmental screening according
to her educational attainment
Know correctly? Primary Secondary Tertiary
TABLE 2 Crude and adjusted'
likelihood) of correct knowledge of development screening according
to educational attainment: proportional hazards model
prevalence rate ratio (relative
attainment Rate ratio (95% CI) Rate ratio (95% Cl)
Tertiary" 5.59 (2.51-12.49)
* Adjusted for ethnicity (Malay relative to Chinese), whether or not
the mother attended health education talks, and whether the mother is
working outside home or a housewife.
b Relative to primary.
TABLE 3 Crude and adjusted' prevalence odds ratio (relative odds)
of correct knowledge of development screening according to educa-
tional attainment: logistic model
attainment Odds ratio (95% CI) Odds ratio (95% Cl)
a Adjusted for ethnicity (Malay relative to Chinese), whether or not
the mother attended health education talks, and whether the mother is
working outside home or a housewife.
b Relative to primary.
I wish to express my thanks to Dr M M Thein for the
use of her data.M
1 Miettinen O S. Theoretical Epidemiology. New York: John Wiley,
2 Rothman K J. Modern Epidemiology. Boston: Little, Brown,
LETTERS TO THE EDITOR
3 Savitz D A. Measurements, estimates and inferences in reporting
epidemiologic study results. Am J Epidemiol 1992; 135:
* Cornfield J. A statistical property arising from retrospective
studies. Proceedings of the 3rd Berkely Symposium on
Mathematical and Statistical Problems, 1956; 4: 135-48.
5 Miettinen O S. Estimability and estimation in case-referent studies.
Am J Epidemiol 1976; 103: 226-35.
6 Kleinbaum D G, Kupper L L, Morgenstern H. Epidemiologic
Research: Principles and Quantitative Methods.
CA: Lifetime Learning Publications, 1982.
7 Greenland S. Interpretation and choice of effect measures in
epidemiologic analysis. Am J Epidemiol 1987; 125: 761-68.
8 Checkoway H, Pearce N, Crawford-Brown D J. Research Methods
in Occupational Epidemiology. New York: Oxford University
9 Elwood J M. Causal Relationships in Medicine. New York: Oxford
University Press, 1988.
10 Flanders W D, Lin L, Pirkle J L. Caudill S P. Assessing the direc-
tion of causality in cross-sectional studies. Am J Epidemiol
1992; 135: 926-35.
11 Cox D R. Regression models and life-tables (with discussion).
J R Stat Soc B 1972; 34: 187-220.
12 Lee J, Yoshizawa C, Wilkens L, Lee H P. Covariance adjustment
of survival curves based on Cox's proportional hazard regres-
sion model. Comut Appl Biosc (UK) 1992; 8: 23-27.
13 Breslow N E. Covanance analysis of censored survival data.
Biometrics 1974; 30: 89-99.
14 Thein M M, Lee J, Yoong T. Knowledge about developmental
screening in mothers attending a maternal and child health
clinic in Singapore. Ann Acad Med (Singapore) 1992; 21:
15 SAS/STAT Software: The PHREC Procedure. SAS Technical
Report P-217. Cary, NC: SAS Institute, 1991.
Ascertainment Corrected Rates
From LAMBERTUS A L M KIEMENEY. LEO J SCHOUTEN AND HUUB STRAATMAN
Sir—In their well-written paper McCarty and col-
leagues argued that 'all rates be reported only after
formal evaluation and adjustments for underascertain-
ment have been completed'.1 Although they state that
the exact methodology to employ such evaluation and
adjustment is not critical, a strong case is made for the
use of capture-recapture methods. We agree with Mc-
Carty et at. that capture-recapture methods can be a
very useful tool in the evaluation of completeness of
registries. In our opinion, however, capture-recapture
analyses and the interpretation of their results are less
straightforward than the authors suggest.
The first difficulty is the assumption of in-
dependence. The authors recognize that (without extra
information) the dependency between sources is not
identifiable in the two-sources situation. When they
explain that dependencies can be controlled if at least
three sources are available, e.g. by means of log-linear
modelling, they fail to recognize that the three-way in-
teraction term in this modelling approach will remain
unknown. Comparable to the two-sources situation it
is not possible to estimate the quantitative relevance of
this three-way dependency between the sources
without extra information.
Second, there is the problem of cases with an almost
zero probability of being captured by any source. One
Correspondence to: Leo J Schouten, Department of Medical Infor-
matics and Epidemiology, University of Nijmegen, P.O. Box 9101,
NL-6500 HB Nijmegen, The Netherlands.
may argue that this is just another form of negative
dependence but it results in underestimates of the
total number of cases instead of overestimates. For
example, in a cancer registry using hospital discharge
records and pathology reports as notification sources,
all cases with chronic lymphocytic leukaemia (CLL)
may be missed if the diagnosis (based on a blood
sample) is made by the haematologist and patients are
not hospitalized. Using the records of radiotherapy
departments as a third notification source will not
identify the number of missed cases with CLL. In
general, no method can estimate the true rate in the
population if certain cases are systematically missed by
Third, capture-recapture methods may be well-
established, the specific features of their methodology
(especially in the case of multiple sources) are not well
known yet. Only recently, Hook and Regal discussed
the phenomenon that capture-recapture estimates can
vary with 'variable catchability' in population
subgroups, e.g. by race, even if the different sources
are independent in each subgroup.2
For these reasons, we strongly disagree with the
authors' advice that rates should only be reported after
formal adjustment for underascertainment. Moreover,
if only ascertainment corrected rates are reported,
readers will lose access to the numbers on which these
adjusted rates were based. They may also lose impor-
tant information about the specific characteristics (or
perhaps: quality) of the registry that yielded those
numbers. We must remember that capture-recapture