Article
Invited Commentary: Simple Models for a Complicated Reality
Boston University, Boston, Massachusetts, United States
American Journal of Epidemiology (Impact Factor: 5.23). 09/2006; 164(4):3124; discussion 3156. DOI: 10.1093/aje/kwj238 Source: PubMed
Get notified about updates to this publication Follow publication 
Fulltext
Available from: Enrique Schisterman, Feb 12, 2016Invited Commentary
Invited Commentary: Simple Models for a Complicated Reality
Enrique F. Schisterman
1
and Sonia Herna
´
ndezDı
´
az
2,3
1
Epidemiology Branch, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD.
2
Department of Epidemiology, Harvard School of Public Health, Boston, MA.
3
Slone Epidemiology Center, Boston University, Boston, MA.
Received for publication February 27, 2006; accepted for publication March 6, 2006.
The renowned statistician George P. Box famously said
that all models are wrong, but some are useful. Far from an
indictment of statistical models, Box’s statement can be
taken to mean that even when complex realities are not
exactly represented by simple ﬁtted models, much can be
learned. The paper by Basso et al. (1) in this issue of the
Journal provides an opportunity to consider the costs and
beneﬁts that arise from the simpliﬁcation necessary for gen
erating statistical models of complex biologic processes.
Considering the relation among birth weight, mortality,
and third factors, Basso et al. postulate that birth weight is
not itself on the causal path to mortality; rather, the relation
between birth weight and mortality might be explained by
a confounding factor. The authors conclude that, to produce
the observed inverse J shape of the birthweightspeciﬁc
mortality curve, the putative confounding factors (matrix
X ¼ (X
1
and X
2
)) must be very rare and have very large
effects.
In reducing complex situations to simple models, assump
tions are made, especially when modeling a biologic process.
For example, parametric models make assumptions regard
ing distributions. The ﬂexibility of the models is limited by
that of the assumptions on which it depends. Basso et al.’s
model makes the following assumptions: 1) birth weight
follows a Gaussian distribution, 2) there is a uniform effect
of the confounding factors X
1
and X
2
on mortality, 3) birth
weight does not cause neonatal mortality, and 4) there is no
interaction between factors X
1
and X
2
and birth weight.
Notably, some of these assumptions are interrelated, and a
change in one might affect the others.
As previously stated, consideration of the potential limi
tations of this model requires further attention to the assump
tions on which it is based. Let us review each of these
assumptions regarding their substance, the possible impact
on the ﬁndings if they are violated, and whether they seem
reasonable.
ASSUMPTIONS
Gaussian distribution for birth weight
This assumption requires that birth weight follow a Nor
mal distribution at all strata of the confounding factor. The
hypothetical birth weights in the left tail of this distribution
may be regarded with some skepticism because of their
questionable compatibility with viability. If the left tail is
indeed truncated, the Gaussian birthweight assumption will
be violated. Moreover, a confounding factor might increase
the proportion of lowbirthweight babies without shifting
the whole birthweight distribution, resulting in a skewed
distribution within that stratum. However, given the rela
tively low prevalence of fetalgrowthrestricted babies, ma
jor deviations from a Gaussian distribution are unusual in
real life. Additionally, a small violation of this assumption
will have little impact on the shape of the overall association
between birth weight and neonatal mortality.
Uniform effect of risk factors
This assumption states that all babies exposed to the pu
tative confounding factor X are assumed to have identical
shifts in birth weight and identical elevation of their mor
tality risk. In their paper, Basso et al. acknowledge that this
assumption is unlikely to be true. However, a modiﬁcation
of this assumption seems to entail minor changes on the
noncausal link between birth weight and neonatal mortality.
The authors reﬁned the model by substituting distributions
for the constant effects and obtained similar results.
Correspondence to Dr. Enrique F. Schisterman, Division of Epidemiology, Statistics and Prevention Research, NICHD, NIH, 6100 Executive
Boulevard, 7B03, Rockville, MD 20852 (email: schistee@mail.nih.gov).
312 Am J Epidemiol 2006;164:312–314
American Journal of Epidemiology
Copyright
ª 2006 by the Johns Hopkins Bloomberg School of Public Health
All rights reserved; printed in U.S.A.
Vol. 164, No. 4
DOI: 10.1093/aje/kwj238
Advance Access publication July 17, 2006
by guest on January 13, 2016http://aje.oxfordjournals.org/Downloaded from
Page 1
Noncausal effect of birth weight on neonatal mortality
This assumption states that neonatal mortality is indepen
dent of birth weight conditional on the confounding factor
that links them or, stated more simply, that birth weight in
no way has any effect on neonatal mortality. In probability
terms, this assumption implies conditional independence
and can be rewritten as
PrðNM ¼ 0jBW ¼ bw; X ¼ xÞ
¼ PrðNM ¼ 0jX ¼ xÞ¼f ðxÞ; ð1Þ
where NM ¼ 0,1 indicates neonatal mortality status, BW is
birth weight and X is the unmeasured confounder, and f(x)is
a function that depends only on a value x of factor X. De
parture from this assumption could lead to different conclu
sions. As soon as the model allows for a causal effect of
birth weight on neonatal mortality, the risk factors (X
1
and
X
2
, following the notation of Basso et al.) no longer need to
be rare or have large effects.
Considering equation 1, what are the conditions and char
acteristics of the putative confounding factor, X, that would
give rise to the observed shape of the curve between birth
weight and neonatal mortality? Such a factor must comply
with the following equality that depicts the joint probability
of neonatal mortality and birth weight so that it results in the
inverse Jshaped curves, with form
PrðNM ¼ 0; BW ¼ bwÞ
¼
Z
f ðxÞPrðBW ¼ bwjX ¼ xÞF
X
ðdxÞ: ð2Þ
It is worth noting that, from such conditions, the strength
of the effect of the factor X on birth weight plays a crucial
role in determining the shape of the crude association be
tween birth weight and neonatal mortality under conditional
independence assumptions. Given these circumstances,
Basso et al. show that the confounding factor X cannot be
smoking, or any other factor with similar properties, be
cause it neither is sufﬁciently rare nor conveys sufﬁcient
risk to satisfy the conditions set forth by equation 2.
However, if one allows for a departure from the condi
tional independence assumption, the added ﬂexibility allows
for a very large range of circumstances and attendant con
clusions. When a causal effect between birth weight and
neonatal mortality is assumed to be possible, the unmea
sured confounder X is not constrained to be rare or have
large effects, and the following applies:
f ðbw; xÞ¼PrðNM ¼ 0jBW ¼ bw; X ¼ xÞ
6¼ PrðNM ¼ 0jX ¼ xÞ: ð3Þ
As a result, the equality in equation 2 becomes
PrðNM ¼ 0; BW ¼ bwÞ
¼
Z
f ðx; bwÞPrðBW ¼ bwjX ¼ xÞF
X
ðdxÞ: ð4Þ
Adding birth weight to function f creates a large number of
alternative routes to solving the equality represented by
equation 4. Speciﬁcally, nonlinear relations and/or interac
tions could lead to the observed shape of the birth weight–
neonatal mortality curve. Actually, under the relaxed
assumption, an inﬁnity of models is possible, including that
in which the Jshaped pattern is completely explained by the
effect of birth weight on mortality.
FIGURE 1. Neonatal mortality (log scale) per 10,000 births per year as a function of the interaction between birth weight and smoking. Left: birth
weight density by smoking status; right: neonatal mortality and the interaction between birth weight and smoking.
Simple Models for a Complicated Reality 313
Am J Epidemiol 2006;164:312–314
by guest on January 13, 2016http://aje.oxfordjournals.org/Downloaded from
Page 2
Lack of interaction between factors X
1
or X
2
and
birth weight
In reaching their conclusions, Basso et al. assume no in
teractive effect between risk factors and fetal growth on
neonatal mortality. However, in seeking explanations for
the observed shape of the birthweight curves, there are
alternatives that merit consideration as well. We simulated
an interactive effect between a common factor and fetal
growth. To do so, we had to relax another model assumption
by allowing a causal effect of fetal growth on neonatal mor
tality. We were able to generate curves that were similar to
the empirically observed data, as shown in ﬁgure 1. This
provides an alternative to the proposed rare exposure with
extreme effects on mortality.
There is biologic plausibility to both scenarios—that of
no direct effect from birth weight to fetal growth and that of
at least a small direct effect with or without interactions. In
addition, both scenarios have similar practical implications.
Under the one presented by Basso et al., the putative ‘‘rare’’
exposures will be difﬁcult to uncover, although perhaps we
could identify them among the causes of death attributed to
lowbirthweight babies on death certiﬁcates. Unfortunately,
the same is true under the alternative scenario. If an inter
action between fetal growth and a common factor is respon
sible for the observed association, it would be very grueling
to uncover the true responsible interactive and perhaps non
linear factor.
CONCLUSIONS
After considering the assumptions of Basso et al., several
questions merit deliberation to determine the future course
of research. First, is the assumption of no causal link be
tween birth weight and neonatal mortality correct? If so,
Basso et al. have shown us that the shape of the birth
weightspeciﬁc mortality curve might result from the pres
ence of very rare confounders with very large effects. This
ﬁnding suggests a search for the proverbial needle in the
haystack. Second, if a causal link does exist, what is the
nature of this link? If we relax the model to allow a causal
link, the inverse J pattern of birthweightspeciﬁc mortality
could be also explained by a range of more common and
weaker confounders that might have nonlinear effects and
might interact with other risk factors. That is, speculations
under the relaxed assumption basically take us to square
one, where, if there is a needle in the haystack, we would
have to accidentally sit on it to ﬁnd it. Interestingly, perhaps
we have to conclude that unrestricted models are correct, but
often useless.
We have shown how the model used by Basso et al. may
fail because of violations of the assumptions on which it is
based. However, we should heed the lessons of their simple
model, which opens up the appealing possibility of no direct
causal effect between birth weight and neonatal mortality
and shows how, in this context, the putative confounder
would have to be rare and extremely strong. From a meth
odological point of view, their ﬁndings should change the
way we consider birthweight data when evaluating the ef
fect of other risk factors on perinatal outcomes (2). More
importantly, if they are right and no causal link exists, our
research efforts to reduce neonatal mortality would be well
advised to shift away from birth weight and toward direct
causes of infant mortality and morbidity and toward broad
ening our efforts to acknowledge this possibility.
ACKNOWLEDGMENTS
Supported by the Intramural Research Program of the
National Institutes of Health, National Institute of Child
Health and Human Development.
Conﬂict of interest: none declared.
REFERENCES
1. Basso O, Wilcox AJ, Weinberg CR. Birth weight and mortal
ity: causality or confounding? Am J Epidemiol 2006;164:
303–11.
2. Herna
´
ndezDı
´
az S, Schisterman EF, Herna
´
n MA. The birth
weight paradox uncovered? Am J Epidemiol (in press).
314 Schisterman and Herna
´
ndezDı
´
az
Am J Epidemiol 2006;164:312–314
by guest on January 13, 2016http://aje.oxfordjournals.org/Downloaded from
Page 3
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.

 [Show abstract] [Hide abstract] ABSTRACT: We developed and evaluated a structural model of the determinants of neonatal mortality in Hungary that embodies the causal mechanisms by which its proximate and indirect determinantssocioeconomic, behavioural, and biologicalare related. The statistical model used distinguishes between endogenous and exogenous variables and allows the causal effect of each to be correctly estimated. Unobserved variables are integrated into the model, which was tested using Hungarian data for the periods 198488 and 199498. The principal findings are as follows: weight at birth and duration of gestation are the most important of the (direct) causal determinants of neonatal mortality. Mother's age has an indirect and detrimental effect: when mothers are older than 30 years of age, the risk of lower birth weight or multiple births and, in consequence, neonatal mortality is increased. Father's age has no direct or indirect causal effect on neonatal mortality.

Article: Invited Commentary: Crossing CurvesIt's Time to Focus on Gestational Agespecific Mortality
[Show abstract] [Hide abstract] ABSTRACT: For decades, epidemiologists have observed that, among lower birth weight infants, higher risk infants have lower mortality rates than do lower risk infants. However, among higher birth weight infants, the pattern reverses, leading to a riddle of crossing birth weightspecific mortality curves. The riddle has been considered from different perspectives, including relative z scores, directed acyclic graphs, and, most recently, simulated mathematical models of underlying causal factors that produce the observed curves; similarly paradoxical gestational agespecific mortality curves uncross when calculations include all fetusesatrisk rather than just infants delivered at a particular gestational age. However, researchers have generally focused on birth weight rather than gestational age, likely because birth weight is accurately measured and, if one assumes that birth weight is an intermediate variable between the underlying causal factors and mortality, is easier to model. Within the framework of existing analytical approaches, adding the complexity of a direct relation between gestational age and mortality, and possibly more complex relations among the casual factors, may be difficult. Nevertheless, duration of pregnancy seems a better proxy for the true construct of interest, whether the baby is mature enough to survive, so shifting attention to understanding the riddle of gestational agespecific mortality is encouraged.