ArticlePDF Available

Cross-Race Misaggregation: Its Detection, a Mathematical Decomposition, and Simpson's Paradox

Authors:

Abstract

Researchers sometimes aggregate data, such as combining resident data into state-level means. Doing so can sometimes cause valid individual-level data to be invalid at the group level. We focus on cross-race misaggregation, which can occur when individual-level data are confounded with race. We discuss such misaggregation in the context of Simpson’s Paradox and identify 4 diagnostic indicators: aggregated rates that correlate strongly with the relative size of one or more subgroup(s), unequal sample sizes across subgroups, unequal rates or mean values across subgroups, and aggregated rates that do not correlate with subgroup rates. To illustrate these diagnostic indicators, we decomposed data on the prevalence of sexually transmitted diseases (STDs) to confirm cross-race misaggregation in Parasite Stress U.S.A., an ostensible index of parasite prevalence known to be confounded with the proportion of African American residents per state.
Cross-Race Misaggregation: Its Detection, a Mathematical
Decomposition, and Simpson’s Paradox
Bryan L. Koenig
Washington University in St. Louis
Florian van Leeuwen
University of Lyon
Justin H. Park
University of Bristol
Researchers sometimes aggregate data, such as combining resident data into state-level
means. Doing so can sometimes cause valid individual-level data to be invalid at the group
level. We focus on cross-race misaggregation, which can occur when individual-level data
are confounded with race. We discuss such misaggregation in the context of Simpson’s
Paradox and identify 4 diagnostic indicators: aggregated rates that correlate strongly with
the relative size of one or more subgroup(s), unequal sample sizes across subgroups,
unequal rates or mean values across subgroups, and aggregated rates that do not correlate
with subgroup rates. To illustrate these diagnostic indicators, we decomposed data on the
prevalence of sexually transmitted diseases (STDs) to confirm cross-race misaggregation in
Parasite Stress U.S.A., an ostensible index of parasite prevalence known to be confounded
with the proportion of African American residents per state.
Keywords:
ecological fallacy, parasite-stress theory, population demographics, Simpson’s Par-
adox, sexually transmitted diseases
Supplemental materials: http://dx.doi.org/10.1037/ebs0000067.supp
Researchers sometimes test hypotheses by
analyzing data aggregated at the level of US
states or countries. Such analyses face several
obstacles to validity: Data points are noninde-
pendent, measures might mean different things
in different countries, and individual-level or
subgroup relationships cannot be reliably in-
ferred from group-level aggregated data (Pollet,
Tybur, Frankenhuis, & Rickard, 2014). We
elaborate on one cause of the latter obstacle: a
validity threat we refer to as misaggregation.
By this we mean valid data combined together
such that the aggregate represents neither what
the researcher intended it to represent, nor what
it represented at the individual level. Instead,
the aggregate is undermined by a confounder
variable such that a true effect at the individual
or subgroup level is obscured at the aggregate
level. Confounder variables can be hard to iden-
tify, making misaggregation easy to overlook.
We focus on race, a relatively easy-to-identify
confounder variable, although many variables
are potential confounders across states or coun-
tries, such as poverty levels, ethnic groups, and
rural-versus-urban residency rates. To help re-
searchers who use aggregated data avoid using
invalid aggregated variables, we connect mis-
aggregation to Simpson’s Paradox, identify four
red flags that can help researchers detect mis-
aggregation, and show how cross-race misag-
gregation can occur. Our primary illustration
uses Parasite Stress U.S.A., a variable that was
intended to be an index of pathogen prevalence
for the 50 states of the U.S.A. (Fincher &
Thornhill, 2012) that has been found to be in-
valid as a result of confounding with race
(Hackman & Hruschka, 2013; Hruschka &
Hackman, 2014).
Bryan L. Koenig, University College, Washington Uni-
versity in St. Louis; Florian van Leeuwen, Laboratoire
Dynamique du Langage, University of Lyon; Justin H. Park,
School of Experimental Psychology, University of Bristol.
Correspondence concerning this article should be ad-
dressed to Bryan L. Koenig, University College, Wash-
ington University in St. Louis, One Brookings Drive,
Campus Box 1085, St. Louis, MO 63130. E-mail:
bryanleekoenig@gmail.com
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Evolutionary Behavioral Sciences © 2016 American Psychological Association
2016, Vol. 10, No. 1, 000 2330-2925/16/$12.00 http://dx.doi.org/10.1037/ebs0000067
1
Simpson’s Paradox and Cross-Race
Misaggregation
A key problem with aggregated data is when
relationships observed at the group level are not
the same as the relationships that occur at the
subgroup or individual level. This is Simpson’s
Paradox
1
(Simpson, 1951). It can be seen in the
following example, in which an effect of treat-
ment is present in subgroups of men and women
but disappears when their data are aggregated
into a single group. Survivorship can be higher
for treated men (61%, or 8/13) compared with
untreated men (57%, or 4/7) as well as for
treated women (44%, or 12/27) compared with
untreated women (40%, or 2/5). When aggre-
gated across the sexes, however, the efficacy of
the treatment disappears: treated people (50%,
or 20/40) survive no better than untreated peo-
ple (50%, or 6/12; Simpson, 1951). The possi-
bility of divergence in relationships across lev-
els implies that researchers should not infer that
relationships observed at the group level hold
for subgroups or individuals (i.e., the ecological
fallacy; Robinson, 1950). Simpson’s Paradox
can occur when the relationship differs across
subgroups as a result of a third, confounder
variable.
Known methods for detecting Simpson’s Par-
adox are pertinent to detecting misaggregation.
Kievit, Frankenhuis, Waldorp, and Borsboom
(2013) provided four methods to detect Simp-
son’s Paradox. First, if the data are bivariate and
continuous, one can look at a scatterplot for any
obvious subgroups with different patterns of
results. Second, for contingency tables with an
observed relationship at the aggregate level, one
can compute a chi-square test of independence
to see whether the frequency distributions differ
across subgroups. If they differ, then the sub-
groups should be analyzed separately. Third,
researchers using regression can check the re-
siduals for systematic (subgroup-based) differ-
ences in homoscedasticity, which could reflect
the different slopes of the subgroups. The fourth
diagnosis technique is to use latent cluster anal-
ysis to detect subgroups whose patterns of re-
sults diverge. Such clusters are based on their
position on a bivariate scatterplot, although the
technique can also be applied to multiple regres-
sion. (Kievit & Epskamp, 2012, made available
an analysis tool that detects diverging clusters
and statistically evaluates whether the observed
relationship of interest differs statistically
across those clusters.)
However, for Parasite Stress U.S.A., a mea-
sure of infectious-disease prevalence aggre-
gated at the level of U.S. states—which is
known to be confounded with race—the diag-
nostic methods of Kievit and colleagues (2013)
do not reliably indicate Simpson’s Paradox (see
Supplemental Material S1 available online).
This is because the checks assume that each
case has only one level on the confounder vari-
able. For example, half of the participants are
male and the other half female, and a positive
correlation is observed when analyzing all par-
ticipants together, but a negative correlation is
observed within each sex. In such cases the
confounder variable might be considered a be-
tween-cases confound because each data point
is associated with only one level of the con-
founder variable. Parasite Stress U.S.A. (and
likely other variables suffering from cross-race
misaggregation) differs systematically in that
the confounder can be thought of as a within-
cases confound: each data point has within it all
levels of the confounder variable. For such data,
the methods suggested by Kievit and colleagues
(2013) would be unlikely to reveal misaggrega-
tion.
Red Flags for Cross-Race Misaggregation
Given the frequency of research that uses
data aggregated across demographic character-
istics such as race, researchers could benefit
from evaluating their data for the following four
red flags, which are suggestive of misaggrega-
tion. The first red flag is when aggregated rates
correlate strongly with the relative size of one or
more subgroup. The second is when sample
sizes are unequal across subgroups. The third is
when rates or mean values differ across sub-
groups. The fourth is when aggregated rates
correlate weakly or not at all with subgroup
rates. Checks for the red flags require access to
information about subgroups, which might be
difficult to obtain. In many cases, relevant in-
1
Tu, Gunnell, and Gilthorpe (2008) argue that many
labels, such as Simpson’s Paradox (Simpson, 1951), Lord’s
Paradox (Lord, 1967), or suppression, refer to the same
underlying phenomenon, the reversal paradox. We use the
best known term, Simpson’s Paradox, to refer to reversals
regardless of other features of the situation.
2 KOENIG, VAN LEEUWEN, AND PARK
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
formation is available for demographic sub-
groups, such as the population sizes of different
racial groups. Thus the check for the first red
flag is relatively easy to do for cross-race mis-
aggregation. The first three checks are sugges-
tive; the fourth is diagnostic (i.e., if total rates
correlate with all subgroups, misaggregation
has not occurred across those subgroups). We
illustrate these red flags with Parasite Stress
U.S.A.
Parasite Stress U.S.A.
We build on research that has identified prob-
lems with an aggregated index of pathogens—
Parasite Stress U.S.A.—that has been used in
tests of parasite stress theory. Parasite stress
theory proposes that norms and practices that
reduce the likelihood of pathogenic infection
will be heightened among cultural groups situ-
ated in regions with more instances of infectious
disease (Thornhill & Fincher, 2011). Evidence
presented in support of parasite stress theory
includes correlations of sociality variables with
Parasite Stress U.S.A. or similar pathogen in-
dexes (Fincher & Thornhill, 2012; Shrira, Wis-
man, & Webster, 2013; Thornhill & Fincher,
2011; Varnum, 2013, 2014). However, these
findings may be invalid as Parasite Stress
U.S.A. is confounded with the percentage of
state populations that was African American,
%Black (Hruschka & Hackman, 2014). The
confounding of Parasite Stress U.S.A. with
%Black is argued to have resulted from African
Americans having higher STD rates than non-
Hispanic Whites and %Black varying substan-
tially across states. Hruschka and Hackman
(2014) provide suggestions for researchers who
desire to use aggregated data but avoid the
pitfalls, such as replicating with new data and at
multiple levels, considering historical and social
context, and testing alternative hypotheses.
To further demonstrate and elucidate the con-
founding of Parasite Stress U.S.A. with
%Black, we removed components of the data
from the numerator and denominator of the
aggregate to evaluate the contributions of the
components. This decomposition confirms that
unequal STD rates across racial subgroups
played a key role. It also highlights the impor-
tance of unequal sample sizes (i.e., the propor-
tion of state populations composed of non-
Hispanic Whites compared with African
Americans). Our decomposition did not address
the contribution of variation in %Black; how-
ever, it was important because if it were con-
stant across states, then the numerator would
have been driven by the relatively high STD
rates of African Americans and the aggregate
would have been strongly correlated with STD
rates of African Americans rather than with
%Black. In short, cross-race misaggregation
can result when a minority subgroup has dis-
proportionate influence on the numerator but
not the denominator of an aggregate index.
Method
We obtained Parasite Stress U.S.A. from sup-
plementary materials of Fincher and Thornhill
(2012). They derived Parasite Stress U.S.A.
from the Summary of Notifiable Diseases,
United States for years 1993 to 2007, part of the
annual Morbidity and Mortality Weekly Report
of the Centers for Disease Control and Preven-
tion (CDC). It is a standardized measure of the
total incidence of all notifiable diseases reported
by all states for a year divided by the popula-
tion, calculated separately for each state.
Hackman and Hruschka (2013) demonstrated
that Parasite Stress U.S.A. mainly represents
STDs, because STD cases dwarf cases of other
notifiable diseases. Using data from the CDC
WONDER spanning 1998 –2009, Hackman and
Hruschka developed STD indexes using the two
most common STDs, chlamydia and gonorrhea
(CG), for the total population and separately for
non-Hispanic Whites and African Americans.
For the total population (i.e., collapsed over
racial subgroups), the CG index was strongly
correlated with Parasite Stress U.S.A.; r .95,
N 50, p .001. They reasoned that if parasite
stress theory is valid, the pattern observed when
collapsed over racial subgroups should also oc-
cur when analyzing the White and African
American subsamples separately. We also used
CDC WONDER data (U.S. Department of
Health and Human Services, 2011) to recreate
CG rates for our analyses. We developed three
indexes of CG rates per 100,000 residents: one for
the whole population, CG rates
total
; another for non-
Hispanic Whites, CG rates
white
; and one for
non-Hispanic African Americans, CG rates
black
.
For consistency we also used this data source for
calculating values %Black (note this source for
%Black differs from that used by Hackman &
3CROSS-RACE MISAGGREGATION
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Hruschka, 2013, and that their African Ameri-
can population included Hispanics).
Results
Parasite Stress U.S.A. Failed All Red
Flag Checks
Reproducing Hackman and Hruschka (2013),
CG rates
total
strongly correlated with Parasite
Stress U.S.A., r .96, N 50, p .001.
Previous reports have noted the high correlation
between Parasite Stress U.S.A. and %Black
(Eppig, Fincher, & Thornhill, 2011; Hackman
& Hruschka, 2013). Our data showed the same
pattern: Parasite Stress U.S.A. correlated
strongly with %Black, r .90, N 50, p
.001. Thus, Parasite Stress U.S.A. failed the
check for the first red flag.
Parasite Stress U.S.A. failed the checks for
the second and third red flags because across
states non-Hispanic Whites and African Amer-
icans diverged substantially in sample size and
rates of STDs, respectively. African Americans,
on average, constituted only 7.50% of state pop-
ulations whereas non-Hispanic Whites consti-
tuted 60.68% of state populations. In addition,
across U.S. states, the African American CG
rates per 100,000 (Mdn 1,810.79, M 1,954.
96, SD 747.72) were an order of magnitude
higher than those of non-Hispanic Whites
(Mdn 157.78, M 162.95, SD 54.83), d
2.45, Wilcoxon signed-ranks test, Z 6.15, p
.001.
If Parasite Stress U.S.A. were simply a race-
independent index of STD rates, it might corre-
late with race-stratified CG rates. The strongest
correlation might be observed with the CG rates
of the largest racial subgroup, non-Hispanic
Whites. However, Parasite Stress U.S.A. did not
correlate with CG rates
white
, r ⫽⫺.03, N 50,
p .858 (nor was CG rates
total
correlated with
CG rates
white
, r .13, N 50, p .354).
Alternatively, given that African Americans ac-
counted for a large number of CG cases in
absolute numbers, Parasite Stress U.S.A. might
correlate with CG rates
black
. This also did not
occur, r .11, N 50, p .464 (nor was CG
rates
total
correlated with CG rates
black
, r .23,
N 50, p .112. Parasite Stress U.S.A. there-
fore failed the check for the fourth red flag.
Decomposition of Parasite Stress U.S.A.:
Why Aggregate Rates Correlate With
%Black
Parasite Stress U.S.A. is uncorrelated with
race-stratified CG rates, but is strongly corre-
lated with %Black. This is because the numer-
ator of Parasite Stress U.S.A. was mostly STDs,
and a large proportion of CG cases were in the
African American subcomponent of state pop-
ulations. This results in African American CG
cases strongly influencing the numerator of CG
rates
total
across US states (and by implication
Parasite Stress U.S.A.), but not its denominator,
which is largely determined by members of
other races.
If African American CG cases exert such a
strong influence on Parasite Stress U.S.A., then
removing them from the numerator of CG
rates
total
should reduce its correlation with Par
-
asite Stress U.S.A. On the other hand, including
only African American CG cases in CG rates
total
should not reduce the correlation much. These
predictions were confirmed (see Table 1). Re-
moving African American CG cases eliminated
the significant correlations of CG rates
total
with
both Parasite Stress U.S.A. and %Black. Con-
versely, using only African American CG cases
hardly changed these correlations. For compar-
ison, we excluded non-Hispanic White CG
cases (see Table 1). Doing so had little effect on
the relationships of CG rates
total
with either
Parasite Stress U.S.A. or %Black. By contrast,
including only non-Hispanic White CG cases
dramatically altered the relationships. These re-
sults indicate that African American CG cases
were critical for the strong relationships of CG
rates
total
and Parasite Stress U.S.A. with
%Black.
To see whether African American or non-
Hispanic White population sizes had a strong
influence on the denominator, we excluded each
in turn. Removing non-Hispanic Whites from
state populations substantially altered correla-
tions of CG rates
total
with Parasite Stress U.S.A.
and %Black, but removing African Americans
from the population size did not (Supplemental
Material S2 available online). Therefore, Afri-
can Americans had little influence over the de-
nominator of CG rates
total
, and by implication
Parasite Stress U.S.A. In sum, African Ameri-
can CG cases can account for the observed
4 KOENIG, VAN LEEUWEN, AND PARK
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
strong relationships of Parasite Stress U.S.A.
and CG rates
total
with %Black.
Other Examples of Checks for Cross-Race
Misaggregation
Misaggregation may have invalidated other
aggregated variables used in research on eco-
logical (environmental) effects. For illustration
purposes, we identified three for which African
Americans and non-Hispanic Whites were
likely to differ: incarceration rates (Kruger &
De Loney, 2009), life expectancy (Thornhill &
Fincher, 2011; Eppig et al., 2011), and homi-
cide rates (Hackman & Hruschka, 2013; Shrira
et al., 2013; Thornhill & Fincher, 2011). Table
2 illustrates with these three variables how the
identified red flags indicate the presence or ab-
sence of cross-race misaggregation. For incar-
ceration rates, the fourth red flag for total, state-
level rates strongly suggests they may be
confounded with race. Removing African
American cases resulted in the correlation be-
tween total incarceration rates and %Black be-
coming negative, r ⫽⫺.35, p .016, and
including only African American cases in-
creased it, r .96, p .001. Thus the decom-
position of state-level incarceration rates
showed a pattern similar to that observed for
CG rates
total
and Parasite Stress U.S.A. (see
Supplemental Materials S3). This suggests that
total incarceration rates indeed suffers from
cross-race misaggregation. Despite three sug-
gestive red flags for both life expectancy and
homicide rates, total state-level values strongly
correlated with both race-stratified rates; there-
fore, the fourth diagnostic checks indicates no
misaggregation for life expectancy or homicide
rates. The absence of misaggregation might be
attributable to positive correlations for African
Americans with non-Hispanic Whites (life ex-
pectancy, r(36) .65, p .001; homicide rates
r(45) .38, p .009).
Discussion
Researchers are sometimes interested in test-
ing whether ecological variables motivate par-
ticular kinds of behavior, using this as evidence
of context-specific adaptations. If an ecological
variable is confounded with demographic vari-
ables, then we are mistaking cross-group differ-
ences, which might have any number of contex-
tual or historical bases, with the ecological
effect that we are specifically hypothesizing.
Our analyses showed how data of infectious
disease rates are confounded with race, and they
illustrated four red flags suggestive of misag-
gregation. Although identifying confounding
variables can be hard (and showing that no
confounder is present may be impossible), re-
searchers compiling and using an aggregated
variable may check for these red flags. The red
flags can be checked for demographic variables
other than race (e.g., age, income). When the
red flags indicate misaggregation, statistically
controlling for the confounder variable (e.g., by
including %Black as a predictor) might not
solve the problem (e.g., because of multicol-
linearity). Stratified analyses may provide more
valid inferences (Hruschka & Hackman, 2014).
If lower-level data (e.g., stratified by race) are
unavailable, these can sometimes be estimated
from aggregated data (i.e., ecological inference)
using methods developed in political science
(King, 1997; Rosen, Jiang, King, & Tanner,
Table 1
Decomposition of Parasite Stress USA: Correlations Among Parasite Stress USA, Population Percentage
African American (%Black), and CG Rates
Total
Variable
CG rates
total
No CG
black
No CG
white
Only CG
black
Only CG
white
CG
total
/pop
total
(CG
total
CG
black
)/
pop
total
(CG
total
CG
white
)/
pop
total
CG
black
/pop
total
CG
white
/pop
total
Parasite stress USA .96
ⴱⴱⴱ
.26
.97
ⴱⴱⴱ
.93
ⴱⴱⴱ
.45
ⴱⴱⴱ
%Black .80
ⴱⴱⴱ
.10 .84
ⴱⴱⴱ
.89
ⴱⴱⴱ
.34
CG rates
total
.42
ⴱⴱ
.97
ⴱⴱⴱ
.84
ⴱⴱⴱ
.20
Note. Modified CG rates
total
were produced by excluding African American or non-Hispanic White CG cases from the
total number of cases, or including only African American or non-Hispanic White CG cases. The formulas for various
calculations of CG rates are shown.
p .10.
p .05.
ⴱⴱ
p .01.
ⴱⴱⴱ
p .001.
5CROSS-RACE MISAGGREGATION
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
2001). Confounders can be hard to identify, so
even in the absence of any of the identified red
flags we advise researchers to be cautious when
using aggregated data.
References
Eppig, C., Fincher, C. L., & Thornhill, R. (2011).
Parasite prevalence and the distribution of intelli-
gence among the states of the USA. Intelligence,
39, 155–160. http://dx.doi.org/10.1016/j.intell
.2011.02.008
Fincher, C. L., & Thornhill, R. (2012). Parasite-stress
promotes in-group assortative sociality: The cases
of strong family ties and heightened religiosity.
Behavioral and Brain Sciences, 35, 61–79. http://
dx.doi.org/10.1017/S0140525X11000021
Hackman, J., & Hruschka, D. (2013). Fast life histories,
not pathogens, account for state-level variation in
homicide, child maltreatment, and family ties in the
US. Evolution and Human Behavior, 34, 118 –124.
http://dx.doi.org/10.1016/j.evolhumbehav.2012.11
.002
Harrison, P. M., & Beck, A. J. (2006). Prison and jail
inmates at midyear 2005. NCJ, 213133.
Hruschka, D. J., & Hackman, J. (2014). When are
cross-group differences a product of a human be-
havioral immune system? Evolutionary Behavioral
Sciences, 8, 265–273. http://dx.doi.org/10.1037/
ebs0000013
Kievit, R. A., & Epskamp, S. (2012). Simpsons:
Detecting Simpson’s paradox. R package version
0.1.0. Retrieved from http://CRAN.R-project
.org/packageSimpsons
Kievit, R. A., Frankenhuis, W. E., Waldorp, L. J., &
Borsboom, D. (2013). Simpson’s paradox in psy-
chological science: A practical guide. Frontiers in
Psychology, 4, 513.
King, G. (1997). A solution to the ecological infer-
ence problem: Reconstructing individual behavior
from aggregate data. Princeton, NJ: Princeton
University Press.
Kruger, D. J., & De Loney, E. H. (2009). The asso-
ciation of incarceration with community health and
racial health disparities. Progress in Community
Health Partnerships: Research, Education, and
Action, 3, 113–121. http://dx.doi.org/10.1353/cpr
.0.0066
Lord, F. M. (1967). A paradox in the interpretation of
group comparisons. Psychological Bulletin, 68,
304 –305. http://dx.doi.org/10.1037/h0025105
Pollet, T. V., Tybur, J. M., Frankenhuis, W. E. F., &
Rickard, I. J. (2014). What can cross-cultural cor-
relations teach us about human nature? Human
Nature, 25, 410 429. http://dx.doi.org/10.1007/
s12110-014-9206-3
Table 2
Illustrative Red Flag Checks for Three State-Level Aggregate Variables: Incarceration, Life Expectancy, and Homicide
Variable
(total state-level rate)
r with
%Black
First red
flag?
Mean African
American value
Mean non-Hispanic
White value
Third red
flag?
r with African
American rates
r with non-
Hispanic
White rates
Fourth red
flag?
Cross-race
misaggregation?
Incarceration
a
.65
Yes 2,572.63/100,000 425.23/100,000 Yes
b
.03 .77
Yes Yes
Life expectancy
c
.57
Yes 74.46 years 78.62 years Yes
b
.73
.95
No No
Homicide
d
.70
Yes 22.99/100,000 2.97/100,000 Yes
b
.62
.58
No No
Note. These examples pertain to U.S. states and so all exhibit the second red flag of unequal racial subgroup sizes.
a
Data from Harrison and Beck, 2006; excludes New Mexico and Wyoming.
b
Rates differ across races significantly, p .001, using Wilcoxon signed-rank test.
c
Life
expectancy at birth; data from www.measureofamerica.org, 2010 –2011 dataset; correlation sample sizes are N 45 for %Black, N 38 for African American mean, and N 50
for non-Hispanic White mean.
d
Data from the Uniform Crime Report (Federal Bureau of Investigation 2003, 2005, 2006, 2007, & 2009), stratified by offender race; no data for
Florida; following Hackman and Hruschka (2013), we excluded data from New Mexico and Nevada because they had large Hispanic populations but did not distinguish Hispanic
from non-Hispanic Whites; other states with large Hispanic populations distinguished between Hispanic and non-Hispanic Whites; N 49 for total and African American, N
47 for non-Hispanic White.
p .001.
6 KOENIG, VAN LEEUWEN, AND PARK
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Robinson, W. S. (1950). Ecological correlations and
the behavior of individuals. American Sociological
Review, 15, 351–357. http://dx.doi.org/10.2307/
2087176
Rosen, O., Jiang, W., King, G., & Tanner, M. A.
(2001). Bayesian and frequentist inference for eco-
logical inference: The RC case. Statistica Neer-
landica, 55, 134 –156. http://dx.doi.org/10.1111/
1467-9574.00162
Shrira, I., Wisman, A., & Webster, G. (2013). Guns,
germs, and stealing: Exploring the link between
infectious disease and crime. Evolutionary Psy-
chology, 11, 270 –287. http://dx.doi.org/10.1177/
147470491301100124
Simpson, E. H. (1951). The interpretation of interac-
tion in contingency tables. Journal of the Royal
Statistical Society Series B. Methodological, 13,
238 –241.
Thornhill, R., & Fincher, C. L. (2011). Parasite stress
promotes homicide and child maltreatment. Philo-
sophical Transactions of the Royal Society of Lon-
don Series B, Biological Sciences, 366, 3466
3477. http://dx.doi.org/10.1098/rstb.2011.0052
Tu, Y. K., Gunnell, D., & Gilthorpe, M. S. (2008).
Simpson’s Paradox, Lord’s Paradox, and Suppres-
sion Effects are the same phenomenon—The re-
versal paradox. Emerging Themes in Epidemiol-
ogy, 5, 2. http://dx.doi.org/10.1186/1742-7622-5-2
United States Department of Justice. Federal Bureau
of Investigation. (2003). Uniform Crime Reporting
Program Data [United States]: Arrests by age,
sex, and race, summarized yearly, 2003
(ICPSR27651-v1). Ann Arbor, MI: Inter-univer-
sity Consortium for Political and Social Research
[distributor], 2010-03-11. http://dx.doi.org/10
.3886/ICPSR27651.v1
United States Department of Justice. Federal Bureau
of Investigation. (2005). Uniform Crime Reporting
Program Data [United States]: Arrests by age,
sex, and race, summarized yearly, 2005
(ICPSR04716-v2). Ann Arbor, MI: Inter-univer-
sity Consortium for Political and Social Research
[distributor], 2009-09-28. http://dx.doi.org/10
.3886/ICPSR04716.v2
United States Department of Justice. Federal Bureau
of Investigation. (2006). Uniform Crime Reporting
Program Data [United States]: Arrests by age,
sex, and race, summarized yearly, 2006
(ICPSR22405-v1). Ann Arbor, MI: Inter-univer-
sity Consortium for Political and Social Research
[distributor], 2009-09-28. http://dx.doi.org/10
.3886/ICPSR22405.v1
United States Department of Justice. Federal Bureau
of Investigation. (2007). Uniform Crime Reporting
Program Data [United States]: Arrests by age,
sex, and race, summarized yearly, 2007
(ICPSR25106-v1). Ann Arbor, MI: Inter-univer-
sity Consortium for Political and Social Research
[distributor], 2009-09-28. http://dx.doi.org/10
.3886/ICPSR25106.v1
United States Department of Justice. Federal Bureau
of Investigation. (2009). Uniform Crime Reporting
Program Data: Arrests by age, sex, and race,
summarized yearly, 2009 (ICPSR30762-v1). Ann
Arbor, MI: Inter-university Consortium for Politi-
cal and Social Research [distributor], 2011-09-30.
http://dx.doi.org/10.3886/ICPSR30762.v1
U.S. Department of Health and Human Services.
(2011, June). Centers for Disease Control and
Prevention, National Center for HIV, STD, and TB
Prevention (NCHSTP), Division of STD/HIV Pre-
vention, Sexually Transmitted Disease Morbidity
for selected STDs by age, race/ethnicity and gen-
der 1996 –2009 [CDC WONDER online data-
base]. Retrieved 16 June, 2013, from http://
wonder.cdc.gov/std-v2009-race-age.html
Vandello, J. A., & Cohen, D. (1999). Patterns of
individualism and collectivism across the United
States. Journal of Personality and Social Psychol-
ogy, 77, 279 –292. http://dx.doi.org/10.1037/0022-
3514.77.2.279
Varnum, M. E. W. (2013). Frontiers, germs, and
nonconformist voting. Journal of Cross-Cultural
Psychology, 44, 832– 837. http://dx.doi.org/10
.1177/0022022112466591
Varnum, M. E. W. (2014). Sources of regional vari-
ation in social capital in the United States: Fron-
tiers and pathogens. Evolutionary Behavioral Sci-
ences, 8, 77–85. http://dx.doi.org/10.1037/
h0098950
Received May 19, 2015
Revision received November 18, 2015
Accepted November 19, 2015
7CROSS-RACE MISAGGREGATION
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
... A further concern is that the parasite stress index used in a number of previous studies is confounded with the racial composition of U.S. states. African Americans have higher sexually transmitted disease (STD) rates, and these STD rates contribute strongly to the parasite stress index used by Fincher and Thornhill (2012), Hackman and Hruschka (2013), and Koenig, van Leeuwen, and Park (2017). This is important theoretically because high STD rates may reflect people's adoption of a fast life history strategy (an adaptive response to living in an impoverished or threatening environment, which involves having children early and investing in quantity rather than quality), and it is, therefore, possible that behaviors associated with fast life history strategy may occur with greater frequency in high-stress areas. ...
Article
The parasite stress hypothesis predicts that individuals living in regions with higher infectious disease rates will show lower openness, agreeableness, and extraversion, but higher conscientiousness. This article, using data from more than 250,000 U.S. Facebook users, reports tests of these predictions at the level of both U.S. states and individuals and evaluates criticisms of previous findings. State-level results for agreeableness and conscientiousness are consistent with previously reported cross-national findings, but others (a significant positive correlation with extraversion and no correlation with openness) are not. However, effects of parasite stress on conscientiousness and agreeableness are not found when analyses account for the data’s hierarchical structure and include controls. We find that only openness is robustly related to parasite stress in these analyses, and we also find a significant interaction with age: Older, but not younger, inhabitants of areas of high parasite stress show lower openness. Interpretations of the findings are discussed.
Article
Full-text available
Although the individualism–collectivism dimension is usually examined in a U.S. versus Asian context, there is variation within the United States. The authors created an eight-item index ranking states in terms of collectivist versus individualist tendencies. As predicted, collectivist tendencies were strongest in the Deep South, and individualist tendencies were strongest in the Mountain West and Great Plains. In Part 2, convergent validity for the index was obtained by showing that state collectivism scores predicted variation in individual attitudes, as measured by a national survey. In Part 3, the index was used to explore the relationship between individualism–collectivism and a variety of demographic, economic, cultural, and health-related variables. The index may be used to complement traditional measures of collectivism and individualism and may be of use to scholars seeking a construct to account for unique U.S. regional variation. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Several scholars have proposed that behavioral immune responses can account for worldwide human diversity in several behavioral and cognitive domains. Testing such claims generally relies on observational, cross-population data sets, posing challenges for causal inference. In this paper we describe four key pitfalls to using such data to test hypotheses for cross-population diversity based on a behavioral immune system. These issues are associated with (a) representativeness of sampling populations, (b) statistical artifacts of aggregation, (c) clustered data, and (d) spurious associations through unmeasured variables. We argue that these issues can be mitigated through careful attention to research design, analytic strategies, and serious treatments of alternative hypotheses.
Article
Full-text available
Many recent evolutionary psychology and human behavioral ecology studies have tested hypotheses by examining correlations between variables measured at a group level (e.g., state, country, continent). In such analyses, variables collected for each aggregation are often taken to be representative of the individuals present within them, and relationships between such variables are presumed to reflect individual-level processes. There are multiple reasons to exercise caution when doing so, including: (1) the ecological fallacy, whereby relationships observed at the aggregate level do not accurately represent individual-level processes; (2) non-independence of data points, which violates assumptions of the inferential techniques used in null hypothesis testing; and (3) cross-cultural non-equivalence of measurement (differences in construct validity between groups). We provide examples of how each of these gives rise to problems in the context of testing evolutionary hypotheses about human behavior, and we offer some suggestions for future research.
Article
Full-text available
The present study sought to explore the contribution of a number of factors to regional variation among U.S. states in social capital with a focus on the impact of frontier settlement and levels of pathogen prevalence. As predicted, date of statehood was positively correlated with state-level scores on Putnam’s social capital index, as well as generalized trust, number of group memberships, and hours spent volunteering, and parasite stress was negatively correlated with these four variables. Controlling for parasite stress eliminated or reduced the relationship between date of statehood and each of these variables, suggesting that differences in parasite stress may underlie differences between frontier and nonfrontier regions of the U.S. in terms of social capital. The relationship between parasite stress and social capital was quite robust, with parasite stress remaining a significant predictor when controlling for a number of other factors. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
Article
Full-text available
An emerging literature has documented differences in values and behavioral practices (including conformity) between frontiers (areas that were more recently settled) and areas with a longer history of settlement. However, so far there have been few tests of which mechanisms might contribute to the maintenance of such regional differences. The present study provides the first test of the hypothesis that differences in pathogen prevalence might underlie this regional variation. Specifically, the relationship between frontier settlement, pathogen prevalence, and nonconformist voting was explored. Date of statehood, a proxy for recency of settlement, was positively correlated with votes for third-party candidates, and this relationship was partially mediated by pathogen prevalence. Theoretical and practical implications are discussed.
Article
The definition of second order interaction in a (2 × 2 × 2) table given by Bartlett is accepted, but it is shown by an example that the vanishing of this second order interaction does not necessarily justify the mechanical procedure of forming the three component 2 × 2 tables and testing each of these for significance by standard methods.*
Book
This book provides a solution to the ecological inference problem, which has plagued users of statistical methods for over seventy-five years: How can researchers reliably infer individual-level behavior from aggregate (ecological) data? In political science, this question arises when individual-level surveys are unavailable (for instance, local or comparative electoral politics), unreliable (racial politics), insufficient (political geography), or infeasible (political history). This ecological inference problem also confronts researchers in numerous areas of major significance in public policy, and other academic disciplines, ranging from epidemiology and marketing to sociology and quantitative history. Although many have attempted to make such cross-level inferences, scholars agree that all existing methods yield very inaccurate conclusions about the world. In this volume, Gary King lays out a unique--and reliable--solution to this venerable problem. King begins with a qualitative overview, readable even by those without a statistical background. He then unifies the apparently diverse findings in the methodological literature, so that only one aggregation problem remains to be solved. He then presents his solution, as well as empirical evaluations of the solution that include over 16,000 comparisons of his estimates from real aggregate data to the known individual-level answer. The method works in practice. King's solution to the ecological inference problem will enable empirical researchers to investigate substantive questions that have heretofore proved unanswerable, and move forward fields of inquiry in which progress has been stifled by this problem.
Article
Parasite stress theory has recently been used to account for an array of cross-cultural differences in human cognition and social behavior, including in-group bias, interpersonal violence, child maltreatment, and religious adherence. Here, we re-assess the apparently ubiquitous effects of parasite stress on behavior observed in the U.S., using the cross-sectional, cross-population approach implemented by prior pathogen stress studies. Our results raise two challenges to previous findings. First, we show that the observed effects of pathogen stress in the U.S. data are due exclusively to one type of infectious disease – sexually transmitted diseases (STD) – while non-STD infections have no effect. Second, we find that controlling for life history measures of extrinsic risk and a fast life history erases the observed associations with family ties, interpersonal violence, child fatalities, and religious adherence. Thus, after appropriate variable specification, stratification, and control, U.S. cross-state population differences provide no support for the pathogen stress hypothesis in these various domains of behavior. Rather, the findings are more consistent with predictions from life history theory.