Content uploaded by Justin T. Pickett
Author content
All content in this area was uploaded by Justin T. Pickett on Jul 07, 2019
Content may be subject to copyright.
Why I Asked the Editors of Criminology to Retract
Johnson, Stewart, Pickett, and Gertz (2011)
Justin T. Pickett
School of Criminal Justice
University at Albany, SUNY
ABSTRACT
My coauthors and I were informed about data irregularities in Johnson, Stewart, Pickett, and
Gertz (2011), and in my coauthors’ other articles. Subsequently, I examined my limited files and
found evidence that we: 1) included hundreds of duplicates, 2) underreported the number of
counties, and 3) somehow added another 316 respondents right before publication (and over a
year after the survey was conducted) without changing nearly any of the reported statistics
(means, standard deviations, regression coefficients). The survey company confirmed that it sent
us only 500 respondents, not the 1,184 reported in the article. I obtained and reanalyzed those
data. This report presents the findings from my reanalysis, which suggest that the sample was not
just duplicated. The data were also altered—intentionally or unintentionally—in other ways, and
those alterations produced the article’s main findings. Additionally, we misreported data
characteristics as well as aspects of our analysis and findings, and we failed to report the use of
imputation for missing data. The following eight findings emerged from my reanalysis:
1) The article reports 1,184 respondents, but actually there are 500.
2) The article reports 91 counties, but actually there are 326.
3) The article describes respondents that differ substantially from those in the data.
4) The article reports two significant interaction effects, but actually there are none.
5) The article reports the effect of Hispanic growth is significant and positive, but actually it
is non-significant and negative.
6) The article reports many other findings that do not exist in the data.
7) The standard errors are stable in our published article, but not in the actual data or in
articles published by other authors using similar modeling techniques with large samples.
8) Although never mentioned in the article, 208 of the 500 respondents in the data (or 42%)
have imputed values.
*Direct correspondence to Justin T. Pickett, School of Criminal Justice, University at Albany,
SUNY, 135 Western Avenue, Albany, NY 12222 e-mail (jpickett@albany.edu).
BACKGROUND
On May 5, 2019, my coauthors and I received an email that identified data irregularities
in our study, Johnson, Stewart, Pickett, and Gertz (2011), and in four other articles by my
coauthors. I asked my coauthors to send me the full data for Johnson et al. (2011), but
encountered difficulties getting them. Consequently, I examined the limited files I already had,
and discovered 500 unique respondents and 500 duplicates, as well as several other problems,
like inexplicable changes in sample size from manuscript draft (N = 868) to published article (N
= 1,184) that did not affect means, standard deviations, or regression coefficients. I sent an email
to my coauthors on June 6 that listed these issues and provided my files (see Appendix A). One
of my coauthors, Dr. Gertz, then contacted the former director of the Research Network, who
confirmed that the survey he ran for us included only 500 respondents. At that point, Dr. Stewart
sent me a copy of the data for our article (N = 500).
In the article, we claim to have 1,184 respondents nested in 91 counties. In the actual
data, there are only 500 respondents, and they are nested in 326 counties. Dr. Stewart
acknowledged that both the sample size and county number reported in the article were wrong.
He said the explanations for the differences were that: 1) he accidently doubled the sample, and
2) he created 91 “county clusters” for the analysis by grouping together the 326 counties. The
published article never mentions county clusters, or grouping together counties. It is also unclear
how the sample of 500 grew to 1,184 in our article, and then to 1,379 in Dr. Stewart’s later
Social Problems article (Stewart et al., 2015), which uses the same data (with the same 54.8%
response rate, and same $62,700 mean family income). Duplicating the 500 respondents, as Dr.
Stewart said he did, would lead to a sample size of only 1,000.
I should note that my coauthors are working on an erratum. Dr. Stewart now says there
were two surveys conducted for our study, one with 500 respondents and one with 425, and that
the results for the combined sample (N = 925) are similar to those in the published article.
However, I am uncomfortable with the new results for four reasons. First, I have not seen them.
Dr. Stewart has not sent me the data for the second sample, and although he has sent Stata output
for the combined sample to the lead author, Dr. Johnson, he has asked him not to share it.
Second, the published article reports 1,184 respondents, not 925. Third, our published article lists
only one survey company—the Research Network—and one survey. Fourth, Dr. Stewart has
refused to tell me who conducted the second survey, and Dr. Johnson has said he does not know
who conducted it. This lack of transparency and accountability is why I have decided not to wait
for my coauthors to finish their reanalysis before asking for a retraction.
Before I discovered the duplicates in our data, Dr. Stewart had never mentioned a second
survey or any survey company besides the Research Network. The email I have from the
Research Network says it conducted one survey for Dr. Stewart (N = 500), sent that data to him
in January 2008, and then, at his request, sent the county identifiers for 420 of those same
respondents to him in May 2008. It did not conduct a second survey. Therefore, the remainder of
this report focuses on the findings from my reanalysis of the data I can account for (N = 500),
which are from the sample that our article describes (albeit inaccurately) and that Dr. Stewart
initially said he doubled accidently. I did ask Dr. Stewart for the full sample (with duplicates
included) of 1,184 respondents that he initially said he used in his original analysis for our
article. Unfortunately, he doesn’t have it, because he saved over it after dropping the duplicates.
REANALYSIS FINDINGS
The evidence suggests the data were altered in ways besides duplication. The descriptive
statistics in the published article should match those in the data, even if the 500 respondents were
accidently doubled. Doubling the sample would increase the sample size, but it would not change
its composition. However, the descriptive statistics in the published article differ substantially
from the actual data. The outcome variable in our analysis is public support for the use of
defendants’ ethnicity in sentencing decisions. The distribution of the outcome variable by
respondents’ race is shown in Figure 1 of the published article (Johnson et al., 2011: 419). Here
is how it compares to the actual data:
Figure 1. Published Article Vs. Actual Data
27%
3% 1%
38% 38%
17%
0%
10%
20%
30%
40%
50%
White Black Hispanic White Black Hispanic
Published Article Actual Data
Percentage of Respondents
Doubling the sample and then adding another 184 respondents (to get the reported sample
size of 1,184) cannot explain these discrepancies. For example, even adding 184 Black
respondents who ALL oppose ethnicity-based sentencing would reduce the percent of Blacks
supporting it from 38% to 13%, not to the article’s 3%. And this is not the only noteworthy
distributional difference. The table below compares all of the descriptive statistics in the
published article to those in the actual data.
Table 1. Descriptive Statistics: Published Article vs. Actual Data
Published Article
Actual Data
Variables
Mean
SD
Mean
SD
p-value for
Difference
Use of ethnicity in punishment
.31
.46
.37
.48
.007
Hispanic criminal threat
4.93
1.66
2.85
2.67
.000
Hispanic economic threat
1.72
1.13
1.64
1.13
.137
Hispanic political threat
4.38
1.41
1.95
1.59
.000
Percent Hispanic
.12
.11
9.52
11.61
.000
Hispanic growth
.26
1.53
3.18
3.11
.000
Homicide rate (per 100,000)
3.96
4.37
3.92
4.25
.819
Concentrated disadvantage
1.09
1.53
1.40
.94
.000
Percent Republican
53.04
13.02
52.56
13.07
.415
Percent Black
.10
.14
11.63
11.98
.000
Population structure
5.39
.70
5.11
1.00
.000
White
.86
.41
.85
.36
.606
Black
.10
.33
.10
.30
1.000
Hispanic
.04
.22
.05
.21
.494
Age
47.12
19.72
46.41
16.98
.352
Male
.47
.50
.46
.50
.591
Married
.59
.31
.61
.49
.275
Education level (college graduate)
.42
.31
.42
.49
.902
Family income
$62,700
$14,210
$61,196
$22,593
.137
Employed
.46
.50
.55
.50
.000
Political conservative
.43
.31
.70
.46
.000
Own home
.78
.33
.78
.41
.829
Southwest
.17
.41
.16
.37
.721
Northeast
.15
.35
.15
.36
.802
Midwest
.24
.43
.24
.43
.917
West
.17
.38
.17
.38
.812
South
.44
.39
.43
.50
.787
General punitive attitudes
6.84
2.16
4.69
3.57
.000
The means for 11 of the 28 variables (or 39%) differ significantly between the article and
data. For example, the published article claims that 43% of respondents are political
conservatives. In the actual data, 70% are political conservatives. Again, even doubling the
sample to 1,000, and then adding 184 liberals, would only drop the percentage of conservatives
in the sample to 59%, not to the 43% reported in the article. Similarly, the mean for Hispanic
criminal threat is almost two points higher (mean = 4.93 vs. 2.85), and the mean for general
punitive attitudes is over two points higher (mean = 6.84 vs. 4.69), in the published article than
in the actual data. Even doubling the sample and then adding 184 respondents with the highest
possible value (a value of 9) on these two variables would only increase their means to 3.81 and
5.36, respectively; both still a point lower than in the article.
Accidently doubling the sample of 500 respondents would leave the regression
coefficients unscathed—they would be identical if all respondents were duplicated. However, the
regression results in the published article differ substantially from those in the actual data. Most
notably, the main findings in the article—the interaction effect of perceived Hispanic threat
(criminal and economic) and Hispanic growth—do not emerge with the actual data. Those
findings are reported in Table 3 of Johnson et al. (2011: 423). In the actual data, none of the
coefficients for the interaction terms are statistically significant and they are all much smaller.
This is shown in the table below.
Table 2. Interaction Effects: Published Article vs. Actual Data
Published Article
Actual Data
Variables
b
SE
Exp(b)
b
SE
Exp(b)
Variables
―
―
―
―
―
―
Perceived Hispanic threat
―
―
―
―
―
―
Criminal threat
.183*
.079
1.201
.212***
.048
1.236
Economic threat
.272*
.111
1.312
.632***
.124
1.881
Political threat
.008
.116
1.009
–.045
.072
.956
Aggregate Hispanic threat
―
―
―
―
―
―
Percent Hispanic
–.089
.766
.815
.032*
.015
1.032
Hispanic growth
.334**
.127
1.396
–.121
.064
.886
Interactions
―
―
―
―
―
―
Criminal * His. Growth
.126*
.051
1.134
.026
.017
1.026
Economic * His. Growth
.175*
.073
1.191
.015
.055
1.015
Political * His. Growth
–.101
.087
.904
.007
.022
1.007
Intercept
–.848***
.119
―
–.776***
.121
―
*p < .05; **p < .01; ***p < .001 (two-tailed).
The main effect of Hispanic growth also fails to replicate in the actual data; indeed, the
coefficient is in the opposite direction. This is the case even when the interaction terms are
removed from the model. In the published article, the main effect of Hispanic growth is shown
Model 2 of Table 2, and is positive and highly statistically significant (b = .288, p < .01). In the
actual data, the coefficient is negative and non-significant. The table below compares the
estimates in Johnson et al. (2011: 420) to those from the actual data. The differences are striking,
extending to many other variables besides Hispanic growth. For example, the coefficient for
Black in the published article is negative and significant (b = –.628, p < .05), but it is positive
and significant in the actual data (b = .949, p < .01).
Table 3. Model 2 in Published Article vs. Actual Data
Published Article
Actual Data
Variables
b
SE
Exp(b)
b
SE
Exp(b)
Percent Hispanic
–.104
.662
.901
.021
.020
1.021
Hispanic growth
.288**
.102
1.334
–.070
.053
.933
Homicide rate (per 100,000)
–.013
.033
.987
–.031
.030
.970
Concentrated disadvantage
.051
.086
1.052
.033
.144
1.034
Percent Republican
.001
.005
1.000
.009
.009
1.009
Percent Black
–.038
.173
.963
.003
.012
1.003
Population structure
–.051
.174
.950
.068
.126
1.070
Black
–.628*
.248
.534
.949**
.361
2.582
Hispanic
–.982**
.359
.375
–.752
.547
.472
Age
.016**
.005
.016
.004
.006
1.004
Male
.164**
.056
1.178
–.111
.199
.895
Married
.208
.201
1.231
.425*
.210
1.529
Education level
–.099
.198
.906
.215
.226
1.240
Family income
–.026
.061
.974
.062
.096
1.064
Employed
–.191*
.082
.826
–.058
.202
.944
Political conservative
.367*
.146
1.443
.360
.218
1.434
Own home
–.137
.246
.872
.158
.281
1.171
Southwest
–.111
.274
.895
.721
.369
2.057
Northeast
–.183
.281
.832
.522
.334
1.686
Midwest
.054
.230
1.055
.102
.291
1.107
West
.082
.264
1.085
–.350
.362
.705
General punitive attitudes
.191**
.062
1.210
.175***
.031
1.191
Intercept
–.858***
.105
—
–.640***
.097
—
*p < .05; **p < .01; ***p < .001 (two-tailed).
There are other issues with the data that are concerning. For example, one of the
irregularities raised in the email we received was the high degree of stability in standard errors
across the three models in Table 2 of our published article (Johnson et al., 2011: 420-421). In the
article, 21 regressors are included in all three models and 19 of them (or 90%) have standard
errors that are perfectly stable. In the actual data, not a single regressor has a standard error that
is stable across the models. As the table below shows, this is the case regardless of whether the
models are estimated using logistic regression (with clustered standard errors) or multilevel
modeling. In the table, there are boxes around the stable standard errors.
Table 4. Standard-Error Stability: Published Article vs. Actual Data
Published Article
Actual Data: Logistic
Actual Data: Multilevel
Variables
Model 1:
SE
Model 2:
SE
Model 3:
SE
Model 1:
SE
Model 2:
SE
Model 3:
SE
Model 1:
SE
Model 2:
SE
Model 3:
SE
Criminal threat
―
―
.084
―
―
.048
―
―
.049
Economic threat
―
―
.111
―
―
.121
―
―
.117
Political threat
―
―
.107
―
―
.073
―
―
.073
Percent Hispanic
―
.662
.669
―
.020
.016
―
.018
.019
Hispanic growth
―
.102
.101
―
.053
.053
―
.057
.064
Homicide rate
.033
.033
.033
.030
.030
.029
.028
.028
.031
Concentrated dis.
.086
.086
.086
.134
.144
.148
.129
.146
.158
Percent Repub.
.005
.005
.005
.009
.009
.011
.010
.010
.011
Percent Black
.173
.173
.173
.012
.012
.013
.012
.013
.014
Population structure
.173
.174
.185
.119
.126
.141
.118
.125
.144
Black
.248
.248
.248
.359
.361
.389
.364
.365
.411
Hispanic
.359
.359
.359
.540
.547
.518
.604
.607
.666
Age
.005
.005
.005
.006
.006
.007
.006
.006
.007
Male
.056
.056
.056
.200
.199
.227
.202
.204
.228
Married
.201
.201
.201
.206
.210
.266
.226
.228
.255
Education level
.198
.198
.198
.225
.226
.247
.218
.218
.243
Family income
.061
.061
.061
.096
.096
.104
.096
.097
.108
Employed
.082
.082
.082
.201
.202
.228
.213
.214
.237
Political conservative
.146
.146
.146
.217
.218
.260
.228
.229
.264
Own home
.246
.246
.246
.277
.281
.313
.278
.279
.316
Southwest
.274
.274
.274
.287
.369
.369
.310
.377
.425
Northeast
.281
.281
.281
.327
.334
.384
.341
.348
.387
Midwest
.230
.230
.230
.287
.291
.332
.293
.300
.333
West
.264
.264
.264
.360
.362
.379
.366
.369
.413
General punitive
.062
.062
.062
.031
.031
.035
.033
.033
.036
Intercept
.102
.105
.119
.097
.097
.116
.101
.102
.117
NOTES: Stable standard errors are boxed in.
The observed differences in standard-error stability between our published article and the
actual data are so startling that I searched for other articles to compare. I found several published
by other prominent scholars in top journals that, in design and analysis, are comparable to ours.
Specifically, they all have large samples, include a series of multilevel regression models
examining one outcome variable (stepwise, building from a baseline model), and report standard
errors to the third decimal place:
➢ Hagan, Shedd, and Payne (2005), American Sociological Review
➢ Kirk (2008), Demography
➢ Kirk and Matsuda (2011), Criminology
➢ Sampson, Morenoff, and Raudenbush (2005), American Journal of Public Health
➢ Slocum, Taylor, Brick, and Esbensen (2010), Criminology
➢ Xie, Lauritsen, and Heimer (2012), Criminology
I have included the relevant regression tables from all of these articles in Appendix B of this
report. None of the articles exhibit the degree of standard-error stability we report in Johnson et
al. (2011). Instead, they all are consistent with the actual data we have, and show that standard
errors normally vary across stepwise multilevel models, the main exception being standard errors
with two leading zeros (e.g., SE = .005).
Another issue with our data is item missing values. In the file Dr. Stewart sent me, all 500
respondents have complete data on every variable. In most surveys, a substantial number of
respondents have missing values on some of the variables (e.g., income). Closer inspection of the
data reveals that mean imputation was used to impute missing values for 208 respondents (or
42% of the sample). This is problematic because in our published article we never mention
imputation, much less inform readers that we used mean imputation specifically. Dropping cases
with imputed values results in findings that look even less like those reported in the published
article (e.g., the negative coefficient for Hispanic growth becomes larger and statistically
significant [b = –.202, p = .028], opposite the article’s positive and significant coefficient).
CONCLUSION
There is only one possible conclusion from reanalyzing the data I have: the sample was
not just duplicated in the analysis for the published article; the data were also altered, whether
intentionally or unintentionally, and those alterations produced the article’s main findings. It may
be that appending the data I have and the data Dr. Stewart has for the second sample of 425
respondents would change this conclusion. Unfortunately, the data for the second survey are not
forthcoming; neither is an answer about who conducted that survey. Regardless, our published
article did not report a second survey, or a sample of 925; it reported one survey of 1,184
respondents by the Research Network.
Three other things are incontrovertible. First, we omitted important information that must
be reported to journal referees and readers, like the use of imputation for item missing data.
Second, we misreported data characteristics, like the number of counties—there were at least 326
counties, not 91. Third, if Dr. Stewart grouped counties into “county clusters,” then we
consistently misreported our measurement of variables, analysis, and findings (emphasis added):
“we matched respondents to the 91 counties where they resided to assess objective measures of
aggregate threat characteristics … as well as county-level controls” (pp. 412); “model 1 of table 2
provides baseline estimates for the effects of county and demographic controls on the dependent
variable” (p. 418); “we also estimated a series of interactions between Hispanic population
growth and county-level controls” (p. 423); “both criminal and economic ethnic threat measures
became stronger as the Hispanic growth rate increased in the county” (p. 428). It is also unclear
how exactly Dr. Stewart grouped the counties. He has not provided the code for generating
county clusters or any cluster-level data.
1
My coauthors are working on an erratum, and Dr. Stewart’s reanalysis is ongoing.
However, given 1) the number and severity of the data discrepancies, 2) the lack of transparency
about the second sample, which was not reported in the published article, and 3) the misreporting
of data characteristics, methodology, and findings, I believe we should retract our study
altogether. I have asked the editors of Criminology to do so.
1
There seems to be little reason to use county clusters in our study. It is not desirable to group together counties,
because it throws away geographic detail and creates “meaningless socio-political entities” (Hagen et al., 2013:
770). Typically, researchers only group together counties in historical studies that examine data over a large number
of decades or across centuries, and even in those studies they only create county clusters for those specific counties
that have boundaries that changed during the time period examined (e.g., King et al., 2009; Messner et al., 2005).
REFERENCES
Hagan, John, Carla Shedd, and Monique R. Payne. 2005. Race, Ethnicity, and Youth Perceptions
of Criminal Injustice. American Sociological Review 70:381-407.
Hagen, Ryan, Kinga Makovi, and Peter Bearman. 2013. The influence of political dynamics on
southern lynch mob formation and lethality. Social Forces 92:757-787.
Johnson, Brian D., Eric A. Stewart, Justin Pickett, and Marc Gertz. 2011. Ethnic Threat and
Social Control: Examining Public Support for Judicial Use of Ethnicity in Punishment.
Criminology 49: 401-441.
King, Ryan D., Steven F. Messner, and Robert D. Baller. 2009. Contemporary hate crimes, law
enforcement, and the legacy of racial violence. American Sociological Review 74:291-
315.
Kirk, David S. 2008. The Neighborhood Context of Racial and Ethnic Disparities in Arrest.
Demography 45:55-77.
Kirk, David S., and Mauri Matsuda. 2011. Legal Cynicism, Collective Efficacy, and the Ecology
of Arrest. Criminology 49:443-472.
Messner, Steven F., Robert D. Baller, and Matthew P. Zevenbergen. 2005. The legacy of
lynching and southern homicide. American Sociological Review 70:633-655.
Sampson, Robert J., Jeffrey D. Morenoff, and Stephen Raudenbush. 2005. Social Anatomy of
Racial and Ethnic Disparities in Violence. American Journal of Public Health 95:224-
232.
Slocum, Lee A., Terrance J. Taylor, Bradley T. Brick, and Finn-Aage Esbensen. 2010.
Neighborhood Structural Characteristics, Individual-Level Attitudes, and Youths’ Crime
Reporting Intentions. Criminology 48:1063-1100.
Stewart, Eric A., Ramiro Martinez, Jr., Eric P. Baumer, and Marc Gertz. (2015) The Social
Context of Latino Threat and Punitive Latino Sentiment. Social Problems 62: 69-92.
Xie, Min, Janet L. Lauritsen, and Karen Heimer. 2012. Intimate Partner Violence in U.S.
Metropolitan Areas: The Contextual Influences of Police and Social Services.
Criminology 50:961-992.
APPENDIX A: “FILES AND CONCERNS” EMAIL
Brian, Eric and Marc,
I have spent the day going back through all of my records for our 2011 Criminology article. I
located files and emails that, without additional data, I have difficulty attributing to any benign
explanation.
Here is the background: the data for our 2011 paper were collected in early 2008, and the paper
was written in 2009. In fall 2009, after the analysis was finished and the paper was written, we
sent it out for feedback from colleagues. One of them suggested that we control for county
political climate. In response, Eric sent me a limited version of the data, so that I could add in
county voting percentages. I was a graduate student at the time. I have attached the limited data
Eric sent me (justin_voting_data.xls), as well as the data I sent back (Voting – Data.xls). Three
things concern me.
The first, and the one that worries me the least, is that there are 1,000 respondents in the data, not
1,184 as we say in the published article. Eric sent me this data in December 2009, the same
month we sent the manuscript to Criminology. Therefore, the sample size should match what we
report in the article.
What is more concerning is that half of these 1,000 respondents are duplicates. That is, the data
include two exact copies of the same 500 respondents. If you open the data set and sort on
“case,” you can see that cases 501 to 1,000 are exact copies of cases 1 to 500. To make it easier
to see, I have included another excel sheet (Case_comparison.xls) with these cases placed next to
each other. This suggests someone may have copied and pasted the rows in the original data, and
then given them new case numbers (501 to 1,000).
Just as concerning, the 500 unique respondents in the data live in 256 counties. In our published
article, however, we say that the sample includes 1,184 respondents nested within 91 counties.
Obviously, that is a big difference. We are claiming in the article that there is an average of 13
respondents per county, when the data actually include fewer than two per county. I also looked
at other studies that have used data from the Research Network. Consistent with our data (but not
with our article), they have reported fewer than two respondents per county, on average (Pickett
et al., 2014; Stupi et al., 2014).
I initially thought this must be the wrong data, even though Eric sent it to me for our paper. Then
I looked at the mean (53.04) and standard deviation (13.02) for the variable I collected
(Rep_2004). They both are identical to those we report in the published paper. Thus, the
variable’s descriptives match the published statistics, even though the data are nothing like what
we claim, either in terms of the number of respondents or the number of counties.
This led me to dig back through all of the emails I have from Eric in my old FSU account to see
if I missed something. I found two emails that worry me. The first is his response when I
apparently asked him (on September 11, 2010) about the large number of counties. He said:
“To answer your question, I created an aggregate shape file that resulted in 91 usable counties.
Additionally, I also had to fix the errors from the data Jake and Gertz gave me for the counties.
The counties and codes were wrong. Once I got things clean, I was better able to use the
data. Anyway that is how I got the 91 counties.”
At the time, as a graduate student, I found this explanation convincing. Now, I do not, for several
reasons. First, a shapefile wouldn’t change the number of counties. Second, although the FIPS
codes (Full_FIPS) included in the data Eric sent were wrong (and missing for 365 respondents),
the county names (county_n) matched both the city names (city) and the core-based statistical
areas (cbsa_tit). When I collected the voting information, I used all of these variables to: 1)
double-check the counties, and then 2) distinguish counties with the same name in different
states (Adams, CO vs. Adams, NE, and Harrison, MS vs. Harrison, WV). Once I did this, the
total number of counties increased from 256 to 292. Now, maybe these county names were
wrong also, and Eric fixed them later that month (December 2009) before we submitted the
paper. If so, the percent Republican (mean, SD) would necessarily have changed when he fixed
them. You cannot go from 292 counties to 91 without some other numbers also changing.
In another email, Eric sent me the 2009 ASC presentation for our paper (attached). I opened it
and looked at the data and findings. In it we say we have a total of 868 Americans, not the 1,000
in the data, or the 1,184 reported in the published article. But many of the descriptive statistics,
regression coefficients, and standard errors are identical to those in the published article. I also
found a draft of the paper from a few months before we sent the manuscript to Criminology (also
attached). It also lists 868 respondents, not 1,184. Again, most of the descriptive statistics,
regression coefficients, and standard errors match those in the published article. I counted. There
are 293 statistics in the published article, and 258 (88%) are perfectly identical to those in the
unpublished draft, even though the article has 316 additional respondents. Where did these
respondents come from? Our published article doesn’t say anything about imputation, or adding
a later survey. Why didn’t the inclusion of 316 new respondents have a larger impact on the
statistics (means, standard deviations, regression coefficients, and standard errors)?
I remain hopeful that one of you will share a copy of the data with me that clears up these issues.
They appear quite serious to me. Without additional data and a convincing explanation, I will
have to retract my name from the article.
Best,
Justin
APPENDIX B: MULTILEVEL REGRESSION TABLES FROM
ARTICLES PUBLISHED BY OTHER AUTHORS