PreprintPDF Available

On the minimum strength of (unobserved) covariates to overturn an insignificant result

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

We study conditions under which the addition of variables to a regression equation can turn a previously statistically insignificant result into a significant one. Specifically, we characterize the minimum strength of association required for these variables--both with the dependent and independent variables, or with the dependent variable alone--to elevate the observed t-statistic above a specified significance threshold. Interestingly, we show that it is considerably difficult to overturn a statistically insignificant result solely by reducing the standard error. Instead, included variables must also alter the point estimate to achieve such reversals in practice. Our results can be used for sensitivity analysis and for bounding the extent of p-hacking, and may also offer algebraic explanations for patterns of reversals seen in empirical research, such as those documented by Lenz and Sahn (2021).
arXiv:2408.13901v1 [math.ST] 25 Aug 2024
On the minimum strength of (unobserved)
covariates to overturn an insignificant result
Danielle Tsao
, Ronan Perry
, and Carlos Cinelli
August 27, 2024
Abstract
We study conditions under which the addition of variables to a regression equation
can turn a previously statistically insignificant result into a significant one. Specifically,
we characterize the minimum strength of association required for these variables—both
with the dependent and independent variables, or with the dependent variable alone—
to elevate the observed t-statistic above a specified significance threshold. Interestingly,
we show that it is considerably difficult to overturn a statistically insignificant result
solely by reducing the standard error. Instead, included variables must also alter the
point estimate to achieve such reversals in practice. Our results can be used for sensi-
tivity analysis and for bounding the extent of p-hacking, and may also offer algebraic
explanations for patterns of reversals seen in empirical research, such as those docu-
mented by Lenz and Sahn (2021).
1 Introduction
Applied researchers are often confronted with unexpected statistically insignificant esti-
mates of linear regression coefficients. This situation may lead to the addition of variables
to the regression equation with the intent to reduce standard errors or to account for factors
that could be masking the target relationship of interest (Cinelli, Forney, and Pearl, 2022).
If statistical significance remains elusive, researchers may naturally speculate whether there
exist key variables that remained unmeasured but could have overturned statistical insignif-
icance had they been accounted for in the analysis. As statistical significance is often a key
factor for publication, such practices and concerns frequently arise in both experimental
and observational studies.
Consider, for example, a randomized controlled trial (RCT) in which a researcher uses
ordinary least squares (OLS) to estimate the average effect of a treatment on an outcome.
Here, confounding biases do not exist by design but adjusting for covariates may still help
with precision. If the initial result is not statistically significant, this may lead to the inclu-
sion of pre-treatment covariates in the regression equation to potentially attain statistical
significance. In observational studies, beyond precision gains, covariate adjustment may
be an essential tool for obtaining valid estimates of the target of inference. It may help
PhD Student, Department of Statistics, University of Washington, Seattle, WA, USA.
Email: dltsao@uw.edu.
PhD Student, Department of Statistics, University of Washington, Seattle, WA, USA.
Email: rflperry@uw.edu. URL: rflperry.github.io.
Assistant Professor, Department of Statistics, University of Washington, Seattle, WA, USA.
Email: cinelli@uw.edu. URL: carloscinelli.com.
1
mitigate confounding biases, block indirect pathways, estimate conditional effects, or ad-
dress various other methodological concerns—all of which could provide legitimate reasons
for introducing covariates that reverse an initially insignificant result, or to ask whether
unobserved variables could have done so.
However, despite the many valid reasons for covariate adjustment, applied researchers
often fail to adequately justify their choice of control variables. For example, in the Ameri-
can Journal of Political Science, Lenz and Sahn (2021) found that over 30% of articles relied
on the inclusion of covariates to turn previously statistically insignificant findings into sig-
nificant ones. According to Lenz and Sahn (2021), none of the articles justified this choice,
nor disclosed these reversals. In fact, the practice of testing various model specifications
with the intention of obtaining statistically significant results is commonly referred to as
‘p-hacking’ (Simonsohn, Nelson, and Simmons, 2014). Extensive surveys and meta-analysis
of published p-values suggest that p-hacking may be prevalent across disciplines (Brodeur
et al., 2016; Vivalt, 2019).
Under what conditions can such reversals of statistical insignificance occur? Can we
establish bounds on the extent of ‘p-hacking’? And what observable patterns in the data
should emerge when these reversals take place? In this short communication, we provide
simple algebraic answers to these questions in the context of OLS. Building on recent
results from Cinelli and Hazlett (2020,2022), we first characterize the maximum change
in the t-statistic that covariates with bounded strength can produce. We then derive the
minimum strength of association that such covariates must have—whether with both the
dependent and independent variables or with the dependent variable alone—to elevate the
observed t-statistic above a given statistical significance threshold. Lastly, we provide an
empirical example. These results can be applied to conduct sensitivity analyses against
unobserved ‘suppressors’ and to bound the extent of p-hacking arising due to the choice
of control variables. It may also offer algebraic explanations for patterns of significance
reversals observed in empirical research.
2 Preliminaries
2.1 Problem set-up
Let Ybe an (n×1) vector containing the dependent variable for nobservations; Dbe an
(n×1) independent variable of interest and Xbe an (n×p) matrix of observed covariates
including a constant. Consider the regression equation
Y=ˆ
λrD+Xˆ
βr+ ˆǫr,(1)
where ˆ
λr,ˆ
βrare the OLS estimates of the regression coefficients of Yon Dand X, and ˆǫr
is the corresponding (n×1) vector of residuals.
Let bse(ˆ
λr) be the estimated classical (homoskedastic) standard error of ˆ
λr. Under the
classical linear regression model, the t-statistic for testing the null hypothesis H0:λr=λ0,
i.e.,
tr:= ˆ
λrλ0
bse(ˆ
λr),(2)
follows a t-distribution with df := np1 degrees of freedom. Denoting by t
α,df the
(1 α/2) quantile of this distribution, the t-statistic (2) is considered “statistically signif-
2
icant with significance level α if the absolute value of trexceeds that of t
α,df. Note that
the t-statistic depends on the choice of λ0. For simplicity, we use the notation trwith the
understanding that a particular λ0has been chosen.
Now suppose the t-statistic (2) is insignificant. Let Zbe an (n×1) vector of a (poten-
tially unobserved) covariate whose inclusion in the regression equation we wish to assess. In
contrast to the ‘restricted’ regression in (1), we now consider the long regression equation
of Yon Dafter adjusting for both Xand Z,
Y=ˆ
λD +Xˆ
β+ ˆγZ + ˆǫ. (3)
Here, the t-statistic for testing null hypothesis H0:λ=λ0is
t:= ˆ
λλ0
bse(ˆ
λ)(4)
where ˆ
λand bse(λ) have the same interpretation as before, just now with an additional
adjustment for Z. We wish to quantify the properties that Zneeds to have such that the
t-statistic in (4) will be statistically significant.
2.2 Omitted variable bias formulas
Comparing (2) with (4), observe that the (absolute) relative change in the t-statistic can
be decomposed as the product of the relative change in the bias and the relative change in
the standard error:
t
tr=
ˆ
λλ0
ˆ
λrλ0× bse(ˆ
λr)
bse(ˆ
λ)!.(5)
Concretely, for Zto double the t-statistic, it must either double the absolute difference
between the point estimate and λ0, halve the standard errors, or achieve some combination
of both.
To characterize these changes in terms of how much residual variation Zexplains of D
and Y, we refer to the following result from Cinelli and Hazlett (2020).
Theorem 1 (Cinelli and Hazlett, 2020).Let R2
YZ|DXdenote the sample partial R2of Y
with Zafter adjusting for Dand X, and let R2
DZ|X<1denote the sample partial R2of
Dwith Zafter adjusting for X. Then,
|ˆ
λrˆ
λ|=v
u
u
tR2
YZ|DXR2
DZ|X
1R2
DZ|X
|{z }
BF
×bse(ˆ
λr)×pdf (6)
and
bse(ˆ
λ) = v
u
u
t1R2
YZ|DX
1R2
DZ|X
|{z }
SEF
×bse(ˆ
λr)×sdf
df 1.(7)
To aid interpretation, we call the terms BF in (6) and SEF in (7) the “bias factor” and
3
the “standard error factor” of Z, respectively.
We can use Theorem 1to write the absolute value of the t-statistic (4) as a function of
R2
YZ|DXand R2
DZ|X, i.e.,
t(R2
YZ|DX, R2
DZ|X) = |(ˆ
λrλ0)±BF ×bse(ˆ
λr)×df|
bse(ˆ
λr)×SEF ×qdf
df 1
,
where the sign of the bias term, denoted by ±, depends on whether ˆ
λr>ˆ
λor vice-versa.
This re-formulation allows us to assess how Zaffects inferences for any postulated pair of
partial R2values {R2
YZ|DX, R2
DZ|X}, and it will help us determine the conditions under
which the addition of Zturns a previously statistically insignificant result into a significant
one.
To set the stage for upcoming results, we note an immediate but important corollary of
Theorem 1: for a fixed observed t-statistic and fixed strength of Z, the impact that Zhas
on the relative bias depends on the sample size, whereas the impact it has on the relative
change in standard errors does not. The relative change in the bias is given by
ˆ
λλ0
ˆ
λrλ0
= 1 ±BF
tr×df.
Notice that for fixed trand fixed {R2
YZ|DX, R2
DZ|X}(which sets BF), larger sample
sizes yield larger relative changes in the distance of the estimate from the null hypothesis.
Conversely, the relative change in the standard error is given by
bse(ˆ
λr)
bse(ˆ
λ)=1
SEF ×rdf 1
df 1
SEF (8)
and is thus unaffected by the sample size. As an example, halving standard errors is
equally challenging in a sample of 100 as in a sample of 1,000,000; in contrast, doubling
point estimates becomes much easier as the sample size grows. This distinction will become
clearer as we continue with our analysis.
3 Results
In this section we present two main results. First, given upper-bounds on R2
YZ|DXand
R2
DZ|X, we derive the ‘maximum adjusted t-statistic’ which quantifies the maximum possi-
ble value that t(R2
YZ|DX, R2
DZ|X) can attain after including Zin the regression equation.
Then we solve for the minimum upper bound on {R2
YZ|DX, R2
DZ|X}, hereby referred to as
the “strength of Z”, such that it guarantees that the maximum adjusted t-statistic exceeds
the desired significance threshold. In what follows, it is useful to define the quantities
fr:= |tr|/df and f
α,df := t
α,df/df,
which normalize the observed t-statistic and the critical threshold by the degrees of freedom.
These definitions greatly simplify formulas and derivations.
4
3.1 On the maximum adjusted t-statistic
We start by defining the maximum value that the t-statistic (4) could attain given Zwith
bounded strength.
Definition 1 (Maximum adjusted t-statistic).For a fixed null hypothesis H0:λ=λ0, signifi-
cance level α, and upper bounds on R2
YZ|DXand R2
DZ|X, denoted by R2={Rmax
Y, Rmax
D},
we define the maximum adjusted t-statistic as
tmax
R2:= max
R2
YZ|DX,R2
DZ|X
t(R2
YZ|DX, R2
DZ|X) s.t. R2
YZ|DXRmax
Y, R2
DZ|XRmax
D.
The solution to the above problem has a simple closed-form characterization.
Theorem 2 (Closed-form solution to tmax
R2).Let Rmax
Y<1. Then,
tmax
R2=
frq1R2
DZ|X+qR2
YZ|DXR2
DZ|X
q(1 R2
YZ|DX)/(df 1)
such that if f2
r< Rmax
Y(1 Rmax
D)/Rmax
D, then
{R2
YZ|DX, R2
DZ|X}={Rmax
Y, Rmax
D}
and otherwise,
{R2
YZ|DX, R2
DZ|X}=nRmax
Y,Rmax
Y
f2
r+Rmax
Yo.
If tmax
R2< t
α,df 1, then we can be assured that no Zwith the specified maximum strength
would be able to overturn an insignificant result. On the other hand, if tmax
R2> t
α,df 1, we
know that there exists at least one Zwith strength no greater than R2that is capable of
bringing the t-statistic above the specified threshold.
Remark 1.Note that the optimal value of R2
YZ|DXalways reaches the upper bound Rmax
Y,
while R2
DZ|Xmay either reach its upper bound Rmax
Dor result in the interior point solu-
tion Rmax
Y
f2
r+Rmax
Y.
Remark 2.It is always necessary to constrain the strength of Zwith respect to Yin order
to obtain a finite solution for tmax
R2. If Rmax
Y= 1, then as R2
YZ|DXapproaches one, the
standard error approaches zero and the t-statistic grows without bounds.
Remark 3.In contrast, it is possible to leave R2
DZ|Xunconstrained. Increasing R2
DZ|X
has two counterbalancing effects on the t-statistic. On one hand, it can change the point
estimate, as described by (6), in a direction that is favorable for rejecting H0:λ=λ0. On
the other hand, it also increases the standard error due to the variance inflation factor in (7)
(i.e. the denominator of the SEF), which eventually counter-balances and then exceeds the
benefit of the change in estimate. Thus, setting Rmax
D= 1 will always result in an interior
point solution for R2
DZ|X.
Of course, there naturally could be multiple latent variables instead of a single one, and
so one might wonder about the case when Ztakes the form of a matrix rather than a vector.
The following theorem demonstrates that it is sufficient to consider a single unmeasured
latent variable, up to a correction in the degrees of freedom.
5
Theorem 3 (tmax
R2for matrix Z).Let Zbe an (n×m)matrix of covariates. Then the
solution to tmax
R2is the same as that of Theorem 2, save for the adjustment in the degrees of
freedom, which now is df m.
In what follows, for simplicity we keep Zas a vector with the understanding that all
results still hold for matrix Z. But before moving forward, there is an interesting corollary of
the previous result. It places limits on the extent of p-hacking given observed covariates X.
Corollary 1 (Upper bound on p-hacking).For observed covariates X, let R2
YX|D, R2
DX
denote the strengths of the associations of Xwith Yand D, respectively. Let tYD|XS
denote the t-statistic for the coefficient of the regression of Don Ywhen adjusting for the
subset of covariates XS(consisting of a subset of the columns of X). Then for any XS,
tYD|XStmax
R2,
where tmax
R2is the solution of Theorem 2using tr=tYDand R2={R2
YX|D, R2
DX}.
When the number of covariates is small, it is feasible to run all possible regressions to
identify the exact maximum t-statistic across all specifications. However, when the number
of covariates is large, this exhaustive approach becomes impractical. For example, with
p= 40, there are 240 (approximately 1 trillion) possible specifications. In such a case, tmax
R2
offers a simple upper bound on the maximum extent of p-hacking without the need to run
all 1 trillion regressions.
Example 1.Let p= 40, df = 100 and the t-statistic of the regression of Yon Dbe equal
to 1. Then, if {R2
YX|D, R2
DX}={0.08,0.08}, Corollary 1assures us that none of the 1
trillion specifications can yield a t-statistic greater than 1.83.
Remark 4.Note that tmax
R2is achievable when the only constraint on the variables to be
included is their maximum explanatory power. For any given set of observed covariates,
tmax
R2is a potentially loose upper bound. It is possible to tighten this bound by applying the
corollary iteratively within subsets of regressions. Obtaining tight bounds without running
all of the 2ppossible regressions remains an open problem.
3.2 On the minimal strength of Zto reverse statistical insignificance
Equipped with the notion of the maximum adjusted t-statistic, we can now characterize
the minimum strength of Znecessary to obtain a statistically significant result. Following
the convention of Cinelli and Hazlett (2020), we call our metrics “robustness values” for
insignificance. They quantify how “robust” an insignificant result is to the inclusion of
covariates in the regression equation.
Extreme robustness value for insignificance
As highlighted in Remark 2, the parameter R2
YZ|DXis essential for assessing the potential
of Zto bring about a significant result, as it always needs to be bounded. Thus we begin
by characterizing the minimal strength of association of Zwith Yalone in order to achieve
significance.
Definition 2 (Extreme Robustness Value for Insignificance).For fixed Rmax
D[0,1], the
extreme robustness value for insignificance, XRVIRmax
D
α, is the minimum upper bound on
6
R2
YZ|DXsuch that tmax
R2is large enough to reject null hypothesis H0:λ=λ0at specified
significance level α, i.e.,
XRVIRmax
D
α:= min{XRVI : tmax
XRVI,Rmax
Dt
α,df 1}.
For a fixed bound on R2
DZ|X, the XRVIRmax
D
αdescribes how robust an insignificant
result is in terms of the minimum explanatory power that Zneeds to have with Yin
order to overturn it. Theorem 7in the appendix provides an analytical expression for
XRVIRmax
D
αgiven arbitrary Rmax
D[0,1]. Here we focus on two important cases: Rmax
D= 0
and Rmax
D= 1.
Starting with Rmax
D= 0, we first consider the scenario where the point estimate re-
mains unchanged, and any increase in the t-statistic occurs solely due to a reduction in the
standard error. That is, XRVI0
αquantifies how much variation a control variable Zthat
is uncorrelated with Dmust explain of the dependent variable Yin order to overturn a
previously insignificant result. This turns out to have a remarkably simple and insightful
characterization.
Theorem 4 (Closed-form expression for XRVI0
α).Let fr>0, then the analytical solution
for XRVI0
αis
XRVI0
α=
0,if f
α,df 1< fr,
1 fr
f
α,df 1!2
,otherwise.
If fr= 0, there is no value of R2
YZ|DXcapable of overturning an insignificant result.
Remark 5.It is useful to understand how XRVI0
αchanges as the sample size grows, when
keeping the observed t-statistic and the significance level fixed.
XRVI0
α1 tr
t
α,df 1!2
df
1tr
z
α2
,
where here df
denotes the limit as df goes to infinity and z
αdenotes the (1α/2) quantile
of the standard normal distribution. In other words, when considering only a reduction in
the standard error, the amount of residual variation that Zmust explain of Yin order to
overturn an insignificant result depends solely on the ratio of the observed t-statistic to the
critical threshold. Apart from changes in the critical threshold due to degrees of freedom—
which eventually converges to z
α—this value remains constant regardless of sample size.
Example 2.Consider testing the null hypothesis of zero effect with an observed t-statistic
of 1 and 100 degrees of freedom. The percentage of residual variation of Ythat Zneeds
to explain in order to bring a t-statistic of 1 to the critical threshold of 2, only through a
reduction in the standard error, is
XRVI0
α11
22
= 1 (1/4) = 3/4 = 75%.
That is, if we are considering a reduction in the standard error alone, Zneeds to explain
at least 75% of the variation of Yin order to elevate the observed t-statistic to 2. Notably,
7
this number is (virtually) the same across sample sizes, be it df = 100, df = 1,000 or
df = 1,000,000. As variables that explain 75% of the variation of Yare rare in most
settings, this simple fact suggests that it should also be rare to see a reversal of significance
driven by gains in precision when tr= 1. We return to this point in the discussion.
The previous result describes how to achieve statistical significance via precision gains.
We now move to the case with Rmax
D= 1. As noted in Remark 3, this is the scenario in
which we impose no constraints on the strength of association between Zand D. In this
sense, XRVI1
αcomputes the bare minimum amount of variation that Zneeds to explain
of Yin order to reverse an insignificant result. Any variable that does not explain at
least (100 ×XRVI1
α)% of the variation of Yis logically incapable of making the t-statistic
significant.
Theorem 5 (Closed-form expression for XRVI1
α).The analytical solution for XRVI1
αis:
XRVI1
α=
0,if f
α,df 1< fr,
f2
α,df 1f2
r
1 + f2
α,df 1
,otherwise.
Remark 6.Contrary to the previous case, we observe the following behaviour as the sample
size grows,
XRVI1
α t2
α,df 1t2
r
df +t2
α,df 1!df
0.
Therefore, if we allow Zto change point estimates, then for a fixed observed t-statistic, the
minimal strength of Zwith Yto bring about a reversal tends to zero as the sample size
grows to infinity.
Example 3.Consider again testing the null hypothesis of zero effect with an observed t-
statistic of 1 and 100 degrees of freedom. If we allow Zto be arbitrarily associated with D,
it needs only to explain 2.9% of the residual variation of Yin order to bring the t-statistic
to 2:
XRVI1
α2212
100 + 22=3
104 = 2.9%.
Also note that any Zthat explains less than 2.9% of the variation of Yis logically incapable
of bringing about such change. Corroborating our previous analysis, the Zthat achieves
this must do so via an increase in point estimate, and not via a decrease in standard errors.
As per Theorem 2, the optimal value of the association with Dis R2
DZ|X74%. Notice
that SEF 1.94, meaning that the inclusion of Zalmost doubles the standard error,
instead of reducing it. This, however, is compensated by the fact that Zincreases the point
estimate by a factor of 1 + BF ×df = 3.88, thus doubling the t-statistic despite the loss
in precision.
Example 4.For the same observed t-statistic of 1, consider a sample size that is an order
of magnitude larger, say, df = 1,000. The minimum residual variation that Zneeds to
explain of Ythen reduces to 0.29%:
XRVI1
α2212
1000 + 22=3
1004 = 0.29%.
As per Theorem 2, this Zhas an association with Dof R2
DZ|X75%. Here, we have the
8
same situation as before: the standard error doubles while the point estimate increases by
a factor of four, thus doubling the t-statistic.
Robustness value for insignificance
While in the previous section we investigated the minimal bound on R2
YZ|DXalone in order
to revert an insignificant result, here we investigate the minimal bound on both R2
YZ|DX
and R2
DZ|Xsimultaneously.
Definition 3 (Robustness Value for Insignificance).The robustness value for insignificance,
RVIα, is the minimum upper bound on both R2
YZ|DXand R2
DZ|Xsuch that tmax
R2is large
enough to reject the null hypothesis H0:λ=λ0at specified significance level α. That is,
RVIα:= min{RVI : tmax
RVI,RVI t
α,df 1}.
Note that RVIαprovides a convenient summary of the minimum strength of association
that Zneeds to have, jointly with Dand Y, in order to bring about a statistically significant
result. Any Zthat has both partial R2values no stronger than RVIαcannot reverse a
statistically insignificant finding. On the other hand, we can always find a Zwith both
partial R2values at least as strong as RVIαthat does so. The solution of this problem is
given in the following result.
Theorem 6 (Closed-form expression for RVIα).The analytical solution for RVIαis
RVIα=
0,if f
α,df 1< fr,
1
2qf4
+ 4f2
f2
,if fr< f
α,df 1< f1
r,
XRVI1
α,otherwise
where f:= f
α,df 1fr.
Remark 7.The first case in Theorem 6occurs when the t-statistic for H0:λ=λ0is already
statistically significant, even when losing one degree of freedom. The second case occurs
when both constraints on R2
DZ|Xand R2
YZ|DXare binding. The third case is the interior
point solution, as defined in Theorem 5, where only the constraint on R2
YZ|DXis binding.
Notice that in the second solution of RVα, we have R2
DZ|X=R2
YZ|DX; thus, SEF = 1
and standard errors remain unchanged. Therefore, RVαrepresents the minimal strength of
Zneeded to achieve statistical significance via a change in the point estimate alone, usefully
complementing XRVI0
α, which quantifies the minimal strength of Zneeded solely through
a reduction in standard errors.
Remark 8.We recover the XRVI as the solution to RVI if and only if the conditions (1)
f
α,df 1> f1
rand (2) f
α,df 1> frboth hold. This rarely occurs. To see this more clearly,
note that condition (1) simplifies to df 1df < tr×t
α,df 1, or, approximately,
df /tr×t
α,df 1
which, for typical critical thresholds (e.g. 1.96), only occurs when there are few degrees of
freedom.
9
Remark 9.As with XRVI1
α, for fixed trand significance level α, we observe the same
behaviour for RVIαas the sample size grows,
RVIα1
2
st4
df2+ 4t2
df t2
df
df
0,
where t=t
α,df 1tr. Therefore, the larger the sample size, any change in the point
estimate will eventually be sufficiently strong to bring about statistical significance.
Remark 10.The statistics we introduced here obey the following ordering,
XRVI1
αRVIαXRVI0
α.
This follows directly from their definitions, as each case represents a constrained minimiza-
tion problem and the constraint becomes stricter as we move from Rmax
D= 1 to Rmax
D= 0.
Moreover, XRVIRVIα
α= RVIα.
Example 5.Continuing with the case where the t-statistic is 1, we obtain RVIα9.5%
when df = 100 and RVIα3% when df = 1,000. In both cases, the inflation of the
t-statistic by a Zthat attains the optimal strength is driven solely by changes in the point
estimate.
4 Empirical Example
We demonstrate the use of our metrics in an empirical example that estimates the effect of
a vote-by-mail policy in various outcomes (Amlani and Collitt, 2022). This work includes
an analysis for the effect of a US county’s vote-by-mail (VBM) policy on the Republican
presidential vote share (dependent variable Y) in the 2020 election. There are 5 treatment
conditions concerning the VBM policy change from 2016 to 2020, of which the authors
are specifically interested in the condition: no-excuse-needed (in 2016) to ballots-sent-in
(in 2020), which we will refer to as condition-1. The authors t a differences-in-differences
model using OLS, adjusting for various covariates including an indicator for battleground
states and the median age and the median income of residents in the county. Notably,
the interaction term for condition-1 ×year (independent variable D) is not statistically
significant at the 5% level: the t-statistic is tr= 0.12 with 4,307 degrees of freedom.
Estimate Std. Error t-statistic XRVI1
αRVIαXRVI0
α
0.103 0.873 0.118 0.089% 2.77% 99.6%
Note: df = 4307, λ0= 0, α= 0.05.
Table 1: Robustness values for insignificance for the vote-by-mail policy study.
4.1 Robustness to unobserved suppressors
The authors were concerned that the lack of significance for the coefficient of interest could
have been due to suppression effects of unobserved variables. To address this, they use the
formulas of Theorem 1to examine whether different hypothetical values for the strength of
Zyield a statistically significant t-statistic. Here we complement their analysis by providing
the three proposed metrics, XRVI1
α, RVIαand XRVI0
α.
10
The results are displayed in Table 1. We find that any latent variable Zthat explains
less than 2.77% of the residual variation of both Yand D(RVIα= 2.77%) would not be
sufficiently strong to make the estimate statistically significant. Moreover, if we impose
no constraints on R2
DZ|X, then Zneeds to explain at least 0.089% of the variation of Y
in order to attain such a reversal (XRVI1
α= 0.089%) . Finally, our analysis shows that a
reversal of significance solely due to gains in precision is virtually impossible: a Zorthogonal
to Dwould need to explain a remarkable 99.6% of the variation of Yin order to overturn
the insignificant result (XRVI0
α= 99.6%). To put these statistics in context, a latent
variable with the same strength of association with Dand Yas that of the battleground
state indicator would only explain 0.6% of the residual variation in Yand 0.27% of the
residual variation in D. Since both of these numbers are below the RVIαvalue of 2.77%,
we can immediately conclude that adjusting for a latent variable Zof similar strength to
this observed covariate would not be sufficient to overturn the insignificant result.
4.2 Robustness to subsets of controls
We now illustrate how tmax
R2can be used to understand whether one can easily rule out
the possibility of obtaining a statistically significant t-statistic when adjusting for different
subsets of control variables. The authors present three model specifications, all of which
found the effect of interest to be statistically insignificant. However, could there be a
specification where the results turn out to be significant? Here we consider all possible
variations between their base model and an expanded model that includes 12 additional
control variables. This amounts to 212 = 4,096 possible regressions. Applying the results
of Corollary 1, we obtain tmax
R220.7, meaning that we cannot rule out that there exists
a specification where the interaction term becomes significant. Given the relatively small
number of combinations, we can actually compute the ground truth to verify—and, indeed
there are 510 models that yield a statistically significant result.
5 Discussion
The algebra of OLS both imposes strong limits on and reveals clear patterns in how reversals
of statistical insignificance occur. The first lesson that emerges from our analysis is that
reversals of low t-statistics are unlikely to occur through reductions in standard errors
alone. As shown in the first row of Table 2, even with a t-statistic of 1.75, one would need
to explain at least 20% of the residual variation in Yto achieve statistical significance at
the 5% level solely through a reduction in standard errors. Elevating a t-statistic of .5 to
statistical significance requires explaining a remarkable 93% of the residual variation of Y.
Such strong associations with the response variable are not typically common in many
empirical applications.
tr0.25 0.50 .75 1.00 1.25 1.50 1.75
XRVI0
α=0.05 0.98 0.93 0.85 0.74 0.59 0.41 0.20
XRVIq.95
α=0.05 0.41 0.32 0.24 0.16 0.10 0.05 0.01
Table 2: Approximate values of XRVI for various values of tr.
A second consequence of our findings is that, in RCTs, it should be difficult to observe
reversals of low t-statistics due to covariate adjustments. Since covariates in such trials typ-
11
ically have zero association with the treatment by design (barring sampling errors), their
inclusion is unlikely to significantly shift the point estimate. A back-of-the-envelope cal-
culation illustrates this point: if Dis randomized, then df R2
DZ|Xfollows an approximate
chi-square distribution with one degree of freedom. Thus, letting q.95 denote the (approx-
imate) 95th percentile of realizations of R2
DZ|X, we can calculate XRVIq.95
0.05 for various
values of tr. These values are recorded in the second line of Table 2. With the exception of
tr= 1.75 and tr= 1.5, the values of R2
YZ|DXrequired to reverse statistical insignificance
remain moderate to large, suggesting that such reversals should be uncommon in practice.
Finally, and perhaps counter-intuitively, even when included variables are highly predic-
tive of the response, reversals of insignificance are still typically driven by shifts in the point
estimate rather than by reductions in standard errors. To illustrate, consider the usual crit-
ical threshold of t
α,df 12 and any observed t-statistic below 1. Then, if R2
YZ|DX.5, it
is impossible to obtain a reversal that is not mainly driven by changes in point estimate. In
other words, any post-mortem analysis of such significance reversals, using decompositions
like (5), must necessarily find that the relative change in bias is larger than the relative
change in standard errors. Overall, these results closely mirror empirical patterns of rever-
sals of statistical insignificance observed in applied research, such as those documented by
Lenz and Sahn (2021), and may offer a purely algebraic explanation for at least some of
these patterns.
Appendix: Deferred proofs
Proof of Theorem 2.From (6) and (7), the magnitude of the t-statistic for H0:λ=λ0can
be written as a function of R2
YZ|DXand R2
DZ|X,
t(R2
YZ|DX, R2
DZ|X) = |(ˆ
λrλ0)±BF ×bse(ˆ
λr)×df|
bse(ˆ
λr)×SEF ×qdf
df 1
.(9)
We wish to maximize t(R2
YZ|DX, R2
DZ|X) under the posited bounds R2
YZ|DXRmax
Y
and R2
DZ|XRmax
D. That is, we want to solve the constrained maximization problem,
tmax
R2= max
R2
YZ|DX,R2
DZ|X
t(R2
YZ|DX, R2
DZ|X) (10)
such that R2
YZ|DXRmax
Y,R2
DZ|XRmax
D. First notice that we should choose the
direction of the bias that increases the magnitude of the difference (ˆ
λλ0). If (ˆ
λrλ0)>0
then we should add the BF term in (9), whereas if (ˆ
λrλ0)<0 then the bias should be
subtracted. Both cases yield the same (simplified) objective function as argued below.
Suppose that ˆ
λr> λ0. Then the absolute value from (10) can be dropped and the
objective function becomes
ˆ
λrλ0+ BF ×bse(ˆ
λr)×df
bse(ˆ
λr)×SEF ×qdf
df 1
=fr×bse(ˆ
λr) + BF ×bse(ˆ
λr)
bse(ˆ
λr)×SEF ×q1
df 1
.
12
Now suppose that ˆ
λr< λ0. We again have,
λ0ˆ
λr+ BF ×bse(ˆ
λr)×df
bse(ˆ
λr)×SEF ×qdf
df 1
=fr×bse(ˆ
λr) + BF ×bse(ˆ
λr)
bse(ˆ
λr)×SEF ×q1
df 1
.
Therefore, after some algebraic manipulation, the maximum t-value in (10) will be of
the form,
tmax
R2=
frq1R2
DZ|X+qR2
YZ|DXR2
DZ|X
q(1 R2
YZ|DX)/(df 1)
where R2
YZ|DX,R2
DZ|Xare the values of R2
YZ|DXand R2
DZ|Xthat optimize (10).
We now find analytical expressions for the optimizers R2
YZ|DX,R2
DZ|X. In what
follows we write trfor the objective function with the understanding that it is written in
its modified form above. The partial derivative of tr
df 1with respect to R2
YZ|DXis
∂tr/df 1
∂R2
YZ|DX
=frq(1R2
DZ|X)R2
YZ|DX+qR2
DZ|X
2(1R2
YZ|DX)3
2qR2
YZ|DX
.
Since the partial of trwith respect to R2
YZ|DXis always positive, we have R2
YZ|DX=Rmax
Y
unconditionally, i.e. R2
YZ|DXalways lies on the boundary.
We now turn to the partial derivative of tr
df 1with respect to R2
DZ|X:
∂tr/df 1
∂R2
DZ|X
=frqR2
DZ|X+qR2
YZ|DX(1R2
DZ|X)
2q(1R2
YZ|DX)(1R2
DZ|X)R2
DZ|X
.
It is straightforward to check that the second derivative of tr
df 1is negative with respect
to R2
DZ|X. Thus, when attainable, the zero of the first partial derivative with respect to
R2
DZ|Xis a maximizer. Solving for the value that makes the first derivative zero yields:
R2
DZ|X=Rmax
Y
f2
r+Rmax
Y
.(11)
This interior point solution is only feasible when f2
rRmax
Y(1 Rmax
D)/Rmax
D. Otherwise,
if
f2
r< Rmax
Y(1 Rmax
D)/Rmax
D,(12)
then the partial with respect to R2
DZ|Xis strictly positive for all R2
DZ|XRmax
Dand so
we obtain the boundary solution R2
DZ|X=Rmax
D.
Proof of Theorem 3.Let Zdenote an (n×m) matrix of unobserved covariates and let ˆγ
denote the coefficient vector of Z. We are now interested in the long regression
Y=ˆ
λD +Xˆ
β+Zˆγ+ ˆǫ. (13)
13
Consider the (n×1) vector ZL:= Zˆγ. The regression
Y=ˆ
λD +Xˆ
β+ZL+ ˆǫ(14)
yields the same value for ˆ
λ; therefore, the bias induced by Zis equal to that induced by
ZLand R2
YZL|D,X=R2
YZ|D,X. On the other hand, since ˆγis chosen solely to maximize
R2
YZ|D,X, we also have that R2
DZL|XR2
DZ|X. Now observe that the standard error
formula from (7) holds for multivariate Zif we correctly adjust for the degrees of freedom.
Further note that the bias of ZLis a strictly increasing function of R2
DZL|X. Thus, the
most adversarial choice of Zis such that R2
DZL|X=R2
DZ|X. We can thus assess the
maximum t-statistic of a matrix Zby considering that of a single adversarial vector ZLand
further adjusting for the degrees of freedom.
Proof of Corollary 1.For any subset XSof the columns of observed covariate matrix X,
recall that R2
YXS|DR2
YX|Dand R2
DXSR2
DX. Now apply the proof of Theorem 3
with the alteration that the constraint R2
DZL|XR2
DZ|Xis not necessarily tight, since
ZLmay not be adversarial for a specific dataset.
For all cases below, consider the following condition for significance:
t
α,df 1tmax
R2.(15)
Proof of Theorem 4.First consider the case in which fr= 0. This only happens if ˆ
λr=λ0.
Since here R2
DZ|X= 0, the inclusion of Zdoes not alter the point estimate and we
still obtain ˆ
λ=λ0after adjusting for Z. Therefore, the adjusted t-statistic will be zero
regardless of the value of the standard error.
If fr> f
α,df 1then we are already able to reject H0even if Zhas zero explanatory
power. Otherwise, given the constraints R2
YZ|DXXRVI and R2
DZ|X= 0, the expression
for tmax
R2simplifies to
tmax
XRVI,0= max
R2
YZ|DX
fr
q1R2
YZ|DX
df 1
such that R2
YZ|DXXRVI. This is a strictly increasing function of R2
YZ|DXand thus
attains its maximum at R2
YZ|DX= XRVI. Thus solving for the minimum value of XRVI
that satisfies (15) is equivalent to solving for XRVI at the equality. That is,
f
α,df 1=fr
q1XRVI0
α
.
Squaring and rearranging terms, we obtain
XRVI0
α= 1 fr
f
α,df 1!2
,
as desired.
14
Proof of Theorem 5.If fr> f
α,df 1then we are already able to reject H0even if Zhas
zero explanatory power, and thus the minimal strength to reject H0is zero. Otherwise,
consider constraints R2
YZ|DXXRVI and R2
DZ|X1. From the proof of Theorem 2we
see that tmax
R2is an increasing function of XRVI. Thus solving for the minimum value of
XRVI that satisfies (15) is equivalent to solving for XRVI at the equality. Notice that (12)
is not satisfied here. We therefore plug in the interior point solution from Theorem 2for
tmax
R2and solve for XRVI1
α. This results in the equation
f
α,df 1=sf2
r+ XRVI1
α
1XRVI1
α
which yields the solution
XRVI1
α=f2
α,df 1f2
r
1 + f2
α,df 1
,
as we wanted to show.
Theorem 7 (Closed-form expression for XRVIRmax
D
α).The analytical expression for XRVIRmax
D
α
is
XRVIRmax
D
α=
b+sb24ac
2a,if 0f2
r<XRVI1
α1Rmax
D
Rmax
D
XRVI1
α,if XRVI1
α1Rmax
D
Rmax
Df2
rf2
α,df 1,
0,if f2
α,df 1< f2
r
where
a= 1,(16)
b=2"f2
α,df 1(1 Rmax
D)f2
r
f2
α,df 1+Rmax
D
+2f2
r(1 Rmax
D)Rmax
D
(f2
α,df 1+Rmax
D)2#,(17)
c="f2
α,df 1(1 Rmax
D)f2
r
f2
α,df 1+Rmax
D#2
,(18)
XRVI1
α=f2
α,df 1f2
r
1 + f2
α,df 1
,(19)
and s {−1,1}is chosen to yield the valid quadratic root, i.e., tmax
XRVI,Rmax
D=t
α,df 1.
Proof of Theorem 7.As argued in the proofs for Theorems 4and 5, if f
α,df 1< fr, then
the minimum strength of Znecessary to attain significance is zero. Otherwise, solving for
the minimum value of XRVI that satisfies (15) is equivalent to solving for XRVI at the
equality.
Now let fr>0. Here we have two cases. Either both R2values reach the bound or
the optimal value of R2
DZ|Xis an interior point. For latter case, this means the constraint
R2
DZ|Xis not binding and thus XRVIRmax
D
αshould equal XRVI1
α. Recall that tris concave
down with respect to Rmax
D; therefore, the optimal value of R2
DZ|Xis the interior point
solution R2
DZ|X, as defined in (11), if and only if R2
DZ|X< Rmax
D. We can simplify this
15
condition to be of the form,
XRVI1(1 Rmax
D)
Rmax
D
< f2
r.
It remains to solve for the case where both coordinates reach the bound. Here, the equality
in (15) simplifies to
t
α,df 1=frp1Rmax
D+pRmax
DXRVI
p(1 XRVI)/(df 1) .(20)
We now solve for XRVI in (20) by squaring both sides and taking the valid root of the
quadratic equation. The simplified quadratic form is
XRVI2"2(f2
α,df 1(1 Rmax
D)f2
r)
f2
α,df 1+Rmax
D
+4f2
r(1 Rmax
D)Rmax
D
(f2
α,df 1+Rmax
D)2#XRVI
+"f2
α,df 1(1 Rmax
D)f2
r
f2
α,df 1+Rmax
D#2
= 0.(21)
The expressions for a, b, c in (16)-(18) immediately follow from Sridharacharya-Bhaskara’s
formula for quadratic equations. Finally, we note that if fr= 0 and Rmax
D>0 then
XRVIRmax
Dsolution is valid. If fr= 0 and Rmax
D= 0 then we are in the XRVI0case such
that, by Theorem 4, we cannot overturn an insignificant result.
Proof of Theorem 6.The derivation of RVIαfollows that of XRVIRmax
D
αvery closely (see
proof of Theorem 7). The only difference is that when f2
r<1RVI, the equality in (15)
simplifies to
t
α,df 1=fr1RVI + RVI
1RVI ×df 1
which is equivalent to
f
α,df 1=fr+RVI
1RVI .(22)
Let f=f
α,df 1fr. Then (22) further reduces to a quadratic function of RVI with
positive root
RVI = 1
2(qf4
+ 4f2
f2
).
It remains to show that (12), i.e. f2
r1RVI, is equivalent to
fr>1
f
α,df 1
.(23)
This is equivalent to showing that RVIαis given by the interior point solution if and only
16
if (23) holds. To see why, recall that for the interior point solution,
Rmax
Y=Rmax
D= RVIα=f2
α,df 1f2
r
1 + f2
α,df 1
.(24)
Plugging (24) into (12), we have
f2
r1RVIα= 1 f2
α,df 1f2
r
1 + f2
α,df 1
which reduces to f
α,df 11
fr.
Asymptotic distribution of R2
DZ|X.First consider the case without observed covariates X.
Under i.i.d sampling, the asymptotic distribution of the sample correlation RDZis derived
in Ferguson (2017, Theorem 8, p. 52) to be
pdf (RDZρ)d
N(0, γ2)
where here d
denotes convergence in distribution, ρis the population correlation coefficient
of Dand Z, and
γ2=c1ρ2c2ρ+Var((DE[D])(ZE[Z]))
Var(D) Var(Z),
where constants c1and c2depend on higher order moments of Dand Z, and Var(·) denotes
population variance. If Dis randomized, we have that Dis independent of Zby design.
Thus ρ= 0 and Var((DE[D])(ZE[Z])) = Var(D) Var(Z). This simplifies the expression
of the asymptotic variance γ2to 1, and we have
pdf RDZ
d
N(0,1),and df R2
DZ
d
χ2
1.
To extend the argument to the sample partial correlation RDZ|X, first note that it can be
rewritten using the FWL Theorem (Frisch and Waugh, 1933; Lovell, 1963) as the sample
correlation RDZ|X= cor( ˜
Z, ˜
D),where ˜
Zand ˜
Dare the sample residuals of the regression
of Zand Don X, i.e, ˜
Z:= ZXˆ
θand ˜
D:= DXˆ
δ, and ˆ
θand ˆ
δare the respective
OLS coefficient estimates which are asymptotically normal. Now define the population
counterparts, ˇ
Z:= ZXθ,ˇ
D:= DXδ, where we replace sample estimates with their
corresponding population values. We note that estimation errors on ˆ
θand ˆ
δdo not affect
the asymptotic distribution of cor( ˜
Z, ˜
D), which is the same as that of cor( ˇ
Z, ˇ
D)—this can
be verified by applying standard results in large sample theory to the covariances (and
variances) of sample residuals, such as Boos and Stefanski (2013, Theorem 5.28, p. 249).
Now cor( ˇ
Z, ˇ
D) is again a simple bivariate correlation, and we can directly apply the result
of Ferguson (2017) above. Note this result does not rely on any parametric distributional
assumptions on ˇ
Dand ˇ
Z, except for requiring that the relevant moments are finite.
Acknowledgements
This work is supported in part by the Royalty Research Fund at the University of Wash-
ington, and by the National Science Foundation under Grant No. 2417955.
17
References
Amlani, Sharif and Samuel Collitt (2022). “The Impact of Vote-By-Mail Policy on Turnout
and Vote Share in the 2020 Election”. In: Election Law Journal: Rules, Politics, and
Policy 21.2, pp. 135–149.
Boos, Dennis D and Leonard A Stefanski (2013). Essential statistical inference: theory and
methods. Vol. 591. Springer.
Brodeur, Abel et al. (2016). “Star wars: The empirics strike back”. In: American Economic
Journal: Applied Economics 8.1, pp. 1–32.
Cinelli, Carlos, Andrew Forney, and Judea Pearl (2022). “A Crash Course in Good and
Bad Controls”. In: Sociological Methods & Research 1, p. 34.
Cinelli, Carlos and Chad Hazlett (2022). “An omitted variable bias framework for sensitivity
analysis of instrumental variables”. In: Available at SSRN 4217915.
(Feb. 2020). “Making Sense of Sensitivity: Extending Omitted Variable Bias”. In: Jour-
nal of the Royal Statistical Society Series B: Statistical Methodology 82.1, pp. 39–67.
issn: 1369-7412. (Visited on 10/13/2023).
Ferguson, Thomas S (2017). A course in large sample theory. Routledge.
Frisch, Ragnar and Frederick V Waugh (1933). “Partial time regressions as compared with
individual trends”. In: Econometrica: Journal of the Econometric Society, pp. 387–401.
Lenz, Gabriel S and Alexander Sahn (2021). “Achieving statistical significance with control
variables and without transparency”. In: Political Analysis 29.3, pp. 356–369.
Lovell, Michael C (1963). “Seasonal adjustment of economic time series and multiple re-
gression analysis”. In: Journal of the American Statistical Association 58.304, pp. 993–
1010.
Simonsohn, Uri, Leif D Nelson, and Joseph P Simmons (2014). “P-curve: a key to the
file-drawer.” In: Journal of experimental psychology: General 143.2, p. 534.
Vivalt, Eva (2019). “Specification searching and significance inflation across time, methods
and disciplines”. In: Oxford Bulletin of Economics and Statistics 81.4, pp. 797–816.
18
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Many students of statistics and econometrics express frustration with the way a problem known as “bad control” is treated in the traditional literature. The issue arises when the addition of a variable to a regression equation produces an unintended discrepancy between the regression coefficient and the effect that the coefficient is intended to represent. Avoiding such discrepancies presents a challenge to all analysts in the data intensive sciences. This note describes graphical tools for understanding, visualizing, and resolving the problem through a series of illustrative examples. By making this “crash course” accessible to instructors and practitioners, we hope to avail these tools to a broader community of scientists concerned with the causal interpretation of regression models.
Article
Full-text available
This paper examines how significance inflation has varied across time, methods and disciplines. Leveraging a unique data set of impact evaluations on 20 kinds of development programmes, I find that results from randomized controlled trials exhibit less significance inflation than results from studies using other methods. Further, randomized controlled trials have exhibited less significance inflation over time, but quasi‐experimental studies have not. There is no robust difference between results from researchers affiliated with economics departments and those from researchers affiliated with other predominantly health‐related departments. Overall, the biases found appear much smaller than those previously observed in other social sciences.
Article
Full-text available
Because scientists tend to report only studies (publication bias) or analyses (p-hacking) that “work,” readers must ask, “Are these effects true, or do they merely reflect selective reporting?” We introduce p-curve as a way to answer this question. P-curve is the distribution of statistically significant p values for a set of studies (ps < .05). Because only true effects are expected to generate right-skewed p-curves—containing more low (.01s) than high (.04s) significant p values—only right-skewed p-curves are diagnostic of evidential value. By telling us whether we can rule out selective reporting as the sole explanation for a set of findings, p-curve offers a solution to the age-old inferential problems caused by file-drawers of failed studies and analyses.
Article
How often do articles depend on suppression effects for their findings? How often do they disclose this fact? By suppression effects, we mean control-variable-induced increases in estimated effect sizes. Researchers generally scrutinize suppression effects as they want reassurance that authors have a strong explanation for them, especially when the statistical significance of the key finding depends on them. In a reanalysis of observational studies from a leading journal, we find that over 30% of articles depend on suppression effects for statistical significance. Although increases in key effect estimates from including control variables are of course potentially justifiable, none of the articles justify or disclose them. These findings may point to a hole in the review process: journals are accepting articles that depend on suppression effects without readers, reviewers, or editors being made aware.
Book
A Course in Large Sample Theory is presented in four parts. The first treats basic probabilistic notions, the second features the basic statistical tools for expanding the theory, the third contains special topics as applications of the general theory, and the fourth covers more standard statistical topics. Nearly all topics are covered in their multivariate setting.The book is intended as a first year graduate course in large sample theory for statisticians. It has been used by graduate students in statistics, biostatistics, mathematics, and related fields. Throughout the book there are many examples and exercises with solutions. It is an ideal text for self study.
Article
The logical implications of certain simple consistency requirements for appraising alternative procedures for seasonal adjustment constitute the first problem considered in this paper. It is shown that any sum preserving technique of seasonal adjustment that satisfies the quite reasonable requirements of orthogonality and idempotency can be executed on the electronic computer by standard least squares regression procedures. Problems involved in utilizing seasonally adjusted data when estimating the parameters of econometric models are examined. Extending the fundamental Frisch-Waugh theorem concerning trend and regression analysis to encompass problems of seasonality facilitates the comparison of the implications of running regressions on data subjected to prior seasonal adjustment with the effects of including dummy variables with unadjusted data. Complicated types of moving seasonal patterns that may be handled by the dummy variable procedure are considered. Although efficient estimates of the parameters of the econometric model may be obtained in appropriate circumstances when data subjected to prior seasonal adjustment by the least squares procedure are employed, there is an inherent tendency to overstate the significance of the regression coefficients; a correction procedure is suggested. When seasonally adjusted data are employed, certain special complications must be taken into account in applying Aitken's generalized least squares procedure in order to deal with autocorrelated residuals. The entire argument extends to two-stage least squares estimation of the parameters of simultaneous equation models.
Article
Journals favor rejections of the null hypothesis. This selection upon results may distort the behavior of researchers. Using 50,000 tests published between 2005 and 2011 in the AER, JPE and QJE, we identify a residual in the distribution of tests that cannot be explained by selection. The distribution of p-values exhibits a camel shape with abundant p-values above :25, a valley between :25 and :10 and a bump slightly under :05. Missing tests are those which would have been accepted but close to being rejected (p-values between :25 and :10). We show that this pattern corresponds to a shift in the distribution of p-values: between 10% and 20% of marginally rejected tests are misallocated. Our interpretation is that researchers might be tempted to inflate the value of their tests by choosing the specification that provides the highest statistics. Note that Inflation is larger in articles where stars are used in order to highlight statistical significance and lower in articles with theoretical models.