PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

This paper introduces the package sensemakr for R and Stata, which implements a suite of sensitivity analysis tools for regression models developed in Cinelli and Hazlett (2020a). Given a regression model, sensemakr can compute sensitivity statistics for routine reporting, such as the robustness value, which describes the minimum strength that unobserved confounders need to have to overturn a research conclusion. The package also provides plotting tools that visually demonstrate the sensitivity of point estimates and t-values to hypothetical confounders. Finally, sensemakr implements formal bounds on sensitivity parameters by means of comparison with the explanatory power of observed variables. All these tools are based on the familiar "omitted variable bias" framework, do not require assumptions regarding the functional form of the treatment assignment mechanism nor the distribution of the unobserved confounders, and naturally handle multiple, non-linear confounders. With sensemakr, users can transparently report the sensitivity of their causal inferences to unobserved confounding, thereby enabling a more precise, quantitative debate as to what can be concluded from imperfect observational studies.
Content may be subject to copyright.
JSS Journal of Statistical Software
MMMMMM YYYY, Volume VV, Issue II. doi: 10.18637/jss.v000.i00
sensemakr: Sensitivity Analysis Tools for OLS in R
and Stata
Carlos Cinelli
University of California, Los Angeles
Jeremy Ferwerda
Dartmouth College
Chad Hazlett
University of California, Los Angeles
Abstract
This paper introduces the package sensemakr for Rand Stata, which implements a
suite of sensitivity analysis tools for regression models developed in Cinelli and Hazlett
(2020a). Given a regression model, sensemakr can compute sensitivity statistics for rou-
tine reporting, such as the robustness value, which describes the minimum strength that
unobserved confounders need to have to overturn a research conclusion. The package also
provides plotting tools that visually demonstrate the sensitivity of point estimates and
t-values to hypothetical confounders. Finally, sensemakr implements formal bounds on
sensitivity parameters by means of comparison with the explanatory power of observed
variables. All these tools are based on the familiar “omitted variable bias” framework, do
not require assumptions regarding the functional form of the treatment assignment mech-
anism nor the distribution of the unobserved confounders, and naturally handle multiple,
non-linear confounders. With sensemakr, users can transparently report the sensitivity
of their causal inferences to unobserved confounding, thereby enabling a more precise,
quantitative debate as to what can be concluded from imperfect observational studies.
Keywords: causal inference, sensitivity analysis, omitted variable bias, robustness value, R,
Stata, bounds.
1. Introduction
Across disciplines, investigators face the perennial challenge of making and defending causal
claims using observational data. The most common identification strategy in these circum-
stances is to adjust for a set of observed covariates deemed sufficient to control for confounding,
with linear regression remaining among the most popular statistical method for making such
adjustments. Researchers who argue that a regression coefficient unbiasedly reflects a causal
relationship must also be able to argue that there are no unobserved confounders—a difficult
2sensemakr: Sensitivity Analysis Tools for OLS
or impossible assumption to defend in most applied settings.1What value can we draw from
these studies, knowing that this ideal condition is likely to fail? Fortunately, the assumption
of zero unobserved confounding need not hold precisely for an observational study to remain
substantively informative. In these cases, sensitivity analyses play a useful role by allowing
researchers to quantify how strong unobserved confounding needs to be in order to substan-
tially change a research conclusion, and by aiding in determining whether confounding of such
strength is plausible.
Although numerous methods for sensitivity analyses have been proposed, these tools are still
under-utilized.2As argued in Cinelli and Hazlett (2020a), several reasons may contribute to
the low adoption of these methods. First, many of these methods impose complicated and
strong assumptions regarding the nature of the confounder, which many users cannot or are
not willing to defend. Second, while users routinely report regression tables or coefficient plots,
until recently investigators have lacked “standard” quantities that can easily and correctly
summarize the robustness of a regression coefficient to unobserved confounding. Finally,
connecting the results of a formal sensitivity analysis to a cogent argument about what types
of confounders may exist in one’s research project remains difficult, particularly when there
are no compelling arguments as to why the treatment assignment should be approximately
“ignorable,”“exogeneous,” or “as-if random.”
This paper introduces the Rand Stata package sensemakr (Cinelli, Ferwerda, and Hazlett
2020b,a), which implements a suite of sensitivity analysis tools proposed in Cinelli and Hazlett
(2020a) to address these challenges. Within the familiar regression framework and without
the need for additional assumptions, sensemakr enables analysts to easily answer a variety of
common sensitivity questions, such as:
How strong would an unobserved confounder (or a group of confounders) have to be to
change a research conclusion?
In a worst-case scenario, how robust are the results to all unobserved confounders acting
together, possibly non-linearly?
How strong would confounding need to be, relative to the strength of observed covariates,
to change the answer by a certain amount?
Specifically, given a full regression model, or simply standard statistics found in conventional
regression tables, sensemakr is able to: (i) compute sensitivity statistics for routine reporting,
such as the robustness value describing the minimum strength that unobserved confounders
would need to have to overturn the research conclusions; (ii) provide graphical tools that
1This condition is also known as “selection on observables,”“conditional igorability,”“conditional exogene-
ity,”“conditional exchangeability,” or “backdoor admissibility” (Angrist and Pischke 2008;Pearl 2009;Imbens
and Rubin 2015;Hern´an and Robins 2020).
2Dating back to at least Cornfield, Haenszel, Hammond, Lilienfeld, Shimkin, and Wynder (1959), a par-
tial list of sensitivity analysis proposals includes Rosenbaum and Rubin (1983); Robins (1999); Frank (2000);
Rosenbaum (2002); Imbens (2003); Brumback, Hern´an, Haneuse, and Robins (2004); Frank, Sykes, Anagnos-
topoulos, Cannata, Chard, Krause, and McCrory (2008); Hosman, Hansen, and Holland (2010); Imai, Keele,
Yamamoto et al. (2010); Vanderweele and Arah (2011); Blackwell (2013); Frank, Maroulis, Duong, and Kelcey
(2013); Carnegie, Harada, and Hill (2016); Dorie, Harada, Carnegie, and Hill (2016); Middleton, Scott, Diakow,
and Hill (2016); Oster (2017); Cinelli, Kumor, Chen, Pearl, and Bareinboim (2019); Franks, D’Amour, and
Feller (2019).
Journal of Statistical Software 3
enable users to visually explore the implications of unobserved confounding, such as contour
plots showing adjusted point estimates and t-values under confounding of various strengths,
as well as plots showing adjusted estimates under extreme (pessimistic) scenarios; and (iii)
place formal bounds on the maximum strength of confounding, based on plausibility judg-
ments regarding how unobserved confounders compare with observed variables. These tools
do not require additional assumptions regarding the functional form of the treatment assign-
ment mechanism nor on the distribution of the unobserved confounders, and naturally handle
multiple confounders, possibly acting non-linearly.
In what follows, Section 2briefly reviews the omitted variable bias framework for sensitivity
analysis developed in Cinelli and Hazlett (2020a), which provides the theoretical foundations
for the tools in sensemakr. Next, Section 3describes the basic functionality and provides a
practical introduction to sensitivity analysis using sensemakr for R. Section 4describes ad-
vanced usage of the Rpackage, and shows how to leverage individual functions for customized
sensitivity analyses. Finally, Section 5describes sensemakr for Stata, and Section 6concludes
with a brief discussion of what sensitivity analysis can and cannot do in practice.
2. Sensitivity analysis in an omitted variable bias framework
In this section, we briefly review the omitted variable bias (OVB) framework for sensitivity
analysis presented in Cinelli and Hazlett (2020a). This method builds on a scale-free reparam-
eterization of the OVB formula in terms of partial R2values, which allows us to: (i) assess the
sensitivity of point estimates, t-values, and confidence intervals under the same conceptual
framework; (ii) easily assess the sensitivity of multiple confounders acting together, possibly
non-linearly; (iii) exploit knowledge of the relative strength of variables to posit plausible
bounds on unobserved confounding; and (iv) construct a set of summary sensitivity statistics
suitable for routine reporting.
2.1. The OVB framework
The starting point of our analysis is a “full” linear regression model of an outcome Yon a
treatment D, controlling for a set of covariates given by both Xand Z,
Y= ˆτD +Xˆ
β+ ˆγZ + ˆfull (1)
where Yis an (n×1) vector containing the outcome of interest for each of the nobservations
and Dis an (n×1) treatment variable (which may be continuous or binary); Xis an (n×p)
matrix of observed covariates including the constant; and Zis a single (n×1) unobserved
covariate (we discuss how to extend results for a multivariate Zbelow).
Equation 1is the regression model that the investigator wished she had run to obtain a valid
causal estimate of the effect of Don Y. Nevertheless, Zis unobserved. Therefore, the feasible
regression the investigator is able to estimate is the “restricted“ model omitting Z, that is,
Y= ˆτresD+Xˆ
βres + ˆres (2)
Given the discrepancy of what we wish to know and what we actually have, the main question
we would like to answer is: how do the observed point estimate and standard error of the
4sensemakr: Sensitivity Analysis Tools for OLS
restricted regression, ˆτres and bse(ˆτres ), compare to the desired point estimate and standard
error of the full regression, ˆτand bse(ˆτ)?
OVB with the partial R2 parameterization
Define as d
bias the difference between the full and restricted estimates, d
bias := ˆτres ˆτ.
Now let (i) R2
DZ|Xdenote the share of residual variance of the treatment Dexplained by
the omitted variable Z, after accounting for the remaining covariates X; and, (ii) R2
YZ|D,X
denote the share of residual variance of the outcome Yexplained by the omitted variable Z,
after accounting for Xand D.Cinelli and Hazlett (2020a) have shown that these quantities
are sufficient for determining the bias, adjusted estimate, and adjusted standard errors of the
full regression of Equation 1.
More precisely, the bias can be written as,
|d
bias|=bse(ˆτres)v
u
u
tR2
YZ|D,XR2
DZ|X
1R2
DZ|X
(df) (3)
Where df stands for the degrees of freedom of the restricted regression actually run. Moreover,
the estimated standard error of ˆτcan be recovered with,
bse(ˆτ) = bse(ˆτres )v
u
u
t1R2
YZ|D,X
1R2
DZ|Xdf
df 1.(4)
Given hypothetical values of R2
DZ|Xand R2
YZ|D,X, Equations 3and 4allow investigators
to examine the sensitivity of point estimates and standard-errors (and consequently t-values,
confidence intervals or p-values) to the inclusion of any omitted variable Zwith such strengths.
Conversely, given a critical threshold deemed to be problematic, one can find the strength of
confounders capable of bringing about a bias reducing the adjusted effect to that threshold.
Another useful property of the OVB formula with the partial R2parameterization is that the
effect of R2
YZ|D,Xon the bias is bounded. This allows investigators to contemplate extreme
sensitivity scenarios, in which the parameter R2
YZ|D,Xis set to 1 (or another conservative
value), and see what happens as R2
DZ|Xvaries.
2.2. Sensitivity statistics for routine reporting
The previous formulas can be used to assess the sensitivity of an estimate to confounders with
any hypothesized strength. However, making sensitivity analyses standard practice benefits
from simple and interpretable sensitivity statistics that can quickly summarize the robustness
of a study result to unobserved confounding. With this in mind, Cinelli and Hazlett (2020a)
propose two main sensitivity statistics for routine reporting: (i) the (observed) partial R2
of the treatment with the outcome, R2
YD|X; and, (ii) the robustness value,RVq,α. These
statistics serve two main purposes:
Journal of Statistical Software 5
1. They can be easily displayed alongside other summary statistics in regression tables,
making sensitivity analysis to unobserved confounding simple, accessible, and standard-
ized;
2. They can be easily computed from quantities found in a regression table, thereby en-
abling readers and reviewers to assess the sensitivity of results they see in print, even if
the original authors did not perform sensitivity analyses.
The partial R2 of the treatment with the outcome
In addition to quantifying how much variation of the outcome is explained by the treatment,
the partial R2of the treatment with the outcome also conveys how robust the point estimate
is to unobserved confounding in an “extreme scenario.” Specifically, suppose the unobserved
confounder Zexplains all residual variance of the outcome, that is, RYZ|D,X= 1. For
this confounder to bring the point estimate to zero, it must explain at least as much residual
variation of the treatment as the residual variation of the outcome that the treatment currently
explains. Put differently, if RYZ|D,X= 1, then we must have that R2
DZ|XR2
YD|X,
otherwise this confounder cannot logically account for all the observed association between
the treatment and the outcome (Cinelli and Hazlett 2020a).
The Robustness Value
The second sensitivity statistic proposed in Cinelli and Hazlett (2020a) is the robustness value.
The robustness value RVq,α quantifies the minimal strength of association that the confounder
needs to have, both with the treatment and with the outcome, so that a confidence interval
of level αincludes a change of q% of the current estimated value.
Let fq:= q|fYD|X|, where |fYD|X|is the partial Cohen’s f of the treatment with the
outcome multiplied by the percentage reduction qdeemed to be problematic.3Also, let
|t
α,df1|denote the t-value threshold for a t-test with significance level of αand df 1 degrees
of freedom, and define f
α,df1:= |t
α,df1|/df 1. Finally, construct fq,α, which “deducts”
from fYD|Xboth the proportion of reduction qof the point estimate and the boundary below
which statistical significance is lost at the level of α. That is, fq,α := fqf
α,df1. We then
have that RVq,α is given by (Cinelli and Hazlett 2020a,b),
RVq,α =
0,if fq,α <0
1
2qf4
q,α + 4f2
q,α f2
q,α,if fq<1/f
α,df1
f2
qf2
α,df1
1 + f2
q
,otherwise.
(5)
Any confounder that explains RVq,α% of the residual variance of both the treatment and of
the outcome is sufficiently strong to make the adjusted t-test not reject the null hypothesis
H0:τ= (1 q)|ˆτres |at the αlevel (or, equivalently, sufficiently strong to make the adjusted
1αconfidence interval include (1 q)|ˆτres |). Likewise, a confounder with associations lower
3The partial Cohen’s f2can be written as f2
YD|X=R2
YD|X/(1 R2
YD|X)
6sensemakr: Sensitivity Analysis Tools for OLS
than RVq,α is not capable of overturning the conclusion of such a test. Setting α= 1 returns
the robustness value for the point estimate. Further details on how to interpret the robustness
value in practice are given in the next sections.
2.3. Bounds on the strength of confounding using observed covariates
Consider a confounder orthogonal to the observed covariates, ie., ZX, or, equivalently,
consider only the part of Znot linearly explained by X. Now denote by Xja specific covariate
of the set Xand define
kD:= R2
DZ|Xj
R2
DXj|Xj
, kY:= R2
YZ|Xj,D
R2
YXj|Xj,D
.(6)
where Xjrepresents the vector of covariates Xexcluding Xj. That is, the terms kDand kY
represent how strong the confounder Zis relative to observed covariate Xj, where “strength”
is measured by how much residual variation they explain of the treatment (for kD) and of
the outcome (for kY). Given kDand kY, we can rewrite the strength of the confounders as
(Cinelli and Hazlett 2020a),
R2
DZ|X=kDf2
DXj|Xj, R2
YZ|D,Xη2f2
YXj|Xj,D (7)
where ηis a scalar which depends on kY,kDand R2
DXj|Xj.
These equations allow the investigator to assess the maximum bias that a hypothetical con-
founder at most “k times” as strong as a particular covariate Xjcould cause. This can be
used to explore the relative strength of confounding necessary for bias to have changed the
research conclusion. Furthermore, when the researcher has domain knowledge to argue that
a certain covariate Xjis particularly important in explaining treatment or outcome varia-
tion, and that omitted variables cannot explain as much residual variance of Dor Yas that
observed covariate, these results can be used to set plausible bounds in the total amount of
confounding. The same inequalities hold if one uses a group of variables for benchmarking,
by simply replacing the individual partial R2with the group partial R2of those variables
(Cinelli and Hazlett 2020b).
2.4. Multiple or non-linear confounders
Finally, suppose that, instead of a single unobserved confounder Z, there are multiple unob-
served confounders Z= [Z1, Z2, . . . , Zk]. In this case, the regression the investigator wished
she had run becomes:
Y= ˆτD +Xˆ
β+Zˆγ+ ˆfull .(8)
As Cinelli and Hazlett (2020a) show, the previous results considering a single unobserved
confounder are in fact conservative when considering the impact of multiple confounders,
barring an adjustment in the degrees of freedom of Equation 4. Moreover, since the vector
Zis arbitrary, this can also accommodate non-linear confounders or even misspecification of
the functional form of the observed covariates X. In other words, to assess the maximum
bias that multiple, non-linear confounders could cause in our current estimates, it suffices to
Journal of Statistical Software 7
think in terms of the maximum explanatory power that Zcould have in the treatment and
outcome regressions, as parameterized by R2
DZ|Xand R2
YZ|D,X.
3. sensemakr for R: Basic functionality
In this section we illustrate the basic functionality of sensemakr for R. Given that sensitivity
analysis requires contextual knowledge to be properly interpreted, we illustrate these tools
with a real example. We use sensemakr to reproduce all results found in Section 5 of Cinelli
and Hazlett (2020a), which estimates the effects of exposure to violence on attitudes towards
peace, in Darfur, Sudan. Further details about this application and the data can be found in
Hazlett (2019).
3.1. Violence in Darfur: data and research question
In 2003 and 2004, the Darfurian government orchestrated a horrific campaign of violence
against civilians, killing an estimated two hundred thousand people. This application asks
whether, on average, being directly injured or maimed in this episode made individuals more
likely to feel“vengeful” and unwilling to make peace with those who perpetrated this violence.
Or, might those who directly suffered such violence be motivated to see it end, supporting
calls for peace?
The sensemakr package provides the data required for this example based on a survey among
Darfurian refugees in eastern Chad (Hazlett 2019). To get started we first need to install the
package. From within R, the sensemakr package can be installed from the Comprehensive R
Archive Network (CRAN).
R> install.packages("sensemakr")
After loading the package, the data can be loaded with the command data("darfur").
R> library(sensemakr)
R> data("darfur")
The “treatment” variable of interest is directlyharmed, which indicates whether the individ-
ual was physically injured or maimed during the attack on her or his village in Darfur. The
main outcome of interest is peacefactor, an index measuring pro-peace attitudes. Other
covariates in the data include: village (a factor variable indicating the original village of
the respondent), female (a binary indicator of gender), age,herder_dar (whether they were
a herder in Darfur), farmer_dar (whether they were a farmer in Darfur), and past_voted
(whether they report having voted in an earlier election, prior to the conflict). For further
details, see ?darfur.
Hazlett (2019) argues that the purpose of these attacks was to punish civilians from ethnic
groups presumed to support the opposition and to kill or drive these groups out so as to reduce
this support. Violence against civilians included aerial bombardments by the government as
well as assaults by the Janjaweed, a pro-government militia. For this example, suppose a
researcher argues that, while some villages were more or less intensively attacked, within
village violence was largely indiscriminate. The bombings were crude, could not be finely
8sensemakr: Sensitivity Analysis Tools for OLS
Dependent variable:
peacefactor
directlyharmed 0.097∗∗∗
(0.023)
female 0.232∗∗∗
(0.024)
Observations 1,276
R20.512
Residual Std. Error 0.310 (df = 783)
Note: p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
Table 1: OLS results for darfur.model. To conserve space, only the results for
directlyharmed and female are shown.
targeted below the level of village, and the strategic purpose of the attacks was not kill
or capture specific individuals. Similarly, the Janjaweed had no reason to target certain
individuals rather than others, and no information with which to do so, with one major
exception—women were targeted and often subjected to sexual violence.
Supported by these considerations, this researcher may argue that adjusting for village and
female is sufficient for control of confounding, and run the following linear regression model
(in which other pre-treatment covariates, although not necessary for identification, are also
included):
R> darfur.model <- lm(peacefactor ~ directlyharmed + village + female +
R+ age + farmer_dar + herder_dar +
R+ pastvoted + hhsize_darfur,
R+ data = darfur)
This regression model results in the estimates shown in Table 1. According to this model,
those who were directly harmed in violence were on average more “pro-peace,” not less.
The threat of unobserved confounders
The previous estimate requires the assumption of no unobserved confounders for unbiasedness.
While supported by the claim that there is no targeting of violence within village and gender
strata, other investigators may challenge this account. For example, although the bombing
was crude, perhaps bombs were still more likely to hit the center of the village, and those in
the center were also likely to hold different attitudes towards peace. Or, it could be the case
that the Janjaweed observed signals that indicate individual characteristics such as wealth,
and targeted using this information. Or perhaps an individual’s (prior) political attitudes
could have led them to take actions that exposed them to greater risk during the attack. To
complicate things, all these factors could interact with each other or otherwise have other
non-linear effects.
Journal of Statistical Software 9
These concerns suggest that, instead of the previous linear model (darfur.model), we should
have run a model such as:
R> darfur.complete.model <- lm(peacefactor ~ directlyharmed + village +
R+ female + age + farmer_dar + herder_dar +
R+ pastvoted + hhsize_darfur +
R+ center*wealth*political_attitudes,
R+ data = darfur)
Where center*wealth*political_attitudes indicates fully interacted terms for these three
variables. However trying to fit the model darfur.complete.model will result in error: none
of the variables center,wealth or political_attitudes were measured.
Given an assumption on how strongly omitted variables relate to the treatment and the
outcome, how would including them have changed our inferences regarding the coefficient
of directlyharmed? Or, what is the minimal strength that these unobserved confounders
(or all remaining unobserved confounders) need to have to change our previous conclusions?
Additionally, how can we leverage our contextual knowledge about the attacks to judge how
plausible such confounders are? For instance, given the limited opportunities for targeting
and the special role of gender in this case, if we assumed that unobserved confounding cannot
explain more than female, what would this imply about the maximum possible strength of
confounding? We show next how to use sensemakr to answer each of these questions.
3.2. Violence in Darfur: sensitivity analysis
The main function in sensemakr for Ris sensemakr(). This function performs the most
commonly required sensitivity analyses and returns an object of class sensemakr, which
can then be further explored with the print,summary and plot methods (see details in
?print.sensemakr and ?plot.sensemakr). We begin the analysis by applying sensemakr()
to the original regression model, darfur.model.
R> darfur.sensitivity <- sensemakr(model = darfur.model,
R+ treatment = "directlyharmed",
R+ benchmark_covariates = "female",
R+ kd = 1:3,
R+ ky = 1:3,
R+ q = 1,
R+ alpha = 0.05,
R+ reduce = TRUE)
The arguments of this call are:
model: the lm object with the outcome regression. In our case, darfur.model.
treatment: the name of the treatment variable. In our case, "directlyharmed".
benchmark covariates: the names of covariates that will be used to bound the plau-
sible strength of the unobserved confounders. Here, we put "female", which one could
10 sensemakr: Sensitivity Analysis Tools for OLS
argue to be among the main determinants of exposure to violence. It was also found to
be among the strongest determinants of attitudes towards peace empirically. Variables
considered as separate benchmarks can be passed as a single character vector; variables
that should be treated jointly as a group for benchmarks should be passed as named
list of character vectors.
kd and ky: these arguments parameterize how many times stronger the confounder is
related to the treatment ( kd ) and to the outcome ( ky ) in comparison to the observed
benchmark covariate ( "female" ). In our example, setting kd = 1:3 and ky = 1:3
means we want to investigate the maximum strength of a confounder once, twice, or
three times as strong as female (in explaining treatment and outcome variation). If only
kd is given, ky will be set equal to it by default.
q: this allows the user to specify what fraction of the effect estimate would have to be
explained away to be problematic. Setting q=1means that a reduction of 100% of the
current effect estimate (i.e. a true effect of zero) would be deemed problematic. The
default is q=1.
alpha: significance level of interest for making statistical inferences. The default is
alpha = 0.05.
reduce: should we consider confounders acting towards increasing or reducing the ab-
solute value of the estimate? The default is reduce = TRUE, which means we are con-
sidering confounders that pull the estimate towards (or through) zero. Setting reduce
= FALSE will consider confounders that pull the estimate away from zero.
Using the default arguments, one can simplify the previous call to
R> darfur.sensitivity <- sensemakr(model = darfur.model,
R+ treatment = "directlyharmed",
R+ benchmark_covariates = "female",
R+ kd = 1:3)
After running sensemakr(), we can explore the sensitivity analysis results. We note that the
function sensemakr() also has formula and numeric methods. See ?sensemakr for details.
Sensitivity statistics for routine reporting
The print method for sensemakr provides the original (unadjusted) estimate along with three
summary sensitivity statistics suited for routine reporting: (1) the partial R2of the treatment
with the outcome; (2) the robustness value (RV) required to reduce the estimate entirely to
zero (i.e. q= 1); and, (3) the RV beyond which the estimate would no longer be statistically
distinguishable from zero at the 5% level (q= 1, α= 0.05).
R> darfur.sensitivity
Sensitivity Analysis to Unobserved Confounding
Journal of Statistical Software 11
Model Formula: peacefactor ~ directlyharmed + village + female + age + farmer_dar +
herder_dar + pastvoted + hhsize_darfur
Unadjusted Estimates of 'directlyharmed ':
Coef. estimate: 0.097
Standard Error: 0.023
t-value: 4.18
Sensitivity Statistics:
Partial R2 of treatment with outcome: 0.022
Robustness Value, q = 1 : 0.139
Robustness Value, q = 1 alpha = 0.05 : 0.076
For more information, check summary.
The package also provides a function that creates a latex or html table with these results, as
shown in Table 2(for the html table, simply change the argument to format = "html").
R> ovb_minimal_reporting(darfur.sensitivity, format = "latex")
Outcome: peacefactor
Treatment: Est. S.E. t-value R2
YD|XRVq=1 RVq=1=0.05
directlyharmed 0.097 0.023 4.184 2.2% 13.9% 7.6%
df = 783 Bound (1x female):R2
YZ|X,D = 12.5%, R2
DZ|X= 0.9%
Table 2: Minimal sensitivity analysis reporting.
Together these three sensitivity statistics provide the ingredients for a standard reporting
template proposed in Cinelli and Hazlett (2020a). More precisely:
The robustness value for bringing the point estimate of directlyharmed exactly to zero
(RVq=1) is 13.9%. This means that unobserved confounders that explain 13.9% of the
residual variance both of the treatment and of the outcome are sufficiently strong to
explain away all the observed effect. On the other hand, unobserved confounders that
do not explain at least 13.9% of the residual variance both of the treatment and of the
outcome are not sufficiently strong to do so.
The robustness value for testing the null hypothesis that the coefficient of directlyharmed
is zero (RVq=1=0.05) falls to 7.6%. This means that unobserved confounders that ex-
plain 7.6% of the residual variance both of the treatment and of the outcome are suffi-
ciently strong to bring the lower bound of the confidence interval to zero (at the chosen
significance level of 5%). On the other hand, unobserved confounders that do not ex-
plain at least 7.6% of the residual variance both of the treatment and of the outcome
are not sufficiently strong to do so.
Finally, the partial R2of directlyharmed with peacefactor means that, in an extreme
scenario, in which we assume that unobserved confounders explain all of the left out
12 sensemakr: Sensitivity Analysis Tools for OLS
variance of the outcome, these unobserved confounders would need to explain at least
2.2% of the residual variance of the treatment to fully explain away the observed effect.
These quantities summarize what we need to know in order to safely rule out confounders
that are deemed to be problematic. Researchers can then argue as to whether they fall within
plausible bounds on the maximum explanatory power that unobserved confounders could have
in a given application.
Where investigators are unable to offer strong arguments limiting the absolute strength of
confounding, it can be productive to consider relative claims, for instance, by arguing that
unobserved confounders are likely not multiple times stronger than a certain observed covari-
ate. In our application, this is indeed the case. One could argue that, given the nature of
the attacks, it is hard to imagine that unobserved confounding could explain much more of
the residual variance of targeting than what is explained by the observed variable female.
The lower corner of the table, thus, provides bounds on confounding as strong as female,
R2
YZ|X,D = 12.5%, and R2
DZ|X= 0.9%. Since both of those are below the robustness value,
confounders as strong as female are not sufficient to explain away the observed estimate.
Moreover, the bound on R2
DZ|Xis below the partial R2of the treatment with the outcome,
R2
YD|X. This means that even an extreme confounder explaining all residual variation of
the outcome and as strongly associated with the treatment as female would not overturn the
research conclusions. As noted in Section 2.4, these results are exact for a single unobserved
confounder, and conservative for multiple confounders, possibly acting non-linearly.
Finally, the summary method for sensemakr provides an extensive report with verbal descrip-
tions of all these analyses. Entering the command summary(darfur.sensitivity) produces
verbose output similar to the text explanations in the last several paragraphs (and thus not
reproduced here), so that researchers can directly cite or include such text in their reports.
Sensitivity contour plots of point estimates and t-values
The minimal report of sensitivity results provided by Table 2offers a useful summary of how
robust the current estimate is to unobserved confounding. Researchers can extend and refine
sensitivity analyses through plotting methods for sensemakr that visually explore the whole
range of possible estimates that confounders with different strengths could cause. These plots
can also represent different bounds on the plausible strength of confounding based on different
assumptions on how they compare to observed covariates.
We begin by examining the default plot type, contour plots for the point estimate.
R> plot(darfur.sensitivity)
The resulting plot is shown in the left panel of Figure 1. The horizontal axis shows the
residual share of variation of the treatment that is hypothetically explained by unobserved
confounding, R2
DZ|X. The vertical axis shows the hypothetical partial R2of unobserved con-
fouding with the outcome, R2
YZ|X,D. The contours show what estimate for directlyharmed
would have been obtained in the full regression model including unobserved confounders with
such hypothetical strengths. Note the plot is parameterized in way that hurts our preferred
hypothesis, by pulling the estimate towards zero. Recall that the direction of the bias was
determined by the argument reduce = TRUE of the sensemakr() call.
Journal of Statistical Software 13
−0.25
−0.2
−0.15
−0.1
−0.05
0.05
0.0 0.1 0.2 0.3 0.4
0.0 0.1 0.2 0.3 0.4
0
Unadjusted
(0.097)
1x female
(0.075)
2x female
(0.053)
3x female
(0.03)
Partial R2 of confounder(s) with the treatment
Partial R2 of confounder(s) with the outcome
−12
−10
−8
−6
−4
−2
0
4
0.0 0.1 0.2 0.3 0.4
0.0 0.1 0.2 0.3 0.4
1.963
Unadjusted
(4.2)
1x female
(3.439)
2x female
(2.6)
3x female
(1.628)
Figure 1: Sensitivity contour plots of point estimate (left) and t-value (right)
The bounds on the strength of confounding, determined by the parameter kd = 1:3 in the
call for sensemakr(), are also shown in the plot. The plot reveals that the direction of the
effect (positive) is robust to confounding once, twice or even three times as strong as the
observed covariate female, although in this last case the magnitude of the effect is reduced
to a third of the original estimate.
We now examine the sensitivity of the t-value for testing the null hypothesis of zero effect by
choosing the option sensitivity.of = "t-value" of the plot() method.
R> plot(darfur.sensitivity, sensitivity.of = "t-value")
The resulting plot is shown in the right of Figure 1. At the 5% significance level, the null
hypothesis of zero effect would still be rejected given confounders once or twice as strong
as female. However, while the point-estimate remains positive, accounting for sampling
uncertainty now means that the null hypothesis of zero effect would not be rejected with the
inclusion of a confounder three times as strong as female.
Sensitivity plots of extreme scenarios
Sometimes researchers may be better equipped to make plausibility judgments about the
strength of determinants of the treatment assignment mechanism, and have less knowledge
about the determinants of the outcome. In those cases, sensitivity plots using extreme sce-
narios are a useful option. These are produced with the option type = extreme. Here one
assumes confounding explains all or some large fraction of the residual variance of the out-
come, then vary how strongly such confounding is hypothetically related to the treatment to
see how this affects the resulting point estimate.
R> plot(darfur.sensitivity, type = "extreme")
14 sensemakr: Sensitivity Analysis Tools for OLS
0.00 0.02 0.04 0.06 0.08 0.10
−0.10 −0.05 0.00 0.05 0.10
Partial R2 of confounder(s) with the treatment
Adjusted effect estimate
Partial R2 of confounder(s) with the outcome
100% 75% 50%
Figure 2: Sensitivity analysis to extreme scenarios.
Figure 2shows the produced plot. By default these plots consider confounding that explains
100%, 75%, and 50% of the residual variance of the outcome, producing three separate curves.
This is equivalent to setting the argument r2yz.dx = c(1, .75, .5). The bounds on the
strength of association of a confounder once, twice or three times as strongly associated with
the treatment as female are shown as red ticks in the horizontal axis. As the plot shows, even
in the most extreme case (R2
YZ|X,D = 100%), confounders would need to be more than twice
as strongly associated with the treatment as female to fully explain away the point estimate.
Moving to the scenarios R2
YZ|X,D = 75% and R2
YZ|X,D = 50%, confounders would need to
be more than three times as strongly associated with the treatment as female to fully explain
away the point estimate.
Group benchmarks
Users can also use a group of variables collectively as benchmarks, by providing a named list of
character vectors to the benchmark_covariates argument. Each character vector of the list
forms its own group. For example, the command below computes bounds on the maximum
strength of confounding once, twice or three times as strong as the combined explanatory
power of the covariates female and pastvoted. The names of the list are used for setting the
benchmark labels in plots and tables.
R> group.sens <- sensemakr(model = darfur.model,
R+ treatment = "directlyharmed",
R+ benchmark_covariates =
R+ list(female_past = c("female", "pastvoted")),
R+ kd = 1:3)
Journal of Statistical Software 15
4. sensemakr for R: Advanced use
The standard functionality demonstrated in the previous section will suffice for most users,
most of the time. More flexibility can be obtained when needed by employing additional
functions, particularly:
functions for computing the bias, adjusted estimates and standard errors: these com-
prise, among others, the functions bias(),adjusted_estimate(),adjusted_se() and
adjusted_t(). They take as input the original (unadjusted) estimate (in the form of a
linear model or numeric values) and a pair of sensitivity parameters (the partial R2of
the omitted variable with the treatment and the outcome), and return the new quantity
adjusted for omitted variable bias.
functions for computing sensitivity statistics: these comprise, among others, the func-
tions partial_r2(),robustness_value(), and sensitivity_stats(). These func-
tions compute sensitivity statistics suited for routine reporting, as proposed in Cinelli
and Hazlett (2020a). They take as input the original (unadjusted) estimate (in the form
of a linear model or numeric values), and return the corresponding sensitivity statistic.
sensitivity plots:ovb_contour_plot() and ovb_extreme_plot() allow estimation and
plotting of the contour and extreme scenario plots, respectively. The convenience func-
tion add_bound_to_contour() allows the user to place manually computed bounds on
contour plots. All plot functions return invisibly the data needed to replicate the plot,
so users can produce their own plots if preferred. The default options for plots work
best with width and height around 4 to 5 inches.
bounding functions:ovb_bounds() computes bounds on the maximum strength of con-
founding “k times” as strong as certain observed covariates. The auxiliary function
ovb_partial_r2_bound() computes bounds for confounders by passing the values of
the partial R2of the benchmarks directly.
We demonstrate the use of these functions below through examples chosen to illustrate im-
portant features of sensitivity analysis.
4.1. Formal versus informal benchmarking: customizing bounds
Informal “benchmarking” procedures have been suggested as aids to interpretation for numer-
ous sensitivy analyses. These approaches are usually described as revealing how an unobserved
confounder Z“not unlike” some observed covariate Xjwould alter the results of a study (Im-
bens 2003;Blackwell 2013;Hosman et al. 2010;Carnegie et al. 2016;Dorie et al. 2016;Hong,
Qin, and Yang 2018). As shown in Cinelli and Hazlett (2020a), these informal proposals
may lead users to erroneous conclusions, even when they make correct suppositions about
how unobserved confounders compare to observed covariates. Here we replicate Section 6.1
of Cinelli and Hazlett (2020a) using sensemakr and provide a numerical example illustrating
the potential for misleading results from informal benchmarking. This example also demon-
strates advanced usage of the package, including how to construct sensitivity contour plots
with customized bounds.
16 sensemakr: Sensitivity Analysis Tools for OLS
Data and model
We begin by simulating the data generating process which will be used in our example, as
given by Equations 9to 12 below. Here we have a treatment variable D, an outcome variable
Y, one observed confounder X, and one unobserved confounder Z. All disturbance variables
Uare standardized mutually independent normals. Note that in this case, the treatment D
has no causal effect on Y.
Model 1:
Z=Uz(9)
X=Ux(10)
D=X+Z+Ud(11)
Y=X+Z+Uy(12)
Also note that, in this model: (i) the unobserved confounder Zis independent of X; and, (ii)
the unobserved confounder Zis exactly like Xin terms of its strength of association with the
treatment and the outcome. The code below draws 100 samples from this data generating
process. We use the function resid_maker() to make sure the residuals are standardized and
orthogonal, thus all properties that we describe here hold exactly even with finite sample size.
R> n <- 100
R> X <- scale(rnorm(n))
R> Z <- resid_maker(n, X)
R> D <- X + Z + resid_maker(n, cbind(X, Z))
R> Y <- X + Z + resid_maker(n, cbind(X, Z, D))
In this example, the investigator knows that she needs to adjust for the confounder Zbut,
unfortunately, does not observe Z. Therefore, she is forced to fit the restricted linear model
adjusting for Xonly.
R> model.ydx <- lm(Y ~ D + X)
Results from this regression are shown in the first column of Table 3, showing a large and
statistically significant coefficient estimate for both Dand X.
Formal benchmarks
Suppose the investigator correctly knows that: (i) Zand Xhave the same strength of associ-
ation with Dand Y; and, (ii) Zis independent of X. How can she leverage this information
to understand how much bias a confounder Z“not unlike” Xcould cause? As shown in Sec-
tion 2.3, Equation 7can be used to bound the maximum amount of confounding caused by an
unobserved confounder Zas strongly associated with the treatment Dand with the outcome
Yas the observed covariate X.
Separately from the main sensemakr() function, these bounds can be computed with the
function ovb_bounds(). In this function one needs to specify the linear model being used
Journal of Statistical Software 17
Dependent variable:
Y
Restricted OLS Full OLS
(1) (2)
D 0.500∗∗∗ 0.000
(0.088) (0.102)
X 0.500∗∗∗ 1.000∗∗∗
(0.152) (0.144)
Z 1.000∗∗∗
(0.144)
Observations 100 100
R20.500 0.667
Residual Std. Error 1.240 (df = 97) 1.020 (df = 96)
Note: p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
Table 3: First column: results of the restricted regression adjusting for Xonly. Second
column: results of the full regression adjusting for Xand Z.
(model = model.ydx), the treatment of interest (treatment = "D"), the observed variable
used for benchmarking (benchmark_covariates = "X"), and how many times stronger Z
is in explaining treatment (kd = 1) and outcome (ky = 1) variation, as compared to the
benchmark variable X.
R> formal_bound <- ovb_bounds(model = model.ydx,
R+ treatment = "D",
R+ benchmark_covariates = "X",
R+ kd = 1,
R+ ky = 1)
We can now inspect the output of ovb_bounds().
R> formal_bound[1:6]
bound_label r2dz.x r2yz.dx treatment adjusted_estimate adjusted_se
1 1x X 0.5 0.333 D 0 0.102
As we can see, the results of the bounding procedure correctly shows that an unobserved
confounder Z, that is truly “not unlike X, would: (1) explain 50% of the residual variation of
the treatment and 33% of the residual variation of the outcome; (2) bring the point estimate
exactly to zero; and, (3) bring the standard error to 0.102. This is precisely what one obtains
18 sensemakr: Sensitivity Analysis Tools for OLS
when running the full regression model adjusting for both Xand Z, as shown in the second
column of Table 3.
Informal benchmarks
We now demonstrate an “informal benchmark” to show its dangers. Computing the bias due
to the omission of Zrequires two sensitivity parameters: its partial R2with the treatment D
and its partial R2with the outcome Y. Informal approaches follow from the intuition that we
can simply take the observed associations of Xwith Dand Y, found directly from regressions
for the treatment and the outcome, to “calibrate” the magnitude of the sensitivity parameters
of an unobserved confounder “not unlike” X. Unfortunately, as formalized in Cinelli and
Hazlett (2020a), these observed associations are themselves affected by the omission of the
omitted variable, making naive comparisons potentially misleading.
What happens if we nevertheless attempt to use those observed statistics for benchmarking?
To compute the informal benchmarks, we first need to obtain the observed partial R2of X
with the outcome Y. This can be done using the partial_r2() function of sensemakr in the
model.ydx regression.
R> r2yx.d <- partial_r2(model.ydx, covariates = "X")
We next need to obtain the partial R2of Xwith the treatment D. For that, we need to fit a
new regression of the treatment Don the observed covariate Xhere denoted by model.dx.
R> model.dx <- lm(D ~ X)
R> r2dx <- partial_r2(model.dx, covariates = "X")
We then determine what would be the implied adjusted estimate due to an unobserved con-
founder Zwith this pair of partial R2values. This can be computed using the adjusted_estimate()
function.
R> informal_adjusted_estimate <- adjusted_estimate(model = model.ydx,
R+ treatment = "D",
R+ r2dz.x = r2dx,
R+ r2yz.dx = r2yx.d)
Let us now compare those informal benchmarks with the formal bounds. To prepare, we first
plot sensitivity contours with the function ovb_contour_plot(). Next, we add the informal
benchmark to the contours, using the numeric method of the function add_bound_to_contour().
Finally, we use add_bound_to_contour() again to add the previously computed formal
bounds.
R> # draws sensitivity contours
R> ovb_contour_plot(model = model.ydx,
R+ treatment = "D",
R+ lim = .6)
R>
Journal of Statistical Software 19
Partial R2 of confounder(s) with the treatment
Partial R2 of confounder(s) with the outcome
−0.2
−0.1
0.1
0.2
0.3
0.4
0.5
0.0 0.1 0.2 0.3 0.4 0.5 0.6
0.0 0.1 0.2 0.3 0.4 0.5 0.6
0
Unadjusted
(0.5)
Informal benchmark
(0.31)
Formal bound
(0)
Figure 3: Informal benchmarking versus proper bounds.
R> # adds informal benchmark
R> add_bound_to_contour(r2dz.x = r2dx,
R+ r2yz.dx = r2yx.d,
R+ bound_value = informal_adjusted_estimate,
R+ bound_label = "Informal benchmark")
R>
R> # adds formal bound
R> add_bound_to_contour(bounds = formal_bound,
R+ bound_label = "Formal bound")
Note how the results from informal benchmarking are misleading: the benchmark point is still
far from zero, which would suggest that an unobserved confounder Z“not unlike” Xis unable
to explain away the observed effect, when in fact it is, as it was shown in Table 3. This incorrect
conclusion occurs despite the investigator correctly assuming both that: (i) Zand Xhave the
same strength of association with Dand Y; and, (ii) Zis independent of X. Therefore, we do
not recommend using informal benchmarks for sensitivity analysis, and suggest researchers
use formal approaches such as the ones provided with ovb_bounds(). For further details and
discussion, see Sections 4.4 and 6.1 of Cinelli and Hazlett (2020a).
4.2. Assessing the sensitivity of existing regression results
We conclude this section by demonstrating how to replicate Section 3using only the statistics
found in the regression table along with the individual functions available in the package.
20 sensemakr: Sensitivity Analysis Tools for OLS
Sensitivity statistics
The robustness value and the partial R2are key sensitivity statistics, useful for standardized
sensitivity analyses reporting. Beyond the main sensemakr() function, these statistics can be
computed directly by the user with the functions robustness_value() and partial_r2().
With a fitted lm model in hand, the most convenient way to compute the RV and partial R2
is by employing the lm methods for these functions, as in
R> robustness_value(model = darfur.model, covariates = "directlyharmed")
R> partial_r2(model = darfur.model, covariates = "directlyharmed")
However, when one does not have access to the data in order to run this model, simple
summary statistics such as: (i) the point estimate for the directlyharmed (0.097); (ii) its
estimated standard error (0.023); and, (ii) the degrees of freedom of the regression (783) are
sufficient to compute the RV and the partial R2.
R> robustness_value(t_statistic = 0.097/0.023, dof = 783)
R> partial_r2(t_statistic = 0.097/0.023, dof = 783)
The convenience function sensitivity_stats() also computes all sensitivity statistics for a
regression coefficient of interest and returns them in a data.frame.
Plotting functions
All plotting functions can be called directly with lm objects or numerical data. For example,
the code below uses the function ovb_contour_plot() to replicate Figure 1(without the
bounds) using only the summary statistics of Table 1.
R> ovb_contour_plot(estimate = 0.097, se = 0.023, dof = 783)
R> ovb_contour_plot(estimate = 0.097, se = 0.023, dof = 783,
R> sensitivity.of = "t-value")
The extreme scenario plots (as in Figure 2) can also be reproduced from summary statistics
using the function ovb_extreme_plot(),
R> ovb_extreme_plot(estimate = 0.097, se = 0.023, dof = 783)
All plotting functions return (invisibly) the data needed to reproduce them, allowing users to
create their own plots if they prefer.
Adjusted estimates, standard errors and t-values
These functions allow users to compute the adjusted estimates given different postulated
degrees of confounding. For instance, suppose a researcher has reasons to believe a confounder
explains 10% of the residual variance of the treatment and 15% of the residual variance of
the outcome. If the underlying data are not available, the investigator can still compute the
adjusted estimate and t-value that one would have obtained in the full regression adjusting
for such confounder.
Journal of Statistical Software 21
Dependent variable:
directlyharmed
female 0.097∗∗∗
(0.036)
Observations 1,276
R20.426
Residual Std. Error 0.476 (df = 784)
Note: p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
Table 4: Treatment regression for the Darfur example. To conserve space only the results for
female are shown, which will be used for benchmarking.
R> adjusted_estimate(estimate = 0.097, se = 0.023, dof = 783,
R+ r2dz.x = .1, r2yz.dx = 0.15)
[1] 0.0139
R> adjusted_t(estimate = 0.097, se = 0.023, dof = 783,
R+ r2dz.x = .1, r2yz.dx = 0.15)
[1] 0.622
The computation shows that this confounder is not strong enough to bring the estimate to
zero, but it is sufficient to bring the t-value below the usual 5% significance threshold of 1.96.
Computing bounds from summary statistics
Finally, we show how users can compute bounds on the strength of confounding using only
summary statistics, if the paper also provides a treatment regression table, i.e., a regression of
the treatment on the observed covariates. Such regressions are sometimes shown in published
works as part of efforts to describe the “determinants” of the treatment, or as “balance tests”
in which the investigator assesses whether observed covariates predict treatment assigment.
For the Darfur example, this regression is shown in Table 4.
Using the results of Tables 1and 4we can compute the bounds on confounding 1, 2 and 3
times as strong as female, as we have done before. First we compute the partial R2of female
with the treatment and the outcome
R> r2yxj.dx <- partial_r2(t_statistic = -0.232/0.024, dof = 783)
R> r2dxj.x <- partial_r2(t_statistic = -0.097/0.036, dof = 783)
Next, we compute the bounds on the partial R2of the unobserved confounder using the
ovb_partial_r2_bound() function.
22 sensemakr: Sensitivity Analysis Tools for OLS
R> bounds <- ovb_partial_r2_bound(r2dxj.x = r2dxj.x,
R+ r2yxj.dx = r2yxj.dx,
R+ kd = 1:3,
R+ ky = 1:3,
R+ bound_label = paste(1:3, "x", "female"))
Finally, the adjusted_estimate() function computes the estimates implied by these hypo-
thetical confounders.
R> bound.values <- adjusted_estimate(estimate = 0.0973,
R+ se = 0.0232,
R+ dof = 783,
R+ r2dz.x = bounds$r2dz.x,
R+ r2yz.dx = bounds$r2yz.dx)
This information along with the numeric methods for the plot functions, allow us to reproduce
the contour plots of Figure 1using only summary statistics. Note that, since we are performing
all calculations manually, appropriate limits of the plot area need to be set by the user.
R> ovb_contour_plot(estimate = 0.0973, se = 0.0232, dof = 783, lim = 0.45)
R> add_bound_to_contour(bounds, bound_value = bound.values)
5. sensemakr for Stata
For Stata users, we have also developed a homonymous package sensemakr, which is available
for download on SSC. The package can be installed as follows:
ssc install sensemakr, replace all
The main function of the Stata package is sensemakr, which is called using the format:
sensemakr depvar covar [if] [in], treat(varlist)
For consistency with the syntax of the well-known regress command, the first variable is
assumed to be the dependent variable, while the subsequent treatment variable and covariates
can appear in any order. The required argument is treat(varlist), which indicates the
treatment variable for which sensitivity analysis is conducted.
By default, sensemakr displays sensitivity statistics for routine reporting, as well as a text
interpretation of the results. Specifically, the output table reports three key values: the partial
R2of the treatment with the outcome (R2yd.x), the robustness value (RV) required to reduce
the point estimate entirely to zero (if q= 1), and the RV beyond which the estimate would
no longer be statistically distinguishable from zero at the 5% level (q= 1, α= 0.05).
Should users wish to bound the plausible strength of unobserved confounders relative to ex-
isting covariates, they can specify the option benchmark(varlist).benchmark() can accept
Journal of Statistical Software 23
multiple covariates from the main specification, including time-series and factor variables.
If a benchmark is specified, sensemakr displays a bounds table. By default, this bounds
table displays estimates for a hypothetical confounder that is 1, 2, and 3 times as strong
as each benchmark covariate in explaining residual variation in both the treatment and the
outcome, as well as adjusted coefficient estimates for the treatment if such a confounder were
present. In addition to these bounds, the table displays treatment coefficients under an “ex-
treme scenario,” in which the confounder is assumed to have the same relationship to the
treatment (R2dz.x) as each benchmark, but explains all the residual variance of the outcome
(R2yz.dx=1).
5.1. Violence in Darfur
In this section, we briefly demonstrate how to replicate the analysis of Section 3, using the
dataset darfur.dta included with sensemakr for Stata.
Users can investigate the sensitivity of the directlyharmed treatment estimate, as well as
bounds using the benchmark covariate female, via the following call:
. use darfur.dta, clear
. sensemakr peacefactor directlyharmed age farmer herder pastvoted hhsize ///
female i.village_, treat(directlyharmed) benchmark(female)
Grouped benchmarks can be assessed using the gbenchmark(varlist) option. For instance,
the following code adds the joint benchmark female and pastvoted. Note that while the
options gbenchmark() and benchmark() can be used in tandem, only a single grouped
benchmark, consisting of all the variables specified in gbenchmark(), can be evaluated per
sensemakr call.
. sensemakr peacefactor directlyharmed age farmer herder pastvoted hhsize ///
female i.village_, treat(directlyharmed) benchmark(female) ///
gbenchmark(female pastvoted)
Users can modify the output using the following options:
alpha(real): the significance level. The default is 0.05.
gname(string): enables the user to specify a custom name for the group benchmark
specified in gbenchmark() (if used). By default, names for grouped benchmarks are
constructed by appending variables with ‘-’.
kd(numlist) and ky(numlist): these arguments parameterize how many times stronger
the confounder is related to the treatment (kd) and to the outcome (ky), in compari-
son to the benchmark covariate. By default, kd and ky are set to (1 2 3), so provides
estimates for a hypothetical confounder that is 1, 2, and 3 times as strong as each
benchmark covariate. If only option kd(numlist) is provided, ky will be set equal to
kd by default. If the user opts to specify kd and ky, the number of elements within each
option must be equivalent.
latex(filename): saves a condensed version of the reporting outputs in filename.tex.
24 sensemakr: Sensitivity Analysis Tools for OLS
noreduce: the default functionality assumes that confunders reduce the absolute value
of the estimate. If the user wishes to assume that confounders pull the estimate away
from zero, they can specify the noreduce flag.
q(real): this option enables the user to specify what fraction of the effect estimate
would have to be explained away to be problematic. Defaults to 1, implying that a
reduction of 100% of the current effect estimate (true effect of 0) would be problematic.
r2yz(numlist): Allows the user to specify alternative scenarios for the extreme bounds
table. For instance, inputting (.5 .75) would display the expected treatment coeffi-
cients if a confounder explained 50% and 75% of the residual variance of the outcome.
By default r2yz is set to 1.
suppress: eliminates verbose description of sensitivity statistics.
Should users wish to design their own custom exports, all reported estimates are accessible
within the e() class.
Sensitivity contour plots of point estimates and t-values
Sensitivity plots for point estimates and t-values can be generated by appending the options
contourplot and tcontourplot, respectively, to the sensemakr call. The contour plots can
be customized with the following display options:
clines: the number of contour lines to display on each plot. Defaults to 7.
clim(numlist): the symmetric axis limits for the contour plots. Max range is (0 1)
In addition, advanced users can generate their own plots by accessing the raw contour data
within e(contourgrid) or e(tcontourgrid).
Sensitivity plots of extreme scenarios
Plots for extreme confounding scenarios are generated using the extremeplot option. By
default these plots consider confounding that explains 100%, 75%, and 50% of variation in
the residual outcome, producing three separate curves for each scenario. The extreme scenario
plot can be customized with the following display options:
r2yz(numlist): enables the user to specify custom values for the extreme plot. Users
can specify a maximum of four custom values.
elim(numlist): adjusts the x-axis limits of the plot. Max range is (0 1). Note that
limits for the y-axis are set automatically to include the critical value.
6. Discussion
We recognize that the tools we present here have the potential to be misused, and that it
may be tempting to use sensitivity analyses as “robustness tests” that should be “passed,”
Journal of Statistical Software 25
in way similar to the current abuse we observe, for instance, with statistical significance
testing (Ziliak and McCloskey 2008;Benjamin, Berger, Johannesson, Nosek, Wagenmakers,
Berk, Bollen, Brembs, Brown, Camerer et al. 2018;Amrhein and Greenland 2018). We thus
conclude the paper with brief remarks regarding the appropriate use of sensitivity analysis in
general and as applied to the tools provided by sensemakr in particular.
What sensitivity analyses can and cannot tell us
The quantities and graphics computed by sensemakr tell us what we need to be prepared to
believe in order to sustain that a given conclusion is not due to confounding. For instance,
in the applied example discussed in this paper, sensemakr reveals that, even in a worst case
scenario where the unobserved confounder explains all the residual variation of the outcome,
this unobserved confounder would need to be more than twice as strongly associated with
the treatment as the covariate female to fully explain away the observed estimated effect
of directlyharmed. This is a true quantitative statement that describes the strength of
confounding needed to overturn the research conclusions.
Note, however, that sensitivity analyses cannot tell us whether such confounder is likely to
exist. The role of sensitivity analysis is, therefore, to discipline the discussion regarding the
causal interpretation of the effect estimate. Ultimately, this discussion needs to rely on domain
knowledge, and is beyond the realm of statistics alone. To illustrate using our example:
1. A causal interpretation of the research conclusion may be defended by claiming that,
given the way injuries (the “treatment”) occurred, the scope for targeting particular
types of individuals was quite limited; aircraft dropped makeshift and unguided bombs
and other objects over villages, and militia raided without concern for who they would
attack—the only known major exception to this, due to sexual assaults, was targeting
gender, which is also one of the most visually apparent characteristics of an individual.
Thus, a confounder twice as strong as female would be indeed surprising.
2. Similarly, for the causal conclusion to be persuasively dismissed, it does not suffice to
argue that some confounding might exist. Helpful skepticism must articulate why a
confounder that explains more than twice of the variation of the treatment assignment
than the covariate female is plausible. Otherwise, the putative confounder cannot
logically account for all the observed association, even if it explains all or some large
portion of the residual outcome variation.
Robustness to confounding is thus claimed to the extent one agrees with the arguments
articulated in point 1, while the results can be deemed fragile insofar as alternative stories
meeting the requirements in point 2 can be offered. Both types of arguments need to rely on
domain knowledge as to how the attacks occurred and what could presumably influence the
outcome variable.
In sum, sensitivity analyses should not be used to obviate discussions about confounding by
engaging in automatic procedures; rather, they should be used to stimulate a disciplined,
quantitative argument about confounding, in which such statements are made and debated.
The tools provided by sensemakr allow users to easily and transparently report the sensi-
tivity of their causal inferences to unobserved confounding, thereby enabling this disciplined
discussion as to what can be concluded from imperfect observational studies.
26 sensemakr: Sensitivity Analysis Tools for OLS
References
Amrhein V, Greenland S (2018). “Remove, rather than redefine, statistical significance.”
Nature Human Behaviour,2(1), 4–4.
Angrist JD, Pischke JS (2008). Mostly harmless econometrics: An empiricist’s companion.
Princeton university press.
Benjamin DJ, Berger JO, Johannesson M, Nosek BA, Wagenmakers EJ, Berk R, Bollen KA,
Brembs B, Brown L, Camerer C, et al. (2018). “Redefine statistical significance.” Nature
Human Behaviour,2(1), 6.
Blackwell M (2013). “A selection bias approach to sensitivity analysis for causal effects.”
Political Analysis,22(2), 169–182.
Brumback BA, Hern´an MA, Haneuse SJ, Robins JM (2004). “Sensitivity analyses for unmea-
sured confounding assuming a marginal structural model for repeated measures.” Statistics
in medicine,23(5), 749–767.
Carnegie NB, Harada M, Hill JL (2016). “Assessing sensitivity to unmeasured confounding
using a simulated potential confounder.” Journal of Research on Educational Effectiveness,
9(3), 395–420.
Cinelli C, Ferwerda J, Hazlett C (2020a). sensemakr for Stata: sensitivity analysis tools for
OLS.Stata package version 1.0.
Cinelli C, Ferwerda J, Hazlett C (2020b). sensemakr: sensitivity analysis tools for OLS.
Rpackage version 0.3, URL https://CRAN.R-project.org/package=sensemakr.
Cinelli C, Hazlett C (2020a). “Making Sense of Sensitivity: Extending Omitted Variable
Bias.” Journal of the Royal Statistical Society: Series B (Statistical Methodology),82(1),
39–67. doi:10.1111/rssb.12348.
Cinelli C, Hazlett C (2020b). “An omitted variable bias framework for sensitivity analysis of
instrumental variables.” Working Paper.
Cinelli C, Kumor D, Chen B, Pearl J, Bareinboim E (2019). “Sensitivity Analysis of Linear
Structural Causal Models.” International Conference on Machine Learning.
Cornfield J, Haenszel W, Hammond EC, Lilienfeld AM, Shimkin MB, Wynder EL (1959).
“Smoking and lung cancer: recent evidence and a discussion of some questions.” journal of
National Cancer Institute, (23), 173–203.
Dorie V, Harada M, Carnegie NB, Hill J (2016). “A flexible, interpretable framework for
assessing sensitivity to unmeasured confounding.” Statistics in medicine,35(20), 3453–
3470.
Frank KA (2000). “Impact of a confounding variable on a regression coefficient.” Sociological
Methods & Research,29(2), 147–194.
Journal of Statistical Software 27
Frank KA, Maroulis SJ, Duong MQ, Kelcey BM (2013). “What would it take to change an
inference? Using Rubin’s causal model to interpret the robustness of causal inferences.”
Educational Evaluation and Policy Analysis,35(4), 437–460.
Frank KA, Sykes G, Anagnostopoulos D, Cannata M, Chard L, Krause A, McCrory R (2008).
“Does NBPTS certification affect the number of colleagues a teacher helps with instructional
matters?” Educational Evaluation and Policy Analysis,30(1), 3–30.
Franks A, D’Amour A, Feller A (2019). “Flexible sensitivity analysis for observational studies
without observable implications. Journal of the American Statistical Association, (just-
accepted), 1–38.
Hazlett C (2019). “Angry or Weary? How Violence Impacts Attitudes toward Peace among
Darfurian Refugees.” Journal of Conflict Resolution, p. 0022002719879217.
Hern´an M, Robins J (2020). “Causal inference: What if.” Boca Raton: Chapman & Hill/CRC.
Hong G, Qin X, Yang F (2018). “Weighting-Based Sensitivity Analysis in Causal Mediation
Studies.” Journal of Educational and Behavioral Statistics,43(1), 32–56.
Hosman CA, Hansen BB, Holland PW (2010). “The Sensitivity of Linear Regression Co-
efficients’ Confidence Limits to the Omission of a Confounder.” The Annals of Applied
Statistics, pp. 849–870.
Imai K, Keele L, Yamamoto T, et al. (2010). “Identification, inference and sensitivity analysis
for causal mediation effects.” Statistical science,25(1), 51–71.
Imbens GW (2003). “Sensitivity to exogeneity assumptions in program evaluation.” The
American Economic Review,93(2), 126–132.
Imbens GW, Rubin DB (2015). Causal inference in statistics, social, and biomedical sciences.
Cambridge University Press.
Middleton JA, Scott MA, Diakow R, Hill JL (2016). “Bias amplification and bias unmasking.”
Political Analysis,24(3), 307–323.
Oster E (2017). “Unobservable selection and coefficient stability: Theory and evidence.”
Journal of Business & Economic Statistics, pp. 1–18.
Pearl J (2009). Causality. Cambridge university press.
Robins JM (1999). “Association, causation, and marginal structural models.” Synthese,
121(1), 151–179.
Rosenbaum PR (2002). “Observational studies.” In Observational studies, pp. 1–17. Springer.
Rosenbaum PR, Rubin DB (1983). “Assessing sensitivity to an unobserved binary covariate
in an observational study with binary outcome.” Journal of the Royal Statistical Society.
Series B (Methodological), pp. 212–218.
Vanderweele TJ, Arah OA (2011). “Bias formulas for sensitivity analysis of unmeasured
confounding for general outcomes, treatments, and confounders.” Epidemiology (Cambridge,
Mass.),22(1), 42–52.
28 sensemakr: Sensitivity Analysis Tools for OLS
Ziliak S, McCloskey DN (2008). The cult of statistical significance: How the standard error
costs us jobs, justice, and lives. University of Michigan Press.
Affiliation:
Carlos Cinelli
University of California, Los Angeles
Department of Statistics, 8125 Math Sciences Building, Los Angeles, CA 90095, USA.
E-mail: carloscinelli@ucla.edu
URL: http://carloscinelli.com
Jeremy Ferwerda
Dartmouth College
Department of Government, Hanover, NH 03755
E-mail: jeremy.a.ferwerda@dartmouth.edu
URL: http://jeremyferwerda.com/
Chad Hazlett
University of California, Los Angeles
Department of Statistics, 8125 Math Sciences Building, Los Angeles, CA 90095, USA.
E-mail: chazlett@ucla.edu
URL: http://chadhazlett.com
Journal of Statistical Software http://www.jstatsoft.org/
published by the Foundation for Open Access Statistics http://www.foastat.org/
MMMMMM YYYY, Volume VV, Issue II Submitted: yyyy-mm-dd
doi:10.18637/jss.v000.i00 Accepted: yyyy-mm-dd
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
We propose to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.
Article
We extend the omitted variable bias framework with a suite of tools for sensitivity analysis in regression models that does not require assumptions on the functional form of the treatment assignment mechanism nor on the distribution of the unobserved confounders, naturally handles multiple confounders, possibly acting non‐linearly, exploits expert knowledge to bound sensitivity parameters and can be easily computed by using only standard regression results. In particular, we introduce two novel sensitivity measures suited for routine reporting. The robustness value describes the minimum strength of association that unobserved confounding would need to have, both with the treatment and with the outcome, to change the research conclusions. The partial R2 of the treatment with the outcome shows how strongly confounders explaining all the residual outcome variation would have to be associated with the treatment to eliminate the estimated effect. Next, we offer graphical tools for elaborating on problematic confounders, examining the sensitivity of point estimates and t‐values, as well as ‘extreme scenarios’. Finally, we describe problems with a common ‘benchmarking’ practice and introduce a novel procedure to bound the strength of confounders formally on the basis of a comparison with observed covariates. We apply these methods to a running example that estimates the effect of exposure to violence on attitudes toward peace.
Article
Does exposure to violence motivate individuals to support further violence or to seek peace? Such questions are central to our understanding of how conflicts evolve, terminate, and recur. Yet, convincing empirical evidence as to which response dominates—even in a specific case—has been elusive, owing to the inability to rule out confounding biases. This article employs a natural experiment based on the indiscriminacy of violence within villages in Darfur to examine how refugees’ experiences of violence affect their attitudes toward peace. The results are consistent with a pro-peace or “weary” response: individuals directly harmed by violence were more likely to report that peace is possible and less likely to demand execution of their enemies. This provides microlevel evidence supporting earlier country-level work on “war-weariness” and extends the growing literature on the effects of violence on individuals by including attitudes toward peace as an important outcome. These findings suggest that victims harmed by violence during war can play a positive role in settlement and reconciliation processes.
Article
A fundamental challenge in observational causal inference is that assumptions about unconfoundedness are not testable from data. Assessing sensitivity to such assumptions is therefore important in practice. Unfortunately, some existing sensitivity analysis approaches inadvertently impose restrictions that are at odds with modern causal inference methods, which emphasize flexible models for observed data. To address this issue, we propose a framework that allows (1) flexible models for the observed data and (2) clean separation of the identified and unidentified parts of the sensitivity model. Our framework extends an approach from the missing data literature, known as Tukey’s factorization, to the causal inference setting. Under this factorization, we can represent the distributions of unobserved potential outcomes in terms of unidentified selection functions that posit a relationship between treatment assignment and unobserved potential outcomes. The sensitivity parameters in this framework are easily interpreted, and we provide heuristics for calibrating these parameters against observable quantities. We demonstrate the flexibility of this approach in two examples, where we estimate both average treatment effects and quantile treatment effects using Bayesian nonparametric models for the observed data.
Article
This paper proposes a simple technique for assessing the range of plausible causal conclusions from observational studies with a binary outcome and an observed categorical covariate. The technique assesses the sensitivity of conclusions to assumptions about an unobserved binary covariate relevant to both treatment assignment and response. A medical study of coronary artery disease is used to illustrate the technique.
Article
Through a sensitivity analysis, the analyst attempts to determine whether a conclusion of causal inference could be easily reversed by a plausible violation of an identification assumption. Analytic conclusions that are harder to alter by such a violation are expected to add a higher value to scientific knowledge about causality. This article presents a weighting-based approach to sensitivity analysis for causal mediation studies. Extending the ratio-of-mediator-probability weighting (RMPW) method for identifying natural indirect effect and natural direct effect, the new strategy assesses potential bias in the presence of omitted pretreatment or posttreatment covariates. Such omissions may undermine the causal validity of analytic conclusions. The weighting approach to sensitivity analysis reduces the reliance on functional form assumptions and removes constraints on the measurement scales for the mediator, the outcome, and the omitted covariates. In its essence, the discrepancy between a new weight that adjusts for an omitted confounder and an initial weight that omits the confounder captures the role of the confounder that contributes to the bias. The effect size of the bias due to omitted confounding of the mediator–outcome relationship is a product of two sensitivity parameters, one associated with the degree to which the omitted confounders predict the mediator and the other associated with the degree to which they predict the outcome. The article provides an application example and concludes with a discussion of broad applications of this new approach to sensitivity analysis. Online Supplemental Material includes R code for implementing the proposed sensitivity analysis procedure.