PreprintPDF Available

Single-World Intervention Graphs for Defining, Identifying, and Communicating Estimands in Clinical Trials

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Confusion often arises when attempting to articulate target estimand(s) of a clinical trial in plain language. We aim to rectify this confusion by using a type of causal graph called the Single-World Intervention Graph (SWIG) to provide a visual representation of the estimand that can be effectively communicated to interdisciplinary stakeholders. These graphs not only display estimands, but also illustrate the assumptions under which a causal estimand is identifiable by presenting the graphical relationships between the treatment, intercurrent events, and clinical outcomes. To demonstrate its usefulness in pharmaceutical research, we present examples of SWIGs for various intercurrent event strategies specified in the ICH E9(R1) addendum, as well as an example from a real-world clinical trial for chronic pain. Latex code to generate all the SWIGs shown is this paper is made available. We advocate clinical trialists adopt the use of SWIGs in their estimand discussions during the planning stages of their studies.
Content may be subject to copyright.
arXiv:2206.01249v1 [stat.ME] 2 Jun 2022
Single-World Intervention Graphs for Defining, Identifying,
and Communicating Estimands in Clinical Trials
Alex Ocampo1and Jemar R. Bather2
1Novartis Pharma AG, Basel, Switzerland
2Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
Email: alex.ocampo@novartis.com
Abstract
Confusion often arises when attempting to articulate target estimand(s) of a clinical trial
in plain language. We aim to rectify this confusion by using a type of causal graph called the
Single-World Intervention Graph (SWIG) to provide a visual representation of the estimand
that can be effectively communicated to interdisciplinary stakeholders. These graphs not
only display estimands, but also illustrate the assumptions under which a causal estimand
is identifiable by presenting the graphical relationships between the treatment, intercurrent
events, and clinical outcomes. To demonstrate its usefulness in pharmaceutical research,
we present examples of SWIGs for various intercurrent event strategies specified in the ICH
E9(R1) addendum, as well as an example from a real-world clinical trial for chronic pain.
Latex code to generate all the SWIGs shown is this paper is made available. We advocate
clinical trialists adopt the use of SWIGs in their estimand discussions during the planning
stages of their studies.
Keywords: SWIG,Estimands, Clinical Trials, Causal Inference, Potential Outcomes
1 Introduction
The ICH E9(R1) addendum [1] was a pivotal achievement in aligning all stakeholders (e.g., statis-
ticians, clinicians, regulators) on target treatment effect(s) of randomized clinical trials (RCTs).
This guidance came at an ideal time in statistical history as it aligned with work developed by
pioneers in causal inference. Their advances include methods to define, identify, and communicate
causal treatment effects from both observational data and randomized trials [2–8]. Causal infer-
ences has allowed us to define when we can step beyond the mantra, "correlation does not imply
causation". This is pertinent to the pharmaceutical industry, where demonstrating causation of a
new treatment is the primary objective.
While the tools of causal inference (e.g., potential outcomes, causal graphs, etc.) are not ex-
plicitly mentioned in the addendum, the language of causal inference appears throughout. For
example, Section A.3 states that research questions such as “how the outcome of treatment com-
pares to what would have happened to the same subjects under alternative treatment” are pivotal
for drug development and licensing. Another example can be found in Section A.3.2: “A scenario
is envisaged in which the intercurrent event would not occur. These types of research questions
and hypothetical scenarios can be defined using potential outcomes, prompting the application of
causal inference methods. As a result, recent research has mathematically translated estimands
1
from the ICH E9(R1) addendum into the causal inference framework [9, 10]. This paper contin-
ues this trend by providing stakeholders with an accessible tutorial of how to characterize various
estimands described in the addendum using causal graphs.
Causal graphs were popularized by Pearl [11, 12] in the form of Directed Acyclic Graphs
(DAGs). Figure 1 shows a classic example of a DAG where a treatment Ahas a causal effect on the
intercurrent event Mand outcome Y. Here Malso has a causal effect on Y. Through a graphical
representation, DAGs encode the research team’s assumptions about which variables causally affect
one another. DAGs make explicit the independencies (and conditional independencies) between
variables, which can be read off of the graph using the d-separation criterion [13]. For these
reasons, clinical trialists can use DAGs to effectively communicate the relationships among different
variables of interest in the study. As a result, DAGs can serve as an invaluable tool to inform
decision-making prior to performing statistical analyses. They also have the potential advantage
of uncovering novel research questions within established trials, thus mitigating trial expenses while
expanding therapeutic research.
A M Y
Figure 1: An example of a DAG with a treatment A, intercurrent event M, and outcome Y
However, DAGs are not without limitation. Their primary shortcoming is that they do not
visualize potential outcomes [14], which are needed to define the estimands outlined in the ICH-
E9(R1) addendum. In fact, as we will show in this paper, the DAG in Figure 1 could be associated
with multiple estimands in a RCT, depending on the strategy used for handling intercurrent events.
Luckily, Single-World Intervention Graphs (SWIGs) overcome this limitation by allowing DAGs to
incorporate potential outcomes. As with DAGs, independencies among variables can be identified
using d-separation. Using these properties, we demonstrate how to graphically represent common
target estimands in clinical trials for various intercurrent event strategies using SWIGs.
This paper proceeds as follows. Section 2 provides a brief overview of SWIGs. This gives
the reader the prerequisites to digest the example SWIGs in the subsequent sections. Section
3 provides examples of SWIGs for various intercurrent event strategies outlined in the ICH E9
addendum. Section 4 provides a case study of a SWIG in the context of a clinical trial for chronic
pain. Lastly, Section 5 concludes with a discussion.
2 Overview of SWIGs
In 2013, SWIGs were introduced by Richardson & Robins to unite the potential outcome and
graphical approaches to causal inference [15, 16]. Their paper makes clear why SWIGs are necessary
to do so, by considering the simplest DAG where treatment Ahas a causal effect on clinical outcome
Y,
A Y
Figure 2: Simplest DAG
2
where interest lies in the estimand for the average treatment effect:
∆ = E[Y(a= 1)] E[Y(a= 0)].
Where Y(a= 1) is the potential outcome if experimental treatment were taken and Y(a= 0)
is the potential outcome if reference treatment were taken. Note that all people in the population
have both potential outcomes Y(a= 1) and Y(a= 0) despite that typically only one of them being
observed in practice. For a review of potential outcomes framework, see [17–20]. Clearly there is a
disconnect between the DAG and the estimand, as the potential outcomes Y(a= 1) and Y(a= 0)
do not appear on the DAG. This is where SWIGs provide added value. To see how, consider the
following SWIGs associated with the DAG in Figure 2:
Aa= 1 Y(a= 1)
(a) SWIG for Experimental Treatment
Aa= 0 Y(a= 0)
(b) SWIG for Reference Treatment
Figure 3: Two possible SWIGs for DAG in Figure 2
The SWIGs clearly display the potential outcomes Y(a= 1) and Y(a= 0). This is due to the
node-splitting transformation that took variable Aon the DAG and split it into two components: A
and a. Splitting the node represents asking the same "what if" questions that the potential outcomes
pose: "What would the outcome be in a world where everyone took experimental treatment (a= 1)
or reference treatment (a= 0)?". Therefore, splitting the node on the DAG results in changing
the downstream variable from the observed outcome Yto its corresponding potential outcomes
Y(a= 1) and Y(a= 0). Since in this scenario there are two possible potential outcomes, there are
two possible SWIGs. In practice, we often collapse the possible SWIGs into one general SWIG for
any arbitrary a, which is shown in Figure 4.
AaY(a)
Figure 4: General graph representing the two SWIGs in Figure 3
Most importantly, we can apply the same d-separation criterion used for DAGs to read inde-
pendencies (and conditional independencies) between the variables from the SWIGs as well. In
doing so we see that Ais independent of both Y(a= 1) and Y(a= 0) - i.e. AY(a). This is
because there is no backdoor path between Aand the potential outcomes on the SWIG. Also, note
that the node-splitting blocks the forward path between Aand Y(a). This is because we have now
postulated a single-world where everyone took treatment a, so that the Awe observed in practice
no longer effects Y(a).
Given these independencies, we can identify the estimand of the average treatment effect ∆
from the observed data as follows:
∆ = E[Y(a= 1)] E[Y(a= 0)]
3
=E[Y(a= 1)|A= 1] E[Y(a= 0)|A= 0] (Y(a)A)
=E[Y|A= 1] E[Y|A= 0] (consistency)
The independency Y(a)A, which was given from our assumed SWIG, allowed us to condition
on Ain the expectation in line 2 of the equation. This is intuitive because if Ais independent
of Y(a) then we would expect the mean of Y(a) to be the same in any subgroup of Aas in
the whole population. In addition, we invoked the consistency assumption in the final line to
change the potential outcome Y(a) to the observed Y. The consistency assumption states that
Y=Y(a= 1)A+Y(a= 0)(1 A) [21]. In layman’s terms the consistency assumption states
that the observed outcome under observed treatment is consistent with the underlying potential
outcome for that same treatment. See, Table 1 for an example of consistency between potential
outcomes and the observed data for five subjects - data that only God could see. Additionally, the
usual causal assumptions of SUVTA [22] and positivity [20] apply as well.
id Y(a= 0) Y(a= 1) A Y
1 60 52 152
245 37 0 45
3 46 38 138
4 75 67 167
521 15 0 21
Table 1: God’s Table
Like DAGs, SWIGs encode one’s assumptions about the process on which we are collecting
data. It may be possible that two statisticians are involved in the same trial, but draw different
SWIGs because they have different causal assumptions about the real world process. There may
be differences in two statisticians choice on which nodes to split as well. However, the node-
splitting decisions reflect interest in different estimands (or clinical questions) rather than causal
assumptions about variables. This motivates the following section, which demonstrates SWIGs in
the context of the ICH E9(R1) Addendum.
3 Intercurrent Event Strategies from the ICH-E9(R1) Addendum in
SWIGs
We now explore visualizing various strategies for handling intercurrent events (IEs) using SWIGs.
There are infinite ways to explore such estimands. Important work has started taking place in
recent years to characterize specific clinical scenarios using SWIGs that incorporate time, multiple
outcome measurements, and competing risks on the graph [23–26]. We have chosen to focus on
simple cases in this section for pedagogical reasons in hopes that readers can afterwards extend
the basic principles detailed in this paper to their own trial work.
In all subsequent examples, Ais a binary treatment indicator that can be set to a∈ {0,1}. We
will consider a world where we intervene on A; consequently, this node will always be split in all
subsequent SWIGs and all descendants of awill take on their potential outcome representations.
Also, because we are operating in the context of a randomized trial, Awill have no parents in
4
all SWIGs presented; treatment is purely consequence of randomization and nothing else. The
potential outcome M(a) represents the value of the intercurrent event in a world where treatment
ais taken. We consider M(a) = 1 if the intercurrent event occurs and M(a) = 0 if not. Lastly,
Y(a) is the clinical outcome of interest in world where treatment ais taken.
3.1 Treatment Policy
AaM(a)Y(a)
Figure 5: SWIG for an ITT Estimand
In the treatment policy, or intention to treat (ITT), strategy the intercurrent event is ignored.
The corresponding SWIG (Figure 2) admits to this, as there is no node-splitting taking place on
M(a). The treatment effect (i.e. estimand) of interest measures a combination of the direct effect
of the drug on the outcome and the indirect effect mediated through the intercurrent event. In
the SWIG above, we can see that Y(a) is independent of A, i.e. there are no paths between A
and Y(a). Therefore, we can condition on Ain the expectations and then use the consistency
assumption to identify the estimand from the observed data as follows:
IT T =E[Y(1)] E[Y(0)]
=E[Y(1)|A= 1] E[Y(0)|A= 0] (Y(a)A)
=E[Y|A= 1] E[Y|A= 0] (consistency)
Identification is complete when the estimand of interest can be expressed as a function of
the observed data alone [27, 28] - i.e., no potential outcome variables remain in the expression.
Note that for the ITT estimand above, the equivalence E[Y(1)] E[Y(0)] = E[Y(1, M(1))]
E[Y(0, M(0))] exists, which clearly demonstrates that the intercurrent event is ignored in this
context. Due to this fact, it would actually be possible to collapse this SWIG into one identical
to Figure 4 and retain the necessary independencies for identification; however, we recommend
including even those IE that are ignored on the graph because this helps communicate that an
ITT strategy has been adopted.
3.2 Hypothetical
The hypothetical strategy envisages a counterfactual scenario - e.g., one in which the intercurrent
event would not occur. A SWIG can make this explicit by splitting the node on the intercurrent
event. Ergo, the "single world" that the SWIG depicts is one where, in addition to treatment
being set to a, the intercurrent event is set to m= 0. As a consequence of the node-splitting, the
potential outcomes of interest are now Y(a, m = 0). This hypothetical estimand can be written
mathematically as:
hypo =E[Y(a= 1, m = 0)] E[Y(a= 0, m = 0)]
5
We present two examples of SWIGs for a hypothetical estimand. In this first example (Figure
6), there is an unobserved confounder Ubetween M(a) and Y(a, m). We indicate that this variable
is unobserved by shading in the node.
AaM(a)mY(a,m)
U
Figure 6: SWIG for a hypothetical estimand with an unobserved confounder
As we can see, Y(a, m)6⊥M(a) because of the backdoor path through U. In other words, when
two variables share a common cause they are inherently related. Therefore,
E[Y(a= 1, m = 0)] E[Y(a= 0, m = 0)] 6=E[Y|A= 1, M = 0] E[Y|A= 0, M = 0]
Our attempts to identify the estimand fail because we cannot condition on the observed Min
the expectations. Consequently, we cannot express this causal contrast of potential outcomes as
a function of the observed data. This example highlights the extra layer of information that the
SWIG adds: the causal assumptions that determine whether identification of the estimand using
the observed data is possible. SWIGs go one step beyond defining estimands in plain language or
with tables.
In the second example, we assume to have observed a rich enough set of confounders between
M(a) and Y(a, m) to obtain the required independence to identify the causal effect, i.e. the
estimand, of interest. We denote conditioning on a variable, or blocking the causal pathway, by
drawing a square box around this variable on the graph (Figure 4).
AaM(a)mY(a,m)
C
Figure 7: SWIG for a hypothetical estimand with no unobserved confounding
6
The SWIG shows that we have observed and controlled for Cand can therefore induce the
conditional independence Y(a, m)M(a)|C. This allows us to reach identification by stratifying,
or integrating, across levels of C:
E[Y(a, m)] = X
c
E[Y(a, m)|C=c]P(C=c)
=X
c
E[Y(a, m)|C=c, M(a) = m]P(C=c) (Y(a, m)M(a)|C)
=X
c
E[Y(a, m)|C=c, M(a) = m, A =a]P(C=c) (Y(a, m)A)
=X
c
E[Y|C=c, M =m, A =a]P(C=c) (consistency)
Substituting this into ∆hypo:
hypo =E[Y(a= 1, m = 0)] E[Y(a= 0, m = 0)]
=X
c
E[Y|C=c, M = 0, A = 1]P(C=c)X
c
E[Y|C=c, M = 0, A = 0]P(C=c)
Therefore, we’ve demonstrated that the hypothetical estimand can be identified from the ob-
served data under the scenario defined by the SWIG in Figure 7. In this example, and in most
hypothetical scenarios, we are making the assumption of no unobserved confounding between the
IE and outcome. This is why, oftentimes, the sister assumption, missing at random (MAR)[29], is
utilized for imputing these "missing" potential outcomes using an imputation model that leverages
baseline covariates C. Of course, these are untestable assumptions, which highlight the impor-
tance of conducting a sensitivity analyses when considering a hypothetical estimand. Lastly, those
readers familiar with the mediation literature in causal inference will note that many hypothetical
estimands are examples of controlled direct effects [30].
Of course, it must be plausible that one could in theory imagine such a world. It is possible
to postulate a world in which everyone in your RCT population either receives treatment or does
not receive treatment. However, the ability to conceive of a world in which the IE never happens
will also depend on the specific IE at hand. Depending on the clinical context, it may be plausible
to consider a world in which rescue medication is not taken or treatment discontinuation due to
administrative dropout did not occur. In the event of active drug side effects, this may be less
plausible. In the case of death, this is highly implausible.
3.3 Composite
Composite estimands incorporate the IE into the variable definition. In the simplest case of a
binary outcome Y(a)∈ {0,1}the composite variable is defined as:
U(a) =
Y(a) if M(a) = 0
0 if M(a) = 1
Which corresponds to treating occurrence of an IE as a failure. Moreover, there exist strategies
for creating composite estimands for continuous endpoints [31, 32]. The composite strategy is often
employed in the case where the IE M(a) under consideration is death, and hence the potential
7
outcome Y(a) doesn’t exist when the IE occurs (i.e. M(a) = 1). We characterize such an example
in Figure 8, which assumes no causal effect of the IE on outcome and omit the arrow from M(a)
to Y(a). An example of a SWIG for such a composite estimand could then be:
AaM(a)Y(a)
U(a)
Figure 8: SWIG for a composite estimand
It is straightforward to see on the SWIG that U(a) is a combination of the outcome M(a)
and Y(a). It is also clear from the SWIG that AU(a), allowing us to identify the composite
estimand:
C=E[U(1)] E[U(0)]
=E[U(1)|A= 1] E[U(0)|A= 0] (U(a)A)
=E[U|A= 1] E[U|A= 0] (by consistency)
In the above example, we do not assume that the IE has a causal effect on the outcome - i.e.
no arrow from M(a) to Y(a). This would be the case if the IE were death for example. There are
of course situations where interest lies in a composite estimand, but there is an arrow from M(a)
to Y(a). For example, this would unarguably be the case where Mis treatment discontinuation.
Under such scenarios, the composite estimand remains identifiable. In fact, much like the ITT
estimand it would remain identifiable even if there are unobserved confounders between M(a) and
Y(a) because no matter what, the required independence U(a)Aremains.
3.4 Principal Stratum
In the Principal Stratum (PS) strategy, interest lies in a sub-population of patients defined by
whether or not an IE would occur under a particular treatment. We consider one such example:
the population of patients who would not have the IE under active treatment, i.e. M(a= 1) = 0.
This causal contrast, or estimand, of interest is therefore:
P S =E[Y(a= 1)|M(a= 1) = 0] E[Y(a= 0)|M(a= 1) = 0]
Consider first the following SWIG, that considers a world in which every patient took the
treatment and conditions on the strata of interest.
8
Aa= 1 M(a= 1) = 0 Y(a= 1)
Figure 9: SWIG where active treatment is given for those who would not have the IE under
treatment (i.e. conditioning on M(a= 1) = 0)
The SWIG reveals two important facts. Firstly, as is true in all randomized trials AY(a).
Secondly, and most importantly, this makes clear that we are interested in the direct effect of
treatment on outcome, as sub-setting to the principal strata M(a= 1) = 0 blocks the effect
of treatment mediated through the IE. Using this SWIG, we can identify the expected value of
Y(a= 1) in our principal strata, i.e. the first term in P S :
E[Y(a= 1)|M(a= 1) = 0] = E[Y(a= 1)|M(a= 1) = 0, A = 1] (Y(a)A)
=E[Y|M= 0, A = 1] (by consistency)
Which is straightforward as before. The difficulty in PS estimands comes into play when trying
to identify the second term in ∆P S from the observed data, i.e. E[Y(a= 0)|M(a= 1) = 0]. If we
consider the SWIG for the untreated world
Aa= 0 M(a= 0)Y(a= 0)
Figure 10: SWIG where reference treatment is given for those who would not have the IE under
treatment M(a= 1) = 0
it becomes evident that identifying the second term requires a cross-world assumption. We
want to know the expected value of Y(a= 0) in the strata of M(a= 1) = 0, but these variables
do not appear on the same SWIG, or in the same world. This is a fundamental challenge of PS
estimands and SWIGs help to make this challenge explicit when communicating the estimand. Of
course, this should not downplay the clinical importance of trying to understand PS estimands and
has not hindered statisticians attempts to estimate them [33]. The SWIG helps makes explicit the
assumptions needed to do so, and then allows us to begin discussing how plausible PS assumptions
may be.
Given the independencies presented on the two SWIGs, consider trying to identify the second
term of this principal stratum estimand from the observed data:
E[Y(a= 0)|M(a= 1) = 0] = E[Y(a= 0)|M(a= 1) = 0, A = 0] (Y(a)A)
=E[Y|M(a= 1) = 0, A = 0] (by consistency)
9
The challenge arises in identifying the stratum of patients in the reference arm of the trial who
would not have had the IE had they taken experimental treatment, something we never see in
practice. The SWIG can help us to communicate this.
4 Example from a Clinical Trial for Chronic Pain
We now outline an example of how one can use a SWIG to characterize an estimand in a Phase
II clinical trial for chronic pain. Callegari et al. [34] provide a detailed exposition of the various
attributes necessary to define a slew of clinically relevant target estimands in this context. Herein,
we focus on the primary estimand outlined in their paper. We demonstrate how a SWIG can be
used to define this estimand. This SWIG then allows the independencies between variables to be
read off this graph to demonstrate identification of the estimand. This case study illustrates how
a SWIG can aid in communicating a target estimand from a clinical trial.
To construct the graph, consider the following notation for the relevant variables used to
characterize the estimand so that we can concisely place them on the graph. As above, we denote
Aas the random variable representing treatment assignment. In the context of a chronic pain
outlined in Callegari et al, this variable can take on two values: A= 1 for active treatment and
A= 0 for placebo. However, because the SWIGs are equivalent under either treatment, we only
need to draw one SWIG wherein we intervene on Aand set the value to arbitrary a∈ {0,1}. We
denote the relevant intercurrent events as:
M1= Intake of short acting pain relief medication
M2= Treatment discontinuation due to Adverse Event, Loss of Efficacy, or intake of prohib-
ited medications
M3= Change of dose of allowed concomitant medication for pain
M4= Treatment discontinuation due to Administrative or Other reasons
Their primary estimand definition outlines the IE strategy used for each of the intercurrent
events. That is, M1and M2a treatment policy strategy is adapted. That is, the data post IE
is deemed relevant for these IE and collected and used in the analysis. For IEs M3and M4, a
hypothetical strategy is most relevant. Data post IE, even if available, are discarded and treated
as missing. For these IEs, what would have happened had the IEs not occured (i.e. M3=M4= 0)
is most clinically relevant. The potential outcome of interest is Y(a, m1= 0, m4= 0), representing
weekly mean pain score change from baseline using the 11-point numerical rating scale that would
be observed had treatment abeen taken, and had intercurrent events M1and M4not occurred.
Thus, the estimand of interest is:
RCT =E[Y(a= 1, m1= 0, m4= 0)] E[Y(a= 0, m1= 0, m4= 0)]
10
Aa
M1(a)
M2(a)
M3(a)m3
M4m4
Y(a,m3,m4)
C
Figure 11: SWIG for a Clinical Trial in Chronic Pain
It is clear from the SWIG that A,M1, and M4are d-separated from the potential outcome
Y(a, m1, m4) given that we can condition on C. These are the necessary independencies for iden-
tification. The mathematics for identifying this estimand are shown in supplementary information
1.
5 Discussion
It is easy to get lost in the jargon of estimands; SWIGs provide an escape. These causal graphs
allow a quick view of the clinically relevant treatment effect. Such visuals may appeal to many,
considering that roughly 65% of the population are visual learners [35]. Time is of the essence to
get effective treatments to those in need; tools like SWIGs help hasten estimand discussions. Ad-
ditionally, SWIGs can limit mistakes and misalignment between stakeholders as they are succinct
and clear.
SWIGs encode clinical assumptions about how treatment, IE, confounders, and outcomes all
casually interact with one another. As seen from the examples in this paper, these assumptions
govern whether or not an estimand is identifiable, and thus estimable from the observed data. IE
often embed observational studies within the RCT, and SWIGs help statisticians understand how
to account for the biases. We recommend clinical trialists adopt SWIGs in preliminary planning
discussions when defining estimands. The visuals are succinct and the rules of identification are
simple to explain to interdisciplinary teams.
Furthermore, defining estimands explicitly using the mathematical notation of potential out-
comes removes ambiguity. For instance, in the case of the clinical trial example in section 4,
the proposed analysis leads to confusion about what potential outcome is being targeted. The
authors consider treatment discontinuation due to adverse events, loss of efficacy, and intake of
prohibited medications as unfavorable events in their estimand definition. In these scenarios, they
propose using retrieved drop out (RDO) data in the analysis, which is aligned with a treatment
policy strategy. That would imply interest in potential outcome Y(a= 1, m1= 0, m4= 0),
which is to ignore M3as we did in section 4. However, should RDO data be unavailable, the
11
authors suggest imputing outcomes based on the placebo data, which rather implies interest in
Y(a= 0, m1= 0, m4= 0) among the principal stratum M3(a= 1) = 1. Perhaps it is believed that
these potential outcome distributions are equivalent. Nevertheless, the clinically relevant estimand
needs to be made clear. Furthermore, their primary estimand definition written in section 3.1 does
not align with their characterization of the estimand in Fig 1. This is because the latter proposes
a different IE strategy for M2in the placebo arm, which was not mentioned in section 3.1. The
SWIG removes these ambiguities from the estimand definition by clearly showing the potential
outcome of interest on the graph.
Our work is limited in a number of ways. Firstly, we ignore the "while on treatment" IE
strategy in our presentation of SWIGs because of our own bias that this type of estimand is
difficult to interpret causally. Additionally, previous work detailing causal diagrams in the estimand
framework has added another mediating variable between the IE and outcome representing the
intervention take post IE [9], but we find this redundant as their relationship is often deterministic
and can be considered jointly in one variable. This may be a simplification on our part. Should
the distinction prove valuable, this or any other variables should be added to the graph. Our
main goal herein is to show the utility of SWIGs in characterizing estimands. In practice, many
nuances specific to the trial will reveal themselves, making the SWIGs more complicated than
those presented in this paper. Lastly, we used differences in expectations throughout to define our
estimands for simplicity, but causal estimands can be contrasts (e.g. ratios, etc.) of any marginal
statistical functional (e.g. hazards, odds, etc.).
It is important for clinical trialists–especially statisticians–to learn to speak the language of
causal inference in order to adapt the vast methodological progress made in this area over the
last few decades. We believe incorporating SWIGs into estimand discussions is one such step in
the right direction. To enable our readers to start using SWIGs, all LaTeX code to generate the
graphs in this paper in included in supporting information 2.
12
References
1. Committee for Human Medicinal Products. ICH E9 addendum on estimands and sensitivity
analysis in clinical trials to the guideline on statistical principles for clinical trials, Step 2b.
European Medicines Evaluation Agency (London, 2017).
2. Robins, J. M., Hernan, M. A. & Brumback, B. Marginal structural models and causal inference
in epidemiology. Epidemiology 11, 550–560 (2000).
3. Robins, J. M., Rotnitzky, A. & Zhao, L. P. Estimation of regression coefficients when some
regressors are not always observed. Journal of the American Statistical Association 89, 846–
866 (1994).
4. Pearl, J. Causal diagrams for empirical research. Biometrika 82, 669–688 (1995).
5. Hernan, M. A., Hernandez-Diaz, S. & Robins, J. M. A structural approach to selection bias.
Epidemiology, 615–625 (2004).
6. Robins, J. M. & Greenland, S. Identifiability and exchangeability for direct and indirect
effects. Epidemiology, 143–155 (1992).
7. Hernan, M. & Robins, J. Causal Inference: What if. Boca Raton: Chapman & Hill/CRC
(2020).
8. Hernan, M. A. Methods of Public Health Research—Strengthening Causal Inference from
Observational Data. New England Journal of Medicine 385, 1345–1348 (2021).
9. Lipkovich, I., Ratitch, B. & Mallinckrodt, C. H. Causal inference and estimands in clinical
trials. Statistics in Biopharmaceutical Research 12, 54–67 (2020).
10. Mallinckrodt, C., Molenberghs, G., Lipkovich, I. & Ratitch, B. Estimands, estimators and
sensitivity analysis in clinical trials (CRC Press, 2019).
11. Pearl, J. Causality (Cambridge University Press, 2009).
12. Greenland, S., Pearl, J. & Robins, J. M. Causal diagrams for epidemiologic research. Epi-
demiology, 37–48 (1999).
13. Geiger, D., Verma, T. & Pearl, J. in Machine Intelligence and Pattern Recognition 139–148
(Elsevier, 1990).
14. Splawa-Neyman, J., Dabrowska, D. M. & Speed, T. On the application of probability theory
to agricultural experiments. Essay on principles. Section 9. Statistical Science, 465–472 (1990).
15. Richardson, T. S. & Robins, J. M. Single world intervention graphs (SWIGs): A unification
of the counterfactual and graphical approaches to causality. Center for the Statistics and the
Social Sciences, University of Washington Series. Working Paper 128, 2013 (2013).
16. Richardson, T. S. & Robins, J. M. Single world intervention graphs: a primer in Second UAI
workshop on causal structure learning, Bellevue, Washington (2013).
17. Rosenbaum, P. R. & Rubin, D. B. The central role of the propensity score in observational
studies for causal effects. Biometrika 70, 41–55 (1983).
18. Rubin, D. B. Bayesian inference for causal effects: The role of randomization. The Annals of
Statistics, 34–58 (1978).
19. Holland, P. W. Statistics and causal inference. Journal of the American Statistical Association
81, 945–960 (1986).
13
20. Robins, J. A new approach to causal inference in mortality studies with a sustained exposure
period—application to control of the healthy worker survivor effect. Mathematical Modelling
7, 1393–1512 (1986).
21. Cole, S. R. & Frangakis, C. E. The consistency statement in causal inference: a definition or
an assumption? Epidemiology 20, 3–5 (2009).
22. Rubin, D. B. Randomization analysis of experimental data: The Fisher randomization test
comment. Journal of the American Statistical Association 75, 591–593 (1980).
23. Young, J. G., Stensrud, M. J., Tchetgen Tchetgen, E. J. & Hernan, M. A. A causal framework
for classical statistical estimands in failure-time settings with competing events. Statistics in
Medicine 39, 1199–1236 (2020).
24. Stensrud, M. J. & Dukes, O. Translating questions to estimands in randomized clinical trials
with intercurrent events. Statistics in Medicine (2022).
25. Parra, C. O., Daniel, R. M. & Bartlett, J. W. Hypothetical estimands in clinical trials: a
unification of causal inference and missing data methods. arXiv preprint arXiv:2107.04392
(2021).
26. Breskin, A., Cole, S. R. & Hudgens, M. G. A practical example demonstrating the utility of
single-world intervention graphs. Epidemiology (Cambridge, Mass.) 29, e20 (2018).
27. Pearl, J. & Bareinboim, E. in Probabilistic and Causal Inference: The Works of Judea Pearl
451–482 (2022).
28. Bareinboim, E. & Pearl, J. Causal inference and the data-fusion problem. Proceedings of the
National Academy of Sciences 113, 7345–7352 (2016).
29. Rubin, D. B. Inference and missing data. Biometrika 63, 581–592 (1976).
30. VanderWeele, T. Explanation in causal inference: methods for mediation and interaction (Ox-
ford University Press, 2015).
31. Permutt, T. & Li, F. Trimmed means for symptom trials with dropouts. Pharmaceutical
Statistics 16, 20–28 (2017).
32. Ocampo, A., Schmidli, H., Quarg, P., Callegari, F. & Pagano, M. Identifying treatment effects
using trimmed means when data are missing not at random. arXiv preprint arXiv:1908.01044
(2019).
33. Bornkamp, B. et al. Principal stratum strategy: Potential role in drug development. Phar-
maceutical Statistics (2021).
34. Callegari, F. et al. Estimands in a chronic pain trial: challenges and opportunities. Statistics
in Biopharmaceutical Research 12, 39–44 (2020).
35. Bradford, W. C. Reaching the visual learner: teaching property through art. The Law Teacher
11 (2004).
14
Appendix: Identification of Estimand in Clinical Trial Example
As stated in section 4, the causal estimand of interest is:
RCT =E[Y(a= 1, m3= 0, m4= 0)] E[Y(a= 0, m3= 0, m4= 0)]
Using the independencies on the SWIG in Figure X we can identify the estimand. Since the
procedure for identifying the two expectations of potential outcomes in the contrast above are
identical, we show identification of arbitrary E[Y(a, m3= 0, m4= 0)]:
E[Y(a, m3= 0, m4= 0)] = E[Y(a, m3= 0, m4= 0)|A=a]
=X
c
E[Y(a, m3= 0, m4= 0)|A=a, C =c]P r(C=c)
=X
c
E[Y(a, m3= 0, m4= 0)|A=a, C =c, M3= 0, M4= 0]P r(C=c) (1)
=X
c
E[Y|A=a, C =c, M3= 0, M4= 0]P r(C=c) (2)
Which leads to identification of the estimand. We use the fact that {M3, M4} ⊥Y(a, m3=
0, m4= 0)|Cin line (1) to invoke the consistency assumption in line (2) and identify the causal
effect from the observed data.
15
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The ICH E9 addendum introduces the term intercurrent event to refer to events that happen after treatment initiation and that can either preclude observation of the outcome of interest or affect its interpretation. It proposes five strategies for handling intercurrent events to form an estimand but does not suggest statistical methods for estimation. In this article we focus on the hypothetical strategy, where the treatment effect is defined under the hypothetical scenario in which the intercurrent event is prevented. For its estimation, we consider causal inference and missing data methods. We establish that certain “causal inference estimators” are identical to certain “missing data estimators.” These links may help those familiar with one set of methods but not the other. Moreover, using potential outcome notation allows us to state more clearly the assumptions on which missing data methods rely to estimate hypothetical estimands. This helps to indicate whether estimating a hypothetical estimand is reasonable, and what data should be used in the analysis. We show that hypothetical estimands can be estimated by exploiting data after intercurrent event occurrence, which is typically not used. Supplementary materials for this article are available online.
Article
Full-text available
Intercurrent (post‐treatment) events occur frequently in randomized trials, and investigators often express interest in treatment effects that suitably take account of these events. Contrasts that naively condition on intercurrent events do not have a straight‐forward causal interpretation, and the practical relevance of other commonly used approaches is debated. In this work, we discuss how to formulate and choose an estimand, beyond the marginal intention‐to‐treat effect, from the point of view of a decision maker and drug developer. In particular, we argue that careful articulation of a practically useful research question should either reflect decision making at this point in time or future drug development. Indeed, a substantially interesting estimand is simply a formalization of the (plain English) description of a research question. A common feature of estimands that are practically useful is that they correspond to possibly hypothetical but well‐defined interventions in identifiable (sub)populations. To illustrate our points, we consider five examples that were recently used to motivate consideration of principal stratum estimands in clinical trials. In all of these examples, we propose alternative causal estimands, such as conditional effects, sequential regime effects, and separable effects, that correspond to explicit research questions of substantial interest.
Article
Full-text available
Patients often discontinue from a clinical trial because their health condition is not improving or they cannot tolerate the assigned treatment. Consequently, the observed clinical outcomes in the trial are likely better on average than if every patient had completed the trial. If these differences between trial completers and non‐completers cannot be explained by the observed data, then the study outcomes are missing not at random (MNAR). One way to overcome this problem—the trimmed means approach for missing data due to study discontinuation—sets missing values as the worst observed outcome and then trims away a fraction of the distribution from each treatment arm before calculating differences in treatment efficacy (Permutt T, Li F. Trimmed means for symptom trials with dropouts. Pharm Stat. 2017;16(1):20–28). In this paper, we derive sufficient and necessary conditions for when this approach can identify the average population treatment effect. Simulation studies show the trimmed means approach's ability to effectively estimate treatment efficacy when data are MNAR and missingness due to study discontinuation is strongly associated with an unfavorable outcome, but trimmed means fail when data are missing at random. If the reasons for study discontinuation in a clinical trial are known, analysts can improve estimates with a combination of multiple imputation and the trimmed means approach when the assumptions of each hold. We compare the methodology to existing approaches using data from a clinical trial for chronic pain. An R package trim implements the method. When the assumptions are justifiable, using trimmed means can help identify treatment effects notwithstanding MNAR data.
Article
Full-text available
In failure‐time settings, a competing event is any event that makes it impossible for the event of interest to occur. For example, cardiovascular disease death is a competing event for prostate cancer death because an individual cannot die of prostate cancer once he has died of cardiovascular disease. Various statistical estimands have been defined as possible targets of inference in the classical competing risks literature. Many reviews have described these statistical estimands and their estimating procedures with recommendations about their use. However, this previous work has not used a formal framework for characterizing causal effects and their identifying conditions, which makes it difficult to interpret effect estimates and assess recommendations regarding analytic choices. Here we use a counterfactual framework to explicitly define each of these classical estimands. We clarify that, depending on whether competing events are defined as censoring events, contrasts of risks can define a total effect of the treatment on the event of interest or a direct effect of the treatment on the event of interest not mediated by the competing event. In contrast, regardless of whether competing events are defined as censoring events, counterfactual hazard contrasts cannot generally be interpreted as causal effects. We illustrate how identifying assumptions for all of these counterfactual estimands can be represented in causal diagrams, in which competing events are depicted as time‐varying covariates. We present an application of these ideas to data from a randomized trial designed to estimate the effect of estrogen therapy on prostate cancer mortality.
Article
For researchers using observational data, a useful way to answer a causal question is to design the target trial that would answer it and then emulate its protocol. The example of the HIV-treatment-as-prevention strategy illustrates the benefits of this approach.
Article
A randomized trial allows estimation of the causal effect of an intervention compared to a control in the overall population and in subpopulations defined by baseline characteristics. Often, however, clinical questions also arise regarding the treatment effect in subpopulations of patients, which would experience clinical or disease related events post‐randomization. Events that occur after treatment initiation and potentially affect the interpretation or the existence of the measurements are called intercurrent events in the ICH E9(R1) guideline. If the intercurrent event is a consequence of treatment, randomization alone is no longer sufficient to meaningfully estimate the treatment effect. Analyses comparing the subgroups of patients without the intercurrent events for intervention and control will not estimate a causal effect. This is well known, but post‐hoc analyses of this kind are commonly performed in drug development. An alternative approach is the principal stratum strategy, which classifies subjects according to their potential occurrence of an intercurrent event on both study arms. We illustrate with examples that questions formulated through principal strata occur naturally in drug development and argue that approaching these questions with the ICH E9(R1) estimand framework has the potential to lead to more transparent assumptions as well as more adequate analyses and conclusions. In addition, we provide an overview of assumptions required for estimation of effects in principal strata. Most of these assumptions are unverifiable and should hence be based on solid scientific understanding. Sensitivity analyses are needed to assess robustness of conclusions.
Article
The National Research Council’s report on the prevention and treatment of missing data (NRC, 2010) highlighted the need to clearly specify causal estimands. This focus fundamentally changed how the missing data problem was perceived and addressed in clinical trials. The recent ICH E9(R1) draft addendum (ICH, 2017) is another major step in promoting the use of the causal estimands framework that should further influence how clinical trial protocols and statistical analysis plans are written and implemented. The language of potential outcomes that is widely accepted in the causal inference literature is not widely recognized in the clinical trialists community and was not used in defining causal estimands in the NRC report or the ICH E9(R1). In this article, we attempt to bridge the gap between the causal inference community and clinical trialists to further advance the use of causal estimands in clinical trial settings. We illustrate how concepts from causal literature, such as potential outcomes and dynamic treatment regimens, can facilitate defining and implementing causal estimands and may provide a unifying language to describing the targets for both observational and randomized clinical trials.
Article
An estimand clearly defines the target treatment effect to be estimated in a clinical trial. A recently published draft International Conference on Harmonization (ICH) E9 addendum introduces the concept of estimand in clinical trials and provides a structured framework to link trial objectives, design, conduct, statistical analysis and interpretation in a coherent way. In the meantime, regulators are already keen to discuss the definition of estimands for new clinical trials. In this manuscript, we focus on the primary clinical question of interest and on the corresponding estimand specification for a future Phase 2 study in chronic pain. This entails, in particular, the identification and handling of intercurrent events relevant in the chronic pain field. The primary estimand with its detailed rationale for consideration is presented, together with the primary estimation method. Other supplementary estimands are also defined to assess slightly different treatment effects. Some practical considerations arising from the development of the estimand concept for this trial are summarized, outlining the challenges encountered, how these have been overcome and the opportunities discovered during this process.