Content uploaded by Alberto Caron

Author content

All content in this area was uploaded by Alberto Caron on Jul 11, 2022

Content may be subject to copyright.

Interpretable Deep Causal Learning for Moderation Effects

Alberto Caron 1 2 Gianluca Baio 1Ioanna Manolopoulou 1

Abstract

In this extended abstract paper, we address the

problem of interpretability and targeted regular-

ization in causal machine learning models. In

particular, we focus on the problem of estimat-

ing individual causal/treatment effects under ob-

served confounders, which can be controlled for

and moderate the effect of the treatment on the out-

come of interest. Black-box ML models adjusted

for the causal setting perform generally well in

this task, but they lack interpretable output identi-

fying the main drivers of treatment heterogeneity

and their functional relationship. We propose a

novel deep counterfactual learning architecture

for estimating individual treatment effects that

can simultaneously: i) convey targeted regulariza-

tion on, and produce quantify uncertainty around

the quantity of interest (i.e., the Conditional Av-

erage Treatment Effect); ii) disentangle baseline

prognostic and moderating effects of the covari-

ates and output interpretable score functions de-

scribing their relationship with the outcome. Fi-

nally, we demonstrate the use of the method via

a simple simulated experiment and a real-world

application1.

1. Introduction

In the past years, there has been a growing interest towards

applying ML methods for causal inference. Disciplines

such as precision medicine and socio-economic sciences in-

evitably call for highly personalized decision making when

designing and deploying policies. Although in these ﬁelds

exploration of policies in the real world through random-

ized experiments is costly, in order to answer counterfactual

questions such as “what would have happened if individual

1

Department of Statistical Science, University College London,

London, UK

2

The Alan Turing Institute, London, UK. Correspon-

dence to: Alberto Caron <alberto.caron.19@ucl.ac.uk>.

Workshop on Interpretable ML in Healthcare at International Con-

ference on Machine Learning (ICML). Copyright 2022 by the

author(s).

1

Code for full reproducibility can be found at

https://

github.com/albicaron/ICNN.

i

undertook medical treatment A instead of treatment B”

one can rely on observational data, provided that the con-

founding factors can be controlled for. Black-box causal

ML models proposed in many recent contributions perform

remarkably well in the task of estimating individual coun-

terfactual outcomes, but signiﬁcantly lack interpretability,

which is a key component in the design of personalized treat-

ment rules. This is because they jointly model the outcome

dependency on the covariates and on the treatment variable.

Knowledge of what are the main moderating factors of a

treatment can unequivocally lead to overall better policy

design, as moderation effects can be leveraged to achieve

higher cumulative utility when deploying the policy (e.g.,

by avoiding treating patient with uncertain or borderline

response, better treatment allocation on budget/resources

constraints,...). Another main issue of existing causal ML

models, related to the one of interpretability, is carefully de-

signed regularization (Nie & Wager,2020;Hahn et al.,2020;

Caron et al.,2022b). Large observational studies generally

include measurements on a high number of pre-treatment

covariates, and disentangling prognostic

2

and moderating

effects allows the application of targeted regularization on

both, that avoids incurring in unintended ﬁnite sample bias

and large variance (see (Hahn et al.,2020) for a detailed

discussion on Regularization Induced Confounding bias).

This is useful in many scenarios where treatment effect is

believed to be a sparser and relatively less complex function

of the covariates compared to the baseline prognostic effect,

so it necessitates carefully tailored regularization.

1.1. Related Work

Among the most inﬂuential and recent contributions on ML

regression-based techniques for individualized treatment

effects learning, we particularly emphasize the work of

(Johansson et al.,2016;Shalit et al.,2017;Yao et al.,2018)

on deep learning models, (Alaa & van der Schaar,2017;

2018) on Gaussian Processes, (Hahn et al.,2020;Caron

et al.,2022b) on Bayesian Additive Regression Trees, and

ﬁnally the literature on the more general class of Meta-

Learners models (K

¨

unzel et al.,2017;Nie et al.,2020). We

refer the reader to (Caron et al.,2022a) for a detailed review

of the above methods.

2

Prognostic effect is deﬁned as the baseline effect of the covari-

ates on the outcome, in absence of treatment.

Interpretable Deep Causal Learning for Moderation Effects

X

A Y

X=fX(εX)

A=fA(X, εA)

Y=fY(X, A) + εY

Figure 1.

Causal DAG and set of structural equations describing

a setting that satisﬁes the backdoor criterion. The underlying

assumption is that conditioning on the confounders

X

is sufﬁcient

to identify the causal effect

A→Y

. Models generally assume

mean-zero additive error term for the outcome equation. The red

arrow in the DAG represent the moderating effect of

X

in the

A→Yrelationship.

In particular we build on top of contributions by (Nie et al.,

2020;Hahn et al.,2020;Caron et al.,2022b), that have pre-

viously addressed the two issues of targeted regularization

in causal ML. Our work proposes a new deep architecture

that can separate baseline prognostic and treatment effects,

and, by borrowing ideas from recent work on Neural Addi-

tive Models (NAMs) (Agarwal et al.,2021), a deep learn-

ing version of Generalized Additive Models, can output

interpretable score functions describing the impact of each

covariate in terms of their prognostic and treatment effects.

2. Problem Framework

In this section we brieﬂy introduce the main notation setup

for causal effects identiﬁcation and estimation under ob-

served confounders scenarios, by utilizing the framework of

Structural Causal Models (SCMs) and do-calculus (Pearl,

2009). We assume we have access to data of observational

nature described by the tuple

Di={Xi, Ai, Yi} ∼ p(·)

,

with

i∈ {1, ..., N }

, where

Xi∈ X

is a set of covariates,

Ai∈ A

a binary manipulative variable, and

Yi∈R

is the

outcome. We assume then that the causal relationships be-

tween the three variables are fully described by the SCM

depicted in Figure 1, both in the forms of causal DAG and

set of structural equations. A causal DAG is a graph made of

vertices and edges

(V,E)

, where vertices represent the ob-

servational random variables, while edges represent causal

functional relationships. Notice that we assume, in line with

most of the literature, zero-mean additive error structure for

the outcome equation. The ultimate goal is to identify and

estimate the Conditional Average Treatment Effects (CATE),

deﬁned as the effect of intervening on the manipulative vari-

able

Ai

, by setting equal to some value

a

(or

do(Ai=a

in

the do-calculus notation), on the outcome

Yi

, conditional on

covariates

Xi

(i.e., conditional on patient’s characteristics,

...). In the case of binary Ai, CATE is deﬁned as:

CATE: τ(xi) = E[Yi|do(Ai= 1),Xi=x]

−E[Yi|do(Ai= 0),Xi=x].(1)

In order to identify the quantity in (1) we make two standard

assumptions. The ﬁrst assumption is that there are no unob-

served confounders (unconfoundedness) — or equivalently

in Pearl’s terminology, that

Xi

satisﬁes the backdoor cri-

terion. The second assumption is common support, which

states that there is no deterministic selection into either of

the treatment arms conditional on the covariates, or equiv-

alently that

p(Ai= 1|Xi=x)∈(0,1),∀i

. The latter

guarantees that we could theoretically observe data points

with

Xi=x

in each of the two arms of

A

. Under these

two assumptions, we can identify CATE

τ(xi)

in terms of

observed quantities only, replacing the do-operator in (1)

with the factual Ai, by conditioning on Xi:

E[Yi|do(Ai=a),Xi=x] = E[Yi|Ai=a, Xi=x].

Once CATE is identiﬁed as above, there are different ways

in which it can be estimated in practice. We will brieﬂy

describe few of them in the next section.

3. Targeted CATE estimation

Very early works in the literature on CATE estimation pro-

posed ﬁtting a single model ˆ

fY(Xi, Ai)(S-Learners). The

main drawback of S-Learners is that they are unable to ac-

count for any group-speciﬁc distributional difference, which

becomes more relevant the stronger the selection bias is.

Most of the subsequent contributions instead suggested

splitting the sample into treatment subgroups and ﬁt sep-

arate, arm-speciﬁc models

ˆ

fYa(xi)

(T-Learners). While

T-Learners are able to account for distributional variation

attributable to

Ai

, they are less sample efﬁcient, prone to

CATE overﬁtting and to regularization induced confounding

bias (K

¨

unzel et al.,2017;Hahn et al.,2020;Caron et al.,

2022b). In addition they do not produce credible intervals

directly on CATE, as a CATE estimator is derived as the dif-

ference of two separate models’ ﬁt

ˆτ(xi) = ˆ

f1(xi)−ˆ

f0(xi)

,

with the induced variance being potentially very large:

Vˆτ(xi)=Vˆ

f1(xi)−ˆ

f0(xi)=

=Vˆ

f1(xi)+Vˆ

f0(xi)−2Covˆ

f1(xi),ˆ

f0(xi).

Finally some of the most recent additions to the literature

(Hahn et al.,2020;Nie et al.,2020;Caron et al.,2022b)

proposed using (Robinson,1988) additively separable re-

parametrization of the outcome function, which reads:

Robinson: Yi=µ(Xi)

| {z }

Prognostic Eff

+τ(Xi)

| {z }

CATE

Ai+εi,(2)

where

µ(xi) = E[Yi|do(Ai= 0),Xi=x]

is the

prognostic effect function and

τ(xi)

is the CATE func-

tion as deﬁned in (1). We assume like most contribu-

tions that

E(εi) = 0

. The distinctive trait of Robinson’s

parametrization is that the outcome function explicitly in-

cludes the function of interest, i.e. CATE

τ(xi)

, while in

Interpretable Deep Causal Learning for Moderation Effects

the usual S- or T-Learner (and subsequent variations of

these) parametrizations CATE is implicitly obtained post-

estimation as

ˆτ(xi) = ˆ

f1(xi)−ˆ

f0(xi)

. This means that

(2) is able to differentiate between the baseline prognos-

tic effect

µ(xi)

(in absence of treatment) and moderating

effects embedded in the CATE function

τ(xi)

, of the co-

variates. As a consequence, by utilizing (2), one can convey

different degree of regularization when estimating the two

functions. This is particularly useful as CATE is usually

believed to display simpler patterns than

µ(xi)

; so by esti-

mating it separately, one is able to apply stronger targeted

regularization.

3.1. Interpretable Causal Neural Networks

Following (Robinson,1988), and the more recent work by

(Nie et al.,2020;Hahn et al.,2020;Caron et al.,2022b), we

propose a very simple deep learning architecture for inter-

pretable and targeted CATE estimation, based on Robinson

parametrization. The architecture is made of two separa-

ble neural nets blocks that respectively learn the prognostic

function

µ(xi)

and the CATE function

τ(xi)

, but are “re-

connected” at the end of the pipeline to minimize a single

loss function, unlike T-Learners which instead minimize

separate loss functions on

f1(·)

and

f0(·)

. Our target loss

function to minimize is generally deﬁned as follows:

TCNN: min

µ(·),τ (·)Lyµ(x) + τ(x)a, y,(3)

where

Ly(·)

can be any standard loss function (e.g., MSE,

negative log-likelihood,...). Through its separable block

structure, the model allows the design of different NN archi-

tectures for learning

µ(·)

and

τ(·)

while preserving sample

efﬁciency (i.e., avoiding sample splitting as in T-Learners),

and to produce uncertainty measures around CATE τ(·)di-

rectly. Thus, if

τ(·)

is believed to display simple moderating

patterns as a function of

Xi

, a shallower NN structure with

less hidden layers and units, and more aggressive regulariza-

tion (e.g., higher regularization rate or dropout probabilities),

can be speciﬁed, while retaining higher level of complexity

in the

µ(·)

block. We generally refer to this model as Tar-

geted Causal Neural Network (TCNN) for simplicity from

now onwards. Figure 2provide a simple visual represen-

tation. While in this work we focus on binary intervention

variables

Ai

for simplicity, TCNN can be easily extended

to multi-category

Ai

by adding extra blocks to the structure

in Figure 2.

In addition to the separable structure, and in order to guar-

antee higher level of interpretability on prognostic and mod-

erating factors, we also propose using a recently developed

neural network version of Generalized Additive Models

(GAMs), named Neural Additive Models (NAMs) (Agarwal

et al.,2021), as the two

µ(·)

and

τ(·)

NN building blocks

of TCNN. We refer to this particular version of TCNN as

X

A

...

ϕµ(·)

...

ϕτ(·)

Ly(µ(x) + τ(x)a, y)

µ(·)

τ(·)

Figure 2.

Intuitive TCNN structure. The deep architecture is mod-

elled through a sample efﬁcient, tailored loss function based on

Robinson’s parametrization.

Interpretable Causal Neural Network (ICNN). Contrary to

normal NNs, which fully “connect” inputs to every nodes in

the ﬁrst hidden layer, NAMs “connect” each single input to

its own NN structure and thus outputs input-speciﬁc score

functions, that fully describe the predicted relationship be-

tween each input and the outcome. NAM’s score functions

have an intuitive interpretation as Shapley values (Shapley,

1953): how much of an impact each input has on the ﬁnal

predicted outcome. The structure of the loss function (3) in

ICNN thus becomes additive also in the

P

covariate-speciﬁc

µj(·)and τj(·)functions:

ICNN: min

µ(·),τ (·)LyP

X

j=1

µj(xj) +

P

X

j=1

τj(xj)a, y,

where the single

µj(xj)

score function represents the Shap-

ley value in terms of prognostic effect of covariate

xj

,

while

τj(xj)

its Shapley value in terms of moderating ef-

fect. Hence, the NAM architecture in ICNN allows us to

estimate the impact of each covariates as a prognostic and

moderating factor and quantify the uncertainty around them

as well. Under ICNN, the outcome function thus becomes

twice additively separable, as:

Yi=

P

X

j=1

µj(xi,j ) +

P

X

j=1

τj(xi,j )Ai+εi,(4)

where

i∈ {1, ..., N }

and

j∈ {1, ..., P }

. Naturally, the

downside of NAMs is that they might miss out on inter-

action terms among the covariates. These could possibly

be constructed and added manually as additional inputs,

although this is not particularly convenient nor computation-

ally ideal.

3.2. Links to Previous Work

We conclude the section by highlighting similarities and

differences between TCNN (and ICNN) and other popular

methods employing Robinson’s parametrization. Differ-

ently than R-Learner (Nie et al.,2020), TCNN is not a

multi-step plug-in (and cross-validated) estimator and does

not envisage the use of propensity score. Instead, simi-

larly to Bayesian Causal Forest (BCF) (Hahn et al.,2020;

Interpretable Deep Causal Learning for Moderation Effects

Model Train √PEHEτTest √PEHEτ

S-NN 1.046 ±0.007 1.076 ±0.007

T-NN 1.021 ±0.002 1.074 ±0.002

R-CF 1.467 ±0.002 1.494 ±0.002

R-NN 0.706 ±0.003 0.712 ±0.003

R-NAM 0.787 ±0.002 0.787 ±0.002

TCNN 0.361 ±0.001 0.362 ±0.001

ICNN 0.328 ±0.001 0.331 ±0.001

Table 1.

Performance on simulated experiment, measured as 70%-

30% train-test set √PEHEτ. Bold indicates better performance.

Caron et al.,2022b), estimation in TCNN is carried out in

a single, more sample efﬁcient step, although BCF is in-

herently Bayesian and relatively computationally intensive.

To obtain better coverage properties in terms of uncertainty

quantiﬁcation in both TCNN and ICNN, we implement the

MC dropout technique (Gal & Ghahramani,2016) in both

µ(·)

and

τ(·)

blocks to perform approximate Bayesian in-

ference, that is, we re-sample multiple times from the NN

model with dropout layers to build an approximate poste-

rior predictive distribution. This produces credible intervals

around CATE estimates

τ(·)

in a very straightforward way,

and, in ICNN speciﬁcally, credible intervals around each

inputs’ score function, as we will show in the experimental

section.

4. Experiments

We hereby present results from a simple simulated exper-

iment on CATE estimation, to compare TCNN and ICNN

performance against some of the state of the art methods. In

addition, we demonstrate how ICNN with MC dropout in

particular can be employed to produce highly interpretable

score function measures, fully describing the estimated mod-

erating effects of the covariates

xi

in

τ(·)

, and uncertainty

around them. For performance comparison we rely on the

root Precision in Estimating Heterogeneous Treatment Ef-

fects (PEHE) metric (Hill,2011), deﬁned as:

pPEHEτ=qE(ˆτi(xi)−τi(xi))2,(5)

and the list of models we compare include: S-Learner ver-

sion of NNs (S-NN); T-Learner version of NNs (T-NN);

Causal Forest (Wager & Athey,2018), a particular type

of R-Learner (R-CF); a “unique-block”, fully connected

NN that uses Robinson’s parametrization minimizing the

loss function in (3) (R-NN); a “unique-block” NAM, again

minimizing the loss function in (3) (R-NAM); our TCNN

with fully connected NN blocks; and ICNN. S-NN, T-NN

and R-NN all feature two [50, 50] hidden layers. R-NAM

features two [20, 20] hidden layers for each input. TCNN

features two [50, 50] hidden layers in the

µ(·)

block, and

X

1

Figure 3.

Score function output from ICNN model relative to co-

variate

X1

, depicting its moderating effect on CATE, plus MC

dropout generated credible intervals.

one [20] hidden layer in the

τ(·)

block. ICNN features two

[20, 20] hidden layers in the

µ(·)

block, and one [50] hidden

layer in the τ(·)block, for each input.

We simulate

N= 2000

data points on

P= 10

correlated

covariates, with binary

Ai

and continuous

Yi

. The exper-

iment was run for

B= 100

replications and results on

70%-30% train-test sets average

√PEHEτ

, plus 95% Monte

Carlo errors, can be found in Table 1. The full description

of the data generating process utilized for this simulated ex-

periment can be found in the appendix Section A. NN mod-

els minimizing the Robinson loss function in (3) perform

considerably better than S- and T-Learner baselines on this

particular example, especially TCNN and ICNN that present

the additional advantage of conveying targeted regulariza-

tion. Considering the ICNN model only, we can then access

the score functions on the

τ(·)

NAM block that describe

the moderating effects of the covariates

xi

. In particular

in Figure 3we plot the score function of the ﬁrst covariate

Xi,1

on CATE

τ(·)

, plus the approximate Bayesian credible

intervals generated through MC dropout resampling (Gal

& Ghahramani,2016). In this speciﬁc simulated example,

CATE function is generated as

τ(xi) = 3 + 0.8X2

i,1

. So

only

Xi,1

, out of all

P= 10

covariates, drives the simple

heterogeneity patterns in treatment response across individ-

uals, in a quadratic form. As Figure 3shows, ICNN is able

in this example to learn a score function that very closely

approximates the underlying true relationship 0.8X2

i,1, and

quantiﬁes uncertainty around it. Naturally, in a different

simulated setup with strong interaction terms among the

covariates, performance of ICNN would inevitably deteri-

orate compared to the other versions of NN and models

considered here. Thus, performance and interpretability in

this type of scenario would certainly constitute a trade-off.

Interpretable Deep Causal Learning for Moderation Effects

Figure 4.

Score functions (or Shapley values) with associated MC dropout bands describing moderation effects of each covariate on

estimated CATE: τj(xj),∀j∈ {1, ..., P }.

4.1. Real-World Example: the ACTG-175 data

Finally, we brieﬂy demonstrate the use of ICNN on a real-

world example. Although the focus of the paper so far has

been on observational type of studies, we will analyze data

from a randomized experiment to show that the methods

introduced naturally extend to this setting as well, with the

non-negligible additional beneﬁt that both unconfounded-

ness and common support assumptions hold by construction

(i.e., no “causal” arrow going from

X→A

in Figure 1

DAG). The data we use are taken from the ACTG-175 study,

a randomized controlled trial comparing standard mono-

therapy against a combination of therapies in the treatment

of HIV-1-infected patients with CD4 cell counts between

200 and 500. Details of the design can be found in the

original contribution by Hammer et al. (1996). The dataset

features

N= 2139

observations and

P= 12

covariates

X

(which are listed in the appendix section below), a bi-

nary treatment

A

(mono-therapy VS multi-therapy) and a

continuous outcome

Y

(difference in CD4 cell counts be-

tween baseline and after 20

±

5 weeks after undertaking the

treatment — this is done in order to take into account any

individual unobserved time pattern in the CD4 cell count).

The aim is to investigate the moderation effects of the covari-

ates in terms of heterogeneity of treatment across patients.

In order to do so, we run ICNN and obtain the estimated

score functions, together with approximated Bayesian MC

dropout bands, for each covariate

Xj

, and we report these

in Figure 4. The results generally suggest a good degree

of treatement heterogeneity, with most of the covariates

playing a signiﬁcant moderating role.

5. Conclusion

In this extended abstract paper, we have addressed the is-

sue of interpretability and targeted regularization in causal

machine learning models for the estimation of heteroge-

neous/individual treatment effects. In particular we have

proposed a novel deep learning architecture (TCNN) that is

able to convey regularization and quantify uncertainty when

learning the CATE function, and, in its interpretable version

(ICNN), to output interpretable score function describing

the estimated prognostic and moderation effects of the co-

variates

Xi

. We have benchmarked TCNN and ICNN by

comparing their performance against some of the popular

methods for CATE estimation on a simple simulated exper-

iment, where we have also illustrated how score functions

are very intuitive and interpretable measures for moderation

effects analysis. Finally, we have demonstrated the use of

ICNN on a real-world dataset based on the ACTG-175 study

(Hammer et al.,1996).

References

Abdar, M., Pourpanah, F., Hussain, S., Rezazadegan, D.,

Liu, L., Ghavamzadeh, M., Fieguth, P., Cao, X., Khosravi,

A., Acharya, U. R., Makarenkov, V., and Nahavandi, S.

A review of uncertainty quantiﬁcation in deep learning:

Techniques, applications and challenges. Information

Fusion, 76:243–297, 2021.

Agarwal, R., Melnick, L., Frosst, N., Zhang, X., Lengerich,

B., Caruana, R., and Hinton, G. E. Neural additive mod-

els: Interpretable machine learning with neural nets. In

Proceedings of the 35th International Conference on Neu-

ral Information Processing Systems, volume 34, pp. 4699–

4711, 2021.

Alaa, A. and van der Schaar, M. Limits of estimating het-

erogeneous treatment effects: Guidelines for practical

algorithm design. In Proceedings of the 35th Interna-

tional Conference on Machine Learning, pp. 129–138,

2018.

Interpretable Deep Causal Learning for Moderation Effects

Alaa, A. M. and van der Schaar, M. Bayesian inference of

individualized treatment effects using multi-task Gaus-

sian Processes. In Proceedings of the 31st International

Conference on Neural Information Processing Systems,

NIPS’17, pp. 3427–3435, 2017.

Athey, S. and Wager, S. Policy Learning With Observational

Data. Econometrica, 89(1):133–161, January 2021.

Blundell, C., Cornebise, J., Kavukcuoglu, K., and Wierstra,

D. Weight uncertainty in neural networks. In Proceed-

ings of the 32nd International Conference on Interna-

tional Conference on Machine Learning - Volume 37, pp.

1613–1622, 2015.

Caron, A., Baio, G., and Manolopoulou, I. Estimating indi-

vidual treatment effects using non-parametric regression

models: A review. Journal of the Royal Statistical Society:

Series A (Statistics in Society), pp. 1–35, 2022a.

Caron, A., Baio, G., and Manolopoulou, I. Shrinkage

bayesian causal forests for heterogeneous treatment ef-

fects estimation. Journal of Computational and Graphi-

cal Statistics, pp. 1–13, 2022b.

Chernozhukov, V., Chetverikov, D., Demirer, M., Duﬂo, E.,

Hansen, C., Newey, W., and Robins, J. Double/debiased

machine learning for treatment and structural parameters.

The Econometrics Journal, 21(1):C1–C68, 2018.

Chipman, H. A., George, E. I., and McCulloch, R. E. BART:

Bayesian additive regression trees. Ann. Appl. Stat., 4(1):

266–298, 03 2010.

Gal, Y. and Ghahramani, Z. Dropout as a bayesian approxi-

mation: Representing model uncertainty in deep learning.

In Proceedings of The 33rd International Conference on

Machine Learning, volume 48, pp. 1050–1059, 2016.

Hahn, P. R., Carvalho, C. M., Puelz, D., and He, J. Regular-

ization and confounding in linear regression for treatment

effect estimation. Bayesian Anal., 13(1):163–182, 03

2018.

Hahn, P. R., Murray, J. S., and Carvalho, C. M. Bayesian

Regression Tree Models for Causal Inference: Regulariza-

tion, Confounding, and Heterogeneous Effects. Bayesian

Analysis, 15(3):965 – 1056, 2020.

Hammer, S. M., Katzenstein, D. A., Hughes, M. D., Gun-

dacker, H., Schooley, R. T., Haubrich, R. H., Henry,

W. K., Lederman, M. M., Phair, J. P., Niu, M., Hirsch,

M. S., and Merigan, T. C. A trial comparing nucleoside

monotherapy with combination therapy in hiv-infected

adults with CD4 cell counts from 200 to 500 per cubic

millimeter. N. Engl. J. Med., 335:1081–1090, 1996.

Hartford, J., Lewis, G., Leyton-Brown, K., and Taddy, M.

Deep IV: A ﬂexible approach for counterfactual predic-

tion. In Proceedings of the 34th International Conference

on Machine Learning, volume 70, pp. 1414–1423, 2017.

Hill, J. L. Bayesian nonparametric modeling for causal infer-

ence. Journal of Computational and Graphical Statistics,

20(1):217–240, 2011.

Hodson, R. Precision medicine. Nature, 547(7619), 2016.

Horvitz, D. G. and Thompson, D. J. A generalization of sam-

pling without replacement from a ﬁnite universe. Journal

of the American Statistical Association, 47(260):663–685,

1952.

Imbens, G. W. and Rubin, D. B. Causal Inference for Statis-

tics, Social, and Biomedical Sciences: An Introduction.

Cambridge University Press, 2015.

Johansson, F., Shalit, U., and Sontag, D. Learning represen-

tations for counterfactual inference. In Proceedings of

The 33rd International Conference on Machine Learning,

volume 48, pp. 3020–3029, 2016.

Kaddour, J., Zhu, Y., Liu, Q., Kusner, M. J., and Silva, R.

Causal effect inference for structured treatments. In Pro-

ceedings of the 35th International Conference on Neural

Information Processing Systems, volume 34, pp. 24841–

24854, 2021.

Kitagawa, T. and Tetenov, A. Who should be treated? empir-

ical welfare maximization methods for treatment choice.

Econometrica, 86(2):591–616, 2018.

K

¨

unzel, S., Sekhon, J., Bickel, P., and Yu, B. Meta-learners

for estimating heterogeneous treatment effects using ma-

chine learning. Proceedings of the National Academy of

Sciences, 116, 06 2017.

Lakshminarayanan, B., Pritzel, A., and Blundell, C. Sim-

ple and scalable predictive uncertainty estimation using

deep ensembles. In Proceedings of the 31st International

Conference on Neural Information Processing Systems,

pp. 6405–6416. Curran Associates Inc., 2017.

Nie, X. and Wager, S. Quasi-oracle estimation of heteroge-

neous treatment effects. Biometrika, 108(2):299–319, 09

2020.

Nie, X., Brunskill, E., and Wager, S. Learning when-to-treat

policies. Journal of the American Statistical Association,

0(ja):1–58, 2020.

Pearce, T., Leibfried, F., and Brintrup, A. Uncertainty in

neural networks: Approximately bayesian ensembling. In

Proceedings of the Twenty Third International Conference

on Artiﬁcial Intelligence and Statistics, volume 108, pp.

234–244, 2020.

Interpretable Deep Causal Learning for Moderation Effects

Pearl, J. Causality: Models, Reasoning and Inference. Cam-

bridge University Press, USA, 2nd edition, 2009. ISBN

052189560X.

Peters, J., Janzing, D., and Schlkopf, B. Elements of Causal

Inference: Foundations and Learning Algorithms. The

MIT Press, 2017.

Robinson, P. M. Root-n-consistent semiparametric regres-

sion. Econometrica, 56(4):931–954, 1988.

Rubin, D. B. Bayesian inference for causal effects: The role

of randomization. Ann. Statist., 6(1):34–58, 01 1978.

Shalit, U., Johansson, F. D., and Sontag, D. Estimating

individual treatment effect: Generalization bounds and

algorithms. In Proceedings of the 34th International

Conference on Machine Learning - Volume 70, volume 70,

pp. 3076–3085, 2017.

Shapley, L. S. A value for n-person games. In Contribu-

tions to the Theory of Games II, pp. 307–317. Princeton

University Press, 1953.

Wager, S. and Athey, S. Estimation and inference of hetero-

geneous treatment effects using random forests. Journal

of the American Statistical Association, 113(523):1228–

1242, 2018.

Yao, L., Li, S., Li, Y., Huai, M., Gao, J., and Zhang, A. Rep-

resentation learning for treatment effect estimation from

observational data. In Advances in Neural Information

Processing Systems 31, pp. 2633–2643, 2018.

Zhang, B., Tsiatis, A. A., Laber, E. B., and Davidian, M. A

robust method for estimating optimal treatment regimes.

Biometrics, 68(4):1010–1018, 2012.

Interpretable Deep Causal Learning for Moderation Effects

A. Data Generating Process

In this appendix section we brieﬂy describe the data generating process utilized for the simulated experiment in Section 4. We

generated

N= 2000

data points on

P= 10

correlated covariates, of which 5 continuous and 5 binary, drawn from a Gaussian

Copula

CGauss

Θ(u)=ΦΘΦ−1(u1),...,Φ−1(uP)

, where the covariance matrix is such that

Θjk = 0.1|j−k|+ 0.1I(j=k)

.

The data generating process is fully described by the following quantities:

µ(xi) = 6 + 0.3 exp(Xi,1)+1X2

i,2+ 1.5|Xi,3|+ 0.8Xi,4,

τ(xi) = 3 + 0.8X2

i,1,

π(xi) = Λ −1.5+0.5Xi,1+νi

10,

Ai∼Bernoulliπ(xi),

Yi=µ(xi) + τ(xi)Ai+εi,where εi∼ N(0, σ2),

(6)

where:

Λ(·)

is the logistic cumulative distribution function; the error’s standard deviation is

σ2= 0.5

; and

νi∼

Uniform(0,1)

. More details on the DGP and the models employed can be found at

https://github.com/

albicaron/ICNN, for full reproducibility.

B. The ACTG-175 Trial Data

In Table here below we report the description of the 12 covariates utilized in the analysis in Section 4.1.

Table 2. ACTG-175 data covariates X

Variable Description

age Numeric

wtkg Numeric

hemo Binary (hemophilia = 1)

homo Binary (homosexual = 1)

drugs Binary (intravenous drug use = 1)

oprior

Binary (non-zidovudine antiretroviral therapy prior to initiation of

study treatment = 1)

z30

Binary (zidovudine use in the 30 days prior to treatment initiation =

1)

preanti

Numeric (number of days of previously received antiretroviral ther-

apy)

race Binary

gender Binary

str2 Binary: antiretroviral history (0 = naive, 1 = experienced)

karnof hi Binary: Karnofsky score (0 = <100,1=100)