Available via license: CC BY 4.0

Content may be subject to copyright.

A Rate-Distortion Framework for Explaining Black-box

Model Decisions

Stefan Kolek1, Duc Anh Nguyen1, Ron Levie1, Joan Bruna2, and Gitta Kutyniok1

1Department of Mathematics, Ludwig Maximilian University, Munich

2Courant Institute of Mathematical Sciences, New York University, New York

Abstract

We present the Rate-Distortion Explanation (RDE) framework, a mathematically well-founded method

for explaining black-box model decisions. The framework is based on perturbations of the target input sig-

nal and applies to any differentiable pre-trained model such as neural networks. Our experiments demon-

strate the framework’s adaptability to diverse data modalities, particularly images, audio, and physical

simulations of urban environments.

1 Introduction

Powerful machine learning models such as deep neural networks are inherently opaque, which has motivated

numerous explanation methods that the research community developed over the last decade [1, 24, 26, 20,

15, 16, 7, 2]. The meaning and validity of an explanation depends on the underlying principle of the expla-

nation framework. Therefore, a trustworthy explanation framework must align intuition with mathematical

rigor while maintaining maximal ﬂexibility and applicability. We believe the Rate-Distortion Explanation

(RDE) framework, ﬁrst proposed by [16], then extended by [9], as well as the similar framework in [2],

meets the desired qualities. In this chapter, we aim to present the RDE framework in a revised and holistic

manner. Our generalized RDE framework can be applied to any model (not just classiﬁcation tasks), sup-

ports in-distribution interpretability (by leveraging in-painting GANs), and admits interpretation queries (by

considering suitable input signal representations).

The typical setting of a (local) explanation method is given by a pre-trained model Φ : Rn→Rm,and

a data instance x∈Rn. The model Φcan be either a classiﬁcation task with mclass labels or a regression

task with m-dimensional model output. The model decision Φ(x)is to be explained. In the original RDE

framework [16], an explanation for Φ(x)is a set of feature components S⊂ {1, . . . , n}in xthat are deemed

relevant for the decision Φ(x). The core principle behind the RDE framework is that a set S⊂ {1, . . . , n}

contains all the relevant components if Φ(x)remains (approximately) unchanged after modifying xSc, i.e.,

the components in xthat are not deemed relevant. In other words, Scontains all relevant features if they are

sufﬁcient for producing the output Φ(x). To convey concise explanatory information, one aims to ﬁnd the

minimal set S⊂ {1, . . . , n}with all the relevant components. As demonstrated in [16] and [28], the minimal

relevant set S⊂ {1, . . . , n}cannot be found combinatorically in an efﬁcient manner for large input sizes.

A meaningful approximation can nevertheless be found by optimizing a sparse continuous mask s∈[0,1]n

that has no signiﬁcant effect on the output Φ(x)in the sense that Φ(x)≈Φ(xs+ (1 −s)v)should

hold for appropriate perturbations v∈Rn, where denotes the componentwise multiplication. Suppose

dΦ(x),Φ(y)is a measure of distortion (e.g. the `2-norm) between the model outputs for x, y ∈Rnand V

is a distribution over appropriate perturbations v∼ V. An explanation in the RDE framework can be found

as a solution mask s∗to the following minimization problem:

s∗:= arg min

s∈[0,1]n

E

v∼V "dΦ(x),Φ(xs+ (1 −s)v)#+λksk1,

where λ > 0is a hyperparameter controlling the sparsity of the mask.

1

arXiv:2110.08252v1 [cs.LG] 12 Oct 2021

We further generalize the RDE framework to abstract input signal representations x=f(h), where f

is a data representation function with input h. The philosophy of the generalized RDE framework is that

an explanation for generic input signals x=f(h)should be some simpliﬁed version of the signal, which

is interpretable to humans. This is achieved by demanding sparsity in a suitable representation system h,

which ideally optimally represents the class of explanations that are desirable for the underlying domain

and interpretation query. This philosophy underpins our experiments on image classiﬁcation in the wavelet

domain, on audio signal classiﬁcation in the Fourier domain, and on radio map estimation in an urban

environment domain. Therein we demonstrate the versatility of our generalized RDE framework.

2 Related works

To our knowledge, the explanation principle of optimizing a mask s∈[0,1]nhas been ﬁrst proposed in [7].

Fong et al. [7] explained image classiﬁcation decisions by considering one of the two “deletion games”: (1)

optimizing for the smallest deletion mask that causes the class score to drop signiﬁcantly or (2) optimizing

for the largest deletion mask that has no signiﬁcant effect on the class score. The original RDE approach

[16] is based on the second deletion game and connects the deletion principle to rate-distortion-theory, which

studies lossy data compression. Deleted entries in [7] were replaced with either constants, noise, or blurring

and deleted entries in [16] were replaced with noise.

Explanation methods introduced before the “deletion games” principle from [7] were typically based

upon gradient-based methods [24][26], propagation of activations in neurons [1][23], surrogate models

[20], and game-theory [15]. Gradient-based methods such as smoothgrad [24] suffer from a lacking prin-

ciple of relevance beyond local sensitivity. Reference-based methods such as Integrated Gradients [26] and

DeepLIFT [23] depend on a reference value, which has no clear optimal choice. DeepLIFT and LRP as-

sign relevance by propagating neuron activations, which makes them dependent on the implementation of

Φ. LIME [20] uses an interpretable surrogate model that approximates Φin a neighborhood around x.

Surrogate model explanations are inherently limited for complex models Φ(such as image classiﬁers) as

they only admit very local approximations. Generally, explanations that only depend on the model behav-

ior on a small neighborhood Uxof xoffer limited insight. Lastly, Shapley values-based explanations [15]

are grounded in Shapley values from game-theory. They assign relevance scores as weighted averages of

marginal contributions of respective features. Though Shapley values are mathematically well-founded, rel-

evance scores cannot be computed exactly for common input sizes such as n≥50, since one exact relevance

score generally requires O(2n)evaluations of Φ[27].

A notable difference between the RDE method and additive feature explanations [15] is that the values

in the mask s∗do not add up to the model output. The additive property as in [15] takes the view that

features individually contribute to the model output and relevance should be reﬂected by their contributions.

We emphasize that the RDE method is designed to look for a set of relevant features and not an estimate of

individual relative contributions. This is particularly desirable when only groups of features are interpretable,

as for example in image classiﬁcation tasks, where individual pixels do not carry any interpretable meaning.

Similarly to Shapley values, the explanation in the RDE framework cannot be computed exactly, as it requires

solving a non-convex minimization problem. However, the RDE method can take full advantage of modern

optimization techniques. Furthermore, the RDE method is a model-agnostic explanation technique, with a

mathematically principled and intuitive notion of relevance as well as enough ﬂexibility to incorporate the

model behavior on meaningful input regions of Φ.

The meaning of an explanation based on deletion masks s∈[0,1]ndepends on the nature of the pertur-

bations that replace the deleted regions. Random [16] [7] or blurred [7] replacements v∈Rnmay result

in a data point xs+ (1 −s)vthat falls out of the natural data manifold on which Φwas trained on.

This is a subtle though important problem, since such an explanation may depend on evaluations of Φon

data points from undeveloped decision regions. The latter motivates in-distribution interpretability, which

considers meaningful perturbations that keep xs+ (1 −s)vin the data manifold. [2] was the ﬁrst work

that suggested to use an inpainting-GAN to generate meaningful perturbations to the “deletion games”. The

authors of [9] then applied in-distribution interpretability to the RDE method in the challenging modalities

music and physical simulations of urban environments. Moreover, they demonstrated that the RDE method

in [16] can be extended to answer so-called “interpretation queries”. For example, the RDE method was

applied in [9] to an instrument classiﬁer to answer the global interpretation query “Is magnitude or phase in

2

the signal more important for the classiﬁer?”. Most recently, in [11], we introduced CartoonX as a novel ex-

planation method for image classiﬁers, answering the interpretation query “What is the relevant piece-wise

smooth part of an image?” by applying RDE in the wavelet basis of images.

3 Rate-distortion explanation framework

Based on the original RDE approach from [16], in this section, we present a general formulation of the

RDE framework and discuss several implementations. While [16] focuses merely on image classiﬁcation

with explanations in pixel representation, we will apply the RDE framework not only to more challenging

domains but also to different input signal representations. Not surprisingly, the combinatorical optimization

problem in the RDE framework, even in simpler form, is extremely hard to solve [16] [28]. This motivates

heuristic solution strategies, which will be discussed in Subsection 3.2.

3.1 General formulation

It is well-known that in practice there are different ways to describe a signal x∈Rn. Generally speaking, x

can be represented by a data representation function f:Qk

i=1 Rdi→Rn,

x=f(h1, . . . , hk),(1)

for some inputs hi∈Rdi,di∈N,i∈ {1, . . . , k},k∈N. Note, we do not restrict ourselves to linear data

representation functions f. To brieﬂy illustrate the generality of this abstract representation, we consider the

following examples.

Example 1 (Pixel representation) An arbitrary (vectorized) image x∈Rncan be simply represented pix-

elwise

x=

x1

.

.

.

xn

=f(h1, . . . , hn),

with hi:=xibeing the individual pixel values and f:Rn→Rnbeing the identity transform.

Due to its simplicity, this standard basis representation is a reasonable choice when explaining image clas-

siﬁcation models. However, in many other applications, one requires more sophisticated representations of

the signals, such as through a possibly redundant dictionary.

Example 2 Let {ψj}k

j=1,k∈N, be a dictionary in Rn, e.g., a basis. A signal x∈Rnis represented as

x=

k

X

j=1

hjψj,

where hj∈R,j∈ {1, . . . , k}, are appropriate coefﬁcients. In terms of the abstract representation (1),

we have dj= 1 for j∈ {1, . . . , k }and fis the function that yields the weighted sum over ψj. Note that

Example 1 can be seen as a special case of this representation.

The following gives an example of a non-linear representation function f.

Example 3 Consider the discrete inverse Fourier transform, deﬁned as

f:

n

Y

j=1

R+×

n

Y

j=1

[0,2π]→Cn,

f(m1, ..., mn, ω1, ..., ωn)l:=1

n

n

X

j=1

mjeiωj

| {z }

:=cj∈C

ei2πl(j−1)/n, l ∈ {1, . . . , n},

where mjand ωjare respectively the magnitude and the phase of the j-th discrete Fourier coefﬁcient cj.

Thus every signal x∈Rn⊆Cncan be represented in terms of (1) with fbeing the discrete inverse Fourier

transform while hj,j= 1, . . . , k (with k= 2n) being speciﬁed as mj0and ωj0,j0= 1, . . . , n.

3

Further examples of dictionaries {ψj}k

j=1 include the discrete wavelet [21], cosine [19] or shearlet [12]

representation systems and many more. In these cases, the coefﬁcients hiare given by the forward transform

and fis referred to as the backward transform. Note that in the above examples we have di= 1, i.e., the input

vectors hiare real-valued. In many situations, one is also interested in representations x=f(h1, . . . , hk)

with hi∈Rdiwhere di>1.

Example 4 Let k= 2 and deﬁne fagain as the discrete inverse Fourier transform, but as a function of two

components: (1) the entire magnitude spectrum and (2) the entire frequency spectrum, namely

f:Rn

+×[0,2π]n,

f(m, ω)l:=1

n

n

X

j=1

mjeiωj

| {z }

:=cn∈C

ei2πl(j−1)/n, l ∈ {1, . . . , n}.

Similarly, instead of individual pixel values, one can consider patches of pixels in an image x∈Rnfrom

Example 1 as the input vectors hito the identity transform f. We will come back to these examples in the

experiments in Section 4.

Finally, we would like to remark that our abstract representation

x=f(h1, . . . , hk)

also covers the cases where the signal is the output of a decoder or generative model fwith inputs h1, . . . , hk

as the code or the latent variables.

As was discussed in previous sections, the main idea of the RDE framework is to extract the relevant

features of the signal based on the optimization over its perturbations deﬁned through masks. The ingredients

of this idea are formally deﬁned below.

Deﬁnition 1 (Obfuscations and expected distortion) Let Φ : Rn→Rmbe a model and x∈Rna data

point with a data representation x=f(h1, ..., hk)as discussed above. For every mask s∈[0,1]k, let Vsbe

a probability distribution over Qk

i=1 Rdi. Then the obfuscation of xwith respect to sand Vsis deﬁned as

the random vector

y:=f(sh+ (1 −s)v),

where v∼ Vs,(sh)i=sihi∈Rdiand ((1−s)v)i= (1−si)vi∈Rdifor i∈ {1, . . . , k }. Furthermore,

the expected distortion of xwith respect to the mask sand the perturbation distribution Vsis deﬁned as

D(x, s, Vs,Φ) :=E

v∼Vs"dΦ(x),Φ(y)#,

where d:Rm×Rm→R+is a measure of distortion between two model outputs.

In the RDE framework, the explanation is given by a mask that minimizes distortion while remaining rela-

tively sparse. The rate-distortion-explanation mask is deﬁned in the following.

Deﬁnition 2 (The RDE mask) In the setting of Deﬁnition 1 we deﬁne the RDE mask as a solution s∗(`)to

the minimization problem

min

s∈{0,1}kD(x, s, Vs,Φ) s.t. ksk0≤`, (2)

where `∈ {1, . . . , k}is the desired level of sparsity.

Here, the RDE mask is deﬁned as the binary mask that minimizes the expected distortion while keeping the

sparsity smaller than a certain threshold. Besides this, one could obviously also deﬁne the RDE mask as the

sparsest binary mask that keeps the distortion lower than a given threshold, as deﬁned in [16]. Geometrically,

one can interpret the RDE mask as a subspace that is stable under Φ. If x=f(h)is the input signal and sis

the RDE mask for Φ(x)on the coefﬁcients h, then the associated subspace RΦ(s)is deﬁned as the space of

feasible obfuscations of xwith sunder Vs, i.e.,

RΦ(s):={f(sh+ (1 −s)v)|v∈suppVs},

4

where suppVsdenotes the support of the distribution Vs. The model Φwill act similarly on signals in RΦ(s)

due to the low expected distortion D(x, s, Vs,Φ)—making the subspace stable under Φ. Note that RDE

directly optimizes towards a subspace that is stable under Φ. If, instead, one would choose the mask sbased

on information of the gradient ∇Φ(x)and Hessian ∇2Φ(x), then only a local neighborhood around xwould

tend to be stable under Φdue to the local nature of the gradient and Hessian. Before discussing practical

algorithms to approximate the RDE mask in Subsection 3.2, we will review frequently used obfuscation

strategies, i.e., the distribution Vs, and measures of distortion.

3.1.1 Obfuscation strategies and in-distribution interpretability.

The meaning of an explanation in RDE depends greatly on the nature of the perturbations v∼ Vs. A

particular choice of Vsdeﬁnes an obfuscation strategy. Obfuscations are either in-distribution, i.e., if the

obfuscation

f(sh+ (1 −s)v)

lies on the natural data manifold that Φwas trained on, or out-of-distribution otherwise. Out-of-distribution

obfuscations pose the following problem. The RDE mask (see Deﬁnition 2) depends on evaluations of Φon

obfuscations f(sh+ (1 −s)v). If f(sh+ (1 −s)v)is not on the natural data manifold that

Φwas trained on, then it may lie in undeveloped regions of Φ. In practice, we are interested in explaining

the behavior of Φon realistic data and an explanation can be corrupted if Φdid not develop the region

of out-of distribution points f(sh+ (1 −s)v). One can guard against this by choosing Vsso that

f(sh+ (1 −s)v)is in-distribution. Choosing Vsin-distribution boils down to modeling the conditional

data distribution – a non-trivial task.

Example 5 (In-distribution obfuscation strategy) In light of the recent success of generative adversarial

networks (GANs) in generative modeling [8], one can train an in-painting GAN [29]

G(h, s, z)∈

k

Y

i=1

Rdi,

where zare random latent variables of the GAN, such that the obfuscation fsh+ (1 −s)G(h, s, z )

lies on the natural data manifold (see also [2]). In other words, one can choose Vsas the distribution of

v:=G(h, s, z), where the randomness comes from the random latent variables z.

Example 6 (Out-of-distribution obfuscation strategies) A very simple obfuscation strategy is Gaussian

noise. In that case, one deﬁnes Vsfor every s∈[0,1]kas

Vs:=N(µ, Σ),

where µand Σdenote a pre-deﬁned mean vector and covariance matrix. In Section 4.1, we give an example

of a reasonable choice for µand Σfor image data. Alternatively, for images with pixel representation (see

Example 1) one can mask out the deleted pixels by blurred inputs, v=K∗x, where Kis a suitable blur

kernel.

Obfuscation strategy Perturbation formula In-distribution

Constant v∈Rd–

Noise v∼ N (µ, Σ) –

Blurring v=K∗x–

Inpainting-GAN v=G(h, s, z)X

Table 1: Common obfuscation strategies with their perturbation formulas.

We summarize common obfuscation strategies for a given target signal in Table 1.

5

3.1.2 Measure of distortion.

Various options exist for the measure d:Rm×Rm→Rof the distortion between model outputs. The

measure of distortion should be chosen according to the task of the model Φ : Rn→Rmand the objective

of the explanation.

Example 7 (Measure of distortion for classiﬁcation task) Consider a classiﬁcation model Φ : Rn→Rm

and a target input signal x∈Rn. The model Φassigns to each class j∈ {1, . . . , m}a (pre-softmax) score

Φj(x)and the predicted label is given by j∗:= arg maxj∈{1,...,m}Φj(x). One commonly used measure of

the distortion between the outputs at xand another data point y∈Rnis given as

d1Φ(x),Φ(y):=Φj∗(x)−Φj∗(y)2.

On the other hand, the vector [Φj(x)]m

j=1 is usually normalized to a probability vector [˜

Φj(x)]m

j=1 by ap-

plying the softmax function, namely ˜

Φj(x):= exp Φj(x)/Pm

i=1 exp Φi(x). This, in turn, gives another

measure of the distortion between Φ(x),Φ(y)∈Rm, namely

d2Φ(x),Φ(y):=˜

Φj∗(x)−˜

Φj∗(y)2,

where j∗:= arg maxj∈{1,...,m}Φj(x) = arg maxj∈{1,...,m}˜

Φj(x). An important property of the softmax

function is the invariance under translation by a vector [c, . . . , c]>∈Rm, where c∈Ris a constant. By

deﬁnition, only d2respects this invariance while d1does not.

Example 8 (Measure of distortion for regression task) Consider a regression model Φ : Rn→Rmand

an input signal x∈Rn. One can then deﬁne the measure of distortion between the outputs of xand another

data point y∈Rnas

d3(Φ(x),Φ(y):=kΦ(x)−Φ(y)k2

2.

Sometimes it is reasonable to consider a certain subset of components J⊆ {1, . . . , m}of the output vectors

instead of all mentries. Denoting the vector formed by corresponding entries by ΦJ(x), the measure of

distortion between the outputs can be deﬁned as

d4(Φ(x),Φ(y):=kΦJ(x)−ΦJ(y)k2

2.

The measure d4will be used in our experiments for radio maps in Subsection 4.3.

3.2 Implementation

The RDE mask from Deﬁnition 2 was deﬁned as a solution to

min

s∈{0,1}kD(x, s, Vs,Φ) s.t. ksk0≤`.

In practice, we need to relax this problem. We offer the following three approaches.

3.2.1 `1-relaxation with Lagrange multiplier.

The RDE mask can be approximately computed by ﬁnding an approximate solution to the following relaxed

minimization problem:

min

s∈[0,1]kD(x, s, Vs,Φ) + λksk1,(P1)

where λ > 0is a hyperparameter for the sparsity level. Note that the optimization problem is not necessarily

convex, thus the solution might not be unique.

The expected distortion D(x, s, Vs,Φ) can typically be approximated with simple Monte-Carlo esti-

mates, i.e., by averaging i.i.d. samples from Vs. After estimating D(x, s, Vs,Φ), one can optimize the mask

swith stochastic gradient descent (SGD) to solve the optimization problem (P1).

6

3.2.2 Bernoulli relaxation.

By viewing the binary mask as Bernoulli random variables s∼Ber(θ)and optimizing over θ, one can

guarantee that the expected distortion D(x, s, Vs,Φ) is evaluated on binary masks s∈ {0,1}n. To encour-

age sparsity of the resulting mask, one can still apply `1-regularization on s, giving rise to the following

optimization problem:

min

θ∈[0,1]k

E

s∼Ber(θ)"D(x, s, Vs,Φ)#+λksk1.(P2)

Optimizing the parameter θrequires a continuous relaxation to apply SGD. This can be done using the

concrete distribution [17], which samples sfrom a continuous relaxation of the Bernoulli distribution.

3.2.3 Matching pursuit.

As an alternative, one can also perform matching pursuit [18]. Here, the non-zero entries of s∈ {0,1}nare

determined sequentially in a greedy fashion to minimize the resulting distortion in each step. More precisely,

we start with a zero mask s0= 0 and gradually build up the mask by updating stat step tby the rule given

by

st+1 =st+ arg min

ej:st

j=0

D(x, st+ej,Vs,Φ).

Here, the minimization is taken over all standard basis vectors ej∈Rkwith st

j= 0. The algorithm

terminates when reaching some desired error tolerance or after a preﬁxed number of iterations. While this

means that in each iteration we have to test every entry of s, it is applicable when kis small or when we are

only interested in very sparse masks.

4 Experiments

With our experiments, we demonstrate the broad applicability of the generalized RDE framework. Moreover,

our experiments illustrate how different choices of obfuscation strategies, optimization procedures, measures

of distortion, and input signal representations, discussed in Section 3.1, can be leveraged in practice. We

explain model decisions on various challenging data modalities and tailor the input signal representation and

measure of distortion to the domain and interpretation query. In Section 4.1, we focus on image classiﬁ-

cation, a common baseline task in the interpretability literature. In Sections 4.2 and 4.3, we consider two

other data modalities that are often unexplored. Section 4.2 focuses on audio data, where the underlying

task is to classify acoustic instruments based on a short audio sample of distinct notes, while in Section 4.3,

the underlying task is a regression with data in the form of physical simulations in urban environments. We

also believe our explanation framework sustains applications beyond interpretability tasks. An example is

given in Section 4.3.2, where we add an RDE inspired regularizer to the training objective of a radio map

estimation model.

4.1 Images

We begin with the most ordinary domain in the interpretability literature: image classiﬁcation tasks. The

authors of [16] applied RDE to image data before by considering pixel-wise perturbations. We refer to this

method as Pixel RDE. Other explanation methods [20], [1], [2], and [3], have also previously exclusively

operated in the pixel domain. In [11], we challenged this customary practice by successfully applying RDE

in a wavelet basis, where sparsity translates into piece-wise smooth images (also called cartoon-like images).

The novel explanation method was coined CartoonX [11] and extracts the relevant piece-wise smooth part

of an image. First, we review the Pixel RDE method and present experiments on the ImageNet dataset [4],

which is commonly considered a challenging classiﬁcation task. Finally, we present CartoonX and discuss

its advantages. For all the ImageNet experiments, we use the pre-trained MobileNetV3-Small [10], which

achieved a top-1 accuracy of 67.668% and a top-5 accuracy of 87.402%, as the classiﬁer.

7

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 1: Top row: original images correctly classiﬁed as (a) snail, (b) male duck, and (c) airplane. Middle

row: Pixel RDEs. Bottom row: CartoonX. Notably, CartoonX is roughly piece-wise smooth and overall

more interpretable than the jittery Pixel RDEs.

4.1.1 Pixel RDE.

Consider the following pixel-wise representation of an RGB image x∈R3×n:

f:

n

Y

i=1

R3→Rn×3, x =f(h1, ..., hn),

where hi∈R3represents the three color channel values of the i-th pixel in the image x, i.e. (xi,j )j=1,..,3=

hi. In pixel RDE a sparse mask s∈[0,1]nwith nentries—one for each pixel—is optimized to achieve low

expected distortion D(x, s, Vs,Φ). The obfuscation of an image xwith the pixel mask sand a distribution

v∼ Vson Qn

i=1 R3is deﬁned as f(sh+ (1 −s)v). In our experiments, we initialize the mask with

ones, i.e., si= 1 for every i∈ {1, . . . , n}, and consider Gaussian noise perturbations Vs=N(µ, Σ). We

set the noise mean µ∈R3×nas the pixel value mean of the original image xand the covariance matrix

Σ:=σ2Id ∈R3n×3nas a diagonal matrix with σ > 0deﬁned as the pixel value standard deviation of the

original image x. We then optimize the pixel mask sfor 2000 gradient descent steps on the `1-relaxation of

the RDE objective (see Section 3.2.1). We computed the distortion dΦ(x),Φ(y)in D(x, s, Vs,Φ) in the

post-softmax activation of the predicted label multiplied by a constant C= 100, i.e.,

dΦ(x),Φ(y):=CΦj∗(x)−Φj∗(y)2.

The expected distortion D(x, s, Vs,Φ) was approximated as a simple Monte-Carlo estimate after sampling

64 noise perturbations. For the sparsity level, we set the Lagrange multiplier to λ= 0.6. All images were

resized to 256 ×256 pixels. The mask was optimized for 2000 steps using the Adam optimizer with step

size 0.003. In the middle row of Figure 1, we show three example explanations with Pixel RDE for an image

8

of a snail, a male duck, and an airplane, all from the ImageNet dataset. Pixel RDE highlights as relevant

both the snail’s inner shell and part of its head, the lower segment of the male duck along with various lines

in the water, and the airplane’s fuselage and part of its rudder.

4.1.2 CartoonX.

Formally, we represent an RGB image x∈[0,1]3×nin its wavelet coefﬁcients h={hi}n

i=1 ∈Qn

i=1 R3

with J∈ {1,...,blog2nc} scales as x=f(h), where f is the discrete inverse wavelet transform. Each

hi= (hi,c)3

c=1 ⊆R3contains three wavelet coefﬁcients of the image, one for each color channel and is

associated with a scale ki∈ {1, . . . , J }and a position in the image. Low scales describe high frequencies

and high scales describe low frequencies at the respective image position. We brieﬂy illustrate the wavelet

coefﬁcients in Figure 2, which visualizes the discrete wavelet transform of an image.

(a) (b)

Figure 2: Discrete Wavelet Transform of an image: (a) original image (b) discrete wavelet transform. The

coefﬁcients of the largest quadrant in (b) correspond to the lowest scale and coefﬁcients of smaller quad-

rants gradually build up to the highest scales, which are located in the four smallest quadrants. Three nested

L-shaped quadrants represent horizontal, vertical and diagonal edges at a resolution determined by the asso-

ciated scale.

CartoonX [11] is a special case of the generalized RDE framework, particularly a special case of Ex-

ample 2, and optimizes a sparse mask s∈[0,1]non the wavelet coefﬁcients (see Figure 3c) so that the

expected distortion D(x, s, Vs,Φ) remains small. The obfuscation of an image xwith a wavelet mask s

and a distribution v∼ Vson the wavelet coefﬁcients is f(sh+ (1 −s)v). In our experiments, we

used Gaussian noise perturbations and chose the standard deviation and mean adaptively for each scale: the

standard deviation and mean for wavelet coefﬁcients of scale j∈ {1, . . . , J }were chosen as the standard

deviation and mean of the wavelet coefﬁcients of scale j∈ {1, . . . , J }of the original image. Figure 3d

shows the obfuscation f(sh+ (1 −s)v)with the ﬁnal wavelet mask safter the RDE optimization

procedure. In Pixel RDE, the mask itself is the explanation as it lies in pixel space (see middle row in Figure

1), whereas the CartoonX mask lies in the wavelet domain. To go back to the natural image domain, we

multiply the wavelet mask element-wise with the wavelet coefﬁcients of the original greyscale image and

invert this product back to pixel space with the discrete inverse wavelet transform. The inversion is ﬁnally

clipped into [0,1] as are obfuscations during the RDE optimization to avoid overﬂow (we assume here the

pixel values in xare normalized into [0,1]). The clipped inversion in pixel space is the ﬁnal CartoonX

explanation (see Figure 3e).

The following points should be kept in mind when interpreting the ﬁnal CartoonX explanation, i.e., the

inversion of the wavelet coefﬁcient mask: (1) CartoonX provides the relevant pice-wise smooth part of the

image. (2) The inversion of the wavelet coefﬁcient mask was not optimized to be sparse in pixel space but in

the wavelet basis. (3) A region that is black in the inversion could nevertheless be relevant if it was already

black in the original image. This is due to the multiplication of the mask with the wavelet coefﬁcients

of the greyscale image before taking the discrete inverse wavelet transform. (4) Bright high resolution

regions are relevant in high resolution and bright low resolution regions are relevant in low resolution. (5)

It is inexpensive for CartoonX to mark large regions in low resolution as relevant. (6) It is expensive for

CartoonX to mark large regions in high resolution as relevant.

In Figure 1, we compare CartoonX to Pixel RDE. The piece-wise smooth wavelet explanations are more

9

(a) (b) (c)

(d) (e) (f)

Figure 3: CartoonX machinery: (a) image classiﬁed as park-bench, (b) discrete wavelet transform of the

image, (c) ﬁnal mask on the wavelet coefﬁcients after the RDE optimization procedure, (d) obfuscation with

ﬁnal wavelet mask and noise, (e) ﬁnal CartoonX, (f) Pixel RDE for comparison.

interpretable than the jittery Pixel RDEs. In particular, CartoonX asserts that the snail’s shell without the

head sufﬁces for the classiﬁcation, unlike Pixel RDE, which insinuated that both the inner shell and part

of the head are relevant. Moreover, CartoonX shows that the water gives the classiﬁer context for the

classiﬁcation of the duck, which one could have only guessed from the Pixel RDE. Both Pixel RDE and

CartoonX state that the head of the duck is not relevant. Lastly, CartoonX, like Pixel RDE, conﬁrms that the

wings play a subordinate role in the classiﬁcation of the airplane.

4.1.3 Why explain in the wavelet basis?

Wavelets provide optimal representation for piece-wise smooth 1D functions [5], and represent 2D piece-

wise smooth images, also called cartoon-like images [12], efﬁciently as well [21]. Indeed, sparse vectors in

the wavelet coefﬁcient space encode cartoon-like images reasonably well [25], certainly better than sparse

pixel representations. Moreover, the optimization process underlying CartoonX produces sparse vectors in

the wavelet coefﬁcient space. Hence CartoonX typically generates cartoon-like images as explanations. This

is the fundamental difference to Pixel RDE, which produces rough, jittery, and pixel-sparse explanations.

Cartoon-like images are more interpretable and provide a natural model of simpliﬁed images. Since the goal

of the RDE explanation is to generate an easy to interpret simpliﬁed version of the input signal, we argue

that CartoonX explanations are more appropriate for image classiﬁcation than Pixel RDEs. Our experiments

conﬁrm that the CartoonX explanations are roughly piece-wise smooth explanations and are overall more

interpretable than Pixel RDEs (see Figure 1).

4.1.4 CartoonX implementation.

Throughout our CartoonX experiments we chose the Daubechies 3 wavelet system, J= 5 levels of scales

and zero padding for the discrete wavelet transform. For the implementation of the discrete wavelet trans-

form, we used the Pytorch Wavelets package, which supports gradient computation in Pytorch. Distortion

10

0.00 0.02 0.04 0.06 0.08 0.10

ksk1/n

0.000

0.025

0.050

0.075

0.100

0.125

0.150

0.175

0.200

Distortion

DWT based mask

Pixel based mask

Center of mass

Center of mass

Figure 4: Scatter plot of rate-distortion in pixel basis and wavelet basis. Each point is an explanation of a

distinct image in the ImageNet dataset with distortion and normalized `1-norm measured for the ﬁnal mask.

The wavelet mask achieves lower distortion than the pixel mask, while using less coefﬁcients.

was computed as in the Pixel RDE experiments. The perturbations v∼ Vson the wavelet coefﬁcients were

chosen as Gaussian noise with standard deviation and mean computed adaptively per scale. As in the Pixel

RDE experiments, the wavelet mask was optimized for 2000 steps with the Adam optimizer to minimize the

`1-relaxation of the RDE objective. We used λ= 3 for CartoonX.

4.1.5 Efﬁciency of CartoonX.

Finally, we compare Pixel RDE to CartoonX quantitatively by analyzing the distortion and sparsity asso-

ciated with the ﬁnal explanation mask. Intuitively, we expect the CartoonX method to have an efﬁciency

advantage, since the discrete wavelet transform already encodes natural images sparsely, and hence less

wavelet coefﬁcients are required to represent images than pixel coefﬁcients. Our experiments conﬁrmed this

intuition, as can be seen in the scatter plot in Figure 4.

4.2 Audio

We consider the NSynth dataset [6], a library of short audio samples of distinct notes played on a variety of

instruments. We pre-process the data by computing the power-normalized magnitude spectrum and phase

information using the discrete Fourier transform on a logarithmic scale from 20 to 8000 Hertz. Each data

instance is then represented by the magnitude and the phase of its Fourier coefﬁcients as well as the discrete

inverse Fourier transform (see Example 2).

4.2.1 Explaining the classiﬁer.

Our model Φis a network trained to classify acoustic instruments. We compute the distortion with respect to

the pre-softmax scores, i.e., deploy d1in Example 7 as the measure of distortion. We follow the obfuscation

strategy described in Example 5 and train an inpainter Gto generate the obfuscation G(h, s, z). Here, h

corresponds to the representation of a signal, sis a binary mask and zis a normally distributed seed to the

generator.

We use a residual CNN architecture for Gwith added noise in the input and deep features. More details

can be found in Section 4.2.3. We train Guntil the outputs are found to be satisfactory, exempliﬁed by the

outputs in Figure 5.

To compute the explanation maps, we numerically solve (P2) as discussed in Subsection 3.2. In par-

ticular, sis a binary mask indicating whether the phase and magnitude information of a certain frequency

should be dropped and is speciﬁed as a Bernoulli variable s∼Ber(θ). We chose a regularization parameter

of λ= 50 and minimized the corresponding objective using the Adam optimizer with a step size of 10−5

11

Figure 5: Inpainted Bass: Example inpainting from G. The bottom plot depicts phase versus frequency and

the top plot depicts magnitude versus frequency. The random binary mask is represented by the green parts.

The axes for the inpainted signal (black) and the original signal (blue dashed) are offset to improve visibility.

Note how the inpainter generates plausible peaks in the magnitude and phase spectra, especially with regard

to rapid (≥600Hz) versus smooth (<270Hz) changes in phase.

in 106iterations. For the concrete distribution, we used a temperature of 0.1. Two examples resulting from

this process can be seen in Figure 6.

Notice here that the method actually shows a strong reliance of the classiﬁer on low frequencies (30Hz-

60Hz) to classify the top sample in Figure 6 as a guitar, as only the guitar samples have this low frequency

slope in the spectrum. We can also see in contrast that classifying the bass sample relies more on the

continuous signal between 100Hz and 230Hz.

4.2.2 Magnitude vs Phase.

In the above experiment, we have represented the signals by the magnitude and phase information at each

frequency, hence the mask sacts on each frequency. Now we consider the interpretation query of whether

the entire magnitude spectrum or the entire phase spectrum is more relevant for the prediction. Accordingly,

we consider the representation discussed in Example 4 and apply the mask sto turn off or on the whole

magnitude spectrum or the phase information. Furthermore, we can optimize snot only for one datum but

for all samples from a class. This extracts the information whether magnitude or phase is more important

for predicting samples from a speciﬁc class.

For this, we again minimized (P2) (meaned over all samples of a class) with θas the Bernoulli parameter

using the Adam optimizer for 2×105iterations with a step size of 10−4and the regularization parameter

λ= 30. Again, a temperature of t= 0.1was used for the concrete distribution.

INT RUM EN T MAGN IT UD E PHA SE

IMP ORTAN CE IM PO RTAN CE

ORG AN 0.829 1.0

GUI TAR 0.0 0.999

FLU TE 0.092 1.0

BAS S 1.0 1.0

REE D 0.136 1.0

VOC AL 1.0 1.0

MAL LE T 0.005 0.217

BRA SS 0.999 1.0

KEY BOA RD 0.003 1.0

STRING 1.0 0.0

Table 2: Magnitude importance versus phase importance.

From the results of these computations, which can be seen in Table 2, we can observe that there is a

clear difference on what the classiﬁer bases its decision on across instruments. The classiﬁcation of most

12

(a) Guitar

(b) Bass

Figure 6: Interpreting NSynth Model: The optimized importance parameter θ(green) overlayed on top of

the DFT (blue). For each of guitar and bass, the top graph shows the power-normalized magnitude and the

bottom the phase. Notice the solid peaks between 30Hz and 60Hz for guitar and between 100Hz and 230Hz

for bass. These occur because the model is relying on those parts of the spectra, for the classiﬁcation. Notice

also how many parts of the spectrum are important even when the magnitude is near zero. This indicates

that the model pays attention to whether those frequencies are missing.

instruments is largely based on phase information. For the mallet, the values are low for magnitude and

phase, which means that the expected distortion is very low compared to the `1-norm of the mask, even

when the signal is completely inpainted. This underlines that the regularization parameter λmay have to be

adjusted for different data instances, especially when measuring distortion in the pre-softmax scores.

4.2.3 Architecture of the inpainting network G.

Here, we brieﬂy describe the architecture of the inpainting network Gthat was used to generate obfusca-

tions to the target signals. In particular, Figure 7 shows the diagram of the network Gand Table 3 shows

information about its layers.

4.3 Radio Maps

In this subsection, we assume a set of transmitting devices (Tx) broadcasting a signal within a city. The

received strength varies with location and depends on physical factors such as line of sight, reﬂection, and

diffraction. We consider the regression problem of estimating a function that assigns the proper signal

strength to each location in the city. Our dataset Dis RadioMapSeer [14] containing 700 maps, 80 Tx

per map, and a corresponding grayscale label encoding the signal strength at every location. Our model Φ

receives as input x= [x(0), x(1), x(2) ], where x(0) is a binary map of the Tx locations, x(1) is a noisy binary

map of the city (where a few buildings are missing), and x(2) is a grayscale image representing a number of

13

LAYER FI LTE R SIZ E OUT PU T SH AP E # PARAMS

CON V1D-1 21 [-1, 32, 1024] 4,736

RELU-2 [-1, 32, 1024] 0

CON V1D-3 21 [-1, 64, 502] 43,072

RELU-4 [-1, 64, 502] 0

BATCH NOR M1D-5 [-1, 64, 502] 128

CON V1D-6 21 [-1, 128, 241] 172,160

RELU-7 [-1, 128, 241] 0

BATCH NOR M1D-8 [-1, 128, 241] 256

CON V1D-9 21 [-1, 16, 112] 43,024

RELU-10 [-1, 16, 112] 0

BATCH NOR M1D-11 [-1, 16, 112] 32

CON VTR AN SP O SE 1D-12 21 [-1, 64, 243] 43,072

RELU-13 [-1, 64, 243] 0

BATCH NOR M1D-14 [-1, 64, 243] 128

CON VTR AN SP O SE 1D-15 21 [-1, 128, 505] 172,160

RELU-16 [-1, 128, 505] 0

BATCH NOR M1D-17 [-1, 128, 505] 256

CON VTR AN SP O SE 1D-18 20 [-1, 64, 1024] 163,904

RELU-19 [-1, 64, 1024] 0

BATCH NOR M1D-20 [-1, 64, 1024] 128

SKIP CONNECTION [-1, 103, 1024] 0

CON V1D-21 7 [-1, 128, 1024] 92,416

RELU-22 [-1, 128, 1024] 0

CON V1D-23 7 [-1, 2, 1024] 1,794

RELU-24 [-1, 2, 1024] 0

TOTAL N UM BE R OF PA RA ME TER S: 737,266

Table 3: Layer table of the Inpainting model for the NSynth task.

ground truth measurements of the strength of the signal at the measured locations and zero elsewhere. We

apply the UNet [22, 14, 13] architecture and train Φto output the estimation of the signal strength throughout

the city that interpolates the input measurements.

Apart from the model Φ, we also have a simpler model Φ0, which only receives the city map and the Tx

locations as inputs and is trained with unperturbed input city maps. This second model Φ0will be deployed

to inpaint measurements to input to Φ. See Figure 8a, 8b, and 8c for examples of a ground truth map and

estimations for Φand Φ0, respectively.

Magnitude and

Phase Spectrum

Binary Mask

Gaussian Noise

Gaussian Noise

Skip connection

Skip connection

Figure 7: Diagram of the inpainting network for NSynth.

14

(a) Ground Truth (b) ΦEstimation (c) Φ0Estimation

Figure 8: Radio map estimations: The radio map (gray), input buildings (blue), and input measurements

(red).

4.3.1 Explaining Radio Map Φ.

Observe that in Figure 8a there is a missing building in the input (the black one) and in Figure 8b, Φin-ﬁlls

this building with a shadow. As a black box method, it is unclear why it made this decision. Did it rely

on signal measurements or on building patterns? To address this, we consider each building as a cluster of

pixels and each measurement as potential targets for our mask s= [s(1), s(2)], where s(1) acts on buildings

and s(2) acts on measurements. We then apply matching pursuit (see Subsection 3.2.3) to ﬁnd a minimal

mask sof critical components (buildings and measurements).

To be precise, suppose we are given a target input signal x= [x(0), x(1), x(2)]. Let k1denote the

number of buildings in x(1) and k2denote the number of measurements in x(2). Consider the function

f1that takes as inputs vectors in {0,1}k1, which indicate the existence of buildings in x(1), and maps

them to the corresponding city map in the original city map format. Analogously, consider the function f2

that takes as input the measurements in Rk2and maps them to the corresponding grayscale image of the

original measurements format. Then, f1and f2encode the locations of the buildings and measurements in

the target signal x= [x(0), f1(h(1)), f2(h(2))], where h(1) and h(2) denotes the building and measurement

representation of xin f1and f2. When s(1) has a zero entry, i.e., a building in h(1) was not selected, we

replace the value in the obfuscation with zero (this corresponds to a constant perturbation equal to zero).

Then, the obfuscation of the target signal xwith a mask s= [s(1), s(2) ]and perturbations v= [v(1), v (2)]:=

[0, v(2)]becomes:

y:= [x(0), f1(s(1) h(1)), f2(s(2) h(2) + (1 −s(2))v(2))].

While it is natural to model masking out a building by simply zeroing out the corresponding cluster of

pixels by choosing v(1) = 0, we need to also properly choose v(2) for the entries, where the mask s(2)

takes value 0, in order to obtain appropriate obfuscations. For this, we can deploy the second model Φ0as

an inpainter. We consider the following two extreme obfuscation strategies. The ﬁrst is to set also v(2) to

zero, i.e., simply remove the unchosen measurements from the input, with the underlying assumption being

that any subset of measurements is valid for a city map. In the other extreme case, we inpaint all unchosen

measurements by sampling at their locations the estimated radio map obtained by Φ0based on the buildings

selected by s(1).

The two extreme measurement completion methods correspond to two extremes of the interpretation

query. Filling-in the missing measurements by Φ0tends to overestimate the strength of the signal because

there are fewer buildings to obstruct the transmissions. The empty mask will complete all measurements to

the maximal possible signal strength – the free space radio map. The overestimation in signal strength is

reduced when more measurements and buildings are chosen, resulting in darker estimated radio maps. Thus,

this strategy is related to the query of which measurements and buildings are important to darken the free

space radio map, turning it to the radio map produced by Φ. In the other extreme, adding more measurements

to the mask with a ﬁxed set of buildings typically brightens the resulting radio map. This allows us to answer

which measurements are most important for brightening the radio map.

15

Between these two extreme strategies lies a continuum of completion methods where a random subset of

the unchosen measurements is sampled from Φ0, while the rest are set to zero. Examples of explanations of

a prediction Φ(x)according to these methods are presented in Figure 9. Since we only care about speciﬁc

small patches exempliﬁed by the green boxes, the distortion here is measured with respect to the `2distance

between the output images restricted to the corresponding region (see also Example 8).

(a) Estimated map. (b) Explanation: Inpaint all uncho-

sen measurements.

(c) Explanation: Inpaint 2.5% of

unchosen measurements.

Figure 9: Radio map queries and explanations: The radio map (gray), input buildings (blue), input mea-

surements (red), and area of interest (green box). Middle represents the query “How to ﬁll in the image

with shadows”, while right is the query “How to ﬁll in the image both with shadows and bright spots?”. We

inpaint with Φ0.

When the query is how to darken the free space radio map (Figure 9b), the optimized mask ssuggests

that samples in the shadow of the missing building are the most inﬂuential in the prediction. These dark

measurements are supposed to be in line-of-sight of a Tx, which indicates that the network deduced that

there is a missing building. When the query is how to ﬁll in the image both with shadows and bright spots

(Figure 9c), both samples in the shadow of the missing building and samples right before the building are

inﬂuential. This indicates that the network used the bright measurements in line-of-sight and avoided pre-

dicting an overly large building. To understand the chosen buildings, note that Φis based on a composition

of UNets and is thus interpreted as a procedure of extracting high level and global information from the

inputs to synthesize the output. The locations of the chosen buildings in Figure 9 reﬂect this global nature.

4.3.2 Interpretation-Driven Training.

We now discuss an example application of the explanation obtained by the RDE approach described above,

called interpretation driven training. When a missing building is in line-of-sight of a Tx, we would like Φ

to reconstruct this building relying on samples in the shadow of the building rather than patterns in the city.

To reduce the reliance of Φon the city information in this situation, one can add a regularization term in the

training loss which promotes explanations relying on measurements. Suppose x= [x(0) , x(1), x(2)]contains

a missing input building in line-of-sight of the Tx location and denote the subset of pixels of the missing

building in the city map as Jx. Denote the prediction by Φrestricted to the subset Jxas ΦJx. Moreover,

deﬁne ˜x:= [x(0),0, x(2)]to be the modiﬁcation of xwith all input buildings masked out. We then deﬁne the

interpretation loss for xas

`int(Φ, x):=kΦJx(x)−ΦJx(˜x)k2

2.

The interpretation driven training objective then regularizes Φduring training by adding the interpretation

loss for all inputs xthat contain a missing input building in line-of-sight of the Tx location. An example

comparison between explanations of the vanilla RadioUNet Φand the interpretation driven network Φint is

given in Figure 10.

16

(a) Vanilla Φestimation (b) Interpretation-driven

Φint estimation

(c) Vanilla Φexplanation (d) Interpretation-driven

Φint explanation

Figure 10: Radio map estimations, interpretation driven training vs vanilla training: The radio map (gray),

input buildings (blue), input measurements (red), and domain of the missing building (green box).

5 Conclusion

In this work, we presented the Rate-Distortion Explanation (RDE) framework in a revised and compre-

hensive manner. Our framework is ﬂexible enough to answer various interpretation queries by considering

suitable data representations tailored to the underlying domain and query. We demonstrate the latter and the

overall efﬁcacy of the RDE framework on an image classiﬁcation task, on an audio signal classiﬁcation task,

and on a radio map estimation task, a seldomly explored regression task.

References

[1] Sebastian Bach, Alexander Binder, Gr´

egoire Montavon, Frederick Klauschen, Klaus-Robert M¨

uller,

and Wojciech Samek. On pixel-wise explanations for non-linear classiﬁer decisions by layer-wise

relevance propagation. PLoS ONE, 10(7):e0130140, 2015.

[2] Chun-Hao Chang, Elliot Creager, Anna Goldenberg, and David Duvenaud. Explaining image classi-

ﬁers by counterfactual generation. In Proceedings of the 7th International Conference on Learning

Representations, ICLR, 2019.

[3] Piotr Dabkowski and Yarin Gal. Real time image saliency for black box classiﬁers. In Proceed-

ings of the 31st International Conference on Neural Information Processing Systems, NeurIPS, page

6970–6979, 2017.

[4] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale

hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and

Pattern Recognition, CVPR, pages 248–255, 2009.

[5] Ronald A. DeVore. Nonlinear approximation. Acta Numerica, 7:51–150, 1998.

[6] Jesse Engel, Cinjon Resnick, Adam Roberts, Sander Dieleman, Mohammad Norouzi, Douglas Eck,

and Karen Simonyan. Neural audio synthesis of musical notes with wavenet autoencoders. In Proceed-

ings of the 34th International Conference on Machine Learning, ICML, volume 70, page 1068–1077,

2017.

[7] R. C. Fong and A. Vedaldi. Interpretable explanations of black boxes by meaningful perturbation. In

Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV), pages 3449–3457,

2017.

[8] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair,

Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Proceedings of the 27th Interna-

tional Conference on Neural Information Processing Systems, NeurIPS, page 2672–2680, 2014.

[9] Cosmas Heiß, Ron Levie, Cinjon Resnick, Gitta Kutyniok, and Joan Bruna. In-distribution inter-

pretability for challenging modalities. Preprint arXiv:2007.00758, 2020.

17

[10] Andrew Howard, Mark Sandler, Bo Chen, Weijun Wang, Liang-Chieh Chen, Mingxing Tan, Grace

Chu, Vijay Vasudevan, Yukun Zhu, Ruoming Pang, Hartwig Adam, and Quoc Le. Searching for

MobileNetV3. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision

(ICCV), pages 1314–1324, 2019.

[11] Stefan Kolek, Duc Anh Nguyen, Ron Levie, Joan Bruna, and Gitta Kutyniok. Cartoon explanations of

image classiﬁers. arXiv: 2110.03485, 2021.

[12] Gitta Kutyniok and Wang-Q Lim. Compactly supported shearlets are optimally sparse. Journal of

Approximation Theory, 163(11):1564–1589, 2011.

[13] Ron Levie, Cagkan Yapar, Gitta Kutyniok, and Giuseppe Caire. Pathloss prediction using deep learning

with applications to cellular optimization and efﬁcient d2d link scheduling. In ICASSP 2020 - 2020

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8678–

8682, 2020.

[14] Ron Levie, Cagkan Yapar, Gitta Kutyniok, and Giuseppe Caire. RadioUNet: Fast radio map estimation

with convolutional neural networks. IEEE Transactions on Wireless Communications, 20(6):4001–

4015, 2021.

[15] Scott M. Lundberg and Su-In Lee. A uniﬁed approach to interpreting model predictions. In Proceed-

ings of the 31st International Conference on Neural Information Processing Systems, NeurIPS, page

4768–4777, 2017.

[16] Jan Macdonald, Stephan W¨

aldchen, Sascha Hauch, and Gitta Kutyniok. A rate-distortion framework

for explaining neural network decisions. Preprint arXiv:1905.11092, 2019.

[17] Chris J. Maddison, Andriy Mnih, and Yee Whye Teh. The concrete distribution: A continuous relax-

ation of discrete random variables. Preprint arXiv:1611.00712, 2016.

[18] S.G. Mallat and Zhifeng Zhang. Matching pursuits with time-frequency dictionaries. IEEE Transac-

tions on Signal Processing, 41(12):3397–3415, 1993.

[19] M. Narasimha and A. Peterson. On the computation of the discrete cosine transform. IEEE Transac-

tions on Communications, 26(6):934–936, 1978.

[20] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. ”Why should I trust you?”: Explaining

the predictions of any classiﬁer. In Proceedings of the 22nd International Conference on Knowledge

Discovery and Data Mining, ACM SIGKDD, page 1135–1144. Association for Computing Machinery,

2016.

[21] Justin K. Romberg, Michael B. Wakin, and Richard G. Baraniuk. Wavelet-domain approximation and

compression of piecewise smooth images. IEEE Trans. Image Processing, 15:1071–1087, 2006.

[22] O. Ronneberger, P.Fischer, and T. Brox. U-Net: Convolutional networks for biomedical image segmen-

tation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI), volume 9351 of

LNCS, pages 234–241, 2015.

[23] Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. Learning important features through prop-

agating activation differences. In Proceedings of the 34th International Conference on Machine Learn-

ing, ICML, volume 70, page 3145–3153, 2017.

[24] Daniel Smilkov, Nikhil Thorat, Been Kim, Fernanda Vi´

egas, and Martin Wattenberg. Smoothgrad:

removing noise by adding noise. In Workshop on Visualization for Deep Learning, ICML, 2017.

[25] Mallat St´

ephane. Chapter 11.3. In Mallat St´

ephane, editor, A Wavelet Tour of Signal Processing (Third

Edition), pages 535–610. Academic Press, Boston, third edition edition, 2009.

[26] Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. In Proceed-

ings of the 34th International Conference on Machine Learning, ICML, volume 70, page 3319–3328,

2017.

18

[27] Jacopo Teneggi, Alexandre Luster, and Jeremias Sulam. Fast hierarchical games for image explana-

tions. Preprint arXiv:2104.06164, 2021.

[28] Stephan W¨

aldchen, Jan Macdonald, Sascha Hauch, and Gitta Kutyniok. The computational complexity

of understanding network decisions. Preprint arXiv:1905.09163, 2019.

[29] Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. Generative image

inpainting with contextual attention. In Proceedings of the 2018 IEEE/CVF Conference on Computer

Vision and Pattern Recognition, CVPR, pages 5505–5514, 2018.

19