Content uploaded by Alicja Gosiewska
Author content
All content in this area was uploaded by Alicja Gosiewska on Apr 15, 2019
Content may be subject to copyright.
IBREAKDOWN: UNCERTAINTY OF MODEL EXPLANATIONS FOR
NON-ADDITIVE PREDICTIVE MODELS
A PREPRINT
Alicja Gosiewska
Faculty of Mathematics and Information Science
Warsaw University of Technology
alicjagosiewska@gmail.com
https://orcid.org/0000-0001-6563-5742
Przemyslaw Biecek
Faculty of Mathematics, Informatics and Mechanics
University of Warsaw
Faculty of Mathematics and Information Science
Warsaw University of Technology
przemyslaw.biecek@gmail.com
https://orcid.org/0000-0001-8423-1823
March 28, 2019
ABS TRAC T
Explainable Artificial Intelligence (XAI) brings a lot of attention recently. Explainability is being
presented as a remedy for lack of trust in model predictions. Model agnostic tools such as LIME,
SHAP, or Break Down promise instance level interpretability for any complex machine learning
model. But how certain are these explanations? Can we rely on additive explanations for non-additive
models? In this paper, we examine the behavior of model explainers under the presence of interactions.
We define two sources of uncertainty, model level uncertainty, and explanation level uncertainty. We
show that adding interactions reduces explanation level uncertainty. We introduce a new method
iBreakDown that generates non-additive explanations with local interaction.
1 Motivation
Machine learning is an integral part of the modern world. Predictive models are used in almost every aspect of our life,
in school, at work, in hospitals, police stations or dating services. They are useful, yet, at the same time can be a serious
threat. Models that make unexplainable predictions may be harmful (O’Neil,2016). Need for higher transparency
and explainability of models is a hot topic of the recent year both in the Machine Learning community (Gill and Hall,
2018) as well as in legal community that coined the phrase „Right to Explain” in the discussion around General Data
Protection Regulation (Wachter et al.,2017;Edwards and Veale,2018). Since models affect our lives so much, we
should have the right to know what drives their predictions.
In recent years, a number of tools in the field of interpretable machine learning have been developed for image data
(Simonyan et al.,2013), text data (Ribeiro et al.,2018) or tabular data (Molnar,2019). There are several aspects of
predictive models that can be explained (Biecek,2018a), yet in this article, we will focus only on local explanations for
tabular data. This kind of explanations is designed to illustrate model behavior on the level of a single prediction. One
of the most known local explanations for tabular data are
SHAP
(Lundberg and Lee,2017),
LIME
(Ribeiro et al.,2016),
and
Break Down
(Staniak and Biecek,2018). Such tools are becoming more and more popular, but little is said about
the quality of such explanations (Guidotti and Ruggieri,2018;Yeh et al.,2019). Since predictive models are complex
and good explanations are simple, there is always a trade-off between fidelity and readability of explanations. Sparse
explanations are only approximations and simplifications of the underlying model. It is important to assess not only the
accuracy of a model but also a certainty of such explanation (Alvarez-Melis and Jaakkola,2018).
In this article, we distinguish two types of model uncertainty, on model level and on explanation level. For every
type, we provide a definition and a methodology for measuring the uncertainty. We also provide an illustration for
these uncertainties for an example random forest (Breiman,2001) model trained on the Titanic data-set (Encyclopedia
Titanica,2019).
arXiv:1903.11420v1 [cs.LG] 27 Mar 2019
A huge variety of model’s characteristics may be used for assessing
Model Level Uncertainty
, for example, perfor-
mance, interpretability, or robustness (Gosiewska and Biecek,2018). In this article, we focus on a linkage between
certainty and stability. More elastic models are considered as less stable than simpler models. This distinction comes
from the fact that complex models require the estimation of thousands of coefficients. Model stability should be
taken into account also in the model explanation process. The common technique for assessing model stability is
bootstrapping. The novel idea presented in this article is the evaluation of model stability that is based on differences in
local explanations between bootstrap samples. Significant changes in the contributions of variables in local explanations
suggest instability of model.
Explanation Level Uncertainty
is related to the fact that explanations are simplifications of a model. The simpler
explanation, the more we lose in the fidelity. The main idea behind local explanations of the model is to create an
understandable representation of the local behavior of an underlying model. The key issue of local explanations, such
as
SHAP
and
LIME
, is that they show additive local representations, while complex models are usually non-additive.
Therefore, current methods often turn out to be too imprecise and we need to find methods that are more accurate to
the underlying model. One of the possible ways of solving this problem is to take into account interactions between
features. In this article, we propose a novel algorithm iBreakDown that detects local interactions.
The paper is organized as follows. Section 2describes common local explanations and point out inconsistency between
them. Section 3contains the methodology for assessing and visualizing model level and explanation level uncertainties.
Section 4provides a description of the
iBreakDown
algorithm for local variable attributions with interactions. Section 5
contains the example usage of
iBreakDown
and intuition what is an interaction in a model. Conclusions are in Section 6.
2 LOCAL EXPLANATIONS OF MODEL
2.1 RELATED WORK
In this section, we describe the state of the art methods for local model explanations, we also show inconsistency in their
results on the example of the data set Titanic from Kaggle. This data set provides information on the fate of passengers
of the ocean liner „Titanic”. We trained
scikit-learn
(Pedregosa et al.,2011) random forest model on this data set.
Each explanation presented in this section was performed on the same observation (observation number 274). Since
some explanations are implemented in R, we used the
reticulate
(Allaire et al.,2019) to call Python random forest
in R.
One of the most recognized local explanation method is
LIME
(Local Interpretable Model-agnostic Explanations)
(Ribeiro et al.,2016). The idea of
LIME
is to fit a locally-weighted interpretable linear model in the neighborhood of
a particular observation. Numerical and categorical features are converted into binary vectors for their interpretable
representations. Such interpretable representation may be a binary vector indicating the presence or absence of a word
in the text classification task or super-pixel in the image classification task. For tabular data and continuous features,
quintile-based discretization is performed. A linear model is then fitted on simplified binary variables sampled around
the instance of interest. Therefore, the coefficients of this model can be considered as variable effects. Graphical
presentation of LIME implemented in the Python library lime1is presented in Figure 1.
Figure 1:
LIME
explanation for observation from Titanic data set. The underlying model is random forest. Blue color
indicates the reasons for passenger’s death (Survival = 0), orange indicates reasons for survival (Survival = 1).
There are several modifications of the
LIME
approach. The improvements that aim for regression problems and tabular
data are Local Interpretable Visual Explanations (
live
) (Staniak and Biecek,2018). There are two main differences
between live and
LIME
. In
live
, similar instances around original observation are generated by perturbing one feature
at the time and original variables are used as interpretable inputs. Another variant of
LIME
is
localModel
(Staniak and
Biecek,2019). In this method, local sampling is based on decision trees and Ceteris Paribus Profiles (Biecek,2018b).
1https://github.com/marcotcr/lime
2
Categorical variables are dichotomized due to the splits of a decision tree, which models the marginal relationship
between the feature and response. Numerical variables are transformed into binary via discretization of Ceteris Paribus
Profiles for observation under consideration. On the contrary to other approaches,
localModel
creates interpretable
features on the basis of a model, not only on the basis of the distribution of underlying data.
The SHapley Additive exPlanations (
SHAP
) (Lundberg and Lee,2017) are unification of
LIME
and several other methods
for local explanations, such as
DeepLIFT
(Shrikumar et al.,2017) and layer-wise relevance propagation (Bach et al.,
2015).
SHAP
is based on Shapley values, a technique used in the game theory. In this method, we calculate the
contribution of variable as an average of contributions of each possible ordering of variables. Results of
SHAP
for
Titanic data set generated with Python library shap2are presented in Figure 2.
Figure 2:
SHAP
explanations for observation from Titanic data set. The underlying model is random forest. Blue color
indicates the features, which decrease probability of survival, red indicates features increasing this probability Effects
of variables and base value sum up to the output value.
Another method for local explanations is
Break Down
(Staniak and Biecek,2018). The main idea of
Break Down
is to
generate order-specific explanations of features’ contributions. It is important to consider ordering for two reasons.
•
For non-additive models the order of features in explanation matters. Interpretation of the model-reasoning
depends on the order in which explanation is read. An example of different interpretations is presented in
Section 5.
•
Setting a proper order helps to increase the understanding of prediction. Human perception usually associates
the prediction with only a few variables. Therefore, it is important to highlight only the most important features
and set insignificant variables at the end of explanations.
In this method contributions of variables are calculated in a sequential manner. The effect of consecutive variables
depends on the change of expected model prediction while the further variable is fixed. Two
Break Down
explanations
for the same observation form the Titanic data set are presented in Figure 3. Contributions of variables and their
visualizations are generated with
breakDown
R package
3
. Contributions of variables differ between scenarios because
each scenario relies on a different order of variables. For an additive model, the order of variables should be irrelevant.
Regardless of the order, the contribution values should be equal in each scenario. Changes in values suggest that the
model is non-additive, thus, there is an interaction between variables.
There are also other approaches to local explainability, which base on rules, for example,
Anchors
(High-Precision
Model-Agnostic Explanations) (Ribeiro et al.,2018). Anchors are rules that describe subspaces of model features where
model prediction is (almost) the same. (Lakkaraju et al.,2016) introduced interpretable decision sets of if-then rules.
However, there is a trade-off between the simplicity of explanation and its fidelity. Covering a complex model in the
form of a small set of short rules may be too much simplification. On the other hand, many sets with complex rules will
be no longer interpretable. Additionally, these two methods do not produce numerical effects of variables, which makes
the effects of features incomparable.
2.2 UNCERTAINTY OF LOCAL EXPLANATIONS
The common approaches to local explanations consider the effect of each variable separately. However, when interactions
occur in the model, relationships between variables should be also taken into account. Omitting influence of interactions
causes that we do not only lose a part of the information about the variables effects, we also add undesired randomness
in the evaluation of these effects.
In Section 2.1 we introduced several approaches for local explanations of models. Explanations in Figures 1,2, and 3
were generated on the basis of the same model and for the same observation. However, values of contributions differ
2https://github.com/slundberg/shap
3https://github.com/pbiecek/breakDown
3
0.407
−0.19
−0.066
+0.812
0.964
0.407
−0.19
+0.266
+0.48
0.964
intercept
Sex = 0
Pclass = 2
Age = 2
prediction
intercept
Sex = 0
Age = 2
Pclass = 2
prediction
Scenario 1
Scenario 2
Figure 3: Two
Break Down
explanations for the same observation from Titanic data set. The underlying model is
random forest. Scenarios differs due to the order of variables in
Break Down
algorithm. Blue bar indicates difference
between model’s prediction and intercept. Other bars show contributions of variables. Red color means negative effect
on the survival probability, while green color means positive effect. Order of variables on the
y-axis
corresponds to
their sequence used in Break Down algorithm.
between
LIME
,
SHAP
, and
Break Down
. The results of these three methods are summarized in Table 1. Size of effects
differs between methods, there are even differences in the judgment of whether the impact is positive or negative. It is
not clear which explanation should be considered as the most reliable.
Table 1: Effects of features calculated with SHAP,LIME, and two Break Down scenarios.
Feature Effect
Method Age Sex Pclass
LIME -0.16 0.41 -0.03
SHAP 0.41 -0.09 0.24
Break Down, Scenario 1 0.81 -0.19 -0.66
Break Down, Scenario 2 0.27 -0.19 0.48
LIME
approximates the underlying model with a linear model, while
SHAP
averages across all possible combinations of
variable contributions.
Break Down
calculate contributions on the basis of the specified order of variables. In Figure 3,
we see that values of contributions differ for different orders. The differences between
Break Down
scenarios leads
to the conclusion that the reason for inconsistency can the interaction between variables. Detailed explanation is in
Section 5. Visualization of different variable orders in
Break Down
method allowed to identify the source of differences
in
LIME
and
SHAP
predictions and thus better explained model prediction. However, interactions are not included in any
of these three methods, thus we should not rely on their results.
For additive models, the results of
LIME
,
SHAP
, and
LIME
would be similar. The cause of differences and uncertainty of
additive explanations can be the interaction of variables. Detecting interactions reduce uncertainty. One approach to
capturing interactions may be analyzing different orders of features in the
Break Down
algorithm. However, comparing
many scenarios is highly ineffective. As the number of variables increases, the number of cases to review increases
factorially. The solution to this problem is
iBreakDown
, a local explanation method that captures interactions. We
introduce iBreakDown in Section 4.
4
3 UNCERTAINTY ESTIMATION ON THE BASIS OF EXPLANATIONS
3.1 MODEL LEVEL UNCERTAINTY
One of the sources of model uncertainty is model instability. Below, we describe the novel idea of using explanations to
assess model stability and therefore its certainty. We apply the bootstrapping technique to approximate the distribution
of variable contributions in local explanation.
We use one baseline model with fixed parameters and corresponding explanation for a particular instance, then, we
generate
m
bootstrap samples of data used to train the baseline model and we fit a new model on each sample. As
a result, we obtain
m
new models. For each model, we generate a
Break Down
explanation of prediction for the same
instance. Explanations are calculated with respect to variable order in the explanation generated for the
baseline model
.
The example summary plot of bootstrap results is presented in Figure 4. The reflection of the uncertainty level is the
variety of contribution values. Error bars allow us to assess the variation of explanations between models. The wider
the error bars, the less certain are the explanations. The position of the baseline model explanation in the error bars
allows us to assess how representative it is. What is more, while
SHAP
method is an average across multiple
Break
Down scenarios, this plot shows also uncertainty of SHAP explanation.
+0.25
−0.03
−0.014
+0.057
+0.159
+0.05
0.407
+0.25
−0.03
−0.014
+0.057
+0.159
+0.05
0.88
0.0 0.4 0.8
intercept
Age = 2
Sex = 0
Fare = 26
Pclass = 2
Parch = 1
SibSp = 1
prediction
Figure 4: The summary of
Break Down
explanations of one baseline and 100 bootstrapped models. Models are random
forests trained on the Titanic data.
Break Down
bars mark contributions calculated for the baseline random forest. Thin
brown error bars represent the range of cumulative contribution values for 100 models trained on bootstrapped data.
Thick brown error bars shows first quartile and third quartile.
In this setup, we take into account the stability of the model by enabling randomness of training data and fixing an order
of variables in Break Down algorithm. Therefore, the whole variability is the result of the instability of the model.
3.2 EXPLANATION LEVEL UNCERTAINTY
When generating an explanation for a model, it is important to know how much we can rely on it. Therefore, the
uncertainty of the explanation also should be assessed. We propose a methodology for assessing the uncertainty of
Break Down
explanations. The idea is to use bootstrapping to generate a sample of different explanations and measure
the stability of contribution values.
In this setup, we have one fixed underlying model and one baseline explanation of this model. The first step is to
generate
m
random samples of variables orders. Next, we generate a
Break Down
explanation with respect to each
sampled variables order. As a result, we obtain mnew explanations.
The example summary plot of bootstrapping explanations is presented in Figure 5. Uncertainty is realized as a variation
of contribution values between explanations. Error bars show the range of contribution values for explanations generated
5
on different variable orders. Widths of error bars indicate uncertainty of variables’ contributions. The wider bar, the less
certain contribution is.
Age = 2
Fare = 26
Parch = 1
Pclass = 2
Sex = 0
SibSp = 1
feature attribution
0.25
-0.02
0.16
0.059
-0.03
0.052
Figure 5: The summary of contribution values for
Break Down
explanations. Explanations are generated for a random
forest model fitted on Titanic data set and for one passenger. Green and red bars correspond to contribution values of
baseline explanation. Thin brown error bars represent range of contribution values for bootstrapped 100 explanations.
Thick brown error bars shows first quartile and third quartile.
In this setup, we impose randomness of explanations by forcing different variable orders while the model and explained
instance are fixed. The whole variability is the result of the uncertainty of the explanation.
Since
Break Down
is an additive method of explanation, the high variability of contribution, realized by wide error
bars, is related to the occurrence of interaction. In order to reduce the uncertainty of explanation, the interaction should
be taken into account.
4 LOCAL EXPLANATIONS WITH INTERACTIONS
If the uncertainty of model explanations is linked with the presence of interactions, then we have to include interactions
to model explanations. This way we will have more stable and reliable explanations.
In this section, we introduce a novel methodology for the identification of interactions in instance level explanations.
The algorithm works in a similar spirit as
SHAP
or
Break Down
but is not restricted to additive effects. The intuition is
the following:
1. Calculate a single-step additive contribution for each feature.
2.
Calculate a single-step contribution for every pair of features. Subtract additive contribution to assess the
interaction specific contribution.
3. Order interaction effects and additive effects in a list that is used to determine sequential contributions.
This simple intuition may be generalized into higher order interactions. The algorithm is implemented in the
iBreakDown package4.
Notation and methodology behind single-step contributions are introduced in Section 4.1. Details of the last step are
described in Section 4.2.
4.1 SINGLE-STEP CONTRIBUTIONS
Let
f:X→R
be a predictive model under consideration and
x∗∈X
be an observation to explain. For the sake of
simplicity, we consider a univariate model output, more suited for classification or regression, but every step can be
easily generalized into multiclass classification or multivariate regression.
4https://github.com/ModelOriented/iBreakDown
6
For a feature xiwe may define a single-step contribution.
∆i=scorei(f, x∗) = E[f(x)|xi=x∗
i]−E[f(x)].(1)
Expected model prediction E[f(x)] is sometimes called baseline or intercept and may be denoted as ∆∅.
Expected value
E[f(x)|xi=x∗
i]
corresponds to an average prediction of a model
f
if feature
xi
is fixed on
x∗
i
coordinate from the observation to explain
x∗
.
∆i
measures a naive single-step local variable importance, it indicates
how much the average prediction of model fchanges if feature xiis set on x∗
i.
For a pair of variables xi,xjwe introduce a single-step contribution as
∆ij =scorei,j (f, x∗) = E[f(x)|xi=x∗
i, xj=x∗
j]−E[f(x)].(2)
And corresponding interaction specific contribution as
∆I
ij =E[f(x)|xi=x∗
i, xj=x∗
j]−E[f(x)|xi=x∗
i]−E[f(x)|xj=x∗
j] + E[f(x)].(3)
It is an equivalent to
∆I
ij =E[f(x)|xi=x∗
i, xj=x∗
j]−scorei(f, x∗)−scorej(f, x∗)−E[f(x)] = ∆ij −∆i−∆j.(4)
A value of
E[f(x)|xi=x∗
i, xj=x∗
j]
is an average model output if feature
xi
and
xj
are fixed on
x∗
i
and
x∗
j
respectively.
∆I
ij
is the difference between collective effect of variables
xi
and
xj
denoted as
∆ij
and their additive effects
∆i
and
∆j
. Therefore,
∆I
ij
measures the importance of local lack-of-additivnes (aka. interaction) between features
i
and
j
. For
additive models ∆I
ij is small for any i,j.
Calculating
∆i
for each variable is Step 1, computing
∆I
ij
for each pair of variables is Step 2. Note that contributions
∆i
do not sum to final model prediction. We only use them to determine the order of features in which the instance
shall be explained.
We need to provide one more symbol, that corresponds to the added contribution of feature ito the set of features J.
∆i|J=E[f(X)|xJ∪{i}=x∗
J∪{i}]−E[f(X)|xJ=x∗
J] = ∆J∪{i}−∆J.(5)
And for pairs of features
∆ij|J=E[f(X)|xJ∪{i,j }=x∗
J∪{i,j}]−E[f(X)|xJ=x∗
J] = ∆J∪{i,j}−∆J.(6)
Once the order of single-step importance is determined based on
∆i
and
∆I
ij
scores, the final explanation is the
attribution to the sequence of ∆i|Jscores. These contributions sum up to the model predictions, because
∆1,2...p =f(x∗)−E[f(X)].
This approach can be generalized to interactions between any number of variables.
4.2 THE ALGORITHM
In this section, we show how to calculate scores introduced in Section 4.1 and how to calculate final feature attributions.
Algorithm 1is a procedure for calculation of ∆i, i.e. single-step contributions for each feature.
Algorithm 2is a procedure for calculation of ∆ij and ∆I
ij , i.e. single-step contributions and interactions for each pair.
Algorithm 3applies consecutive conditioning to ordered variables. It consists of setting a path due to the calculated
effects, then calculating contributions.
The introduced method takes into account the interactions between variables. A large difference between the sum of
consecutive effects of features and the effect of a pair of features indicates interaction.
(Lundberg et al.,2018) present a similar idea of calculating differences between sum of independent effects of variables
and join effect to calculate SHAP interaction values. However, their approach is based on averaging contributions.
7
Algorithm 1 Single-step contributions of features
1: Input: Xn×p- data; f- model; x∗- new observation
2: Calculate average model response
3: ∆∅=mean(f(X))
4: for iin {1,2, ...p}do
5: Calculate contribution of the i-th feature
6: avg_yhat =mean(f(Xxi=x∗
i))
7: ∆i=avg_yhat −∆∅
8: [∆1, ..., ∆p]contains contributions of features
Algorithm 2 Single-step contributions of pair of features
1: Input: Xn×p- data; f- model; x∗- new observation; ∆i- vector of single-step contributions.
2: for iin {1,2, ...p}do
3: for jin {1,2, ...p}/{i}do
4: Calculate contribution of pair i,j.
5: avg_yhat =mean(f(Xxi=x∗
i,xj=x∗
j))
6: ∆ij = ∆∅
7: ∆I
ij = ∆ij −∆i−∆j
8: ∆Icontains matrix of interaction contributions of pairs of features
Algorithm 3 Sequential explanations
1: Input: Xn×p
- data;
f
- model;
x∗
- new observation;
∆i
- vector of single-step feature contributions;
∆I
ij
- table
of single-step feature interactions;
2: Calculate ∆∗which is a sorted union of ∆iand ∆I
ij
3: feat - table of features and pairs in order corresponding to ∆∗.
4: open ={1,2, ..., p}
5: for candidates in feat do
6: if candidates in open then
7: path =append(path, candidates)
8: open =setdiff (open, candidates)
9: yhat =mean(f(X|x¬open =x∗
¬open))
10: avg_yhats =append(avg_yhats, yhat)
11: Explanation order is determined in the path vector.
12: history =∅
13: for Iin path do
14: Iis a single variable or pair of variables
15: history =history ∪I
16: attributioni= ∆I|history
17: Explanations are in the attribution vector.
5 TITANIC EXAMPLE
We broaden the
Break Down
example for Titanic data, explain the interaction in the underlying model, and show
iBreakDown explanation.
In our example, the training data set consists of 4 variables.
•Survival - binary variable indicates whether passenger survived, 1for survival and 0for death.
•Age - numerical variable, age in years.
•Sex - binary variable, 0for female and 1for male.
•PClass - categorical variable, ticket class, 1,2, or 3.
8
We fit the random forest model to predict whether passenger survived or not, then, we explain the model’s prediction
for 2-year old boy that travels in the second class. The model predicts survival with a probability of
0.964
. We would
like to explain this probability and understand which factors drive this prediction.
In Figure 3we showed two Break Down explanations. Each of them may be interpreted in a different way.
Scenario 1:
The passenger is a boy, and this feature alone decreases the chances of survival. He traveled in the second
class which also lower survival probability. Yet, he is very young, which makes odds higher. The reasoning behind such
an explanation on this level is that most passengers in the second class are adults, therefore a kid from the second class
has high chances of survival.
Scenario 2:
The passenger is a boy, and this feature alone decreases survival probability. However, he is very young,
therefore odds are higher than adult men. Explanation in the last step says that he traveled in the second class, which
make odds of survival even more higher. The interpretation of this explanation is that most kids are from the third class
and being a child in the second class should increase chances of survival.
Note that the effect of the second class is negative in explanations for scenario 1 but positive in explanations for scenario
2. Two interpretations of the above scenarios imply the existence of an interaction between age and ticket class. The
algorithm introduced in the previous section founds this interaction. Corresponding explanation is presented in Figure 6.
Scenario 3 (with interactions):
The passenger is a boy in second class, which increases the chance of survival because
the effect of age depends on passenger class.
0.407
+0.535
+0.022
0.964
intercept
Age:Pclass = 2:2
Sex = 0
prediction
Scenario 3
Figure 6: Explanation of non-additive random forest model for 2-year old boy that travels in the second class. Bars
show contributions of feature Sex and interaction between Age and Pclass.
6 DISCUSSION
In this article, we introduced two types of uncertainty:
model level uncertainty
and
explanation level uncertainty
.
First is related to data sampling, flexible models generate a risk of overfitting that results in unstable predictions. Second
is related to the process of explanation distillation. Simple explanations may omit some important parts of model
behavior.
We introduce procedures to measure and visualize both types of uncertainties. Results are presented for three popular
methods for model explanation
LIME
,
SHAP
and
Break Down
. For the same random forest model, each method
generates different explanations, sometimes even with opposite signs.
As we showed, some of this uncertainty is linked with the lack of additives in the model, which cannot be grasped by
the additive explanations.
To solve this problem we introduced a new
iBreakDown
method, which generates not-only-additive explanations. The
theoretical backbone of this algorithm is similar to
SHAP
and
Break Down
method, yet, in contrast to them we consider
also pairwise interactions. This algorithm and diagnostic plots are implemented and available as an open source in the
package iBreakDown (https://github.com/ModelOriented/iBreakDown).
Finally, we present a test-case that illustrates differences between different methods for instance level explanations.
9
6.1 CONCLUSIONS
For tabular data, most of the local explanation methods are additive. Applying them to non-additive models result in
an increase of uncertainty of such explanations. Tools in the area of Interpretable Machine Learning are developed to
explain complex black-box models. We cannot assume that such complex models will be additive, we should expect,
identify and handle interactions in these models. One of the possible solutions to handle interactions and explain
uncertainty linked with feature contributions is the iBreakDown algorithm.
6.2 FUTURE WORK
Well established statistical methods, like a generalized linear model, are equipped with methods for handling model
level uncertainty. For example, we can use the asymptotic properties of estimators for model coefficients. More complex
methods need other approaches. In this paper we used bootstrap but more research is needed to better understand how
to communicate uncertainty that results from potentially biased sampling.
Presented methodology for assessing model level uncertainty is based on differences between explanations for one
instance. A comprehensive evaluation of the model requires the analysis of many observations. The approach presented
in this article can be extended by a method of sampling observations for the study of stability or by proposing a method
for aggregating the explanations.
Also presented an approach for handling the explanation level uncertainty needs further examination. The inclusion of
interactions in the explanation improves its certainty, yet at the same time, explanations may become more difficult
to understand than the additive representations. It is a field for extensive cognitive studies of visual presentation
of explanations.
The
iBreakDown
method identifies interactions and measures their contributions. However, the main effects of variables
and interaction between them are currently presented as a single value. It would be desirable to separate the main effects
and the contribution of an interaction and present deeper visual clues that help to understand the role of an interaction.
7 Acknowledgements
We would like to acknowledge Mateusz Staniak for valuable discussions.
Alicja Gosiewska was financially supported by NCN Opus grant 2017/27/B/ST6/01307.
References
J. Allaire, K. Ushey, and Y. Tang. reticulate: Interface to ’Python’, 2019. URL
https://CRAN.R-project.org/
package=reticulate. R package version 1.11.
D. Alvarez-Melis and T. S. Jaakkola. On the Robustness of Interpretability Methods. 2018. URL
http://arxiv.
org/abs/1806.08049.
S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek. On Pixel-Wise Explanations for
Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. PLOS ONE, 10(7):1–46, 07 2015. URL
https://doi.org/10.1371/journal.pone.0130140.
P. Biecek. DALEX: Explainers for Complex Predictive Models in R. Journal of Machine Learning Research, 19(84):
1–5, 2018a. URL http://jmlr.org/papers/v19/18-416.html.
P. Biecek. Ceteris Paribus Profiles, 2018b. URL
https://pbiecek.github.io/ceterisParibus/
. R package
version 0.3.1.
L. Breiman. Random Forests. Machine Learning, 45(1):5–32, Oct. 2001. URL
https://www.doi.org/10.1023/A:
1010933404324.
L. Edwards and M. Veale. Enslaving the Algorithm: From a “Right to an Explanation” to a “Right to Better Decisions”?
IEEE Security and Privacy, 16(3):46––54, 2018. URL https://doi.org/10.1109/msp.2018.2701152.
Encyclopedia Titanica. Titanic Facts, History and Biography, 2019. URL
https://www.encyclopedia-titanica.
org/.
10
N. Gill and P. Hall. An Introduction to Machine Learning Interpretability. 2018. URL
https://www.oreilly.com/
library/view/an-introduction-to/9781492033158/.
A. Gosiewska and P. Biecek. auditor: an R Package for Model-Agnostic Visual Validation and Diagnostic. 2018. URL
https://arxiv.org/abs/1809.07763.
R. Guidotti and S. Ruggieri. Assessing the Stability of Interpretable Models. 2018. URL
http://arxiv.org/abs/
1810.09352.
H. Lakkaraju, S. H. Bach, and J. Leskovec. Interpretable Decision Sets: A Joint Framework for Description and
Prediction. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining, pages 1675–1684, 2016. URL http://doi.acm.org/10.1145/2939672.2939874.
S. M. Lundberg and S.-I. Lee. A Unified Approach to Interpreting Model Predictions. In Advances in Neu-
ral Information Processing Systems 30, pages 4765–4774. 2017. URL
https://papers.nips.cc/paper/
7062-a-unified-approach-to-interpreting-model-predictions.
S. M. Lundberg, G. G. Erion, and S. Lee. Consistent Individualized Feature Attribution for Tree Ensembles. 2018.
URL http://arxiv.org/abs/1802.03888.
C. Molnar. Interpretable Machine Learning. 2019.
https://christophm.github.io/interpretable-ml-book/
.
C. O’Neil. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. 2016.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss,
V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn:
Machine Learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
M. T. Ribeiro, S. Singh, and C. Guestrin. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages
1135–1144, 2016. URL https://www.kdd.org/kdd2016/papers/files/rfp0573-ribeiroA.pdf.
M. T. Ribeiro, S. Singh, and C. Guestrin. Anchors: High-Precision Model-Agnostic Explanations. In AAAI Conference
on Artificial Intelligence (AAAI), 2018. URL
https://aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/
16982.
A. Shrikumar, P. Greenside, and A. Kundaje. Learning Important Features Through Propagating Activation Differences.
2017. URL http://arxiv.org/abs/1704.02685.
K. Simonyan, A. Vedaldi, and A. Zisserman. Deep Inside Convolutional Networks: Visualising Image Classification
Models and Saliency Maps. 2013. URL http://arxiv.org/abs/1312.6034.
M. Staniak and P. Biecek. Explanations of Model Predictions with live and breakDown Packages. The R Journal, 10(2):
395–409, 2018. URL https://doi.org/10.32614/RJ-2018-072.
M. Staniak and P. Biecek. LIME-based Explanations With Interpretable Inputs Based on Ceteris Paribus Profiles, 2019.
URL https://modeloriented.github.io/localModel/. R package version 0.3.10.
S. Wachter, B. D. Mittelstadt, and C. Russell. Counterfactual Explanations without Opening the Black Box: Automated
Decisions and the GDPR. 2017. URL http://arxiv.org/abs/1711.00399.
C. Yeh, C. Hsieh, A. S. Suggala, D. I. Inouye, and P. Ravikumar. How Sensitive are Sensitivity-Based Explanations?
2019. URL http://arxiv.org/abs/1901.09392.
11