PreprintPDF Available

Viewing the process of generating counterfactuals as a source of knowledge: a new approach for explaining classifiers

Authors:

Abstract and Figures

(Paper Accepted at IJCNN 2024) - There are now many comprehension algorithms for understanding the decisions of a machine learning algorithm. Among these are those based on the generation of counterfactual examples. This article proposes to view this generation process as a source of creating a certain amount of knowledge that can be stored to be used, later, in different ways. This process is illustrated in the additive model and, more specifically, in the case of the naive Bayes classifier, whose interesting properties for this purpose are shown.
Content may be subject to copyright.
Viewing the process of generating counterfactuals
as a source of knowledge: a new approach for
explaining classifiers
Vincent Lemaire
Orange Innovation Lannion, France
Email: vincent.lemaire@orange.com
Victor Guyomard
Orange Innovation, Lannion, France
Email: victor.guyomard@orange.com
Nathan Le Boudec
Orange Innovation, Lannion, France
Email: nathan.leboudec@orange.com
Franc¸oise Fessant
Orange Innovation, Lannion, France
Email: franc¸oise.fessant@orange.com
Abstract—There are now many explainable AI methods for
understanding the decisions of a machine learning model. Among
these are those based on counterfactual reasoning, which involve
simulating features changes and observing the impact on the
prediction. This article proposes to view this simulation process
as a source of creating a certain amount of knowledge that can
be stored to be used, later, in different ways. This process is
illustrated in the additive model and, more specifically, in the
case of the naive Bayes classifier, whose interesting properties
for this purpose are shown.
I. INTRODUCTION
Machine learning, one of the branches of artificial intelli-
gence, has enjoyed many successes in recent years. The deci-
sions made by these models are increasingly accurate, but also
increasingly more complex. However, it appears that some of
these models are like black boxes: their decisions are difficult,
if not impossible, to explain [1]. This lack of explainability
can lead to a number of undesirable consequences: lack of
user confidence, reduced usability of the models, presence
of biases, etc. These needs have given rise to the field of
XAI (eXplainable AI). XAI [2], [3] is a branch of artificial
intelligence that aims to make the decisions made by machine
learning models intelligible, understandable, to users.
Among XAI methods, some of them are based on counter-
factual reasoning. Counterfactual reasoning is a concept from
psychology and social sciences [4], which involves examining
possible alternatives to past events [5]. Humans often use
counterfactual reasoning by imagining what would happen if
an event had not occurred, and this is precisely what coun-
terfactual reasoning is. When applied to artificial intelligence,
the question is, for example, “Why did the model make this
decision instead of another?” (counterfactual explanation) or
“How would the decision have differed if a certain condition
had been changed?”. This reasoning can take the form of a
counterfactual or semi-factual explanation.
A counterfactual explanation might be “If your income
had been $10000 higher, then your credit would have been
accepted” [6], [7], [8]. A semi-factual is a special-case of the
counterfactual in that it conveys possibilities that “counteract”
what actually happened, even if the outcome does not change
[9]: “Even if your income had been $5000 higher your credit
would still be denied” (but closer to being accepted), see
Figure 1.
Fig. 1. Illustration of a counterfactual and a semi-factual. The red dots
represent initial examples (X). The orange represents a semi-factual, the
purple dot represents a counterfactual and the white line represents the
decision boundary between the red and green classes.
Within the framework of counterfactual reasoning, this
article proposes to consider feature changes and their impacts
on prediction as a source of knowledge that can be stored and
exploited in various ways. This process is illustrated in the case
of additive models and in particular in the case of the naive
Bayes classifier, whose interesting properties for this purpose
are shown.
The rest of this paper is organized as follows: Section II
explains (i) how to build a knowledge base from a classifier
by using counterfactual reasoning and (ii) how to derive
explanations from this knowledge base and (iii) some other
additional usable knowledge. Section III presents a concrete
implementation for building the knowledge base in the case
of the naive Bayes classifier. The Section IV illustrates, by
means of an unsubscribing problem, (i) how clustering applied
to this database generates new knowledge and (ii) examples
X1X2
I1I2I1I2I3
X1∆(X1
1,∗→1) ∆(X1
1,∗→2) ∆(X1
2,∗→1) ∆(X1
2,∗→2) ∆(X1
2,∗→3
X2∆(X2
2,∗→1) ∆(X2
2,∗→2) ∆(X2
2,∗→1) ∆(X2
2,∗→2) ∆(X2
2,∗→3)
TABLE I
ILL UST RATI ON O F THE K NOW LE DGE B ASE ,HE RE I N THE C AS E OF TW O VARI ABL ES A ND TW O EX AM PLE S.
of using the database to generate trajectories to transform the
initial example to counterfactual or semi-factual. Finally the
last section concludes the paper.
II. CR EATI ON A ND USE OF A KNOW LE DG E BA SE
In this section, we introduce the notations needed to un-
derstand the contribution. We then explain how to build a
knowledge base from a classifier using counterfactual reason-
ing. Finally, we present the different types of explanations that
can be derived from the knowledge base.
A. Backgrounds
Let fbe a predictive model that is trained using Nexam-
ples, each described by a set of dexplanatory variables (a vec-
tor X={X1, ...., Xd}, derived from a Xdistribution) so as to
predict a categorical target variable denoted Y={y1, ..., yK}
derived from a Ydistribution.
We assume that the dataset taken as input by the classifier is
discretised, meaning that each variable xifalls into an interval
of values Iq, where qis the qth interval of values.
B. Creation of the knowledge base
Our goal is to build a table that stores each variable change
and its effect on prediction. As the dataset is discretised, each
variable change is represented by membership of a new value
interval of values. To measure the difference in prediction,
we propose to use the predicted probability returned by the
classifier. Table I gives an example of such a table in the
case of two variables X1and X2, discretised (or grouped)
respectively into 2 and 3 intervals (groups) of values (I).
For each individual l, on each row of the table, we store the
differences in the values of the predicted probabilities, where
∆(Xl
i,∗→m, Xl)is the difference in the predicted probability
for class 1 when the value of the variable igoes from its initial
value to the value of the interval (group) m(simplified to
∆(Xl
i,∗→m)in the table).
To exploit this knowledge base effectively, the classifier
used must have an additivity property with regard to pre-
diction. This means that individual differences in predicted
probabilities can be added together to obtain the overall
difference in predicted probability. More formally, suppose Xi
and Xjare two variables and m, p are two new interval values
for each variable, respectively.
Then a classifier is considered as additive if for each
individual l:
∆(Xl
(i,∗→m), Xl) + ∆(Xl
(j,∗→p), Xl) = ∆(Xl
(i,∗→m)(j,∗→p), Xl)
where ∆(Xl
(i,∗→m)(j,∗→p), Xl)is the difference in pre-
dicted probability for class 1 when both variable iand jare
changed.
We will show in section III that this property is valid for the
naive Bayes classifier. However, this choice is not restrictive,
since any other classifier with this property can be used.
C. Deriving explanations from the knowledge base
Once this knowledge base has been built up, different types
of explanations can be derived, such as counterfactual or semi-
factual explanations. In addition, we introduce two new types
of explanation, namely trajectories and profile clustering.
D. Counterfactual explanation
In the context of explainable AI, a counterfactual explana-
tion is defined as the smallest change in feature values that
changes a model’s prediction towards a given outcome. An
example of such a counterfactual explanation might be “if your
monthly income had been 500$ higher, your credit would have
been accepted”. The new example, obtained by changing the
characteristic income by 500$, is called the “counterfactual”.
Many methods have been proposed to generate counterfactu-
als, focusing on specific properties such as realism [10], [11],
[12], actionability [13], [14], sparsity [15], [16], or robustness
[17].
1) Generating sparse counterfactuals: The sparsity prop-
erty involves having counterfactuals with the smallest number
of modified variables. Sparse explanations are interesting
because they are more comprehensible for human [18] and
also more actionable, i.e. the change proposed by the coun-
terfactual is easier to implement. Such explanations can easily
be obtained by exploiting our knowledge base. For a given
individual, X, we simply read the value of the largest and
then, as we have the additivity property, read the second value
of the largest for a second variable and so on. At each
stage we check whether (f(X)>0.5). If this is the case, the
counterfactual has been found1.
2) Taking into account other properties : In other cases,
the objective may be to find the closest counterfactual, but
under ‘business constraints’ defined by the user. For example,
the search for counterfactuals could be restricted to making
changes only in adjacent intervals (e.g. intervals of close
values). Given the table I, we would be allowed to move people
from interval 1 to 2 for the second variable, but not from 1 to
1We could, on the other hand, maximize the number of changes, but this
is often of little practical interest
3. The user can also constrain the search by requiring that one
of the variables must always be changed to a certain value,
and so on. This type of constraint can easily be taken into
account and integrated into a counterfactual search algorithm
using the proposed knowledge table.
The literature on counterfactuals sets out some interesting
properties on the subject, such as (i) the notion of minimality:
having a counterfactual that differs as little as possible from
the original example; (ii) realism: the generated counterfactual
must not contain changes that do not make sense from the
point of view of the data (e.g. a decrease in the “age” of an
individual), also known as plausibility [19]; (iii) generation
of counterfactuals that are similar to real examples or in
dense regions of the class of interest, in order to have robust
counterfactuals [17].
It should be noted that a large proportion of these different
properties can be easily obtained by exploiting the knowledge
base in a different way, since the user can select the variables
on which he wishes to intervene, according to the criterion of
his choice.
E. Additional usable knowledge
1) Preventive and reactive actions - : So far, we have
mainly talked about creating counterfactuals to explain the
model’s decisions (as mentioned in the introduction to this
article), but also potentially to be able to take reactive actions.
For example, if a bank customer is predicted to “leave”
(churner), the counterfactual example indicates one or more
actions to be taken to try to retain him: these actions are known
as “reactive” actions.
Conversely, the study of counterfactual trajectories a poste-
riori is of great interest, as it also makes it possible to identify
when a trajectory is approaching the frontier (see Figure 2).
Fig. 2. Illustration of two counterfactuals: one achieved in 1 step, the second
in 3 steps
In such situations, reactive measures can be taken to reverse
the trend and avoid undesirable outcomes (for example by
observing the second step of the trajectory at the bottom of the
figure 2). This approach is particularly relevant when it comes
to predicting churn, for example, as it enables us to identify
customers who are “starting” to churn. By being proactive,
it is possible to implement targeted strategy to retain these
customers and bring them back to a quality service.
Finally, our knowledge base can also be used to carry out
“preventive” actions. Going back to Figure 1, we can try
to create a (negative) semi-factual which moves away from
the decision frontier (see Figure 3) : ”The customer is is not
supposed to leave, but he is nevertheless close to the decision
frontier”. In this case, all we need to do is look at the negative
values of and move away from them according to the user’s
wishes. For example, we could easily identify all the people
who are one step away from crossing the decision boundary
and act accordingly.
Fig. 3. Illustration of a counterfactuals (on top) and a “negative” semi-factual
(bottom) which moves away from the decision frontier
2) Profile creation - : The last way of using the knowledge
base that we will describe here2, is to carry out an exploratory
analysis using a clustering technique.
Using the knowledge base, it is possible to group individuals
according to the impact of each possible change, i.e. the impact
resulting from each of these changes. Analysis of the clusters
created can be a source of learning. This is illustrated in the
next section.
Note: all the uses of the knowledge base, presented in this
paper, are on-demand, local, global or in-between, so that both
counterfactual and semi-counterfactual uses can be generated,
allowing a wide range of uses.
III. COMPUTING IN T HE C AS E OF T HE NAIVE BAYE S
CL AS SI FIE R
A. Reminders on the naive Bayes classifier
The naive Bayes classifier (NB) is a widely-used tool in
supervised classification problems. It has the advantage of
being efficient for many real data sets [20]. However, the
naive assumption of conditional independence of the variables
can, in some cases, degrade the classifier’s performance.
This is why variable selection methods have been developed
[21]. They mainly consist of variable addition and deletion
heuristics to select the best subset of variables maximizing
a classifier performance criterion, using a wrapper-type ap-
proach [22]. It has been shown in [23] that averaging a large
number of selective naive Bayes classifiers, performed with
different subsets of variables, amounts to considering only
one model with a weighting on the variables. Bayes’ formula
under the assumption of independence of the input variables
conditionally to the class variable becomes:
2The reader can imagine others: descriptive statistics of the table, number
of individuals at 1, 2, 3 ... steps from the decision frontier, visualization of
the trajectories, ... .
P(Ck|X) = P(Ck)QiP(Xi|Ck)Wi
PK
j=1(P(Cj)QiP(Xi|Cj)Wi)(1)
where Wi[0,1] is the weight of variable i. The predicted
class is the one that maximizes the conditional probability
P(Ck|X). The probabilities P(Xi|Ci)can be estimated by
intervals using discretization for numerical variables. Gaus-
sian naive Bayes could be also considered. For categorical
variables, this estimation can be done directly if the variable
takes few different modalities, or after grouping (of values) in
the opposite case.
B. Criteria to be optimized in the search for counterfactuals
Let, Xas an example, and two classes C1and C2. The
search for a counterfactual consists in optimizing and increas-
ing the probability of belonging to the target class C1when X
is initially predicted by the model to belong to C2(and vice
versa). To do this, we can develop a gluttonous algorithm,
which is expensive in terms of computation time and does not
necessarily have the additivity properties described above. We
propose below to pose the problem differently, rewriting the
equation 1 and looking at how to increase the probability of
belonging to a particular class of interest. To achieve this goal,
and to maximize P(Cj|X)with respect to the initial value of
P(Cj|X), we will exploit the following proposition:
If we take Xand Xas two elements of the input space X,
we show that for a two-class classification problem, searching
for counterfactuals of Xamounts to examining the evolution
of the value of when we change some of the values of X
to X, such that:
∆(X, X ) = d
X
i=1
Wi(log P(Xi|C1)log P(Xi|C2))!
d
X
i=1
Wi(log P(X
i|C1)log P(X
i|C2))!
(2)
Proof: If we start again from the equation 1
P(Cj|X) = P(Cj)Qd
i=1 P(Xi|Cj)Wj
Pz[P(Cz)Qd
i=1 P(Xi|Cz)Wj](3)
by posing:
Lj(X) = log P(Cj)
d
Y
i=1
P(Xi|Cj)Wi!
= log P(Cj) +
d
X
i=1
Wilog P(Xi|Cj)
(4)
then we have:
P(Cj|X) = eLj(X)
PzeLz(X)
=1
PzeLz(X)Lj(X)
=1
1 + Pz=jeLz(X)Lj(X),
(5)
and so in the case of two classes:
P(Cj|X) = 1
1 + eLj(X)Lj(X)(6)
We can see that to get closer to the class Cj, all we have to
do is reduce the quantity Lj(X)Lj(X), and thus reduce:
log P(Cj) +
d
X
i=1
Wilog P(Xi|Cj)
log P(Cj)
d
X
i=1
Wilog P(Xi|Cj)
(7)
Since P(Cj)and P(Cj)are constant, this is equivalent to
decreasing:
d
X
i=1
Wilog P(Xi|Cj)
d
X
i=1
Wilog P(Xi|Cj)
and therefore to take an interest in the distance:
∆(X, X ) =
d
X
i=1
Wi(log P(Xi|Cj)log P(Xi|Cj))
d
X
i=1
Wi(log P(X
i|Cj)log P(X
i|Cj))
(8)
If is positive then we are getting closer to the decision
frontier (or even crossing it). If is negative then we
are moving away from the decision frontier and therefore
away from the desired objective. The counterfactual search
algorithm becomes straightforward. Simply calculate, for a
given example X, the value of for each explanatory variable
and for each value of this explanatory variable. Then, given
these values, iterate the successive changes in order to obtain
a counterfactual example. These variable-by-variable changes
have the property of being additive.
Indeed, if we consider four examples X0,X1,X2and
X3 X , which are respectively (i) an initial example X0,
then the same example for which we have modified only one
explanatory variable lfor X1,mfor X2, and finally an
example that cumulates the two univariate modifications land
mfor X3, such that :
!lsuch as X
1
l=X0
l
!msuch as X
2
m=X0
mand m=l
and
X
3
k=
X1
l,if k=l
X2
m,if k=m
X0
k,otherwise
then it is obvious, from the additivity over all the variables
in the equation 2, that we have : ∆(X0, X
3) = ∆(X0, X
1) +
∆(X0, X
2). Modifying one variable and then the other is
equivalent to modifying them simultaneously in the calculation
of . It should also be noted that this additivity is demon-
strated from the equation 6, so we can be sure of increasing
the normalized value of the probability of the class of interest,
P(C|X), which is a plus.
Note: the list of values can potentially be very large
if the number of distinct values of the explanatory variables
is large. Nevertheless, it is common for some naive Bayes
classifiers (except the Gaussian version) [24], [25] to dis-
cretize the numerical variables and group the modalities of
the categorical variables, in a supervised manner, in order
to obtain an estimate of the conditional densities (P(Xi|C))
which are then used in the calculation of P(C|X). This is
what has been done in this article using the [26] and [27]
methods for numerical and categorical variables respectively.
These supervised discretization and grouping operations often
produce a limited number of intervals or groups of modalities.
This makes it possible to obtain a reasonable number of values
to test.
IV. ILLUSTRATION ON AN UNSUBSCRIBE CASE
A. Dataset and classifier used
This section uses the “Telco Customer Churn” dataset
provided by a fictitious telecommunications company that pro-
vided home telephone and internet services to 7043 customers
in California. The aim is to classify people who may or may
not leave the company. Each customer is described by 20
descriptive variables (3 numerical and 17 categorical) plus
the class variable ‘churn’ with two modalities (yes/no) with
an unbalanced distribution (75% non-churn). This dataset can
be downloaded from Kaagle [28]. We use 80% of the data
for learning and 20% for testing. The naive Bayes classifier
is produced using the Khiops library, which is available on
Github [29], the rest of the computation is straightforward
using Equation 8.
During the learning process, only 10 of the variables were
retained in the model. Below are all the intervals of values
or groups of modalities obtained during the pre-processing
process (the value in brackets gives the weight of the variable
in the model, equation IV-D, values from 0 to 1):
1 * - Tenure (W1=0.67): [0-0.5], ]0.5-1.5], ]1.5-5.5], ]5.5-
17.5], ]17.5-42.5], ]42.5-58.5], ]58.5-71.5], ]71.5-72]
2 - InternetService (W2=0.78): [Fiberoptic], [DSL], [No]
3 - Contract (W3=0.37): [Month-to-month], [Twoyear],
[Oneyear]
4 - PaymentMethod (W4=0.29): [Mailedcheck], [Credit-
card(automatic), Electroniccheck, Banktransfer(automatic)]
5 - OnlineSecurity (W5=0.15): [No], [Yes], [No internet
service]
6 -TotalCharges (W6=0.29): [18.8;69.225], ]69.225;91.2],
]91.2;347.9], ]347.9;1182.8],
]1182.8 ;2065.7], ]2065.7;3086.8], ]3086.8;7859],
]7859;8684.8]
7 * - PaperlessBilling (W7=0.40): [Yes], [No]
8 - TechSupport (W8=0.04): [No], [Yes], [No internet ser-
vice]
9 * - SeniorCitizen (W9= 0.28): [0], [1]
10 * - Dependents (W9= 0.10): [Yes], [No]
For all 10 variables, there are a total of 36 inter-
vals/groupings and therefore 26 values of to calculate in
our knowledge base. Indeed, for each individual and each
variable, there is a value that has a null value, the value
which corresponds to it factually and which therefore does not
need to be calculated.
B. Classifier analysis stage
Before carrying out the clustering stage, it is important to
take an interest in the variables retained during the classifi-
cation model training stage. For example, although it may be
interesting to analyze the ‘Tenure’ variable, it is clearly not
an actionable variable. Indeed, it is not possible to change a
customer’s seniority in order to make him potentially less un-
faithful. The same applies to the ‘SeniorCitizen’ and ‘Depen-
dents’ variables. We have also removed the ‘PaperlessBilling’
variable, which has very little impact on the clustering results
described below. As a result, these 4 variables are not re-
tained during the clustering stage below; only the informative,
influential and actionable variables are retained 3(see Section
II-E1).
C. Exploratory Analysis using a clustering
1) Clustering performed: The clustering performed is
usual: (i) we use the table of values calculated on the test
set, (ii) we learn a k-means with the L2 [30] distance for
different kvalues (k {2,12}), (iii) and finally we retain
the k-means whose value of kcorresponds to the ‘elbow
point’, here k= 4, [31] of the curve representing the global
reconstruction distance versus the value of k.
2) Results: The resulting clusters are shown in Figure 4.
An analysis of these 4 clusters, combined with the predictions
of the classifier, shows that:
Cluster 1 (10% of the global population and containing
2% of customers predicted to be churner): individuals
who can be made less churner mainly by means of
variable 3 (‘Contract’) - i.e. by trying to get them to take
out an annual contract (‘Twoyear’ or ‘OneYear’); NB -
this marketing action is fairly difficult to carry out.
Cluster 2 (24% of the global population and containing
no customer predicted to be churner): people who are
very insensitive to the fact that they are becoming less
churner (mostly negative means). They may not be
targeted by a ‘reactive’ marketing campaign (which is
in line with the classifier’s predictions), but rather by a
preventive campaign using the ‘contract’ variable or the
‘payment method’ variable (payment by card or direct
debit).
Cluster 3 (45% of the global population and containing
47% of customers predicted to be churner (almost all of
the individuals predicted to be churner)): some similari-
ties with the individuals in cluster 1 for the ‘Contract’
3All the variables could have been retained but the clustering would have
been biased by uninteresting variables from the point of view of creating
counterfactual examples
Fig. 4. Average profile of individuals in clusters represented as histograms. The names of the values on the abscissa refer to the number of the variables
and the number of the intervals (groups) described above. For example, ‘3I2’ refers to the third variable (’Contract’) and its second interval / group of values
(’Twoyear’). The ordinate values are the mean values of the cluster ().
variable. On the other hand, we can see that the 5th
(’OnlineSecurity’) and 8th (’TechSupport’) variables have
a ‘leverage effect’ in reducing churn. Offering a security
or support option is very attractive to these individuals.
Cluster 4 (21% of the population and containing no
customers predicted to be unfaithful): individuals who are
partially opposite to those in the first cluster, for example
for the ‘Contract’ variable, who should not be offered a
‘two-year contract’ in this case.
The analysis of the clusters obtained here is not exhaustive.
Indeed, it is an exploratory analysis where the data scientist
and the business expert will spend the time needed to refine
their joint analyses. However, the analysis carried out here
allows us to identify interesting ‘reactive’ actions to be taken
with individuals in cluster 3 or preventive actions with indi-
viduals in cluster 2.
D. Examples of trajectories
For this Telco problem, a trajectory could be a customer
going from “no churn” to “churn”, reciprocally from “churn”
to “no churn”, or staying no churner or churner. In this way, by
understanding the trajectories approaching churn (or no churn),
it would be possible to take preventive (reactive) action (see
Section II-E1) to slow down the trend towards churn (or to
“catch up” the customer).
The trajectories are presented as tables where the first line
presents the initial profile (values of his input variables) of
the customer. Then the following lines present each univariate
change (cell in yellow), each step, in this trajectory. The
number of lines differs from a customer to another one since
some variables may not influence the value and are therefore
not included in the table. The columns in the table are the
6 variables kept (over the 10 included in the model) in the
analysis (see Section IV-B).
We give here two representative trajectories extracted from
our ‘knowledge base’, one from “no churn” toward “churn”
and one from ”churn” to “no churn”:
Trajectory 1 - “no churn” toward “churn” : - In
this case a customer moves from a “no churn” situation
to another “no churn” situation but where he could be
closer to the border (see Table II). For this customer, 4
over the 6 variables allow to ‘walk’ toward the border.
At the end of the trajectory the customer remains ’loyal’
(see the last column which indicate the probability to be
a churner). The ‘customer’ in the last line of this table is
therefore the semi-factual of the original one. Detecting
InternetSer OnlineSecu TechSup Contract PaymentM TotalCharges Prob ’Yes’
DSL Yes No One year M-c 1889.5 0.07
DSL Yes No M-t-M M-c 1889.5 0.14
DSL No No M-t-M M-c 1889.5 0.17
DSL No No M-t-M M-c 80.275 0.28
Fiber optic No No M-t-M M-c 80.275 0.48
TABLE II
ILL UST RATI ON O F A SEM I-FAC TUA L - TRA JE CTO RY 1-“NO CHURNTOWARD CHURN (ABBREVIATIONS USED FOR COLUMN VALUES (FO R PLA CE
CONSIDERATIONS) : ’M-T-M’ FO R ’MONT H-T O-MONTH , ’M -CF OR ’M AILED CHECK’) , ABBREVIATIONS USED FOR COLUMN NAMES:
INTERNETSER =INTERNETSE RVIC E, O NLINESEC U=O NLINESECURITY, T EC HSUP =TECH SUP PO RT, PAY ME NTM=PAYME NT METH OD
InternetSer OnlineSecu TechSup Contract PaymentM TotalCharges Prob ’Yes’
Fiber optic No No M-t-M E-c 80.275 0.59
Fiber optic No No M-t-M B-t 80.275 0.50
Fiber optic No No M-t-M B-t 5036.3 0.33
TABLE III
ILL UST RATI ON O F A COU NT ERFA CTAUL - TRAJ EC TORY 2-“CHURNT O NO CHURN (ABBREVIATIONS USED FOR COLUMN VALUES (F OR P LAC E
CONSIDERATIONS) : ’M-T-M’ FO R ’MONT H-T O-MONTH , ’B -TF OR ’BANK TRANSFER’, E- C FO R ELE CTR ONI C CH ECK ), ABBREVIATIONS USED FOR
COLUMN NAMES: INTERNETSER =INTERNETSERVI CE, O NLINESEC U=ONLINESECURITY, TECH SUP= TE CH SUPP ORT, PAYME NT M=PAY MEN TME THO D
his movements towards the border can lead to preventive
action.
Trajectory 2 - “churn” to “no churn” : - In this
case a customer moves from a “churn” situation to a “no
churn” situation; where he could cross the border (see
Table III). For this customer, 2 over the 6 variables allow
him to become a no churner. At the end of the trajectory
the customer becomes ’loyal’ (see the last column which
indicate the probability to be a churner). One see that
for him the variable which has the biggest impact is the
’PaymentMethod’ while the one with the lowest impact
is the ’TotalCharges’. It could be surprising that, for
this customer, a bigger value of TotalCharges results
in a lower probability to churn but an analysis of this
variable confirms this interaction. Indeed the probability
to churn when TotalCharges=80.275 (value belonging to
the interval [18.8;69.225]) is larger than the one when
TotalCharges=5036.3 (value belonging to the interval
]3086.8;7859]) after the preprocessing (see Section IV-A)
used by the classifier4. The ‘customer’ in the last line
of this table is a counterfactual of the original one. The
yellow values are pieces of information to realize reactive
actions.
Note: a future work will be to check whether the examples
in the rows of the Tables II,III, which are semifactuals or
counterfactuals, belong to the density of the training examples;
i.e their P(X)is not outside (or too lower than) the values,
P(X), of the ones of factual examples. In the case of Na¨
ıve
Bayes Classifier P(X)could be simply be computed by
P(X) = PK
j=1(P(Cj)QiP(Xi|Cj)Wi)(the denominator of
the equation 1).
V. CONCLUSION
In the context of methods for explaining the results of a
machine learning model, this article has proposed to consider
4We do not give here all the statistics of the dataset but the reader may
compute them easily by himself
the process of generating counterfactual examples as a source
of knowledge that can be stored and then exploited in different
ways. This process has been illustrated in the case of additive
models and in particular in the case of the naive Bayes
classifier, whose interesting properties for this purpose have
been shown. We have also suggested the quantities that can
be stored and the different ways of exploiting them. Some of
the results have been illustrated on a churn problem, but the
approach is equally exploitable in other application domains
as medical domain. One perspective of the paper could be to
clarify how it differs from traditional ontology-based systems
and outlining its unique features and advantages in the context
of machine learning explainability; for example focusing on
the semantic aspects of the provided knowledge base.
REFERENCES
[1] F. Bodria, F. Giannotti, R. Guidotti, F. Naretto, D. Pedreschi, and
S. Rinzivillo, “Benchmarking and survey of explanation methods for
black box models,” Data Mining and Knowledge Discovery, vol. 37,
no. 5, p. 1719–1778, 2023.
[2] W. Saeed and C. Omlin, “Explainable ai (xai): A systematic meta-
survey of current challenges and future opportunities, Knowledge-Based
Systems, vol. 263, p. 110273, 2023.
[3] G. I. Allen, L. Gan, and L. Zheng, “Interpretable machine learn-
ing for discovery: Statistical challenges & opportunities, Arxiv
preprint:2308.01475, 2023.
[4] T. Miller, “Explanation in artificial intelligence: Insights from the social
sciences,” Artificial Intelligence, vol. 267, pp. 1–38, 2019.
[5] I. Stepin, J. M. Alonso, A. Catala, and M. Pereira-Fari˜
na, “A survey
of contrastive and counterfactual explanation generation methods for
explainable artificial intelligence,” IEEE Access, vol. 9, pp. 11 781–
11 803, 2021.
[6] S. Wachter, B. Mittelstadt, and C. Russell, “Counterfactual explanations
without opening the black box: Automated decisions and the gdpr,
Harvard Journal of Law and Technology, vol. 31, no. 2, pp. 841–887,
2018.
[7] V. Lemaire, C. Hue, and O. Bernier, Data Mining in Public and Private
Sectors: Organizational and Government Applications. IGI Global,
2010, ch. Correlation Analysis in Classifiers, pp. 204–218.
[8] ——, “Correlation explorations in a classification model,” in Workshop
Data Mining Case Studies, SIGKDD, 2009, p. 126.
[9] S. Aryal and M. T. Keane, “Even if explanations: Prior work, desiderata
& benchmarks for semi-factual xai,” arXiv, 2023. [Online]. Available:
https://arxiv.org/abs/2301.11970
[10] M. Pawelczyk, K. Broelemann, and G. Kasneci, “Learning model-
agnostic counterfactual explanations for tabular data, in Proceedings
of The Web Conference (WWW’20), 2020, pp. 3126–3132.
[11] A. Van Looveren and J. Klaise, “Interpretable counterfactual explana-
tions guided by prototypes,” in Proceedings of the European Confer-
ence on Machine Learning and Knowledge Discovery in Databases
(ECML/PKDD), 2021, pp. 650–665.
[12] V. Guyomard, F. Fessant, and T. Guyet, “VCNet: A self-explaining
model for realistic counterfactual generation,” in Proceedings of the
European Conference on Machine Learning and Principles and Practice
of Knowledge Discovery in Databases (ECML/PKDD), 2022, pp. 437–
453.
[13] B. Ustun, A. Spangher, and Y. Liu, “Actionable recourse in linear classi-
fication,” in Proceedings of the conference on Fairness, Accountability,
and Transparency (FAccT), 2019, pp. 10–19.
[14] R. Poyiadzi, K. Sokol, R. Santos-Rodriguez, T. De Bie, and P. Flach,
“Face: feasible and actionable counterfactual explanations, in Proceed-
ings of the AAAI/ACM Conference on AI, Ethics, and Society, 2020, pp.
344–350.
[15] D. Brughmans, P. Leyman, and D. Martens, “Nice: an algorithm for
nearest instance counterfactual explanations, arXiv, vol. v2, 2021.
[Online]. Available: https://arxiv.org/abs/2104.07411
[16] S. Wachter, B. D. Mittelstadt, and C. Russell, “Counterfactual expla-
nations without opening the black box: Automated decisions and the
GDPR,” Harvard Journal of Law and Technology, vol. 31, no. 2, pp.
841–887, 2018.
[17] V. Guyomard, F. Fessant, T. Guyet, T. Bouadi, and A. Termier, “Gen-
erating robust counterfactual explanations, in European Conference on
Macine Learning, 2023.
[18] T. Miller, “Explanation in artificial intelligence: Insights from the social
sciences,” Artificial Intelligence, vol. 267, pp. 1–38, 2018.
[19] D. Nemirovsky, N. Thiebaut, Y. Xu, and A. Gupta, “Countergan: Gen-
erating counterfactuals for real-time recourse and interpretability using
residual gans,” in Conference on Uncertainty in Artificial Intelligence,
ser. Machine Learning Research. PMLR, 2022, pp. 1488–1497.
[20] D. J. Hand and K. Yu, “Idiot’s bayes-not so stupid after all?” Interna-
tional Statistical Review, vol. 69, no. 3, pp. 385–398, 2001.
[21] P. Langley and S. Sage, “Induction of selective bayesian classifiers,” in
Proceedings of the Tenth International Conference on Uncertainty in
Artificial Intelligence. San Francisco, CA, USA: Morgan Kaufmann
Publishers Inc., 1994, pp. 399–406.
[22] I. Guyon and A. Elisseeff, An introduction to variable and feature
selection,” J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003.
[23] M. Boull´
e, “Compression-based averaging of selective naive Bayes
classifiers,” Journal of Machine Learning Research, vol. 8, pp. 1659–
1685, 2007.
[24] Y. Yang and G. Webb, “A comparative study of discretization methods
for naive-bayes classifiers, in Proceedings of PKAW, vol. 2002, 04 2003.
[25] ——, “Discretization for naive-bayes learning: Managing discretization
bias and variance, Machine Learning, vol. 74, pp. 39–74, 09 2009.
[26] M. Boull´
e, “MODL: a Bayes optimal discretization method for contin-
uous attributes,” Machine Learning, vol. 65, no. 1, pp. 131–165, 2006.
[27] ——, “A Bayes optimal approach for partitioning the values of cate-
gorical attributes,” Journal of Machine Learning Research, vol. 6, pp.
1431–1452, 2005.
[28] Kaagle, “Telco customer churn dataset, 2023,
[https://www.kaggle.com/datasets/blastchar/telco-customer-churn],
last visited 08/22/2023.
[29] Khiops, “Github khiops,” 2023, [https://github.com/KhiopsML/khiops],
last visited 08/22/2023.
[30] J. A. Hartigan and M. A. Wong, A k-means clustering algorithm,”
JSTOR: Applied Statistics, vol. 28, no. 1, pp. 100–108, 1979.
[31] R. L. Thorndike, “Who belongs in the family?” Psychometrika,
vol. 18, pp. 267–276, 1953, the method can be traced to
speculation by Robert L. Thorndike. [Online]. Available: https:
//api.semanticscholar.org/CorpusID:120467216
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The rise of sophisticated black-box machine learning models in Artificial Intelligence systems has prompted the need for explanation methods that reveal how these models work in an understandable way to users and decision makers. Unsurprisingly, the state-of-the-art exhibits currently a plethora of explainers providing many different types of explanations. With the aim of providing a compass for researchers and practitioners, this paper proposes a categorization of explanation methods from the perspective of the type of explanation they return, also considering the different input data formats. The paper accounts for the most representative explainers to date, also discussing similarities and discrepancies of returned explanations through their visual appearance. A companion website to the paper is provided as a continuous update to new explainers as they appear. Moreover, a subset of the most robust and widely adopted explainers, are benchmarked with respect to a repertoire of quantitative metrics.
Article
Full-text available
In this paper we propose a new algorithm, named NICE, to generate counterfactual explanations for tabular data that specifically takes into account algorithmic requirements that often emerge in real-life deployments: (1) the ability to provide an explanation for all predictions, (2) being able to handle any classification model (also non-differentiable ones), (3) being efficient in run time, and (4) providing multiple counterfactual explanations with different characteristics. More specifically, our approach exploits information from a nearest unlike neighbor to speed up the search process, by iteratively introducing feature values from this neighbor in the instance to be explained. We propose four versions of NICE, one without optimization and, three which optimize the explanations for one of the following properties: sparsity, proximity or plausibility. An extensive empirical comparison on 40 datasets shows that our algorithm outperforms the current state-of-the-art in terms of these criteria. Our analyses show a trade-off between on the one hand plausibility and on the other hand proximity or sparsity, with our different optimization methods offering users the choice to select the types of counterfactuals that they prefer. An open-source implementation of NICE can be found at https://github.com/ADMAntwerp/NICE.
Article
Full-text available
A number of algorithms in the field of artificial intelligence offer poorly interpretable decisions. To disclose the reasoning behind such algorithms, their output can be explained by means of so-called evidence-based (or factual) explanations. Alternatively, contrastive and counterfactual explanations justify why the output of the algorithms is not any different and how it could be changed, respectively. It is of crucial importance to bridge the gap between theoretical approaches to contrastive and counterfactual explanation and the corresponding computational frameworks. In this work we conduct a systematic literature review which provides readers with a thorough and reproducible analysis of the interdisciplinary research field under study. We first examine theoretical foundations of contrastive and counterfactual accounts of explanation. Then, we report the state-of-the-art computational frameworks for contrastive and counterfactual explanation generation. In addition, we analyze how grounded such frameworks are on the insights from the inspected theoretical approaches. As a result, we highlight a variety of properties of the approaches under study and reveal a number of shortcomings thereof. Moreover, we define a taxonomy regarding both theoretical and practical approaches to contrastive and counterfactual explanation.
Chapter
Full-text available
Supervised classification can be effective for prediction but sometimes weak on interpretability or explainability (XAI). Clustering, on the other hand, tends to isolate categories or profiles that can be meaningful but there is no guarantee that they are useful for labels prediction. Predictive clustering seeks to obtain the best of the two worlds. Starting from labeled data, it looks for clusters that are as pure as possible with regards to the class labels. One technique consists in tweaking a clustering algorithm so that data points sharing the same label tend to aggregate together. With distance-based algorithms, such as k-means, a solution is to modify the distance used by the algorithm so that it incorporates information about the labels of the data points. In this paper, we propose another method which relies on a change of representation guided by class densities and then carries out clustering in this new representation space. We present two new algorithms using this technique and show on a variety of data sets that they are competitive for prediction performance with pure supervised classifiers while offering interpretability of the clusters discovered.
Article
Full-text available
There has been much discussion of the right to explanation in the EU General Data Protection Regulation, and its existence, merits, and disadvantages. Implementing a right to explanation that opens the black box of algorithmic decision-making faces major legal and technical barriers. Explaining the functionality of complex algorithmic decision-making systems and their rationale in specific cases is a technically challenging problem. Some explanations may offer little meaningful information to data subjects, raising questions around their value. Explanations of automated decisions need not hinge on the general public understanding how algorithmic systems function. Even though such interpretability is of great importance and should be pursued, explanations can, in principle, be offered without opening the black box. Looking at explanations as a means to help a data subject act rather than merely understand, one could gauge the scope and content of explanations according to the specific goal or action they are intended to support. From the perspective of individuals affected by automated decision-making, we propose three aims for explanations: (1) to inform and help the individual understand why a particular decision was reached, (2) to provide grounds to contest the decision if the outcome is undesired, and (3) to understand what would need to change in order to receive a desired result in the future, based on the current decision-making model. We assess how each of these goals finds support in the GDPR. We suggest data controllers should offer a particular type of explanation, unconditional counterfactual explanations, to support these three aims. These counterfactual explanations describe the smallest change to the world that can be made to obtain a desirable outcome, or to arrive at the closest possible world, without needing to explain the internal logic of the system.
Article
Full-text available
There has been a recent resurgence in the area of explainable artificial intelligence as researchers and practitioners seek to provide more transparency to their algorithms. Much of this research is focused on explicitly explaining decisions or actions to a human observer, and it should not be controversial to say that, if these techniques are to succeed, the explanations they generate should have a structure that humans accept. However, it is fair to say that most work in explainable artificial intelligence uses only the researchers' intuition of what constitutes a `good' explanation. There exists vast and valuable bodies of research in philosophy, psychology, and cognitive science of how people define, generate, select, evaluate, and present explanations. This paper argues that the field of explainable artificial intelligence should build on this existing research, and reviews relevant papers from philosophy, cognitive psychology/science, and social psychology, which study these topics. It draws out some important findings, and discusses ways that these can be infused with work on explainable artificial intelligence.
Article
New technologies have led to vast troves of large and complex data sets across many scientific domains and industries. People routinely use machine learning techniques not only to process, visualize, and make predictions from these big data, but also to make data-driven discoveries. These discoveries are often made using interpretable machine learning, or machine learning models and techniques that yield human-understandable insights. In this article, we discuss and review the field of interpretable machine learning, focusing especially on the techniques, as they are often employed to generate new knowledge or make discoveries from large data sets. We outline the types of discoveries that can be made using interpretable machine learning in both supervised and unsupervised settings. Additionally, we focus on the grand challenge of how to validate these discoveries in a data-driven manner, which promotes trust in machine learning systems and reproducibility in science. We discuss validation both from a practical perspective, reviewing approaches based on data-splitting and stability, as well as from a theoretical perspective, reviewing statistical results on model selection consistency and uncertainty quantification via statistical inference. Finally, we conclude by highlighting open challenges in using interpretable machine learning techniques to make discoveries, including gaps between theory and practice for validating data-driven discoveries. Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Article
The past decade has seen significant progress in artificial intelligence (AI), which has resulted in algorithms being adopted for resolving a variety of problems. However, this success has been met by increasing model complexity and employing black-box AI models that lack transparency. In response to this need, Explainable AI (XAI) has been proposed to make AI more transparent and thus advance the adoption of AI in critical domains. Although there are several reviews of XAI topics in the literature that have identified challenges and potential research directions of XAI, these challenges and research directions are scattered. This study, hence, presents a systematic meta-survey of challenges and future research directions in XAI organized in two themes: (1) general challenges and research directions of XAI and (2) challenges and research directions of XAI based on machine learning life cycle’s phases: design, development, and deployment. We believe that our meta-survey contributes to XAI literature by providing a guide for future exploration in the XAI area.
Article
Explainable Machine Learning (ML) is an emerging field of Artificial Intelligence that has gained popularity in the last decade. It focuses on explaining ML models and their predictions, enabling people to understand the rationale behind them. Counterfactuals and semifactuals are two instances of Explainable ML techniques that explain model predictions using other observations. These techniques are based on the comparison between the observation to be explained and another one. In counterfactuals, their prediction is different, and in semifactuals, it is the same. Both techniques have been studied in the Social Sciences and Explainable ML communities, and they have different use cases and properties. In this paper, the Explanation Set framework, an approach that unifies counterfactuals and semifactuals, is introduced. Explanation Sets are example-based explanations defined in a neighborhood where most observations satisfy a grouping measure. The neighborhood allows defining and combining restrictions. The grouping measure determines if the explanations are counterfactuals (dissimilarity) or semifactuals (similarity). Besides providing a unified framework, the major strength of the proposal is to extend these explanations to other tasks such as regression by using an appropriate grouping measure. The proposal is validated in a regression and classification task using several neighborhoods and grouping measures.