Content uploaded by Gyunam Park

Author content

All content in this area was uploaded by Gyunam Park on May 31, 2023

Content may be subject to copyright.

Explainable Predictive

Decision Mining for Operational Support

Gyunam Park1, Aaron K¨usters2, Mara

Tews2, Cameron Pitsch2, Jonathan Schneider2, and Wil M. P. van der Aalst1

1Process and Data Science Group (PADS), RWTH Aachen University, Germany

{gnpark,wvdaalst}@pads.rwth-aachen.de

2RWTH Aachen University, Germany

{aaron.kuesters,mara.tews,cameron.pitsch,lennart.schneider}@rwth-aachen.de

Abstract.

Several decision points exist in business processes (e.g.,whether

a purchase order needs a manager’s approval ornot), and diﬀerentdecisions

are made for diﬀerent process instances based on their characteristics (e.g.,

a purchase order higher than

e

500 needs a manager approval). Decision

mining in process mining aims to describe/predict the routing of a process

instance at a decision point of the process. By predicting the decision, one

can take proactive actions to improve the process. For instance, when a

bottleneck is developing in one of the possible decisions, one can predict

the decision and bypass the bottleneck. However, despite its huge potential

for such operational support, existing techniques for decision mining have

focused largely on describing decisions but not on predicting them, deploy-

ing decision trees to produce logical expressions to explain the decision. In

this work, we aim to enhance the predictive capability of decision mining

to enable proactive operational support by deploying more advanced ma-

chine learning algorithms. Our proposed approach provides explanations

of the predicted decisions using SHAP values to support the elicitation of

proactive actions. We have implemented a Web application to support the

proposed approach and evaluated the approach using the implementation.

Keywords: Process Mining ·Decision Mining ·Machine Learning ·Op-

erational Support ·Proactive Action

1 Introduction

A process model represents the control-ﬂow of business processes, explaining the

routing of process instances. It often contains decision points, e.g., XOR-split gate-

way in BPMN. The routing in suchdecision points depends on the data attribute of

the process instance. For instance, in a loan application process, the assessment of

a loan application depends on the amount of the loan, e.g., if the amount is higher

than e5000, it requires advanced assessment and, otherwise, simple assessment.

Decision mining in process mining aims to discover a decision model that rep-

resents the routing in a decision point of a business process [8]. The discovered

decision model can be used for 1) describing how decisions have been made and 2)

2 Park et al.

predicting how decisions will be made for future process instances. While the focus

has been on the former in the literature, the latter is essential to enable proactive

actions to actually improve business processes [13]. Imagine we have a bottleneck in

advanced assessment due to, e.g., the lack of resources. By predicting the decision

of a future loan application, we can take proactive action (e.g., suggesting to lower

the loan amount to conduct simple assessment), thus facilitating the process.

To enable such operational support, a decision model needs to be both 1)

predictive (i.e., the model needs to provide reliable predictions on undesired/ deci-

sions) and 2) descriptive (i.e., domain experts should be able to interpret how the

decision is made to elicit a proactive action). Fig. 1 demonstrates these require-

ments. Fig. 1(a) shows a decision point in a loan application process, and there is

a bottleneck in advanced assessment. Our goal is to accurately predict that a loan

application with the amount of

e

5500 and interest of 1

.

5% needs advanced assess-

ment, whichis undesired due to the bottleneck, and recommend actions to avoid the

bottleneck. Fig. 1(b) shows four diﬀerent scenarios. First, if we predict a desired de-

cision (i.e., predicting the simple assessment),no action is required since the simple

assessment has no operational issues. Second, if we predict an undesired prediction

incorrectly (e.g., incorrectly predicting the advanced assessment), we recommend

an inadequate action. Third, if we predict the undesired decision correctly but no

explanations are provided, no action can be elicited due to the lack of explanations.

Finally, if we predict the undesired decision, and the corresponding explanations

are provided (e.g., the amount/interest of the loan has a positive/negative eﬀect on

the probability of conducting the advanced assessment, respectively), we can come

up with relevant actions (e.g., lowering the amount or increasing the interest rate).

Fig. 1: (a) Decision point in a process model. (b) Diﬀerent Scenarios showing that

decision mining needs to be predictive and descriptive to enable operational support.

Existing work has focused on the descriptive capability of decision models by de-

ploying highly interpretable machine learning algorithms such as decision trees [8,

11,15]. However, it leads to limited predictive capability due to the limitation of deci-

sion trees, such as overﬁtting and instability (i.e., adding a new data point results in

regeneration of the overall tree) [16]. In this work, we aim to enhance the predictive

capabilities of decision mining, while providing explanations of predicted decisions.

To this end, we estimate the decision model by using machine learning algorithms

Explainable Predictive Decision Mining 3

such as support vector machines, random forests, and neural networks. Next, we

produce explanations of the prediction of the decision model by using SHAP values.

We have implemented the approach as a standalone web application. Using

the implementation, we have evaluated the accuracy of predicted decisions using

real-life event logs. Moreover, we have evaluated the reliability of explanations of

predicted decisions by conducting controlled experiments using simulation models.

This paper is structured as follows. First, we discuss related work on decision

mining and explainability in Sec. 2. Next, we introduce process models and event

logs in Sec. 3. In Sec. 4, we provide our proposed approach. In Sec. 5, we explain

the implementation of a web application based on the approach. Sec. 6 evaluates

the approach based on the implementation using simulated and real-life event logs.

We conclude this paper in Sec. 7.

2 Related Work

Several approaches have been proposed to learn decision models from event logs.

Rozinat et al. [15] suggest a technique based on Petri nets. It discovers a Petri net

from an event log, identiﬁes decision points, and employs classiﬁcation techniques

to determine decision rules. De Leoni et al. [8] extend [15] by dealing with invisible

transitions of a Petri net and non-conforming process instances using alignments.

These methods assume that decision-making is deterministic and all factors af-

fecting decisions exist in event logs. To handle non-determinism and incomplete

information, Mannhardt et al. [11] propose a technique to discover overlapping

decision rules. In [2], a framework is presented to derive decision models using

Decision Model and Notation (DMN) and BPMN. All existing approaches deploy

decision trees due to their interpretability. To the best of our knowledge, no ad-

vanced machine learning algorithms have been deployed to enhance the predictive

capabilities of decision models along with explanations.

Although advanced machine learning approaches provide more accurate pre-

dictions compared to conventional white-box approaches, they lack explainability

due to their black-box nature. Recently, various approaches have been proposed to

explain such black-box models. Gilpin et al. [4] provide a systematic literature sur-

vey to provide an overview of explanation approaches. The explanation approaches

are categorized into global and local methods. First, global explanation approaches

aim to describe the average behavior of a machine learning model by analyzing

the whole data. Such approaches include Partial Dependence Plot (PDP) [6], Ac-

cumulated Local Eﬀects (ALE) Plot [1], and global surrogate models [3]. Next,

local explanation approaches aim to explain individual predictions by individually

examining the instances. Such approaches include Individual Conditional Expec-

tation (ICE) [5], Local Surrogate (LIME) [14], and Shapley Additive Explanations

(SHAP) [10]. In this work, we use SHAP to explain the predictions produced by

decision models due to its solid theoretical foundation in game theory and the

availability of global interpretations by combining local interpretations [10].

4 Park et al.

3 Preliminaries

Given a set

X

, we denote the set ofall multi-sets over

X

with

B

(

X

).

f↾X

is the func-

tion projected on

X

:

dom

(

f↾X

)=

dom

(

f

)

∩X

and

f↾X

(

x

)=

f

(

x

) for

x∈dom

(

f↾X

).

3.1 Process Models

Decision mining techniques are independent of the formalism representing process

models, e.g., BPMN, YWAL, and UML-activity diagrams. In this work, we use

Petri nets as the formalism to model the process.

First, a Petri net is a directed bipartite graph of places and transitions. A

labeled Petri net is a Petri net with the transitions labeled.

Deﬁnition 1

(Labeled Petri Net). Let

Uact

be the universe of activity names.

A labeled Petri net is a tuple

N=

(

P,T ,F,l

)with

P

the set of places,

T

the set of

transitions,

P∩T=∅

,

F⊆

(

P×T

)

∪

(

T×P

)the ﬂow relation, and

l∈T→ Uact

a

labeling function.

Fig. 2: An example of Petri nets highlighted with decision points

Fig. 2 shows a Petri net,

N1

= (

P1,T1, F1, l1

), where

P1

=

{p

1

,... , p

6

}

,

T1

=

{t

1

, ... , t

7

}

,

F1

=

{

(

p

1

, t

1)

,

(

t

1

, p

2)

, ... }

,

l1

(

t

1) =

Create purchase order

,

l1

(

t

2) =

Request standard approval, etc.

The state of a Petri net is deﬁned by its marking. A marking

MN∈B

(

P

) is a

multiset of places. For instance,

MN1

=[

p

1] represents a marking with a token in

p

1. A transition

tr ∈T

is enabled in marking

MN

if its input places contain at least

one token. The enabled transition may ﬁre by removing one token from each of the

input places and producing one token in each of the output places. For instance,

t1 is enabled in MN1and ﬁred by leading to M′

N1=[p2].

Deﬁnition 2

(Decision Points). Let

N=

(

P,T ,F,l

)be a labeled Petri net. For

p∈P

,

p•

=

{t∈T|

(

p,t

)

∈F}

denotes it outgoing transitions.

p∈P

is a decision

point if |p•|>1.

For instance, p2 is a decision point in N1since p2•={t2,t3}and |p2•| >1.

3.2 Event Logs

Deﬁnition 3

(Event Logs). Let

Uevent

be the universe of events,

Uattr

the uni-

verse of attribute names (

{case,act,time,res} ⊆ Uattr

), and

Uval

the universe of

attribute values. An event log is a tuple

L

= (

E,π

)with

E⊆Uevent

as the set of

events and π∈E→(Uattr ↛ Uval )as the value assignments of the events.

Explainable Predictive Decision Mining 5

Table 1: An Example of event logs

case id activity timestamp resource total-price vendor

PO92 Create Purchase Order 09:00 05.Oct.2022 Adams 1000 Apple

PO92 Request Standard Order 11:00 07.Oct.2022 Pedro 1000 Apple

PO93 Create Purchase Order 13:00 07.Oct.2022 Peter 1500 Samsung

. . . . . . . . . . . . . . . . . .

Table 1 shows a part of an event log

L1

=(

E1,π1

).

e1∈E1

represents the event

in the ﬁrst row, i.e.,

π1

(

e1

)(

case

) =

PO92

,

π1

(

e1

)(

act

) =

Create Purchase Order

,

π1

(

e1

)(

time

) =

09:00 05.Oct.2022

,

π1

(

e1

)(

res

) =

Adams

,

π1

(

e1

)(

total-price

) = 1000,

and π1(e1)(vendor)= Apple.

4 Explainable Predictive Decision Mining

In this section, we introduce an approach to explainable predictive decision mining.

As shown in Fig. 3, the proposed approach consists of two phases: oﬄine and

online phases. The former aims to derive decision models of decision points, while

the latter aims at predicting decisions for running process instances along with

explanations. In the oﬄine phase, we compute situation tables based on historical

event logs and estimate decision models using the situation tables. In the online

phase, we predict decisions for ongoing process instances and explain the decision.

Fig. 3: An overview of the proposed approach

4.1 Oﬄine Phase

First, we compute situation tables from event logs. Each record in a situation

table consists of features (e.g., total price of an order) and a decision in a decision

point (e.g.,

t

2 at decision point

p

2 in Fig. 2), describing how the decision has been

historically made (e.g., at decision point

p

2 in Fig. 2, standard approval (i.e.,

t

2)

was performed when the total price of an order is e1000).

Deﬁnition 4

(Situation Table). Let

Ufeature

be the universe of feature names

and

Ufmap

=

Ufeature ↛ Uval

the universe of feature mappings. Let

N=

(

P,T,F, l

)

be a labeled Petri net and

p∈P

a decision point.

sitp∈UL→ B

(

Ufmap ×p•

)maps

event logs to situation tables (i.e., multi-sets of feature mappings and decisions).

Sp={sitp(L)|L∈UL}denotes the set of all possible situation tables of p.

6 Park et al.

Fig. 4: An example of the proposed approach

The table in Fig. 4(a) represents a situation table of

p

2 in Fig. 2 derived from the

event log depictedin Table 1. Forinstance, the ﬁrstrow in Fig. 4(a) describes that re-

quest standard approval (

t

2) was executed when human resource Adams performed

create purchase order (i.e., res-CPO) for the order of

e

1000 (i.e., total-price) with

Apple (i.e., vendor). Formally,

s1

=(

fmap1,t

2)

∈sitp2

(

L

1) where

fmap1∈Ufmap

such that

fmap1

=

{

(

res-CPO,Adams

)

,

(

vendor,Apple

)

,

(

total-price,

1000)

}

. Note

that,

s1

corresponds to event

e2

in Table 1 and

fmap1

is derived from all historical

events of P O92.

A decision model provides the likelihood of each transition in a decision point

based on a given feature, e.g., when the total price of an order (i.e., feature) is

e

1800, standard approval will be performed with the likelihood of 0

.

2 and manager

approval with the likelihood of 0.8.

Deﬁnition 5

(Decision Model). Let

N=

(

P,T , F,l

)be a labeled Petri net and

p∈P

a decision point. Let

dmapp∈p• →

[0

,

1] be a decision mapping that

maps decisions to likelihoods such that the sum of all likelihoods adds up to 1,

i.e.,

Σp′∈p•dmapp

(

p′

) = 1.

Dp

denotes the set of all possible decision mappings.

DMp∈Ufmap →Dp

is the set of all possible decision models of

p

that map feature

mappings to decision mappings.

We estimate decision models based on situation tables.

Deﬁnition 6

(Estimating Decision Models). Let

N=

(

P,T ,F,l

)be a labeled

Petri net and

p∈P

a decision point.

estimatep∈Sp→DMp

is a function

estimating a decision model from a situation table.

The estimation function can be built using many machine learning algorithms

such as neural networks, support vector machines, random forests, etc.

4.2 Online Phase

Using the decision model derived from the oﬄine phase, we predict the decision of a

running process instance and explain the prediction. Using the feature of a running

Explainable Predictive Decision Mining 7

process instance depicted in Fig. 4(b), a decision model may produce the prediction

shown in Fig. 4(c), leading to the ﬁnaldecision of request manager approval that has

the highest likelihood. Next, we compute an explanation for the decision (i.e.,the ef-

fect of eachfeature on the prediction) as shown in Fig. 4(d), e.g., total-price has a pos-

itive eﬀect of 0

.

6 while vendor has a negative eﬀect of 0

.

2. In other words, total-price

increases the likelihood of predicting the decision of request manager approval by

the magnitude of 0

.

6 and vendor decreases it by the magnitude of 0

.

2, respectively.

In this work, we use SHAP values [10] to provide explanations of decisions.

SHAP values are based on Shapley values. The concept of Shapley values comes

from game theory. It deﬁnes two elements: a game and some players. In the context

of predictions, the game is to reproduce the outcome of the model, and the players

are the features used to learn the model. Intuitively, Shapley values quantify the

amount that each player contributes to the game, and SHAP values quantify the

contribution that each feature brings to the prediction made by the model.

Deﬁnition 7

(Explaining Decisions). Let

fmap ∈Ufmap

be a feature map-

ping and

F

=

{f1,f2, ... , fi, ... }

=

dom

(

fmap

)denote the domain of

fmap

. Let

N=

(

P,T,F, l

)be a labeled Petri net,

p∈P

a decision point, and

dmp

a decision

model. Let

t∈p•

be a target transition. The SHAP value of feature

fi

for predicting

tis deﬁned as:

ψt

fi=X

F′⊆F\{fi}

|F′|!(|F|−|F′|−1)!

|F|!(dmp(fmap↾F′∪{fi})(t)−dmp(f map↾F′)(t))

For

fmap

,

expdmp,t

(

fmap

)=

{

(

f1,ψt

f1

)

,

(

f2,ψt

f2

)

,...}

is the explanation of predicting

tusing dmp.

As shown in Fig. 4(d), for feature mapping

fmap′

described in Fig. 4(b), the

explanation of predicting

t

3 (i.e., request manager approval) using decision model

dm′

p2

is

expdm′

p2,t3

(

fmap′

) =

{

(

total-price,

0

.

6)

,

(

vendor,−

0

.

2)

}

. In other words,

total-price has a positive eﬀect with the magnitude of 0

.

6 on the decision of

t

3 and

vendor has a negative eﬀect with the magnitude of 0.2.

Moreover, we can provide a global explanation of a decision model by aggre-

gating SHAP values of multiple running instances. For instance, by aggregating

all SHAP values of total-price for predicting

t

3, e.g., with the mean absolute value,

we can compute the global eﬀect of total price to the prediction.

5 Implementation

We have implemented a Web application to support the explainable decision min-

ing with a dedicated user interface. Source code and user manuals are available at

https://github.com/aarkue/eXdpn

. The application comprises three functional

components as follows.

Discovering Process Models. This component supports the discovery of pro-

cess models based on inductive miner [7]. The input is event data of the standard

XES. The discovered accepting Petri net is visualized along with its decision points.

8 Park et al.

Decision Mining. This component supports the computation of situation tables

from event logs and the estimation of decision models from the computed situation

table. First, it computes situation tables with the following three types of features:

–

Case features: Case features are on a case-level and used for predicting all

decisions related to that case.

–

Event features: Event features are speciﬁc to an event and used for predicting

decisions after the occurrence of the event.

–

Performance features: Performance features are derived from the log. It in-

cludes elapsed time of a case (i.e., time duration since the case started) and

time since last event (i.e., time duration since the previous event occurred).

Next, the estimation of decision models uses the following machine learning

algorithms: Random Forests,XGBoost,Support Vector Machines (SVMs), and

Neural Networks.

Visualizing Decisions and Explanations This component visualizes the F1

score of diﬀerent machine learning algorithms and suggests the best technique based

on the score. Moreover, it visualizes the explanation of the decision both at local

and global levels. Local explanations are visualized with force plot (cf. Fig. 5(a)),

decision plot (cf. Fig. 5(b)), and beeswarm plot (cf. Fig. 5(c)), whereas global

explanations are visualized with bar plot (cf. Fig. 5(d)), force plot (cf. Fig. 5(e)), and

beeswarm plot (cf. Fig. 5(f )). We refer readers to [9] for the details of diﬀerent plots.

6 Evaluation

In this section, we evaluate the approach by conducting experiments using the

implementation. Speciﬁcally, we are interested in answering the following research

questions.

–

RQ1: Does the advanced machine learning algorithm eﬃciently predict the

decisions?

–RQ2: Does the approach provides reliable explanations for the predictions?

6.1 RQ1: Prediction Accuracy

In order to answer RQ1, we conductexperiments using real-life event logs: Business

Process Intelligence Challenge (BPIC) 2012

3

and BPIC 2019

4

. For each event log,

we ﬁrst discover a process model and determine decision points. Then we estimate

diﬀerent decision models for each decision point and compare the performance of

the decision models using 5-fold cross-validation. To measure the performance of

the decision model, we use F1 scores. Each model is instantiated with suitable,

event-log-speciﬁc parameters, which have largely been obtained from a parameter

grid search on each decision point as well as manual test runs. For decision tree

algorithms, we apply pruning steps to avoid too many splits that result in decision

trees harder to interpret in practice due to their complexity.

3doi:10.4121/uuid:3926db30-f712- 4394-aebc- 75976070e91f

4doi.org/10.4121/uuid:d06aff4b-79f0- 45e6-8ec8- e19730c248f1

Explainable Predictive Decision Mining 9

Fig. 5: Local explanations: (a) Force plot, (b) Decision plot, and (c) Beeswarm plot

explain how the model arrived at the decision of a running instance (i.e., request manager

approval with the likelihood of 0

.

98). For instance, (a) visualizes the positive (red-colored)

and negative (blue-colored) features with increasing magnitudes. Global explanations:

(d) Bar plot, (e) Beeswarm plot, and (f) Force plot explain how the model arrived at the

decision of all running instances (both on request standard approval and request manager

approval). For instance, (d) visualizes the mean absolute SHAP value for each feature on

predicting request standard approval (blue-colored bar) and request manager approval

(red-colored bar), showing that total-price has the highest impact on both predictions.

Table 2 shows the F1 score of diﬀerent machine learning algorithms in diﬀerent

real-life event logs

5

. The top two scores for each decision point are highlighted

5

The experimental results are reproducible in

https://github.com/aarkue/eXdpn/tree/main/

quantitative_analysis along with the corresponding process model.

10 Park et al.

Table 2: F1 scores of applying diﬀerent machine learning algorithms in diﬀerent decision

points. The bold font shows the top two results in each decision point.

Event Logs BPI Challenge 2012 (only Oﬀers) BPI Challenge 2019 (ﬁltered)

Decision point p4 p6 p12 p14 p16 p19 p3 p4 p8 p11

Algorithms

Decision Tree 0.6888 0.7545 0.7955 0.9633 0.9612 0.9263 0.9555 0.9948 0.8135 1.0000

XGBoost 0.7189 0.7897 0.8004 0.9697 0.9612 0.9407 0.9632 0.9948 0.8293 1.0000

Support Vector Machine 0.7151 0.7799 0.8023 0.9701 0.9612 0.9414 0.9649 0.9950 0.8096 0.9997

Neural Network 0.725 0.8048 0.7955 0.9698 0.9607 0.9317 0.9583 0.9981 0.8191 0.9949

with bold fonts. XGBoost shows good scores for all decision points except

p

14 in

BPIC 2012 and

p

4 in BPIC 2019. The scores for Support Vector Machine belong

to the top two scores for most of the decisions except

p

4 and

p

6 in BPIC 2012 and

p

8 and

p

11 in BPIC 2019, whereas the ones for Neural Network belong to the top

two scores in

p

4,

p

6 and

p

14 in BPIC 2012 and

p

4 and

p

8 in BPIC 2019. Decision

Tree shows the top two scores only for p16 in BPIC 2012 and p11 in BPIC 2019.

6.2 RQ2: Reliability of Explanations

To answer RQ2, we design a simulation model to simulate a Purchase-To-Pay

(P2P) process using CPN tools [12]. The simulation model allows us to fully deﬁne

the decision logic of decision points. Based on the decision logic, we qualitatively

evaluate if the generated explanation is reliable. Fig. 6 shows the Petri net dis-

covered using inductive miner [7] from an event log generated by the simulation

model, with highlighted decision points. Decision point (c) describes the decision

of whether the purchase order is held at customs or not. The decision logic in the

simulation model is as follows: If 1) a purchase order originates from outside the

EU and 2) the base price per item is higher than

e

50, the order is held at customs.

Fig. 6: Petri net discovered from the simulated P2P event logs

The beeswarm plot in Fig. 7(a) explains the decision at decision point (c). The

Non-EU origin (high value of

origin Non EU

) has a strong positive impact on the

probability of being held at customs according to the decision model. Moreover,

the existence of items in category

Odds and Ends

, which have low base prices, has

a negative impact on the probability, whereas the existence of items in category

Electronics

, which have high base prices, has a positive impact on the proba-

bility. When the individual product names, categories, and vendors are excluded

(see Fig. 7b), the four most impactful features that remain are exactly the ones

used in the logic of the underlying simulation model: The EU or Non-EU origin,

the total price and the number of items in the order. Overall the decision logic as

interpretable through the plots corresponds to the underlying logic applied in the

simulation model, showing that the explanation obtained is reliable.

Explainable Predictive Decision Mining 11

(a) Beeswarm plot visualizing the impact of high or low feature values on the model

probability of being held at customs. The Non-EU origin (high value of

origin Non

EU) has a strong positive impact on the probability of being held at customs.

(b) Bar plot visualizing the mean absolute SHAP value of each selected feature,

per output class

Fig. 7: Qualitative Analysis showing the explanation plots of decision point (c) using

a neural network model.

7 Conclusions

In this paper, we proposed an approach to explainable predictive decision mining.

In the oﬄine phase of the approach, we derive decision models for diﬀerent decision

points. In the online phase, we predict decisions for running process instances

with explanations. We have implemented the approach as a web application and

evaluated the prediction accuracy using real-life event logs and the reliability of

explanations with a simulated business process.

This paper has several limitations. First, the explanation generated by the

proposed approach is less expressive than the logical expression generated by tradi-

tional decision mining techniques. Also, we abstract from the deﬁnition of features

that can be used to construct the situation tables, focusing on explaining several

possible features in the implementation. In future work, we plan to extend the

approach with a taxonomy of features to be used for the comprehensive construc-

12 Park et al.

tion of situation tables. Moreover, we plan to connect the explainable predictive

insights to actual actions to improve the process.

Acknowledgment

The authors would like to thank the Alexander von Humboldt (AvH) Stiftung for

funding this research.

References

1.

Apley, D.W., Zhu, J.: Visualizing the eﬀects of predictor variables in black box

supervised learning models. CoRR abs/1612.08468 (2016)

2.

Bazhenova, E., Weske, M.: Deriving decision models from process models by

enhanced decision mining. In: Reichert, M., Reijers, H.A. (eds.) BPM Workshop

2015. vol. 256, pp. 444–457. Springer (2015)

3.

Frosst, N., Hinton, G.E.: Distilling a neural network into a soft decision tree. CoRR

abs/1711.09784 (2017)

4.

Gilpin, L.H., Bau, D., Yuan, B.Z., Bajwa, A., Specter, M.A., Kagal, L.: Explaining

explanations: An approach to evaluating interpretability of machine learning. CoRR

abs/1806.00069 (2018)

5.

Goldstein, A., Kapelner, A., Bleich, J., Pitkin, E.: Peeking inside the black box:

Visualizing statistical learning with plots of individual conditional expectation.

Journal of Computational and Graphical Statistics 24(1), 44–65 (2015)

6.

Greenwell, B.M., Boehmke, B.C., McCarthy,A.J.: A simple and eﬀective model-based

variable importance measure. CoRR abs/1805.04755 (2018)

7.

Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured

process models from event logs - A constructive approach. In: Colom, J.M., Desel,

J. (eds.) PETRI NETS 2013. vol. 7927, pp. 311–329. Springer (2013)

8.

de Leoni, M., van der Aalst, W.M.P.: Data-aware process mining: discovering

decisions in processes using alignments. In: Shin, S.Y., Maldonado, J.C. (eds.) 28th

Annual ACM Symposium on Applied Computing. pp. 1454–1461. ACM (2013)

9.

Lundberg, S.: Shap library documentation.

https://shap.readthedocs.io/en/

latest/index.html#, accessed: 05.Aug.2022

10.

Lundberg, S.M., Lee, S.: A uniﬁed approach to interpreting model predictions. In:

Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan,

S.V.N., Garnett, R. (eds.) NeurIPS 2017. pp. 4765–4774 (2017)

11.

Mannhardt, F., de Leoni, M., Reijers, H.A., van der Aalst, W.M.P.: Decision mining

revisited - discovering overlapping rules. In: Nurcan, S., Soﬀer, P., Bajec, M., Eder,

J. (eds.) CAiSE 2016. vol. 9694, pp. 377–392. Springer (2016)

12.

Park, G., van der Aalst, W.M.P.: Towards reliable business process simulation:

A framework to integrate ERP systems. In: Augusto, A., Gill, A., Nurcan, S.,

Reinhartz-Berger, I., Schmidt, R., Zdravkovic, J. (eds.) BPMDS 2021. pp. 112–127

13.

Park, G., van der Aalst, W.M.P.: Action-oriented process mining: bridging the gap

between insights and actions. Progress in Artiﬁcial Intelligence (2022)

14.

Ribeiro, M.T., Singh, S., Guestrin, C.: “why should I trust you?”: Explaining the

predictions of any classiﬁer. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal,

C.C., Shen, D., Rastogi, R. (eds.) 22nd SIGKDD. pp. 1135–1144. ACM (2016)

15.

Rozinat, A., van der Aalst, W.M.P.: Decision mining in ProM. In: Dustdar, S.,

Fiadeiro, J.L., Sheth, A.P. (eds.) BPM 2006. vol. 4102, pp. 420–425. Springer (2006)

16.

Safavian, S.R., Landgrebe, D.A.: A survey of decision tree classiﬁer methodology.

IEEE Trans. Syst. Man Cybern. 21(3), 660–674 (1991)