Content uploaded by Avigdor Gal

Author content

All content in this area was uploaded by Avigdor Gal on Dec 07, 2021

Content may be subject to copyright.

Discovering Hierarchical Processes

Using Flexible Activity Trees for Event Abstraction

Xixi Lu∗, Avigdor Gal†, and Hajo A. Reijers∗

∗Dept. of Information and Computing Sciences

Utrecht University, Utrecht, The Netherlands

x.lu@uu.nl, h.a.reijers@uu.nl

†Faculty of Industrial Engineering and Management

Technion – Israel Institute of Technology, Haifa, Israel

avigal@technion.ac.il

Abstract—Processes, such as patient pathways, can be very

complex, comprising of hundreds of activities and dozens of

interleaved subprocesses. While existing process discovery algo-

rithms have proven to construct models of high quality on clean

logs of structured processes, it still remains a challenge when

the algorithms are being applied to logs of complex processes.

The creation of a multi-level, hierarchical representation of a

process can help to manage this complexity. However, current

approaches that pursue this idea suffer from a variety of

weaknesses. In particular, they do not deal well with interleaving

subprocesses. In this paper, we propose FlexHMiner, a three-

step approach to discover processes with multi-level interleaved

subprocesses. We implemented FlexHMiner in the open source

Process Mining toolkit ProM. We used seven real-life logs to

compare the qualities of hierarchical models discovered using

domain knowledge, random clustering, and ﬂat approaches. Our

results indicate that the hierarchical process models that the

FlexHMiner generates compare favorably to approaches that do

not exploit hierarchy.

Index Terms—Automated Process Discovery; Process Mining;

Event Abstraction; Model Abstraction; Hierarchical Process

Discovery

I. INTRODUCTION

Complex processes often comprise of hundreds of activities

and dozens of subprocesses that run in parallel [1], [2]. For

example, in healthcare, a patient follows a certain process that

consists of several procedures (e.g., lab test and surgery) for

treating a medical condition. Studies have shown qualitatively

that using hierarchical process models to represent complex

processes can improve understandability and simplicity by

“hiding less relevant information” [3]. In particular, the use

of modularized, hierarchical process models, where process

activities are collected into subprocesses, can be presented in

a condensed higher-level presentation, revealing more details

upon request.

Process discovery, a prominent task of process mining, aims

at automatically constructing a process model from an event

log. Over the years, dozens of discovery algorithms have been

proposed [4]–[7]. These have proven to perform very well on

relatively clean logs of structured processes. However, given

the complexity of real-life processes, these algorithms tend to

discover overﬁtted or overgeneralized models that are difﬁcult

to comprehend [8]. Existing discovery algorithms tend to

disregard any hierarchical decomposition, whereas discovering

hierarchical process models may help decrease the complexity

of the discovery task.

Few hierarchical process discovery algorithms have been

proposed [9]–[14]. While some approaches, such as [12], aim

to automatically detect subprocesses, these algorithms are rigid

in their assumptions and require extensive knowledge encoding

for every event in the log regarding the causalities between the

activities. Other approaches assume that subprocesses do not

interleave [9], [14], which lead to discovering inaccurately,

overly segmented models (see Fig. 1). It is also unclear how

these approaches compare to conventional, ﬂat models in terms

of quality measures such as ﬁtness and precision.

In this work, we propose FlexHMiner (FH), a three-step

approach for the discovery of hierarchal models. We formalize

the concept of activity tree and event abstraction, which allows

us to be ﬂexible in the ways of computing the process hierar-

chy. We illustrate this ﬂexibility by proposing three different

techniques to discover an activity tree: (1) a fully domain-

based approach (DK-FH), (2) a random approach (RC-FH),

and (3) a fall-back, ﬂat activity tree (F-FH). After obtaining

an activity tree, the second step of our approach is to compute

the logs for each subprocess using log abstraction and log

projection. Finally, FlexHMiner discovers a subprocess model

for each subprocess by leveraging the capabilities of existing

discovery algorithms. Using the domain-based approach as

the gold standard and the ﬂat tree approach as base line, we

compare the three ways of discovering an activity tree using

seven real-life logs.

The main contribution of this work is a novel approach to

discover hierarchical process models from event logs, which

(1) can handle multi-level interleaving subprocesses and (2)

is ﬂexible in its data requirements so that it can adapt to

the situations when there is domain knowledge and when

there is not. We evaluated FlexHMiner using seven real-life,

benchmark event logs1.

In the remainder, we deﬁne the research problem in Sect. II.

The proposed approach is described in Sect. III. The evaluation

1The source code and the results can be found at: github.com/xxlu/prom-FlexHMiner

arXiv:2010.08302v1 [cs.DB] 16 Oct 2020

(a) The sequential subprocessses

discovered by SCM [14].

(b) The interleaving subprocesses

discovered by our DK-FH.

Fig. 1: Difference in the discovered root models for the

BPIC2012 log.

TABLE I: An example of an event log of a healthcare process.

EPatient Description Subprocess Act Timestamp

e1101 Visit Contact C Vi 10-10-2019

e2101 Calcium Labtest L Ca 11-10-2019

e3101 Register Contact C Re 12-10-2019

e4101 Glucose Labtest L Gl 13-10-2019

e5101 Consultation Contact C Cs 14-10-2019

e6101 Consultation Contact C Cs 15-10-2019

e7102 Register Contact C Re 16-10-2019

e8102 Glucose Labtest L Gl 17-10-2019

... ... ... ... ... ...

results are presented in Sect. IV. Sect. V discusses related

work, and Sect. VI concludes the paper .

II. ACTIVITY TRE E AN D PROCESS HIERARCHY

Preliminaries: Let σ=ha1,· · · , ani ∈ Σ∗be a sequence

of activities, which is also called a trace. Let a multiset

L⊆B(Σ∗)of traces be an event log. Tab. I shows a

list of events; Fig. 2(a) shows the sequential traces, in a

graphical representation, where each square represents an

event that is labeled with an activity. Given an event log,

a process discovery algorithm Dautomatically constructs a

process M(e.g., in Petri net notation). The quality of such

a model Mcan assessed with respect to the log Lusing

four quality dimensions: ﬁtness, precision, generalization, and

complexity [8].

To discover a hierarchical process model, we leverage a

rather known concept that represents the process hierarchical

information and call it activity tree. We deﬁne an activity tree

as the hierarchical relations between activities of a process.

Formally, an activity tree is a non-overlapping, hierarchical

clustering of activities.

Deﬁnition 1 (Activity tree).Let Σbe a set of activities and

Lan event log over Σ. Let Abe a superset of Σ, i.e., Σ⊂A.

Function γ:A→ P(A)is a mapping that maps each activity

x∈Ato a set of activities X⊂Aas the children of x. An

activity tree (A, γ)is valid for L, if and only if:

1) The children of any two labels do not overlap, i.e., for

each x, y ∈A,x6=y⇔γ(x)∩γ(y) = ∅.

2) The union of the leaves is Σ, i.e., the set of activities that

occurred in log L, i.e., {x∈A|γ(x) = ∅} = Σ.

3) The tree (A, γ )is connected, i.e., for each x∈A, either

there is y∈Asuch that x∈γ(y), or xis the root.

Note that constraints (2) and (3) imply that for all x∈A,

x /∈γ(x). Furthermore, it is also possible to deﬁne an activity

tree as a directed acyclic graph where each node represents

an activity and has only one incoming edge, except the root

node, which does not have any incoming edge.

Example 1. Fig. 2(b) exempliﬁes the activity tree (A, γ),

where A={root,C,L,S,Vi,Cs,Re,Ca,Gl,Cr,Or,

Pr,Op}, and, for example, γ(C) = {Vi,Cs,Re}.

We call the node that has no parent in γthe root node, and

the corresponding process the root process. For each a∈A,

γ(a)6=∅, we call it a (sub)process awith γ(a)as its activities.

We deﬁne the height of a node in the activity tree, which is

later used to recursively compute the abstracted logs bottom-

up. Let height :A→N0be a function that maps each x∈A

to the height (a non-negative integer) of xin the activity tree

(A, γ). For each x∈A, if γ(x) = ∅, then height(x) = 0,

else height(x) = 1 + Max c∈γ(x)height(c). A process model

with a maximal height of an activity tree is 1, is called ﬂat.

III. THE FL EX HMIN ER ALGORITHM

In this section, we discuss the three steps of the FlexHMiner

approach: (1) compute an activity tree, (2) compute abstracted

logs, and (3) compute subprocess models.

A. Computing Activity Tree

For the ﬁrst step, computing an activity tree, we present

three different methods: one supervised using domain knowl-

edge, one automated using random clustering, and one fall-

back using ﬂat tree.

1) Using Domain Knowledge: Domain knowledge can be

used, as a gold standard, to create an activity tree. This can

be done manually. In other situations, the log itself may

already contain an encoding of human knowledge that can

be utilized in creating the activity tree. In Tab. I, the column

Act demonstrates such an encoding of hierarchy. For instance,

the activity label C_Vi, indicating that the event belongs to

subprocess C. Following the same strategy as the State Chart

(a) Log L1 of three traces

CsVi Ca GlRe Cs

CaRe Gl OrVi Pr Op

Cs Cr

Gl

Re Cr OrVi Pr Op

Cs Cs

(b) Activity tree (𝐴, 𝛾)

CaCs GlReVi

Root

CL

Cr

S

Or Pr Op

CsCa Gl Ce

CaCsGl Or Pr Op

CeCr

Gl

CsCr Or Pr OpCe

CsLsLeCe

CsLsOr Pr Op

CeLe

Le

CsLsOr Pr OpCe

σ1

σ2

σ3

CsLsLeCe

CsLsSsSe

CeLe

Le

CsLsSsSe

Ce

c𝐿2

d𝐿3

e𝐿4

σ’1

σ’2

σ’3

σ’’1

σ’’2

σ’’3

σ’’’1

σ’’’2

σ’’’3

𝐴𝐿 ← 𝑓

↑(𝐿1, “C”)

𝐴𝐿 ← 𝑓

↑(𝐿2, “L”)

𝐴𝐿 ← 𝑓

↑(𝐿3, “S”)

(f) The Root process

CS

LS

CE

LE

SSSE

Fig. 2: FlexHMiner applied on the running example.

Miner (SCM) [14], a simple text parser is built to split the

labels of activities and to convert the domain knowledge into

an activity tree. For our experiment, we found seven real-life

logs of two processes, where such information regarding the

hierarchy is readily encoded in the activity labels 2 3.

2) Using Random Clustering: Let L⊂B(Σ∗)be an event

log. Let maxSize ≥2be the indicated maximal size of

subprocesses. The random clustering algorithm, as listed in

Algorithm 1, takes Σand maxSize as inputs and creates ran-

dom subprocesses of size less than maxSize. It ﬁrst initializes

the activity tree (A, γ)using Σ(see Line 1). It then uses

maxSize to determine the number nof subprocesses at the

current height (see Line 2-5). Next, it creates nparent nodes as

Pand simply assigns each activity c∈Crandomly to a cluster

p∈P(see Line 6-12), while updating Aand γaccordingly

(see Line 8 and 11). To decide whether to continue with the

next height, the current set of parent nodes Pbecomes the set

of child nodes C(see Line 13): if the size of C(i.e., |C|) is

greater than maxSize, then the algorithm creates another level

of parent nodes (as abstracted processes); otherwise, the while

loop terminates, and the root node is added (see Line 15 and

16). In this paper, maxSize is set to 10 as default. We use

the random clustering to show an unsupervised way to create

an activity tree and to investigate a possible baseline for the

qualities of hierarchical models.

3) Using Flat Tree: Given a log Lover activities Σ, a ﬂat

activity tree (A, γ)(with height 1) is constructed as follows.

2See the description of the BPI Challenge 2012 at https://www.win.tue.

nl/bpi/doku.php?id=2012:challenge: “The event log is a merger of three

intertwined sub processes. The ﬁrst letter of each task name identiﬁes from

which sub process (source) it originated.”

3See the description of the BPI Challenge 2015 at https://doi.org/10.

4121/uuid:31a308ef-c844- 48da-948c- 305d167a0ec1.: “The ﬁrst two digits as

well as the characters [of the activity label] indicate the subprocess the

activity belongs to. For instance ... 01_BB_xxx indicates the ‘objections

and complaints’ (‘Beroep en Bezwaar’ in Dutch) subprocess.”

We have A= Σ ∪ {root}. For all a∈Σ,γ(a) = ∅; and for

root ∈A,γ(root)=Σ. We use the ﬂat tree approach as a

fall-back.

B. Computing Abstracted Logs and Models

Given an activity tree (A, γ)and a log L, the second

step uses the projection (f↓) and abstraction (f↑) functions

to recursively compute sublogs for each non-leaf node in the

tree. We ﬁrst discuss the projection and abstraction functions,

after which we present the algorithm.

1) Projection: Given a log L, an activity tree (A, γ), and

a subprocess sp ∈A, the projection of Lon the activities

γ(sp)of sp is rather straightforward and standard, which

allows us to create a corresponding log for sp. The projection

function simply retains the events of activities γ(sp)and

remove the rest. Formally, given a trace σ=hei · σ0: if

e∈γ(sp),f↓(σ, sp) = hei · f↓(σ0, sp), otherwise, f↓(σ, sp) =

hi · f↓(σ0, sp). We overload the function for any log L:

f↓(L, sp) = Sσ∈Lf↓(σ, sp), if f↓(σ, sp)6=hi.

Example 2. For example, given the (simpliﬁed) trace σ1=

hVi,Ca,Re,Gl,Cs,Csi, the subprocess C, and the activity-

hierarchy γwhere γ(C) = {Vi,Cs,Re}(as shown in Fig. 2(a)

and (b)) then the trace of Cis computed as f↓(σ1,C) =

hVi,Re,Cs,Csi.

2) Abstraction: The abstraction function f↑returns the

abstracted, intermediate log after abstracting (removing) the

internal behavior of a subprocess sp. Essentially, the abstrac-

tion function hides irrelevant internal behavior by not retaining

the detailed events of a subprocess; it only keeps the relevant

behavior. In this paper, we consider both the start and the

end events of a subprocess as the relevant behavior. In this

way, the abstracted log only records when a subprocess starts

and ends4. If a subprocess xhas different start activities (e.g.,

Vi and Re), they are abstracted into a single start activity xs

4Note that other abstraction functions can be conceived, e.g., a function that will only

render the start or the end event of the subprocess.

Algorithm 1 Compute random tree (A, γ)

Input: the input log Lover Σ, and the maximal size of a process maxSize

Output: activity tree (A, γ),

1: A←Σ; and for a∈Σdo γ(a)← ∅ {initiate Aand the leaves of γ}

2: C←Σ{Use Cto represent the child activities at the current height}

3: while |C|>maxSize do

4: {create parent clusters Pto ensure the size of a process is at most maxSize,

and apply clustering to assign each c∈Cto a parent cluster p}

5: n← b(|C| −1)/maxSizec+ 1 {Calculate the number of parents}

6: Create labels p1,··· , pn{Create nrandom parent nodes/processes}

7: Initiate P← {p1,··· , pn}:for p∈Pdo γ(p)← ∅

8: A←A∪P

9: for c∈Cdo

10: Select a random p∈P, where |γ(p)|<maxSize

11: γ(p)←γ(p)∪ {c}

12: end for {all children have been assigned to a parent process}

13: C←P

14: end while {| C|≤ maxSize}

15: A←A∪ {root}{Create the root node/process and add to A}

16: for c∈Cdo γ(root)←γ(root)∪ {c}{Assign c∈Cto the root process}

17: return (A, γ)

(e.g., Cs, see Fig. 2(c)). The same holds for the end activities,

which are abstracted into a single end activity xe.

Let SP =γ(sp)denote the set of activities of subprocess

sp. Let Isdenote the index of the ﬁrst event of sp in σas

Is= mine∈σ∧e∈SP σ[e]. Similarly, we use Ieto refer to the

index of the last event of sp in σ, i.e., Ie= maxe∈σ∧e∈SP σ[e].

We deﬁne f↑as follows:

f↑(σ, SP) =

hei · f↑(σ0,SP)e /∈SP

hspsi · f↑(σ0,SP)e∈SP , σ[e] = Is

hspei · f↑(σ0,SP)e∈SP , σ[e] = Ie

hi · f↑(σ0,SP)e∈SP

∧Is< σ[e]< Ie

(1)

We overload the function for any log L, i.e., f↑(L, sp) =

Sσ∈Lf↑(σ, sp)5.

Example 3. For example, after abstracting subprocess C

from σ1, we have f↑(σ1,C) = hCs,Ca,Gl,Cei. Similar,

the trace after abstracting subprocess Lis f↑(σ1,L) =

hVi,Ls,Re,Le,Cs,Csi. To obtain the parent-trace, we

apply f↑recursively; for instance, f↑(f↑(σ1,C),L) =

f↑(f↑(σ1,L),C) = hCs,Ls,Le,Cei=σ00

1, see Fig. 2(d).

The abstraction function f↑is commutative and associative

for non-related subprocesses6. These two properties allow

Algorithm 2 to use the abstraction function in a recursive,

bottom-up manner and to go iteratively through the nodes.

The algorithm creates an abstracted log at each height of the

activity tree, which is discussed in the next section.

3) Recursion: The algorithm computes the log mapping α

and the model mapping βwith three inputs: (1) an event log

L, (2) an activity tree (A, γ), and (3) a ﬂat process discovery

algorithm D. It applies the projection function bottom-up

for each subprocess on the abstracted log AL, starting with

height 1 until but does not include the root process (see Lines

2 - 13). For every subprocess p∈Pof the same height, the

algorithm applies the projection function to obtain the sublog

Lpfor p(Line 8) and updates the log mapping and model

mapping (see Lines 9 and 10). The algorithm then computes

the intermediate, abstracted log AL by using the abstraction

function f↑(see Line 11). The abstracted log AL is updated

after each p∈P(see Lines 5 - 12) at height i. The algorithm

terminates once it has completed processing the root process.

It returns a hierarchical model (A, γ, α, β), where the log

mapping αmaps each (sub)process p∈A,γ(p)6=∅, to the

associated log Lp, and the model mapping βmaps each pto

the associated model Mp=D(Lp)of (sub)process p.

Example 4. Given the log L1and activity tree (A, γ)),

respectively, shown in Fig. 2(a) and (b), the three subprocesses

C,L, and Shave height 1. Fig. 2(c), (d), and (e) show the

abstracted log AL obtained after each iteration of the inner-

loop at Lines 5-11. For example, in the ﬁrst iteration (see Fig. 2

(c)), applying f↑(L1,C)on log L1, we obtain AL =L2where

5For the sake of brevity, we consider the labels a=as=aefor both projection and

abstraction functions and only distinguish them when we apply a discovery algorithm.

6Two nodes are non-related if they do not share ancestors or descendants in the activity

tree)

TABLE II: Statistical information of the event logs.

Data #acts #evts #case #dpi avg e/c max e/c

BPIC12 * 36 262,200 13,087 4,366 20 175

BPIC17f * 18 337,995 21,861 1,024 18 32

BPIC15 1f * 70 21,656 902 295 24 50

BPIC15 2f * 82 24,678 681 420 36 63

BPIC15 3f * 62 43,786 1,369 826 32 54

BPIC15 4f * 65 29,403 860 451 34 54

BPIC15 5f * 74 30,030 975 446 31 61

the activities of Cis removed from L2and only the start and

the end of Care retained (as discussed in Example 3). In

the second iteration, the algorithm continues with log AL and

applies f↑(AL, L), see Fig. 2 (d).

IV. EMPIRICAL EVALUATION

We implemented the FlexHMiner approach in the process

mining toolkit ProM7. We evaluated the quality of the models

created by FlexHMiner using seven real-life data sets and

compared the results.

In the following, we discuss the data sets and the experimen-

tal setup, followed by our empirical analysis. All experiments

are run on an Intel Core i7- 8550U 1.80GHZ with a processing

unit of a 16 GB HP-Elitebook running Windows 10 Enterprise.

A. Data sets

An overview of the statistical information, including the

number of distinct activities (acts), events (evts), cases, distinct

sequences (dpi), and of events per case (e/c) of the seven logs,

is given in Tab. II. These logs are the ﬁltered logs used in [8]

as a benchmark8, which allows us to compare our result with

the qualities of the ﬂat models reported in [8].

7http://www.promtools.org/. The source code and results: github.com/xxlu/prom-

FlexHMiner

8https://doi.org/10.4121/uuid:adc42403-9a38- 48dc-9f0a- a0a49bfb6371

Algorithm 2 Compute log hierarchy αand models β

Input: Log L, activity tree (A, γ), and discovery algorithm D

Output: Log mapping αand model mapping β

1: maxHeight ←Max c∈Aheightγ(c){calculate the maximal height of the

tree}

2: AL ←L

3: for i= 1 to maxHeight −1do

4: P← {p∈A|height(p) = i}

5: for p∈Pdo

6: {Iteratively go through each p∈Pat the same height iand perform log

projection and abstraction}

7: C←γ(p){get the activities Cof subprocess p}

8: Lp←f↓(AL, C){project the log so far on the activities C}

9: α(p)←Lp

10: β(p)←D(Lp)

11: AL ←f↑(AL, p){abstract the log so far using the activities C}

12: end for {the log so far AL has been abstracted from Pto height i}

13: end for {i== maxHeight}

14: α(root)←AL {map the root to the resulted abstracted log}

15: β(root)←D(AL){map the root to the discoverede model}

16: return (A, γ, α, β)

B. Experimental Setups

For computing activity tree, we used the three methods:

random (RC-FH), ﬂat tree (F-FH), and domain knowledge

(DK-FH). For computing hierarchical models, we use two

state-of-the-art process discovery algorithms as D, namely the

Inductive Miner with 0.2 path ﬁltering (IMf) and the Split

Miner (SM) [7] with standard parameter settings. According

to the recent work of Augusto et al. [8], these two algorithms

outperform others in terms of the four quality measures and

execution time.

We run these six conﬁgurations on the seven data sets. For

each log and for each of the two ﬂat discovery algorithms, we

also report the results in [8], for which we use F* to represent.

Thus, in total eight rows for each data set.

C. Model Quality Measures

We assess the quality of a model M, with respect to a log

L, using the following four dimensions:

•For ﬁtness ﬁ(L, M )∈[0,1], we use the alignment based

ﬁtness deﬁned by Adriansyah et al. [15], also used in [8].

•For precision (pr(L, M )∈[0,1]), we use the measure

deﬁned in [16], known as ETC-align.

•We also compute the F1-score, which is the har-

monic mean of ﬁtness and precision: f1(M, L)=2∗

ﬁ(L,M)∗pr (L,M )

ﬁ(L,M)+pr (L,M ).

•For generalization, we follow again the same approach

adopted in [8]. We divide the log into k= 3 subsets:

Lis randomly divided into L1,L2,L3,ge(L, M ) =

1

3P1≤i≤32∗fi(Li,Mi)∗pr(L,Mi)

fi(Li,Mi)+pr(L,Mi)where Micorresponds

to Li.

•Finally, for complexity, we report the size (number of

nodes) and the Control-Flow Complexity (CFC) of a

model [8]. Let P N = (P, T , F, l, mi, mf)be a Petri

Net. Size(P N ) =|P|+|T|.

CFC (P N) = X

t∈T∧(|•t|>1∨|t•|>1)

1 + X

p∈P∧(|•p|>1∨|p•|>1)

|p•|

Let (A, γ, α, β )be a hierarchical model returned. Let

q∈ {ﬁ,pr,F1 ,Ge,CFC ,Size}be any quality measure that

takes a model Mand/or a log Las inputs and returns

the quality value of the model. We calculate and report

the average quality measure qas follows : q(A, γ, α, β) =

avga∈A∧γ(a)6=∅q(β(a), α(a)). For example, ﬁ(A, γ, α, β)is

calculated as the average value of the individual ﬁtness values

of each subprocess in the hierarchical model.

The computation of ﬁtness, precision, and generalization

uses a state-of-the-art technique known as alignment [15]9. As

mentioned, we also list the quality values reported by Augusto

et al. using “F*” [8]. Since we were unable to compute all

qualities of the ﬂat models, and for the sake of consistency,

we mainly discuss our results with respect to the results of

“F*”.

9A 60 minute time-out limit is set for computing the alignment of each

model. When the particular quality measurement could not be obtained due

to either syntactical or behavioral issues in the discovered model or a timeout

when computing the quality measures, we record it using the “−” symbol.

TABLE III: Evaluation results of models for the seven logs.

Data CAlg DAlg F i P r F 1Ge C F C S ize #SPs #Act

BPIC12

F-FH

IMf

- - - - 66 115 1 24

F* 0.98 0.50 0.66 0.66 37 59 1 -

DK-FH 0.96 0.78 0.86 0.85 20 36 4 8

RC-FH 0.97 0.78 0.86 0.88 20 35 4 8

F-FH

SM

0.97 0.55 0.70 0.69 53 89 1 24

F* 0.75 0.76 0.75 0.76 32 53 1 -

DK-FH 0.89 0.94 0.91 0.91 10 22 4 8

RC-FH 0.92 0.90 0.90 0.90 14 28 3 8

BPIC17f

F-FH

IMf

0.98 0.57 0.72 0.73 20 51 1 18

F* 0.98 0.70 0.82 0.82 20 35 1 -

DK-FH 0.97 0.98 0.97 0.97 6 17 4 6

RC-FH 0.98 0.90 0.94 0.92 9 22 3 7

F-FH

SM

0.98 0.63 0.76 0.76 23 47 1 18

F* 0.95 0.85 0.90 0.90 17 32 1 -

DK-FH 0.96 0.98 0.97 0.97 6 18 4 6

RC-FH 0.98 0.92 0.95 0.95 8 20 3 7

BPIC15 1f

F-FH

IMf

0.96 0.36 0.52 0.51 128 217 1 70

F* 0.97 0.57 0.72 0.72 108 164 1 -

DK-FH 0.99 0.87 0.91 0.93 9 19 15 6

RC-FH 0.97 0.84 0.88 0.88 16 29 9 10

F-FH

SM

0.93 0.87 0.90 0.90 64 152 1 70

F* 0.90 0.88 0.89 0.89 43 110 1 -

DK-FH 0.96 0.98 0.97 0.97 4 14 15 6

RC-FH 0.88 0.99 0.93 0.93 8 22 9 10

BPIC15 2f

F-FH

IMf

- - - - 183 313 1 82

F* 0.93 0.56 0.70 0.70 123 193 1 -

DK-FH 0.97 0.91 0.93 0.93 7 16 21 5

RC-FH 0.95 0.85 0.89 0.88 16 33 10 10

F-FH

SM

0.83 0.88 0.85 0.85 72 198 1 82

F* 0.77 0.90 0.83 0.82 41 122 1 -

DK-FH 0.94 0.99 0.97 0.97 4 13 21 5

RC-FH 0.83 0.96 0.89 0.89 11 26 10 10

BPIC15 3f

F-FH

IMf

- - - - 136 244 1 62

F* 0.95 0.55 0.70 0.69 108 159 1 -

DK-FH 0.97 0.94 0.95 0.95 6 15 17 5

RC-FH 0.94 0.87 0.90 0.89 16 33 8 10

F-FH

SM

0.81 0.92 0.86 0.87 50 135 1 62

F* 0.78 0.94 0.85 0.85 29 90 1 -

DK-FH 0.95 0.99 0.97 0.97 3 12 17 5

RC-FH 0.87 0.98 0.92 0.92 10 23 7 9

BPIC15 4f

F-FH

IMf

- - - - 121 242 1 65

F* 0.96 0.58 0.72 0.72 111 162 1 -

DK-FH 0.98 0.96 0.97 0.97 5 13 21 4

RC-FH 0.97 0.81 0.87 0.87 18 35 8 10

F-FH

SM

0.79 0.92 0.85 0.85 51 147 1 65

F* 0.73 0.91 0.81 0.80 31 96 1 -

DK-FH 0.95 0.99 0.97 0.97 2 11 21 4

RC-FH 0.83 0.98 0.90 0.89 8 24 8 10

BPIC15 5f

F-FH

IMf

- - - - 142 255 1 74

F* 0.94 0.18 0.30 0.61 95 134 1 -

DK-FH 0.98 0.95 0.97 0.96 5 14 21 5

RC-FH 0.98 0.80 0.87 0.87 19 34 9 10

F-FH

SM

0.85 0.91 0.88 0.88 54 163 1 74

F* 0.79 0.94 0.86 0.85 30 102 1 -

DK-FH 0.96 1.00 0.98 0.98 2 11 20 5

RC-FH 0.83 0.98 0.90 0.90 8 23 9 10

On the seven starred data sets, there are two subprocesses

for RC-FH-SM, one for DK-FH-SM, and none for DK-FH-

IMf and RC-FH-IMf, where the returned alignments were

indicated unreliable. As a result, the ﬁtness of these models are

-1, and subsequently, the precision and generalization cannot

be computed. The results of these models are excluded when

computing the average quality scores.

D. Results and Discussion

Tab. III summarises the results of our evaluation. For each

data set (Data) and each of the two discovery algorithms

(DAlg), we obtained four (hierarhical) models and report on

the results. To also provide concrete examples, Fig. 3(a)

shows the root process and three subprocesses obtained by

applying DK-FH-IMf on the BPIC15 1f log, while the other

subprocesses are hidden (abstracted). By contrast, Fig. 3(b)

shows the ﬂat model by F-FH-IMf on the same log.

Fitness. Column F i (Tab. III) reports on the average ﬁtness

of the models discovered. For both IMf and SM, in 5 of

the seven logs, the hierarchical models returned by DK-FH

achieved a better ﬁtness score than F*. More speciﬁcally,

there is on average an increase of 0.015 (2%) in ﬁtness,

when compared the ﬁtness of F* to DK-FH. Considering SM,

the improvement in ﬁtness is more signiﬁcant: the average

increase in ﬁtness is 17%, from 0.810 to 0.946. The moderate

improvement in ﬁtness for IMf is due to the fact that IMf

tends to optimize ﬁtness. As the average ﬁtness of the seven

models is already very high (0.959), there is little room for

improvement.

Precision. Column P r (Tab. III) lists the average precision

of the models. When comparing the average precision scores

of the subprocesses (DK-FH) to the ﬂat models (F*), DK-

FH has achieved a considerable improvement in precision,

especially for IMf. For all seven logs, the subprocesses ob-

tained by both DK-FH-IMf and DK-FH-SM have achieved an

average precision higher than the F* approaches: for DK-FH-

IMf, the average precision is increased by 75.4% (from 0.520

to 0.912); for DK-FH-SM, there is also an 11.2% increase in

the precision scores (from 0.883 to 0.982).

An explanation for such signiﬁcant improvements in the

average precision can be the following. As the subprocesses

are relatively small and the sublogs do not contain interleaved

behavior of other concurrent subprocesses, it allowed the dis-

covery algorithm to discover rather sequential models of high

precision, while maintaining the high ﬁtness (see Fig. 3 (b), (c)

and (e)). When a subprocess itself contains much concurrent

behavior (for example, see Fig. 3 (d)), the highly concurrent,

ﬂexible behavior is then localized within this one subprocess,

without affecting the models of other subprocesses.

F1 score. For both IMf and SM and for all seven logs, the

average F1 scores of the subprocesses of DK-FH outperform

the ones of F*. This is mainly due to the improvement in both

ﬁtness (for SM) and precision (for IMf), which led to the

increase in the harmonic mean of ﬁtness and precision. On

average DK-FH achieved an increase of 41.9% (from 0.660

to 0.936) for IMf and 14.4% for SM (from 0.842 to 0.962) in

the F1 scores over the seven logs.

Fig. 5 shows the distribution of the F1-scores of the models

of each approach in more detail (IMf on the left and SM on

the right). Compared to the F1 scores of the models by F-

FH (see purple lines), DK-FH (blue boxplot and dots) always

scores higher. Compared to the F* results, there are only three

exceptional cases for IMf. We looked into these models of IMf

and found that many activities are put in parallel. Thus, ﬁtness

was very high (e.g., 0.972) but precision was very low (e.g.,

0.335). Interestingly, for the exact same subprocess, SM was

able to discover a much more sequential model. This model

still has a high ﬁtness (0.958) but a much higher precision

(1.00) and also a high generalization (0.979). This can be an

interesting future work: since DK-FH allows for the use of any

discovery algorithm, one can design an algorithm that selects

the best algorithm (model) for each subprocess to further

increase the quality of the hierarchical models.

Generalization. Column Ge (Tab. III) reports on the gen-

eralization scores (using 3-fold cross validation). In all seven

logs, for both IMf and SM, there is overall an increase of

23.2% (from 0.771 to 0.950) in the generalization scores

(33.3% for IMf and 14.8% for SM), when comparing DK-

FH to F*.

Complexity. Since the subprocesses are much smaller than

the ﬂat model, the average complexity scores of subprocesses

of DK-FH are, as expected, signiﬁcantly lower than F-FH or

F*. For DK-FH-IMf, the CFC is decreased by 90.6% (from

86.0 to 8.1), and 86.3% for DK-FH-SM (from 31.9 to 4.4),

when compared to the CFC scores of the models of F* on the

seven logs. The improvement is even more signiﬁcant when it

is compared to the models by F-FH.

Fig. 6 shows the distribution of CFC of the submodels by

two approaches in more detail: DK-FH (blue boxplots) and

F-FH (purple lines). It shows the signiﬁcant decrease in CFC

when using DK-FH.

Discussion. Overall, the results on the real-life logs have

shown that the quality of the submodels returned by the DK-

FH approach are higher than the one returned by either the

random clustering approachs or the ﬂat discovery algorithms.

For example, compared to F*, DK-FH achieved an increase of

0.07 in ﬁtness, 0.25 in precision, and 0.18 in generalization,

on average. As expected, the DK-FH outperforms the RC-FH.

Interestingly, the random activity clustering approach (RC-

FH) also shows relatively consistent improvements in the

qualities of obtained submodels, compared to the ﬂat tree

approach and the ﬂat discovery (F*), as listed in Tab. III and

shown in Fig. 5. This may suggest that discovering hierarchical

models may have a certain beneﬁcial factor in terms of model

qualities, comparing to discovering a complex, ﬂat model.

Additional experiments are needed to validate this observation.

However, if this is indeed the case, this may allow future

process discovery algorithms to focus on small processes (say

less than 10-20 activities), while designing other algorithms for

clustering the activities and discovering the process hierarchy.

A related, interesting observation is that for some subpro-

cesses and their sublogs, SM was able to discover better

models than IMf, and in other cases, IMf was able to ﬁnd

better models than SM. Since FlexHMiner allows for the use

of any discovery algorithm, this suggests that one can build in

a selection strategy of the best subprocesses to further increase

the quality of the hierarchical models, as future work.

We also observed that it is rather trivial to ﬂatten the

full model or certain subprocesses of interest, while allowing

abstracting the others, see Fig. 4. For example, in Fig. 2(c)

and (d), the logs L2and L3respectively show abstracting one

or two subprocesses. Applying a discovery algorithm on L2

returns a model where subprocess Cis abstracted; and for L3,

a model is returned where subprocesses Cand Lare abstracted,

while the detailed activities of subprocess Sare retained.

Fig. 3: The ﬂat model discovered using F-FH-IMf on the BPIC15 1f log.

(a) The root process

(c)Subprocess “01_HOOFD_0”

(d) Subprocess “01_HOOFD_1” (e) Subprocess “01_HOOFD_2”

(b) Subprocess “01_HOOFD”

(a) The root process

Fig. 4: The root process and three subprocesses discovered by DK-FH-IMf on the BPIC15 1f log.

●●

●

●

●

●

●

●●

●●

●

●

●

●●

●●●

IMf

SM

BPIC12

BPIC15_1f

BPIC15_2f

BPIC15_3f

BPIC15_4f

BPIC15_5f

BPIC17f

BPIC12

BPIC15_1f

BPIC15_2f

BPIC15_3f

BPIC15_4f

BPIC15_5f

BPIC17f

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Data

F1

CAlg

F-FH

F*

DK-FH

RC-FH

Fig. 5: Signiﬁcantly higher F1-scores achieved by DK-FH

(blue), followed by RC-FH (grey), compared to the F-FH

(purple) and the results reported in [8] (purple), on the seven

benchmark data sets (on the left IMf, and on the right SM).

V. RE LATE D WOR K

In this section, we discuss related works regarding hierar-

chical process discovery and event abstraction, speciﬁcally.

Following the unsupervised strategy, Bose and van der

Aalst [9] are one of the ﬁrst who propose to detect reoccurring

consecutive activity sequences as subprocesses. This approach

treats concurrent subprocesses as running sequentially and

thus does not assume true concurrent/interleaving subspro-

cesses. Later, Tax et al. [17] propose an approach to generate

frequent subprocesses, also known as local process models.

This approach enables detecting interleaving subprocess in an

unsupervised manner, which is used to create abstracted logs.

Using the supervised strategy, the State Chart Miner

(SCM) [14] extends Inductive Miner (IM) and uses informa-

●●●●●

●●

●

●●

●

●●

●●

●●

IMf

SM

BPIC12

BPIC15_1f

BPIC15_2f

BPIC15_3f

BPIC15_4f

BPIC15_5f

BPIC17f

BPIC12

BPIC15_1f

BPIC15_2f

BPIC15_3f

BPIC15_4f

BPIC15_5f

BPIC17f

0

50

100

150

CFC

CAlg

F-FH

F*

DK-FH

RC-FH

Fig. 6: A signiﬁcant decrease in the complexity (CFC)

achieved by the DK-FH (blue), compared to the F-FH (F)

returned (purple lines) for the 16 data sets.

tion in the activity labels to create a hierarchy to discover

hierarchical models. SCM also focuses on sequential subpro-

cesses, where non-consecutive events of the same instance

are cut into separate traces. This assumption leads to the

discovery of models that are overly segmented and fail to

capture concurrent behavior, as shown in Fig. 1a. Furthermore,

SCM can only use IM for discovering subprocesses.

Mannhardt et al. [13] require users to deﬁne complete

behavioral models of subprocesses and their relations to com-

pute abstracted logs. This prerequisite of specifying the full

behavior of low-level processes put much burden on the users.

Moreover, it uses the alignment technique to compute high-

level logs which is computationally very expensive and can be

nondeterministic.

Other approaches assume additional attributes to indicate

the hierarchical information. Using a relational database as

input, Conforti et al. [12] assume that each event has a set of

primary and foreign keys that can be used as the subprocess

instance identiﬁer, in order to determine subprocesses and

multi-instances. However, such event attributes may not be

common in most of the event logs. As an alternative, Wang

et al. [11] assume that the events contain start and complete

timestamps and have explicit information of the follow-up

events (i.e., the next intended activity, which they called

“transfer” attributes). Senderovich et al. [18] propose the use

of patient schedules in a hospital as an approximation to the

actual life-cycle of a visit. These approaches cannot be applied

in our case, since we do not ﬁnd these attributes (such as

explicit causal-relations between events, the start and complete

timestamps, or the scheduling of activities) in our logs.

A very recent literature review conducted by van Zelst et al.

[19] provides an extensive overview and taxonomy for clas-

sifying different event abstraction approaches. In [19], Zelst

et al. studied 21 articles in depth and compared them among

seven different dimensions. One dimension is particularly im-

portant to distinguish our approaches, namely the supervision

strategy. None of the 21 methods enable both supervised and

unsupervised, whereas we have shown that we can used both

supervised (domain knowledge) and unsupervised (random

clustering) to discover an activity tree for event abstraction.

Our evaluation has shown FlexHMiner to be ﬂexible and ap-

plicable in practice. It is worthy to mention that our approach

does assume that for each case, each subprocess is executed

at most once (i.e., single-instance), while a subprocess can

contain loops. As future work, we can extend the algorithm to

include, in addition to the abstraction and projection functions,

the multi-instance detection and segmentation techniques to

handle multi-instance subprocesses.

VI. CONCLUSION

In this work, we investigated the hierarchical process dis-

covery problem. We presented FlexHMiner, a general ap-

proach that supports ﬂexible ways to compute process hierar-

chy using the notion of activity tree. We demonstrate this ﬂexi-

bility by proposing three methods for computing the hierarchy,

which vary from fully supervised, using domain knowledge,

to fully automated, using a random approach. We investigated

the quality of hierarchical models discovered using these

different methods. The empirical evidence, using seven real-

life logs, demonstrates that the one using supervised approach

outperforms the one using random clustering in terms of the

four quality dimensions. But both methods outperform the ﬂat

model approaches, which clearly demonstrates the strengths

of the FlexHMiner approach.

For future work, we plan to investigate different algorithms

for computing activity trees to further improve the quality of

hierarchical models. Also, the concept of activity trees can

be extended to data-aware or context-aware activity trees. For

example, an activity node can be associate with a data contraint

(context), so that the events labeled with the same activity but

have different data attributes (contexts) can be abstracted into

different subprocesses.

ACKNOWLEDGMENT

This research was supported by the NWO TACTICS project

(628.011.004) and Lunet Zorg in the Netherlands. We would

also like to thank the experts from the VUMC for their

extremely valuable assistance and feedback in the evaluation.

REFERENCES

[1] S. J. J. Leemans, D. Fahland, and W. M. P. van der Aalst, “Scalable

process discovery and conformance checking,” Software and Systems

Modeling, vol. 17, no. 2, pp. 599–631, 2018.

[2] X. Lu, S. A. Tabatabaei, M. Hoogendoorn, and H. A. Reijers, “Trace

clustering on very large event data in healthcare using frequent sequence

patterns,” in BPM, ser. Lecture Notes in Computer Science, vol. 11675.

Springer, 2019, pp. 198–215.

[3] H. A. Reijers, J. Mendling, and R. M. Dijkman, “Human and automatic

modularizations of process models to enhance their comprehension,” Inf.

Syst., vol. 36, no. 5, pp. 881–897, 2011.

[4] S. J. J. Leemans, D. Fahland, and W. M. P. van der Aalst, “Discovering

block-structured process models from event logs containing infrequent

behaviour,” in BPM Workshops 2013, Beijing, China, August 26, 2013,

2013, pp. 66–78.

[5] J. Carmona and J. Cortadella, “Process discovery algorithms using

numerical abstract domains,” IEEE Trans. Knowl. Data Eng., vol. 26,

no. 12, pp. 3064–3076, 2014.

[6] A. J. M. M. Weijters and J. T. S. Ribeiro, “Flexible heuristics miner

(FHM),” in CIDM. IEEE, 2011, pp. 310–317.

[7] A. Augusto, R. Conforti, M. Dumas, and M. La Rosa, “Split miner:

Discovering accurate and simple business process models from event

logs,” in ICDM. IEEE Computer Society, 2017, pp. 1–10.

[8] A. Augusto, R. Conforti, M. Dumas, M. La Rosa, F. M. Maggi,

A. Marrella, M. Mecella, and A. Soo, “Automated discovery of process

models from event logs: Review and benchmark,” IEEE Trans. Knowl.

Data Eng., vol. 31, no. 4, pp. 686–705, 2019.

[9] R. J. C. Bose and W. M. P. van der Aalst, “Abstractions in process

mining: A taxonomy of patterns,” in BPM, ser. LNCS, vol. 5701.

Springer, 2009, pp. 159–175.

[10] F. M. Maggi, T. Slaats, and H. A. Reijers, “The automated discovery

of hybrid processes,” in BPM, ser. Lecture Notes in Computer Science,

vol. 8659. Springer, 2014, pp. 392–399.

[11] Y. Wang, L. Wen, Z. Yan, B. Sun, and J. Wang, “Discovering BPMN

models with sub-processes and multi-instance markers,” in OTM Con-

ferences, ser. Lecture Notes in Computer Science, vol. 9415. Springer,

2015, pp. 185–201.

[12] R. Conforti, M. Dumas, L. Garc´

ıa-Ba˜

nuelos, and M. La Rosa, “BPMN

miner: Automated discovery of BPMN process models with hierarchical

structure,” Inf. Syst., vol. 56, pp. 284–303, 2016.

[13] F. Mannhardt, M. de Leoni, H. A. Reijers, W. M. P. van der Aalst, and

P. J. Toussaint, “From low-level events to activities - A pattern-based

approach,” in BPM, ser. Lecture Notes in Computer Science, vol. 9850.

Springer, 2016, pp. 125–141.

[14] M. Leemans, W. M. P. van der Aalst, and M. G. J. van den Brand,

“Recursion aware modeling and discovery for hierarchical software

event log analysis,” in SANER. IEEE Computer Society, 2018, pp.

185–196.

[15] A. Adriansyah, B. F. van Dongen, and W. M. P. van der Aalst,

“Conformance checking using cost-based ﬁtness analysis,” in Enterprise

Distributed Object Computing Conference (EDOC), 2011 15th IEEE

International. IEEE, 2011, pp. 55–64.

[16] J. Munoz-Gama and J. Carmona, “A fresh look at precision in process

conformance,” in BPM, ser. Lecture Notes in Computer Science, vol.

6336. Springer, 2010, pp. 211–226.

[17] N. Tax, B. Dalmas, N. Sidorova, W. M. P. van der Aalst, and S. Norre,

“Interest-driven discovery of local process models,” Inf. Syst., vol. 77,

pp. 105–117, 2018.

[18] A. Senderovich, M. Weidlich, A. Gal, A. Mandelbaum, S. Kadish,

and C. A. Bunnell, “Discovery and validation of queueing networks

in scheduled processes,” in International Conference on Advanced

Information Systems Engineering. Springer, 2015, pp. 417–433.

[19] S. J. van Zelst, F. Mannhardt, M. de Leoni, and A. Koschmider, “Event

abstraction in process mining: literature review and taxonomy,” Granular

Computing, May 2020.