ChapterPDF Available

Abstract and Figures

Performance analysis in process mining aims to provide insights on the performance of a business process by using a process model as a formal representation of the process. Existing techniques for performance analysis assume that a single case notion exists in a business process (e.g., a patient in healthcare process). However, in reality, different objects might interact (e.g., order, delivery, and invoice in an O2C process). In such a setting, traditional techniques may yield misleading or even incorrect insights on performance metrics such as waiting time. More importantly, by considering the interaction between objects, we can define object-centric performance metrics such as synchronization time, pooling time, and lagging time. In this work, we propose a novel approach to performance analysis considering multiple case notions by using object-centric Petri nets as formal representations of business processes. The proposed approach correctly computes existing performance metrics, while supporting the derivation of newly-introduced object-centric performance metrics. We have implemented the approach as a web application and conducted a case study based on a real-life loan application process.KeywordsPerformance analysisObject-centric process miningObject-centric Petri netActionable insightsProcess improvement
Content may be subject to copyright.
OPerA: Object-Centric Performance Analysis
Gyunam Park[0000000193946513], Jan Niklas Adams[0000000189544925], and
Wil. M. P. van der Aalst[0000000209556940]
Process and Data Science Group (PADS), RWTH Aachen University
{gnpark,niklas.adams,wvdaalst}@pads.rwth-aachen.de
Abstract. Performance analysis in process mining aims to provide in-
sights on the performance of a business process by using a process model
as a formal representation of the process. Such insights are reliably in-
terpreted by process analysts in the context of a model with formal
semantics. Existing techniques for performance analysis assume that a
single case notion exists in a business process (e.g., a patient in health-
care process). However, in reality, different objects might interact (e.g.,
order, item, delivery, and invoice in an O2C process). In such a setting,
traditional techniques may yield misleading or even incorrect insights on
performance metrics such as waiting time. More importantly, by consid-
ering the interaction between objects, we can define object-centric per-
formance metrics such as synchronization time, pooling time, and lagging
time. In this work, we propose a novel approach to performance analy-
sis considering multiple case notions by using object-centric Petri nets
as formal representations of business processes. The proposed approach
correctly computes existing performance metrics, while supporting the
derivation of newly-introduced object-centric performance metrics. We
have implemented the approach as a web application and conducted a
case study based on a real-life loan application process.
Keywords: Performance Analysis ·Object-Centric Process Mining ·
Object-Centric Petri Net ·Actionable Insights ·Process Improvement
1 Introduction
Process mining provides techniques to extract insights from event data recorded
by information systems, including process discovery, conformance checking, and
performance analysis [1]. Especially performance analysis provides techniques to
analyze the performance of a business process, e.g., bottlenecks, using process
models as representations of the process [6].
Existing techniques for performance analysis have been developed, assum-
ing that a single case notion exists in business processes, e.g., a patient in a
healthcare process [5, 6, 10, 14, 16,17, 19]. Such a case notion correlates events of
a process instance and represents them as a single sequence, e.g., a sequence of
events of a patient. However, in real-life business processes supported by ERP
systems such as SAP and Oracle, multiple objects (i.e., multiple sequences of
events) exist in a process instance [2,7] and they share events (i.e., sequences are
arXiv:2204.10662v2 [cs.AI] 27 Jun 2022
2 G. Park, J. N. Adams, and W.M.P. van der Aalst
overlapping). Fig. 1(a) shows a process instance in a simple blood test process as
multiple overlapping sequences. The red sequence represents the event sequence
of test T1, whereas the blue sequences indicate the event sequences of samples
S1 and S2, respectively. The objects share conduct test event (e4 ), i.e., all the
sequences overlap, and the samples share transfer samples event (e6 ), i.e., the
sample sequences overlap.
Fig. 1. A motivating example showing misleading insights from existing approaches to
performance analysis and the proposed object-centric performance analysis
The goal of object-centric performance analysis is to analyze performance in
such “object-centric” processes with multiple overlapping sequences using 1) ex-
isting performance measures and 2) new performance measures considering the
interaction between objects. Fig. 1(b)(1) visualizes existing performance mea-
sures related to event conduct test.Waiting time of conduct test is the time
spent before conducting the test after preparing test T1 and samples S1 and
S2, while the service time is the time spent for conducting the test and sojourn
time is the sum of waiting time and service time. Furthermore, Fig. 1(b)(2)
shows new performance measures considering the interaction between objects.
First, synchronization time is the time spent for synchronizing different objects,
i.e., samples S1 and S2 with test T1 to conduct the test. Next, pooling time is
the time spent for pooling all objects of an object type, e.g., the pooling time of
OPerA: Object-Centric Performance Analysis 3
conduct test w.r.t. sample is the time taken to pool the second sample. Third,
lagging time is the time spent due to the lag of an object type, e.g., the lagging
time of conduct test w.r.t. test is the time taken due to the lag of the second
sample. Finally, flow time is the sum of sojourn time and synchronization time.
A natural way to apply existing techniques to multiple overlapping sequences
is to flatten them into a single sequence. To this end, we select an object type(s)
as a case notion, removing events not having the object type and replicating
events with multiple objects of the selected type [2]. For instance, Fig. 1(a) is
flattened to Fig. 1(c) by using test as a case notion, to Fig. 1(d) by using sample
as a case notion, and Fig. 1(e) by using both test and sample as a case notion.
However, depending on the selection, flattening results in misleading insights.
Fig. 1(f) summarizes the correctness of object-centric performance analysis on
flattened sequences. 1) Flattening on test provides a misleading waiting time,
measured as the time difference between the complete time of prepare test and
the start time of conduct test, and, thus, a misleading sojourn time. 2) Flattening
on sample results in misleading insights on the service time since two service
times are measured despite the single occurrence of the event. 3) By flattening
on both test and sample, the waiting time for take sample is measured in relation
to prepare test although they are independent events from different object types.
In this work, we suggest a novel approach to object-centric performance anal-
ysis. The approach uses an Object-Centric Event Log (OCEL) that store multiple
overlapping sequences without flattening (cf. Fig. 1(g)) as an input. Moreover, we
use Object-Centric Petri Nets (OCPNs) [2] as a formalism to represent process
models, and the object-centric performance is analyzed in the context of process
models. With formal semantics of OCPNs, we can reliably compute and interpret
performance analysis results, considering the concurrency, loops, etc [?].
More in detail, we first discover an OCPN that formally represents a process
model from the OCEL. Next, we replay the OCEL on the discovered OCPN to
produce token visits and event occurrences. Finally, we compute object-centric
performance measures using the token visit and event occurrence. For instance,
in the proposed approach, the waiting time of Conduct test is computed as the
difference between e4 ’s start and e1 ’s complete. The synchronization time is
computed as the time difference between e3 ’s complete and e1 ’s complete.
In summary, we provide the following contributions.
1. Our approach correctly calculates existing performance measures in an object-
centric setting.
2. Our approach supports novel object-centric performance metrics taking the
interaction between objects into account, such as synchronization time.
3. The proposed approach has been implemented as a web application1and
a case study with a real-life event log has been conducted to evaluate the
effectiveness of the approach.
The remainder is organized as follows. We discuss the related work in Sec. 2.
Next, we present the preliminaries, including OCELs and OCPNs in Sec. 3. In
1A demo video, sources, and manuals are available at https://github.com/
gyunamister/OPerA
4 G. Park, J. N. Adams, and W.M.P. van der Aalst
Sec. 4, we explains the approach to object-centric performance analysis. After-
ward, Sec. 5 introduces the implementation of the proposed approach and a case
study using real-life event data. Finally, Sec. 6 concludes the paper.
2 Related Work
2.1 Performance Analysis in Process Mining
Performance analysis has been widely studied in the context of process min-
ing. Table 1 compares existing work and our proposed work in different criteria:
1) if formal semantics exist to analyze performance in the context of process
models, 2) if aggregated measures, e.g., mean and median, are supported, 3)
if frequency analysis is covered, 4) if time analysis is covered, and 5) if mul-
tiple case notions are allowed to consider the interactions of different objects.
Existing algorithms/techniques assume a single case notion, not considering the
interaction among different objects.
Table 1. Comparison of algorithms/techniques for performance analysis
Author Technique Form. Agg. Freq. Perf. Obj.
Mat´e et al. [17] Business Strategy Model -X X X -
Denisov et al. [10] Performance Spectrum -X X X -
Hornix [14] Petri Nets X X X X -
Rogge-Solti et al. [19] Stochastic Petri Nets X X -X-
Leemans et al. [16] Directly Follows Model X X X X -
Adriansyah et al. [6] Robust Performance X X X X -
Adriansyah [5] Alignments X X X X -
Our work Object-Centric X X X X X
2.2 Object-Centric Process Mining
Traditionally, methods in process mining have the assumption that each event
is associated with exactly one case, viewing the event log as a set of isolated
event sequences. Object-centric process mining breaks with this assumption,
allowing one event to be associated with multiple cases and, thus, having shared
events between event sequences. An event log format has been proposed to store
object-centric event logs [13], as well as a discovery technique for OCPNs [2]
and a conformance checking technique to determine precision and fitness of the
net [4]. Furthermore, Esser and Fahland [11] propose a graph database as a
storage format for object-centric event data, enabling a user to use queries to
calculate different statistics. A study on performance analysis is, so far, missing
in the literature, with only limited metrics being supported in [2] by flattening
event logs and replaying them. However, object-centric performance metrics are
OPerA: Object-Centric Performance Analysis 5
needed to accurately assess performance in processes where multiple case notions
occur.
The literature contains several notable approaches to deal with multiple case
notions. Proclets [12] is the first introduced modeling technique to describe
interacting workflow processes and, later, artifact-centric modeling [9] extends
this approach. DB nets [18] are a modeling technique based on colored Petri
nets. OCBC [3] is a newly proposed technique that includes the evolution of a
database into an event log, allowing for the tracking of multiple objects. Object-
centric process mining aims to alleviate the weaknesses of these techniques. The
approaches and their weaknesses are more deeply discussed in [2].
3 Background
3.1 Object-Centric Event Data
Definition 1 (Universes). Let Uei be the universe of event identifiers, Uact the
universe of activity names, Utime the universe of timestamps, Uot the universe of
object types, and Uoi the universe of object identifiers. type Uoi Uot assigns
precisely one type to each object identifier. Uomap ={omap Uot 6→ P(Uoi )|
otdom (omap )oiomap (ot )type(oi )=ot }is the universe of all object mappings
indicating which object identifiers are included per type. Uevent=Uei ×Uact ×
Utime ×Utime ×Uomap is the universe of events.
Given e=(ei, act, st, ct, omap)Uevent ,πei(e)=ei,πact(e)=act,πst (e)=st,
πct(e)=ct, and πomap (e)=omap. Note that we assume an event has start and
complete timestamps.
Fig. 1(b) describes a fraction of a simple object-centric event log with two
types of objects. For the event in the fourth row, denoted as e4, πei(e4)=e4,
πact(e4)=conduct test,πst(e4) = 180,πct (e4)=240,πomap(e4)(test)={T1}, and
πomap(e4)(sample)={S1,S2}. Note that the timestamp in the example is sim-
plified using the relative scale.
Definition 2 (Object-Centric Event Log (OCEL)). An object-centric event
log is a tuple L=(E, E), where EUevent is a set of events and EE×E
is a total order underlying E.ULis the set of all possible object-centric event
logs.
3.2 Object-Centric Petri Nets
A Petri net is a directed graph having places and transitions as nodes and flow
relations as edges. A labeled Petri net is a Petri net where the transitions can
be labeled.
Definition 3 (Labeled Petri Net). A labeled Petri net is a tuple N=(P, T, F, l)
with Pthe set of places, Tthe set of transitions, PT=,F(P×T)(T×P)
the flow relation, and lT6→ Uact a labeling function.
6 G. Park, J. N. Adams, and W.M.P. van der Aalst
Each place in an OCPN is associated with an object type to represent inter-
actions among different object types. Besides, variable arcs represent the con-
sumption/production of a variable amount of tokens in one step.
Definition 4 (Ob ject-Centric Petri Net). An object-centric Petri net is a
tuple ON =(N, pt, Fvar )where N=(P, T, F, l)is a labeled Petri net, pt P
Uot maps places to object types, and Fvar Fis the subset of variable arcs.
Fig. 3(a) depicts an OCPN, ON1=(N, pt, Fvar ) with N=(P, T , F, l) where
P={p1, . . . , p9},T={t1, . . . , t6},F={(p1, t1),(p2, t2), . . . },l(t1)=prepare test, etc.,
pt(p1)=test,pt(p2)=sample, etc., and Fvar ={(p4, t3),(t3, p6), . . . }.
Definition 5 (Marking). Let ON =(N , pt, Fvar )be an object-centric Petri net,
where N=(P, T, F, l).QON ={(p, oi )P×Uoi |type(oi)=pt(p)}is the set of
possible tokens. A marking Mof ON is a multiset of tokens, i.e., M B(QON ).
For instance, marking M1=[(p3, T 1),(p4, S1),(p4, S2)] denotes three tokens,
among which place p3 has one token referring to object T1 and p4 has two
tokens referring to objects S1 and S2.
A binding describes the execution of a transition consuming objects from its
input places and producing objects for its output places. A binding (t, b) is a
tuple of transition tand function bmapping the object types of the surrounding
places to sets of object identifiers. For instance, (t3, b1) describes the execution
of transition t3 with b1 where b1(test)={T1}and b1(sample)={S1, S2}, where
test and sample are the object types of its surrounding places (i.e., p3, p4, p5, p6).
A binding (t, b) is enabled in marking Mif all the objects specified by bexist
in the input places of t. For instance, (t3, b1) is enabled in marking M1since T1,
S1, and S2 exist in its input places, i.e., p3 and p4.
A new marking M0is reached by executing enabled binding (t, b) at Mleads
to, denoted by M(t,b)
M0. As a result of executing (t1, b1), T1 is removed from
p3 and added to p5. Besides, S1 and S2 are removed from p4 and added to p6,
resulting in new marking M0=[(p5, T 1),(p6, S1),(p6, S2)].
4 Object-Centric Performance Analysis
This section introduces an approach to object-centric performance analysis.
Fig. 2 shows an overview of the proposed approach. First, we discover an OCPN
based on an OCEL. Next, we replay the OCEL with timestamps on the dis-
covered OCPN to connect events in the OCEL to the elements of OCPN and
compute event occurrences and token visits. Finally, we measure various object-
centric performance metrics based on the event occurrence and token visit. The
discovery follows the general approach presented in [2]. In the following subsec-
tions, we focus on explaining the rest.
OPerA: Object-Centric Performance Analysis 7
Fig. 2. An overview of the proposed approach.
4.1 Replaying OCELs on OCPNs
First, we couple events in an OCEL to an OCPN by “playing the token game”
using the formal semantics of OCPNs. Note that most of business processes are
not sequential, and, thus, simply relating an event to its directly following event
does not work. By using the semantics of OCPNs, we can reliably relate events
to process models by considering the concurrency and loop and correctly identify
relationships between events.
As a result of the replay, a collection of event occurrences are annotated to
each visible transition, and a collection of token visits are recorded for each place.
First, an event occurrence represents the occurrence of an event in relation to a
transition.
Definition 6 (Event Occurrence). Let ON =(N , pt, Fvar )be an object-centric
Petri net, where N=(P, T, F, l). An event occurrence eo T×Uevent is a tuple
of a transition and an event. OON is the set of possible event occurrences of ON.
For instance, (t3 ,e4 )OON1is a possible event occurrence in ON 1shown
in Fig. 3(a). It indicates that t3 is associated with the occurrence of event e4 . In
other words, t3 is fired by 1) consuming tokens (p3 ,T1 ) from p3 and (p4 ,S1 )
and (p4 ,S2 ) fromp4 at 180 and 2) producing tokens (p5 ,T1 ) to p5 and (p6 ,S1 )
and (p6 ,S2 )p6 at 240. Note that we derive the consumed and produced tokens
by using the transition and the event, i.e., we are aware of the input and output
places of the transition and the involved objects of the event. Moreover, we know
when the event starts and completes.
A token visit describes “visit” of a token to the corresponding place with the
begin time of the visit, i.e., the timestamp when the token is produced, and the
end time of the visit, i.e., the timestamp when the token is consumed.
Definition 7 (Token Visit). Let ON =(N, pt , Fvar )be an object-centric Petri
net, where N=(P, T, F, l).QON ={(p, oi )P×Uoi |type(oi)=pt(p)}is the set
of possible tokens. A token visit tv QON ×Utime ×Utime is a tuple of a token,
a begin time, and an end time. TV ON is the set of possible token visits of ON .
Given token visit tv=((p, oi), bt, et), πp(tv)=p,πoi(tv)=oi,πbt(tv)=bt, and
πet(tv)=et. For instance, ((p3 ,T1 ),15,180) TV ON 1is a possible token visit
8 G. Park, J. N. Adams, and W.M.P. van der Aalst
Fig. 3. An example of replaying object-centric event logs on an object-centric Petri net
in ON 1shown in Fig. 3. It represents that token (p3 ,T1 )QON 1is produced
in place p3 at 15 and consumed at 180.
Given an OCEL, a replay function produces event occurrences and token
visits of an OCPN, connecting events in the log to the OCPN.
Definition 8 (Replay). Let ON be an object-centric Petri net. A replay func-
tion replayON UL P(OON )× P(VON )maps an event log to a set of event
occurrences and a set of token visits.
Fig. 3(b) shows the result of replaying the events in L1shown in Fig. 3(a) on
model ON1depicted in Fig. 3(a). The dark gray boxes represent event occur-
rences O1and the light gray boxes represent token visits V1, where replayON1(L1)
=(O1, V1). For instance, replaying event e1 and e4 in L1produces event oc-
currences, (t1 ,e1 ) and (t3 ,e4 ), respectively, and token visit ((p3 ,T1 ),15,180)
where 15 is the time when e1 completes and 180 is the time when e4 starts.
In this work, we instantiate the replay function based on the token-based
replay approach described in [8]. We first flatten an OCEL to a traditional event
log and project an OCPN to an accepting Petri net for each object type. Next,
we apply the token-based replay for each log and Petri net, as introduced in [2].
The replay function needs to be instantiated to ignore non-fitting events to deal
with logs with non-perfect fitness. To simplify matters, we assume the flattened
logs perfectly fit the projected Petri nets (i.e., no missing or remaining tokens).
OPerA: Object-Centric Performance Analysis 9
4.2 Measuring Object-Centric Performance Measures
We compute object-centric performance measures per event occurrence. For in-
stance, we compute synchronization,pooling,lagging, and waiting time of (t3 ,e4 )
that analyzes an event of conduct test. For meaningful insights, we may aggre-
gate all waiting time measures of conduct test events into the average, median,
maximum, or minimum waiting time of conduct test.
To this end, we first relate an event occurrence to the token visits 1) associ-
ated with the event occurrence’s transition and 2) involving the objects linked
to the event occurrence’s event.
Definition 9 (Relating An Event Occurrence to Token Visits). Let Lbe
an object-centric event log and ON an object-centric Petri net. Let eo=(t, e)O
be an event occurrence. OI(eo)= Sot dom(πomap (e)) πomap (e)(ot)denotes the set
of objects related to the event occurrence. relON OON × P (VON ) P(VON )is
a function mapping an event occurrence and a set of token visits to the set of the
token visits related to the event occurrence, s.t., for any eo OON and VVON ,
relON (eo, V )= SoiOI (eo)argmax tv∈{tv0V|πp(tv0)∈•tπoi (tv0)=oi}πbt (tv).
Fig. 4(a) shows the token visits related to eo 1=(t3 ,e4 ). relON 1(eo1, V1)={tv1=((
p3 ,T1 ),15,180), tv2=((p4 ,S1 ),120,180), tv3=((p4 ,S2 ),150,180)}since p3 ,p4
t3, {T1 ,S1 ,S2 } OI(eo1), and each token visit is with the latest begin
time among other token visits of the corresponding object, e.g., tv1is the latest
token visit of T1 .
A measurement function computes a performance measure of an event oc-
currence by using the related token visits.
Definition 10 (Measurement). Let ON be an object-centric Petri net. measure
OON ×P (VON )Ris a function mapping an event occurrence and its related
token visits to a performance value. Umdenotes the set of all such functions.
In this paper, we introduce seven measurement functions to compute object-
centric performance measures as shown in Fig. 4(c). With Lan OCEL, ON
an OCPN, and (O, V )=replayON (L), we introduce the functions with formal
definitions and examples as below:
flow Umcomputes flow time, i.e., the time difference between the comple-
tion of the event and the earliest token visit related to the event. Formally,
for any eo=(t, e)O,flow(eo, V )=πct (e)min(T) with T={πbt(tv)|tv
relON (eo, V )}. In Fig. 4(c), the flow time of eo1is the time difference be-
tween the completion of the event, i.e., the completion time of e4 (240), and
the earliest token visit related to the event, i.e., the begin time of tv1(15).
Note that flow time is equal to the sum of synchronization time and sojourn
time.
sojourn Umcomputes sojourn time, i.e., the time difference between the
completion of the event and the latest token visit related to the event. For-
mally, for any eo=(t, e)O,sojourn(eo, V )=πct (e)max (T) with T={πbt (tv)
10 G. Park, J. N. Adams, and W.M.P. van der Aalst
Fig. 4. An example of corresponding token visits of an event occurrence and object-
centric performance measures of the event occurrence
|tv relON (eo, V )}. In Fig. 4(c), the sojourn time of eo1is the time dif-
ference between the completion of the event, i.e., the completion time of e4
(240), and the latest token visit related to the event, i.e., the begin time of
tk3(150). Note that sojourn time is equal to the sum of waiting time and
service time.
wait Umcomputes waiting time, i.e., the time difference between the start
of the event and the latest token visit related to the event. Formally, for
any eo=(t, e)O,wait(eo, V )=πst(e)max (T) with T={πbt (tv)|tv
relON (eo, V )}. In Fig. 4(c), the waiting time of eo1is the time difference
between its start, i.e., the start time of e4 (180), and the latest token visit,
i.e., the begin time of tk3(150).
service Umcomputes service time, i.e., the time difference between the
completion of the event and the start of the event. Formally, for any eo=(t, e)
O,service(eo, V )=πct (e)πst(e). In Fig. 4(c), the service time of eo1is the
time difference between the completion of the event, i.e., the completion time
of e4 (240), and the start of the event, i.e., the start time of e4 (180).
sync Umcomputes synchronization time, i.e., the time difference between
the latest token visit and the earliest token visit related to the event. For-
mally, for any eo=(t, e)O,sync(eo, V )=max (T)min (T) with T={πbt (tv)|
tv relON (eo, V )}. In Fig. 4(c), the synchronization time of eo1is the time
difference between the latest token visit, i.e., the begin time of tv3(150),
OPerA: Object-Centric Performance Analysis 11
and the earliest token visit, i.e., the begin time of tv1(15). Note that the
synchronization time consists of pooling time and lagging time.
pool ot Umcomputes pooling time w.r.t. object type ot , i.e., the time differ-
ence between the latest token visit of ot and the earliest token visit of ot re-
lated to the event. Formally, for any eo=(t, e)O,pool ot (eo, V )=max (T)
min(T) with T={πbt(tv)|tv relON (eo, V )type(πoi(tv))=ot}. In Fig. 4(c),
the pooling time of eo1w.r.t. sample is the time difference between the latest
token visit of sample, i.e., the begin time of tv3(150), and the earliest token
visit of sample, i.e., the begin time of tv2(120). Note that the pooling time
can be the same as the synchronization time.
lagot Umcomputes lagging time w.r.t. object type ot , i.e., the time dif-
ference between the latest token visit of ot and the earliest token visit of
other object types related to the event. Formally, for any eo=(t, e)O,
lagot (eo, V )=max (T0)min (T) with T={πbt (tv)|tv relON (eo, V )}and
T0={πbt(tv)|tv relON (eo, V )type(πoi(tv))6=ot}if max (T0)>min(T).
0 otherwise. In Fig. 4(c), the lagging time of eo1w.r.t. sample is the time
difference between the latest token visit of samples, i.e., the begin time of
tv3(150), and the earliest token visit of any object types, i.e., the begin time
of tv1(15). Note that, in some cases, the lagging time is the same as the
synchronization time.
Non-temporal performance measures are trivial to compute given object-
centric event data, but still provide valuable insights. They include object fre-
quency, i.e., the number of objects involved with the event, and object type
frequency, i.e., the number of object types involved with the event. In Fig. 4(c),
the object frequency of e4 is 3 including T1,S1, and S2 and the object type
frequency of e4 is 2 including Test and Sample.
5 Evaluation
In this section, we present the implementation of the proposed approach and
evaluate the effectiveness of the approach by applying it to a real-life event log.
5.1 Implementation
The approach discussed in Sec. 4 has been fully implemented as a web applica-
tion2with a dedicated user interface. We containerize it as a Docker container,
structuring functional components into a coherent set of microservices. The fol-
lowing functions are supported:
Importing object-centric event logs in different formats including OCEL
JSON, OCEL XML, and CSV.
2A demo video, sources, and manuals are available at https://github.com/
gyunamister/OPerA
12 G. Park, J. N. Adams, and W.M.P. van der Aalst
Discovering OCPNs based on the general approach presented in [2] with
Inductive Miner Directly-Follows process discovery algorithm [15].
Replaying tokens with timestamps on OCPNs based on token-based replay
approach suggested in [8].
Computing object-centric performance measures based on the replay results,
i.e., event occurrences and token visits.
Visualizing OCPNs with the object-centric performance measure.
Fig. 5. A screenshot of the application: Importing Object-Centric Event Logs
(OCELs). 1Importing OCELs in OCEL JSON, OCEL XML, and CSV formats.
2Preprocessing OCELs. 3Displaying OCELs.
5.2 Case Study: Loan Application Process
Using the implementation, we conduct a case study on a real-life loan application
process of a Dutch Financial Institute3. Two object types exist in the process:
application and offer. An application can have one or more offers. First, a cus-
tomer creates an application by visiting the bank or using an online system. In
the former case, submit activity is skipped. After the completion and acceptance
of the application, the bank offers loans to the customer by sending the offer to
the customer and making a call. An offer is either accepted or canceled.
3doi.org/10.4121/uuid:3926db30-f712-4394-aebc- 75976070e91f
OPerA: Object-Centric Performance Analysis 13
Fig. 6. A screenshot of the application: Analyzing and visualizing object-centric performance measures. 1Selecting object-centric
performance measures, aggregations, and a time period to analyze. 2An object-centric Petri net visualizing the computed performance
measures. 3Visualizing the detailed performance measures of a selected activity from the model.
14 G. Park, J. N. Adams, and W.M.P. van der Aalst
In this case study, we focus on the offers canceled due to various reasons. We
filter infrequent behaviors by selecting the ten most frequent types of process
executions. Moreover, we remove redundant activities, e.g., status updates such
as Completed after Complete application. The resulting event log, available at
the Github repository, contains 20,478 events by 1,682 applications and 3,573
offers.
First, we compare our approach to a traditional technique for performance
analysis based on alignments [5]. To apply the traditional technique, we first
flatten the log using the application and offer as a case notion. Fig. 7(a) shows the
performance analysis results from Inductive Visual Miner in ProM framework4.
As shown in 1
, 1,799 applications repeat activity Send. In reality, as shown in
1, no repetition occurs while the activity is conducted once for each offer except
92 offers skipping it. Furthermore, the average sojourn time for the activity is
computed as around 2 days and 23 hours, whereas, in reality, it is around 15
minutes as shown in 1.
Furthermore, 2
shows that activity Cancel application is repeated 1891
times, but it occurs, in reality, 1,682 times for each application, as depicted
in 2. In addition, the average sojourn time for the activity is measured as
around 12 days and 22 hours, but in fact, it is around 31 days and 22 hours, as
shown in 2.
Next, we analyze the newly-introduced object-centric performance measures,
including synchronization, lagging, and pooling time. As described in 3, the
average synchronization time of activity Cancel application is around 4 days
and 11 hours.
Moreover, the average lagging time of applications is 3 days and 15 hours
and the lagging time of offers is 19 hours, i.e., offers are more severely lagging
applications. Furthermore, the pooling time of offers is almost the same as the
synchronization time, indicating that the application is ready to be cancelled
almost at the same time as the first offer, and the second offer is ready in around
4 days and 11 hours.
6 Conclusion
In this paper, we proposed an approach to object-centric performance analysis,
supporting the correct computation of existing performance measures and the
derivation of new performance measures considering the interaction between ob-
jects. To that end, we first replay OCELs on OCPNs to couple events to process
models, producing event occurrences and token visits. Next, we measure object-
centric performance metrics per event occurrence by using the corresponding
token visits of the event occurrence. We have implemented the approach as a
web application and conducted a case study using a real-life loan application
process of a financial institute.
The proposed approach has several limitations. First, our approach relies on
the quality of the discovered process model. Discovering process models that
4https://www.promtools.org
OPerA: Object-Centric Performance Analysis 15
Fig. 7. (a) Performance analysis results based on Inductive Visual Miner in ProM
framework and (b) Performance analysis results based on our proposed approach. We
compare 1
,2
, and 3
with 1,2, and 3, respectively. 4shows the result on
newly-introduced performance measures.
16 G. Park, J. N. Adams, and W.M.P. van der Aalst
can be easily interpreted and comprehensively reflect the reality is a remain-
ing challenge. Second, non-conforming behavior in event data w.r.t. a process
model can lead to misleading insights. If Transfer samples is missing for a sam-
ple in an event log, although a process model describes that it always occurs
for samples, the performance measure of Clear sample w.r.t. the sample will
be computed based on the wrong timestamps from Conduct Test. In the imple-
mentation, we use process discovery techniques that guarantee the discovery of a
perfectly-fitting process model and remove the issue of non-conforming behavior.
As future work, we plan to extend the approach to support reliable performance
analysis of non-conforming event logs. Moreover, we plan to develop an approach
to object-centric performance analysis based on event data independently from
process models. Another direction of future work is to define and compute more
interesting performance metrics that consider the interaction between objects.
References
1. van der Aalst, W.M.P.: Process Mining - Data Science in Action, Second Edi-
tion. Springer (2016). https://doi.org/10.1007/978-3-662-49851-4, https://doi.
org/10.1007/978-3-662-49851- 4
2. van der Aalst, W.M.P., Berti, A.: Discovering object-centric petri nets. Fundam.
Informaticae 175(1-4), 1–40 (2020). https://doi.org/10.3233/FI-2020-1946, https:
//doi.org/10.3233/FI-2020-1946
3. van der Aalst, W.M.P., Li, G., Montali, M.: Object-centric behavioral constraints.
CoRR abs/1703.05740 (2017), http://arxiv.org/abs/1703.05740
4. Adams, J.N., van der Aalst, W.M.P.: Precision and fitness in ob ject-
centric process mining. In: Ciccio, C.D., Francescomarino, C.D., Soffer, P.
(eds.) 3rd International Conference on Process Mining, ICPM 2021, Eind-
hoven, The Netherlands, October 31 - Nov. 4, 2021. pp. 128–135. IEEE
(2021). https://doi.org/10.1109/ICPM53251.2021.9576886, https://doi.org/10.
1109/ICPM53251.2021.9576886
5. Adriansyah, A.: Aligning observed and modeled behavior. Ph.D. thesis, Mathe-
matics and Computer Science (2014). https://doi.org/10.6100/IR770080
6. Adriansyah, A., Dongen, van, B., Piessens, D., Wynn, M., Adams, M.: Robust
performance analysis on YAWL process models with advanced constructs. Journal
of Information Technology Theory and Application 12(3), 5–26 (2011)
7. Bayomie, D., Ciccio, C.D., Rosa, M.L., Mendling, J.: A probabilistic ap-
proach to event-case correlation for process mining. In: Laender, A.H.F., Per-
nici, B., Lim, E., de Oliveira, J.P.M. (eds.) Conceptual Modeling - 38th
International Conference, ER 2019, Salvador, Brazil, November 4-7, 2019,
Proceedings. Lecture Notes in Computer Science, vol. 11788, pp. 136–152.
Springer (2019). https://doi.org/10.1007/978-3-030-33223-5 12, https://doi.
org/10.1007/978-3-030-33223- 5_12
8. Berti, A., van der Aalst, W.M.P.: A novel token-based replay technique to speed up
conformance checking and process enhancement. Trans. Petri Nets Other Model.
Concurr. 15, 1–26 (2021). https://doi.org/10.1007/978-3-662-63079-2 1, https://
doi.org/10.1007/978-3-662-63079- 2_1
9. Cohn, D., Hull, R.: Business artifacts: A data-centric approach to modeling
business operations and processes. IEEE Data Eng. Bull. 32(3), 3–9 (2009),
http://sites.computer.org/debull/A09sept/david.pdf
OPerA: Object-Centric Performance Analysis 17
10. Denisov, V., Fahland, D., van der Aalst, W.M.P.: Unbiased, fine-grained descrip-
tion of processes performance from event data. In: Weske, M., Montali, M., We-
ber, I., vom Brocke, J. (eds.) Business Process Management - 16th International
Conference, BPM 2018, Sydney, NSW, Australia, September 9-14, 2018, Pro-
ceedings. Lecture Notes in Computer Science, vol. 11080, pp. 139–157. Springer
(2018). https://doi.org/10.1007/978-3-319-98648-7 9, https://doi.org/10.1007/
978-3-319-98648- 7_9
11. Esser, S., Fahland, D.: Multi-dimensional event data in graph databases. J.
Data Semant. 10(1-2), 109–141 (2021). https://doi.org/10.1007/s13740-021-00122-
1, https://doi.org/10.1007/s13740-021-00122-1
12. Fahland, D.: Describing behavior of processes with many-to-many interactions. In:
PETRI NETS 2019. vol. 11522, pp. 3–24. Springer (2019), https://doi.org/10.
1007/978-3-030-21571- 2_1
13. Ghahfarokhi, A.F., Park, G., Berti, A., van der Aalst, W.M.P.: Ocel standard,
http://ocel-standard.org/
14. Hornix, P.T.: Performance analysis of business processes through pro-
cess mining. Master’s thesis, Mathematics and Computer Science (2007).
https://doi.org/10.6100/IR770080
15. Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Scalable process
discovery and conformance checking. Softw. Syst. Model. 17(2), 599–631
(2018). https://doi.org/10.1007/s10270-016-0545-x, https://doi.org/10.1007/
s10270-016-0545-x
16. Leemans, S.J.J., Poppe, E., Wynn, M.T.: Directly follows-based process mining:
Exploration & a case study. In: International Conference on Process Mining 2019.
pp. 25–32. IEEE (2019). https://doi.org/10.1109/ICPM.2019.00015, https://doi.
org/10.1109/ICPM.2019.00015
17. Mat´e, A., Trujillo, J., Mylopoulos, J.: Conceptualizing and specifying key per-
formance indicators in business strategy models. In: Atzeni, P., Cheung, D.W.,
Ram, S. (eds.) Conceptual Modeling - 31st International Conference ER 2012, Flo-
rence, Italy, October 15-18, 2012. Proceedings. Lecture Notes in Computer Science,
vol. 7532, pp. 282–291. Springer (2012). https://doi.org/10.1007/978-3-642-34002-
4 22, https://doi.org/10.1007/978-3- 642-34002-4_22
18. Montali, M., Rivkin, A.: Db-nets: On the marriage of colored Petri nets and
relational databases. Trans. Petri Nets Other Model. Concurr. 12, 91–118
(2017). https://doi.org/10.1007/978-3-662-55862-1 5, https://doi.org/10.1007/
978-3-662-55862- 1_5
19. Rogge-Solti, A., Weske, M.: Prediction of remaining service execution time
using stochastic petri nets with arbitrary firing delays. In: Basu, S., Pau-
tasso, C., Zhang, L., Fu, X. (eds.) Service-Oriented Computing - 11th In-
ternational Conference, ICSOC 2013, Berlin, Germany, December 2-5, 2013,
Proceedings. Lecture Notes in Computer Science, vol. 8274, pp. 389–403.
Springer (2013). https://doi.org/10.1007/978-3-642-45005-1 27, https://doi.
org/10.1007/978-3-642-45005- 1_27
... This paradigm aims to support analyzing business processes considering multiple case notions that require developing algorithms, techniques, and methods to support multi-dimensional process analysis. Although OCPM has started recently, due to the highly relevant problem that it targets, several algorithms, tools, and libraries have been developed to support such analysis, e.g., [3,4,6,11,26,34,35,39,40]. This development can also be observed in commercial tools like Celonis 3 , showing the relevancy of the problem in practice. ...
... OCEL [11], [10], [37], [42], [21] [11], [7], [8], [5], [26] [36] [11], [35], [36] [4], [40], [26] [6], [11], [4] [4], [3] [4], [22] EKG [16], [14] U C1: Transformation, U C2: Exploration, U C3: Monitoring, U C4: Performance Analysis, U C5: Discovery U C6: Conformance Checking, U C7: Enhancement, U C8: predictive process monitoring ...
... In the performance analysis use case, a tool called OC-PM is available for calculating the duration time of objects [11]. Additionally, performance metrics computation is supported by [36] and [35]. In the discovery use case, the discovery of object-centric Petri nets is supported by [40] and [4]. ...
Conference Paper
Full-text available
Process mining has significantly transformed business process management by introducing innovative data-based analysis techniques and empowering organizations to unveil hidden insights previously buried within their recorded data. The analysis is conducted on event logs structured by conceptual models. Traditional models were defined based on only a single case notion, e.g., order or item in the purchase process. This limitation hinders the application of process mining in practice for which new data models are developed, a.k.a, multi-dimensional Event Knowledge Graph (EKG) and Object-Centric Event Log (OCEL). While several tools have been developed for OCEL, there is a lack of process mining tooling around the EKG. In addition, there is a lack of comparison about the practical implication of choosing one approach over the other. To fill this gap, the contribution of this paper is threefold. First, it defines and implements an algorithm to transform event logs represented as EKG to OCEL. The implementation is then used to transform five real event logs based on which the approach is evaluated. Second, it compares the performance of analyzing event logs represented in these two models. Third, it reveals similarities and differences in analyzing processes based on event logs represented in these two models. The results highlight ten important findings, including different approaches in calculating directly-follows relations when analyzing filtered event logs in these models and issues that need to be considered in analyzing event lifecycle and inter-log relations using OCEL.
... Among the many studies [6,21,23] focused on object-centric process mining, only a few have proposed methods for generating OCEL, and these often require specific input data sources like ERP [4,5] and blockchain systems [22]. Some work proposes methods to extract OCEL from knowledge graphs [18,28] or traditional XES event logs [25]. ...
... Example of an event log following the ACEL schema [22] Fig. 23 Example of OCEL log following the DOCEL schema [10] • ACEL Schema (Fig. 22c): Records all static attributes of relevant objects, allowing us to identify the type of cargo Cid1 as 'Rice'. • DOCEL Schema (Fig. 23b): Captures static cargo attributes, thus also capable of answering the query. ...
Preprint
Full-text available
Real-world processes involve multiple object types with intricate interrelationships. Traditional event logs (in XES format), which record process execution centred around the case notion, are restricted to a single-object perspective, making it difficult to capture the behaviour of multiple objects and their interactions. To address this limitation, object-centric event logs (OCEL) have been introduced to capture both the objects involved in a process and their interactions with events. The object-centric event data (OCED) metamodel extends the OCEL format by further capturing dynamic object attributes and object-to-object relations. Recently OCEL 2.0 has been proposed based on OCED metamodel. Current research on generating OCEL logs requires specific input data sources, and resulting log data often fails to fully conform to OCEL 2.0. Moreover, the generated OCEL logs vary across different representational formats and their quality remains unevaluated. To address these challenges, a set of quality criteria for evaluating OCEL log representations is established. Guided by these criteria, Dirigo\textit{Dirigo} is proposed -- a method for extracting event logs that not only conforms to OCEL 2.0 but also extends it by capturing the temporal aspect of dynamic object-to-object relations. Object-role Modelling (ORM), a conceptual data modelling technique, is employed to describe the artifact produced at each step of Dirigo\textit{Dirigo}. To validate the applicability of Dirigo\textit{Dirigo}, it is applied to a real-life use case, extracting an event log via simulation. The quality of the log representation of the extracted event log is compared to those of existing OCEL logs using the established quality criteria.
... Varying dimensions, such as time, cost, or quality, can determine the performance of a process. In process mining, performance analysis can be based on a single object, such as frequency or time, and a combination of single objects, using several modeling techniques, such as business strategy models, Petri Net, directly following models, and alignment [44]. Several studies have used event logs with conformance-checking techniques to see performance through activity conformance detection, such as [22] in learner learning activities, [26] in service processes in health facilities, and [45] in the dwell time of the loading and unloading process. ...
... An event represents a learning activity as presented in Table 6. Learner activity performance is analyzed from event logs based on frequency objects, timestamps [44], and activity [22] then five relevant activity performance variables are determined, namely 1) frequency of attendance, 2) duration of attendance, 3) frequency of access material, 4) number of posts to forum, and 5) activity conformance. Figure 3 presents the stages of activity performance analysis in each LLO: ...
Article
E-learning can lead learners to achieve learning outcomes if it is designed based on several principles. One is applying assessments that motivate and inform ability levels. In Outcome-based Education (OBE), assessment is integral to the system. However, e-learning has limitations in providing assessment instruments according to needs, such as assessing complex and detailed aspects and accommodating a variety of numerical and linguistic assessment data. Moreover, the presence and involvement of learners affect their performance and learning outcomes. This study proposes a learner assessment system in e-learning with the OBE approach, including learning design, activity performance analysis, ability level determination, and recommendations. This system adds the e-rubric to e-learning to overcome instrument limitations and accommodate comprehensive assessments. Various numerical and linguistic assessment data are unified using 2-tuple fuzzy linguistics, producing ability levels as two tuples. Performance analysis was based on event log data using descriptive statistical technique and alignment-based conformance checking, from frequency, time, and sequence of activity objects, resulting in five activity performance variables. The performance value of each variable is converted into High, Medium, or Low levels. The ability and performance levels are processed using rule-based methods to produce recommendations for learning stages and activity performance directions. The results of this research can be used as input for academic stakeholders and online learning providers and potentially be applied to the advancement of e-learning in higher education. Doi: 10.28991/HIJ-2024-05-03-03 Full Text: PDF
... The Process Mining Toolkit (PMTk) [56] includes several process mining capabilities in one end-user-oriented tool, including process discovery, visual analysis of processes, and conformance checking. OPerA [57] focuses on object-centric performance metrics, providing insight into object lifecycles. Cortado [58] innovates process discovery by utilizing domain knowledge and enabling incremental process model discovery. ...
Article
Full-text available
PM4Py is a Python library providing a comprehensive array of tools for process mining. This paper presents an in-depth overview of the PM4Py library, including its integration with other Python libraries and its latest features, such as object-centric process mining. Furthermore, we discuss the significant impact of PM4Py within academia, industry, and the open-source community, evidenced by its wide adoption and substantial evolution. In short, the PM4Py library is an essential tool for researchers and practitioners, paving the way for advancements in process mining.
Chapter
Full-text available
The inductive miner (IM) can guarantee to return structured process models, but the process behaviours that process trees can represent are limited. Loops in process trees can only be exited after the execution of the “body” part. However, in some cases, it is possible to break a loop structure in the “redo” part. This paper proposes an extension to the process tree notation and the IM to discover and represent break behaviours. We present a case study using a healthcare event log to explore Acute Coronary Syndrome (ACS) patients’ treatment pathways, especially discharge behaviours from ICU, to demonstrate the usability of the proposed approach in real-life. We find that treatment pathways in ICU are routine behaviour, while discharges from ICU are break behaviours. The results show that we can successfully discover break behaviours and obtain the structured and understandable process model with satisfactory fitness, precision and simplicity.
Conference Paper
Full-text available
Event logs capture information about executed activities. However, they do not capture information about activities that could have been performed, i.e., activities that were enabled during a process. Event logs containing information on enabled activities are called translucent event logs. Although it is possible to extract translucent event logs from a running information system, such logs are rarely stored. To increase the availability of translucent event logs, we propose two techniques. The first technique records the system’s states as snapshots. These snapshots are stored and linked to events. A user labels patterns that describe parts of the system’s state. By matching patterns with snapshots, we can add information about enabled activities. We apply our technique in a small setting to demonstrate its applicability. The second technique uses a process model to add information concerning enabled activities to an existing traditional event log. Data containing enabled activities are valuable for process discovery. Using the information on enabled activities, we can discover more correct models.
Chapter
Full-text available
IoT devices supporting business processes (BPs) in sectors like manufacturing, logistics or healthcare collect data on the execution of the processes. In the last years, there has been a growing awareness of the opportunity to use the data these devices generate for process mining (PM) by deriving an event log from a sensor log via event abstraction techniques. However, IoT data are often affected by data quality issues (e.g., noise, outliers) which, if not addressed at the preprocessing stage, will be amplified by event abstraction and result in quality issues in the event log (e.g., incorrect events), greatly hampering PM results. In this paper, we review the literature on PM with IoT data to find the most frequent data quality issues mentioned in the literature. Based on this, we then derive six patterns of poor sensor data quality that cause event log quality issues and propose solutions to avoid or solve them.
Conference Paper
Full-text available
The application of process mining techniques to real-life information systems is often challenging. Considering a Purchase to Pay (P2P) process, several case notions such as order and item are involved, interacting with each other. Therefore, creating an event log where events need to relate to a single case (i.e., process instance) leads to convergence (i.e., the duplication of an event related to different cases) and divergence (i.e., the inability to separate events within the same case) problems. To avoid such problems, object-centric event logs have been proposed, where each event can be related to different objects. These can be exploited by a new set of process mining techniques. This paper describes OCEL (Object-Centric Event Log), a generic and scalable format for the storage of object-centric event logs. The implementation of the format can use either JSON or XML, and tool support is provided.
Article
Full-text available
Process event data is usually stored either in a sequential process event log or in a relational database. While the sequential, single-dimensional nature of event logs aids querying for (sub)sequences of events based on temporal relations such as “directly/eventually-follows,” it does not support querying multi-dimensional event data of multiple related entities. Relational databases allow storing multi-dimensional event data, but existing query languages do not support querying for sequences or paths of events in terms of temporal relations. In this paper, we propose a general data model for multi-dimensional event data based on labeled property graphs that allows storing structural and temporal relations in a single, integrated graph-based data structure in a systematic way. We provide semantics for all concepts of our data model, and generic queries for modeling event data over multiple entities that interact synchronously and asynchronously . The queries allow for efficiently converting large real-life event data sets into our data model, and we provide 5 converted data sets for further research. We show that typical and advanced queries for retrieving and aggregating such multi-dimensional event data can be formulated and executed efficiently in the existing query language Cypher, giving rise to several new research questions. Specifically, aggregation queries on our data model enable process mining over multiple inter-related entities using off-the-shelf technology.
Chapter
Full-text available
Process mining aims to understand the actual behavior and performance of business processes from event logs recorded by IT systems. A key requirement is that every event in the log must be associated with a unique case identifier (e.g., the order ID in an order-to-cash process). In reality, however, this case ID may not always be present, especially when logs are acquired from different systems or when such systems have not been explicitly designed to offer process-tracking capabilities. Existing techniques for correlating events have worked with assumptions to make the problem tractable: some assume the generative processes to be acyclic while others require heuristic information or user input. In this paper, we lift these assumptions by presenting a novel technique called EC-SA based on probabilistic optimization. Given as input a sequence of timestamped events (the log without case IDs) and a process model describing the underlying business process, our approach returns an event log in which every event is mapped to a case identifier. The approach minimises the misalignment between the generated log and the input process model, and the variance between activity durations across cases. The experiments conducted on a variety of real-life datasets show the advantages of our approach over the state of the art.
Article
Full-text available
Today's process modeling languages often force the analyst or modeler to straightjacket real-life processes into simplistic or incomplete models that fail to capture the essential features of the domain under study. Conventional business process models only describe the lifecycles of individual instances (cases) in isolation. Although process models may include data elements (cf. BPMN), explicit connections to real data models (e.g., an entity relationship model or a UML class model) are rarely made. Therefore, we propose a novel approach that extends data models with a behavioral perspective. Data models can easily deal with many-to-many and one-to-many relationships. This is exploited to create process models that can also model complex interactions between different types of instances. Classical multiple-instance problems are circumvented by using the data model for event correlation. The declarative nature of the proposed language makes it possible to model behavioral constraints over activities like cardinality constraints in data models. The resulting object-centric behavioral constraint (OCBC) model is able to describe processes involving interacting instances and complex data dependencies. In this paper, we introduce the OCBC model and notation, providing a number of examples that give a flavour of the approach. We then define a set-theoretic semantics exploiting cardinality constraints within and across time points. We finally formalize conformance checking in our setting, arguing that evaluating conformance against OCBC models requires diagnostics that go beyond what is provided by contemporary conformance checking approaches.
Article
Techniques to discover Petri nets from event data assume precisely one case identifier per event. These case identifiers are used to correlate events, and the resulting discovered Petri net aims to describe the life-cycle of individual cases. In reality, there is not one possible case notion, but multiple intertwined case notions. For example, events may refer to mixtures of orders, items, packages, customers, and products. A package may refer to multiple items, multiple products, one order, and one customer. Therefore, we need to assume that each event refers to a collection of objects, each having a type (instead of a single case identifier). Such object-centric event logs are closer to data in real-life information systems. From an object-centric event log, we want to discover an object-centric Petri net with places that correspond to object types and transitions that may consume and produce collections of objects of different types. Object-centric Petri nets visualize the complex relationships among objects from different types. This paper discusses a novel process discovery approach implemented in PM4Py. As will be demonstrated, it is indeed feasible to discover holistic process models that can be used to drill-down into specific viewpoints if needed.
Chapter
Token-based replay used to be the standard way to conduct conformance checking. With the uptake of more advanced techniques (e.g., alignment based), token-based replay got abandoned. However, despite decomposition approaches and heuristics to speed-up computation, the more advanced conformance checking techniques have limited scalability, especially when traces get longer and process models more complex. This paper presents an improved token-based replay approach that is much faster and scalable. Moreover, the approach provides more accurate diagnostics that avoid known problems (e.g., “token flooding”) and help to pinpoint compliance problems. The novel token-based replay technique has been implemented in the PM4Py process mining library. We will show that the replay technique outperforms state-of-the-art techniques in terms of speed and/or diagnostics.
Chapter
Processes are a key application area for formal models of concurrency. The core concepts of Petri nets have been adopted in research and industrial practice to describe and analyze the behavior of processes where each instance is executed in isolation. Unaddressed challenges arise when instances of processes may interact with each other in a one-to-many or many-to-many fashion. So far, behavioral models for describing such behavior either also include an explicit data model of the processes to describe many-to-many interactions, or cannot provide precise operational semantics. In this paper, we study the problem in detail through a fundamental example and evolve a few existing concepts from net theory towards many-to-many interactions. Specifically, we show that three concepts are required to provide an operational, true concurrency semantics to describe the behavior of processes with many-to-many interactions: unbounded dynamic synchronization of transitions, cardinality constraints limiting the size of the synchronization, and history-based correlation of token identities. The resulting formal model is orthogonal to all existing data modeling techniques, and thus allows to study the behavior of such processes in isolation, and to combine the model with existing and future data modeling techniques.
Chapter
The integrated management of business processes and master data is being increasingly considered as a fundamental problem, by both the academia and the industry. In this position paper, we focus on the foundations of the problem, arguing that contemporary approaches struggle to find a suitable equilibrium between data- and process-related aspects. We then propose a new formal model, called db-nets, that balances such two pillars through the marriage of colored Petri nets and relational databases. We invite the research community to build on this new model, discussing in particular its potential in conceptual modeling, formal verification, and simulation.