Content uploaded by Nicola Zannone
Author content
All content in this area was uploaded by Nicola Zannone on Jun 25, 2016
Content may be subject to copyright.
Content uploaded by Nicola Zannone
Author content
All content in this area was uploaded by Nicola Zannone on Jun 25, 2016
Content may be subject to copyright.
Constructing Probable Explanations of Nonconformity:
A Data-aware and History-based Approach
Mahdi Alizadeh
Eindhoven University of Technology
Email: m.alizadeh@tue.nl
Massimiliano de Leoni
Eindhoven University of Technology
Email: m.d.leoni@tue.nl
Nicola Zannone
Eindhoven University of Technology
Email: n.zannone@tue.nl
Abstract—Auditing the execution of business processes is be-
coming a critical issue for organizations. Conformance checking
has been proposed as a viable approach to analyze process
executions with respect to a process model. In particular, align-
ments provide a robust approach to conformance checking in
that they are able to pinpoint the causes of nonconformity.
Alignment-based techniques usually rely on a predefined cost
function that assigns a cost to every possible deviation. Defining
such a cost function, however, is not trivial and is prone to
imperfection that can result in inaccurate diagnostic information.
This paper proposes an alignment-based approach to construct
probable explanations of nonconformity. In particular, we show
how cost functions can be automatically computed based on
historical logging data and taking into account multiple process
perspectives. We implemented our approach as a plug-in of
the ProM framework. Experimental results show that our ap-
proach provides more accurate diagnostics compared to existing
alignment-based techniques.
I. INTRODUCTION
Risk management is a critical part of the business processes
of any organization. Several regulations and guidelines (e.g.,
Sarbanes-Oxley Act, Basel III, COBIT) have been defined
for operational risk management. These regulations require
organizations to implement internal controls and ensure a
transparent audit trail of IT-related activities. In particular,
organizations are required to monitor their internal processes in
order to detect operations that violate the prescribed behavior.
Conformance checking [1] has been proposed to analyze
the events recorded by an IT system against a prescribed
behavior described as a process model. In particular, align-
ments [2] provide a robust approach to conformance checking
by allowing the detection and analysis of nonconformity be-
tween the observed and prescribed behavior. Alignment-based
techniques construct alignments based on a cost function that
assigns a cost to every possible deviation. In particular, they
construct alignments that have the least cost with respect to the
employed cost function. As a consequence, the quality of di-
agnostic information provided by alignment-based techniques
depends on the cost function used to construct the alignments.
Alignment-based techniques usually expect cost functions
to be defined by business analysts based on their background
knowledge and beliefs. In particular, business analysts have to
define the cost of every possible deviation that can occur, and
this cost is often specific to the purpose of the intended analysis
and only based on the activities executed. The definition of
such a cost function is, thus, fully based on human judgment
and prone to imperfections. These imperfections ultimately
lead to alignments that are optimal, according to the provided
cost function, but that may not provide accurate diagnostics.
To illustrate these issues, consider the process model in
Fig. 1. The model describes a process for handling credit
requests at a financial institution. After a client requests a loan
(a), the client’s financial data are verified (b). In the case the re-
sult of the verification is negative, the client is informed about
the result (for g) and the process terminates. In particular, if
the initial of the client’s name is A-L, activity gis performed;
otherwise activity fis performed. If the verification is positive,
either advanced (c) or simple (d) assessment is performed
depending on the amount of the requested loan. If the request is
approved, a credit loan is opened (h) and the client is informed
about the result (for g). If the assessment is negative, the
client can renegotiate the loan (e). Several data attributes might
be associated with these activities. During credit request, the
client’s name (R) and amount of the requested loan (A) should
be provided. The result of the verification (V) and assessment
(D) are also recorded by the system. After renegotiation of the
request, the amount of the loan is updated.
Suppose an analyst has to verify the compliance of the
process execution represented by the following event trace1
with the process model in Fig. 1 and determine the causes of
nonconformity if the trace is not compliant.
σ1=h(b, {V=true}),(h, {}),(g, {})i
It is easy to observe that σ1does not conform to the process
model, i.e. the trace does not correspond to any path allowed
by the model. Activity awas not executed. Moreover, a
credit loan was opened without an assessment of the request.
These deviations can have various explanations. A possible
explanation is that the request was rejected and, thus, hshould
not have been performed. Another possible explanation is that
the assessment (cor d) was skipped and the credit loan was
opened correctly.
Which explanation is constructed by alignment-based tech-
niques depends on the cost assigned to deviations. For instance,
suppose that, for a certain analysis, the analyst assigns a cost to
the suppression of cand d(i.e., these activities are not executed
when they should have) higher than the cost of inserting h(i.e.,
executing the activity when it should not have been executed).
In this case, the first explanation is returned. Later, the analyst
defines a new cost function for a new type of analysis, this
1Notation (act, {attr1=val1, . . . , attrn=valn})is used to denote
the occurrence of activity act in which data attributes attr1, . . . , attrnare
assigned values val1, . . . , valnrespectively.
Figure 1: A process model for handling credit requests. Green
boxes represent the transitions associated with process activi-
ties while black boxes represent invisible transitions. The text
below the activities represents the label, which is shortened
with a single letter as indicated inside the transitions.
time assigning a lower cost to the suppression of cor d. In this
case, the second explanation is returned, making the analyst
uncertain on what actually happened.
This example reveals that determining what actually hap-
pened is independent from the purpose of the analysis. We
advocate that the analysis of event traces should be divided
into two phases: first nonconformity should be identified
along with their root causes based on objective factors; then
nonconformity should be analyzed with respect to the purpose
of the analysis. This work focuses on the identification and
explanations of nonconformity.
In our previous work [3], we proposed an alternative way
to define a cost function for the detection of nonconformity,
where the human judgment is put aside and only objective
factors are considered. The cost function is automatically con-
structed by looking at the logging data and, more specifically,
at the past process executions that are compliant with the
process model. The intuition behind is that one should look
at the past history of process executions and learn from it how
the process is typically executed. A probable explanation of
nonconformity of a certain process execution can be obtained
by analyzing the behavior observed for such a process execu-
tion in each and every state against the behavior observed for
other conforming traces when they were in the same state.
However, in our previous work, we used a concept of state
that only considers the activities that are executed. Suppose,
in the example above, that the analyst has to determine
whether advanced (c) or simple (d) assessment should have
been performed. Knowledge of the amount of the requested
loan provides an indication of the type of assessment that
should have been executed. In fact, an advanced assessment
is typically performed when a large amount is requested,
whereas requests for a low amount are typically verified using
a simple assessment. Thus, considering the data attributes
manipulated by activities provides additional information about
which activities could have been executed and, thus, can help
determine more accurate explanations of nonconformity.
This paper extends the work in [3] by taking into account
multiple process perspectives to compute a cost function for
the construction of probable explanations of nonconformity.
In particular, we extend the concept of state to incorporate the
data attributes manipulated by activities, improving the accu-
racy of the cost function. It is worth noting that information
about data attributes is often not provided with the process
model. Therefore, our approach constructs alignments using
only the activities executed, whereas data attributes are used
to define a cost function that enables the construction of align-
ments which give probable explanations of nonconformity.
The new technique has been implemented as a plug-in of
the ProM framework. An evaluation of our approach shows
that it provides more accurate explanations of nonconformity
compared to existing alignment-based techniques.
The remainder of the paper is structured as follows. The
next section presents the basic notation and background about
alignments. Section III introduces the notion of state represen-
tation. Section IV presents our approach to compute the cost
of deviations based on logging data and construct probable
alignments. Section V presents the results of our experiments.
Finally, Section VI discusses related work, and Section VII
concludes the paper with directions for future work.
II. BACKGROU ND A ND NOTATI ON
Process models describe how activities in the process must
be performed. In this work, we represent process models in the
form of Labeled Petri net. Our approach is extendable to any
modeling language for which a translation to Labeled Petri [?]
nets is available.
Definition 1 (Labeled Petri Net).A Labeled Petri net is a tuple
(P, T , F, A, `, mi, mf)where
•Pis a set of places;
•Tis a set of transitions;
•F⊆(P×T)∪(T×P)is the flow relation between places
and transitions (and between transitions and places);
•Ais the set of labels for transitions;
•`:T→Ais a function that associates a label with
every transition in T;
•miis the initial marking;
•mfis the final marking.
Hereafter, for simplicity, the term Petri net is used to refer
to Labeled Petri Net. In a Petri net, transitions represent events
and places represent states. Multiple transitions can have the
same label. Such transitions are called duplicate transitions.
We distinguish two types of transitions, namely invisible and
visible transitions. Visible transitions are labeled with activity
names. Invisible transitions are used for routing purposes or
related to activities that are not observed by the IT system.
Given a Petri net N, the set of activity labels associated with
invisible transitions is denoted with InvN⊆A.
A marking is a multiset of tokens and represents a state of
the Petri net. Tokens reside in places. A transition is enabled if
at least one token exists in each input place of the transition.
By executing (i.e., firing) a transition, a token is consumed
from each input places and a token is produced for each of
its output places. A complete firing sequence is a sequence of
transitions from the initial marking to the final marking. The
set of all complete firing sequences of a Petri net Nis denoted
by ΨN. Hereafter, we call process trace a firing sequence in
which transitions are mapped to their activity label.
Definition 2 (Event, Event Trace, Event Log).Let N=
(P, T , F, A, `, mi, mf)be a labeled Petri net and InvN⊆A
the set of invisible transitions in N. Let Vbe a set of data
attributes. Let us denote the domain of a data attribute v∈V
with U(v)and the union of all attribute domains with U,
i.e. U=∪v∈VU(v). An event e= (a, ϕe)consists of an
executed activity a∈A\InvNand a partial function ϕethat
assigns values to attributes: ϕe∈V9Us.t. ∀v∈dom(ϕe)
ϕe(v)∈U(v).2The set of events is denoted by E. An event
trace σ∈ E∗is a sequence of events. An event log L ∈ B(E∗)
is a multiset of event traces.3
Given an event e= (a, ϕe), we use act(e)to denote
the activity label associated to e, i.e. act(e) = a. This
notation extends to event traces. Given an event trace σ=
he1, . . . , eni ∈ E∗,act(σ)denotes the sequence of activities
obtained from the projection of the events in σto their activity
label, i.e. act(σ) = hact(e1), . . . , act(en)i.
Not all event traces in an event log can be reproduced by a
Petri net, i.e. not all event traces may correspond to a process
trace. If an event trace perfectly fits the net, each “move” in
the event trace (i.e., an event observed in the trace) can be
mimicked by a “move” in the model (i.e., an activity in the net).
After all events in the event trace are mimicked, the net reaches
its final marking. In cases where deviations occur, some moves
in the event trace cannot be mimicked by the net or vice versa.
We explicitly denote “no move” by .
Definition 3 (Legal move).Let N= (P, T, F, A, `, mi, mf)
be a Petri net, SL=E ∪ {} and SM=A∪ {}. A legal
move is a pair (mL, mM)∈(SL×SM)\(,)such that
•(mL, mM)is a synchronous move if mL∈ E,mM∈A
and act(mL) = mM,
•(mL, mM)is a move on log if mL∈ E and mM=,
•(mL, mM)is a move on model if mL=and mM∈A.
ΣNdenotes the set of legal moves for a Petri net N.
A sequence σ0is a prefix of sequence σ00 if there exists a
sequence σ000 such that σ00 =σ0⊕σ000, where ⊕denotes the
concatenation operator. Hereafter, σ0∈prefix(σ00)is used to
denote such a relation.
Definition 4 (Alignment).Let ΣNbe the set of legal moves for
a Petri net N= (P, T, F , A, `, mi, mf). An alignment of an
event trace σand Nis a sequence γ∈Σ∗
Nsuch that, ignoring
all occurrences of , the projection on the first element yields
σand the projection on the second element yields a process
trace ha1, . . . , anisuch that there exists a firing sequence ψ0=
ht1, . . . , tni ∈ prefix(ψ)for some ψ∈ΨNwhere, for every
1≤i≤n,`(ti) = ai. If ψ0∈ΨN,γis called a complete
alignment of σand N.
Fig. 2 shows four alignments of event trace σ1and the
net in Fig. 1. The first and second columns in the alignment,
ignoring , show the event trace and the process trace to
which the event trace is aligned respectively.
2We denote the domain of a function fwith dom(f).
3B(X)represents the set of all multisets over X.
Event Trace Process Trace
a
b, {V=true}b
c
Inv2
h, {} h
g, {} g
Inv5
(a)
Event Trace Process Trace
a
b, {V=true}b
d
Inv2
h, {} h
g, {} g
Inv5
(b)
Event Trace Process Trace
a
b, {V=true}b
Inv1
h, {}
g, {}
f
Inv5
(c)
Event Trace Process Trace
a
b, {V=true}b
Inv1
h, {}
g, {} g
Inv5
(d)
Figure 2: Examples of complete alignments of the net in Fig. 1
and σ1=h(b, {V=true}),(h, {}),(g, {})i
As shown in Fig. 2, there can be several (possibly infinite)
explanations of nonconformity. The quality of an alignment is
measured based on a cost function. A cost function defines the
cost of every move that can occur in an alignment. The cost
of an alignment is the sum of the cost of the moves in the
alignment. An optimal alignment is an alignment that has the
least cost according to the cost function.
As an example, consider a cost function that penalizes
all moves on model for visible transitions and moves on log
equally. This cost function does not penalize synchronous
moves and moves on model for invisible transitions. If moves
on model for invisible transitions are ignored, the alignments
in Fig. 2a and Fig. 2b have two moves on model, the alignment
in Fig. 2c has two moves on log and two moves on model, and
the alignment in Fig. 2d has one move on log and one move
on model. Thus, according to our example cost function, the
alignments in Fig. 2a, Fig. 2b, and Fig. 2d are three optimal
alignments of σ1and the process model in Fig. 1.
III. STATE REPRESENTATION
The goal of this work is to construct alignments that
give probable explanations of nonconformity based on the
information in historical logging data. To this end, we aim
at the definition of a cost function that accounts for the
probability of executing (or never executing) a certain activity
in the current state of the process execution based on the
past process executions. In this section we define the state
representation used to define such a cost function. In the next
section, we define the cost function and present our approach
for constructing probable alignments.
The state of a process represents its execution at a given
time, i.e. the performed activities along with the value of data
attributes. A state is formally defined as follows:
Definition 5 (State).Let Abe the set of activities, Vthe set of
attributes and Uthe attributes’ domain. A state sfor an event
trace σis a pair (act(σ), ϕs)where ϕsdenotes a function that
associates a value to each attribute, i.e. ϕs:V→ U ∪ {⊥}
such that for all v∈V ϕs(v)∈U(v)∪{⊥} (where ⊥indicates
undefined). The initial state is denoted sI= (hi, ϕI)where ϕI
is the initial assignment of values to attributes.
Trace #
(a,{R=‘bob’,A=1000}),(b,{V=false}),(g,{}) 200
(a,{R=‘bob’,A=1000}),(b,{V=true}),(d,{D=true}),(g,{}),(h,{})600
(a,{R=‘bob’,A=4100}),(b,{V=true}),(d,{D=false}),(g,{})100
(a,{R=‘bob’,A=4200}),(b,{V=true}),(d,{D=false}),(g,{})100
(a,{R=‘bob’,A=4500}),(b,{V=true}),(d,{D=false}),(g,{})100
(a,{R=“tim”,A=5500}),(b,{V=false}),(f,{}) 500
(a,{R=‘tim’,A=5500}),(b,{V=true}),(c,{D=true}),(f,{}),(h,{})200
(a,{R=‘tim’,A=5500}),(b,{V=true}),(c,{D=false}),(f,{})50
(a,{R=‘tim’,A=5500}),(b,{V=true}),(c,{D=false}),(e,A=1000), 150
(b,{V=false}),(f,{})
(a) Event traces
State Next
Activity
ID Executed Data Attributes #
Activities R A V D
s1hi ⊥ ⊥ ⊥ ⊥ a2000
s2haibob 1000 ⊥ ⊥ b800
s3haibob 4100 ⊥ ⊥ b100
s4haibob 4200 ⊥ ⊥ b100
s5haibob 4500 ⊥ ⊥ b100
s6haitim 5500 ⊥ ⊥ b900
s7ha, bibob 1000 false ⊥g200
s8ha, bibob 1000 true ⊥d600
s9ha, bibob 4100 true ⊥d100
s10 ha, bibob 4200 true ⊥d100
s11 ha, bibob 4500 true ⊥d100
s12 ha, bitim 5500 true ⊥c400
s13 ha, bitim 5500 false ⊥f500
s14 ha, b, gibob 1000 false ⊥- 200
s15 ha, b, dibob 1000 true true g600
... ... ... ... ... ... ... ...
(b) State representation
Figure 3: Sample event traces and their state representation
The initial attribute assignment ϕIrepresents the value of
the attributes in Vbefore the process is executed. Moreover,
ϕIcomprises the value of case attributes, i.e. attributes that
characterize the event trace as a whole. In some cases, a value
may not be (initially) defined for some attributes. We use
symbol “⊥” to denote the undefined value.
The execution of an event trace changes the state of the
process. We first define a state transition, which is a change
of one state to another state due to the effect of an event, and
then we extend this definition to event traces.
Definition 6 (State Transition).Let Vbe the set of attributes.
Given an event e= (a, ϕe)and the state s= (act(σ), ϕs)for
an event trace σ,etransforms sinto a state s0= (act(σ0), ϕs0)
such that σ0=σ⊕aand for every v∈V
ϕs0(v) = ϕe(v)if v∈dom(ϕe)
ϕs(v)otherwise (1)
We denote se
−→ s0the state transition given by e.
Intuitively, (1) states that an event can update some data
attributes, leaving the other attributes in ϕsunchanged.
Definition 7 (Trace Execution).Given an event trace σ=
he1, ..., eni ∈ E∗,σtransforms the initial state sIinto a state
sif there exist states s0, s1, . . . , snsuch that
sI=s0
e1
−→ s1
e2
−→ . . . en
−→ sn=s
We denote state(σ)the state yielded by an event trace σ.
Alignments are constructed based on the probability of
performing (or never performing) a certain activity when the
process is in a given state. To this end, we determine the states
that are reached by executing the (prefix) event traces in the
historical logging data (i.e., past process executions) together
with their occurrence.
Example 1. Fig. 3a shows a sample of event traces in the
historical logging data for the net in Fig. 1 along with their
occurrence. The states yielded by the prefixes of these traces,
their occurrence and the activity executed after reaching that
state are shown in Fig. 3b. Data attributes are initialized with
the undefined value (⊥). Consider the (prefix) event trace σ2=
h(a, {R= ‘bob’, A = 1000}),(b, {V=true})i. This trace
yields a state sσ2= (ha, bi, ϕ)s.t. ϕ(R)=‘bob’, ϕ(A) =
1000, ϕ(V) = true, ϕ(D) = ⊥. Based on the traces in Fig. 3a,
sσ2was reached 600 times.
We have now the machinery to analyze historical logging
data. However, to compute probable explanations of noncon-
formity we also have to determine the state in which the
process is supposed to be after the execution of a given
sequence of events. Essentially, we have to determine the
state yielded by an alignment. The projection of an alignment
over the net gives us only the sequence of activities that
were performed. We also need to determine the value of data
attributes in the state yielded by the alignment.
However, this information might be missing. For instance,
moves on model, which represent activities that are assumed
to be executed, do not provide any information about the data
attributes that are expected to be updated by the activity. The
problem of missing attributes should be taken into account
when the state of the process is determined.
Example 2. Consider state s= (ha, b, d, ei, ϕ)s.t. ϕ(V) =
true (all other data attributes are equal to ⊥) and the move
on model (, b). If the problem of missing data attributes is
not considered, we simply obtain state s0= (ha, b, d, e, bi, ϕ0)
with ϕ0=ϕ, i.e. the value of Vis not updated. However,
by executing activity bthe value of attribute V, indicating the
result of the verification, might be updated to a new value and,
thus, the process could be in a state in which Vcould be either
true or false.
As shown in the example above, missing data attributes
introduces uncertainty on the reached state: different states
could have been reached. To capture this, we introduce the
notion of state subsumption. In particular, state subsumption
is used to determine the possible states of the process that
could have been yielded by a (non-fitting) event trace.
Definition 8 (State Subsumption).Given two states s=
(rs, ϕs)and s0= (rs0, ϕs0), we say that ssubsumes s0, denoted
ss0, if and only if (i) rs=rs0and (ii) for all v∈Vs.t.
ϕs(v)6=⊥ϕs0(v) = ϕs(v).
Let us provide an explanatory example:
Example 3. Consider a state s= (ha, bi, ϕ)s.t. ϕ(R) = ⊥,
ϕ(A) = ⊥, ϕ(V) = true, ϕ(D) = ⊥. Given the states in
Fig. 3b, state ssubsumes states s8, . . . , s12. Thus, we can
conclude that there are 1300 event traces in Fig. 3a whose
prefix yields a state subsumed by s.
Now we show how a trace can be extracted from an
alignment. Hereafter, we use notation ¯ϕato indicate the set
of attributes that are typically updated by an activity a. Such
a set can be defined by a statistical analysis of the events in
historical logging data, for instance by setting a threshold on
the percentage of occurrences in which an attribute is updated
by an activity.
Definition 9 (From Alignments to Event Traces).Let ΣNbe
the set of legal moves for a Petri net N,InvNthe set of
invisible transitions in N,Ethe set of events and Vthe set of
attributes. We define a function α: Σ∗
N→ E∗that transforms
an alignment into an event trace as follows:
α(hi) = hi
α(γ⊕(mL, mM)) = α(γ)mM∈InvN∪{}
α(γ)⊕(mM, ϕ(mL,mM))otherwise
(2)
where the attribute assignment ϕ(mL,mM)in the event obtained
from move (mL, mM)is defined for every v∈Vas follow:
ϕ(mL,mM)
(v)=
ϕmL
(v)act(mL
)=mMand v∈dom(ϕmL
)
⊥act(mL
)=mMand v6∈dom(ϕmL
)and v∈¯ϕmM
⊥mL=and v∈¯ϕmM
(3)
Intuitively, (2) projects an alignment over the process
model and (3) determines the data attributes updated by the
event reconstructed from a move. In particular, (3) assigns the
undefined value (⊥) to missing data attributes.
IV. CON ST RUCTION OF PROBA BL E AL IG NM EN TS
This section presents our approach to construct probable
alignments, namely alignments between a process model and
a log trace that give likely explanations of deviations based
on objective facts, i.e. the historical logging data. To construct
these alignments, we use the A-star algorithm [5]. Section IV-A
discusses the basic foundation of the A-star algorithm and
how it can be employed to build these alignments, whereas
Section IV-B focuses on how to automatically define a cost
function that enables their construction.
A. Usage of A-star for constructing probable alignments
The A-star algorithm [5] aims to find a path in a graph Q
from a given source node q0to any node q∈Qin a target set.
Every node qof graph Qis associated with a cost determined
by an evaluation function f(q) = g(q) + h(q), where
•g:Q→R+
0is a function that returns the smallest path
cost from q0to q;
•h:Q→R+
0is a heuristic function that estimates the
path cost from qto its preferred target node.
A-star is guaranteed to find a path with the overall lowest cost
if his admissible, i.e. if it returns a value that underestimates
the distance of a path from a node q0to its preferred target
node q00:g(q0) + h(q0)≤g(q00 ).
The A-star algorithm keeps a priority queue of nodes to
be visited: higher priority is given to nodes with lower costs.
At each step, the node with the lowest cost is chosen from the
priority queue and is expanded with all of its successors in the
graph. The search continues until a target node is reached.
We employ A-star to find the optimal alignments between
an event trace σ∈ L and a Petri net N. To apply A-star,
an opportune search space needs to be defined. We associate
every node q∈Qto an alignment, which is a prefix of some
complete alignment of σand N. The source node is the empty
alignment q0=hi, and the set of target nodes includes every
complete alignment of σand N.
Given a node/alignment γ∈Q, the search-space successors
of γinclude all alignments γ0∈Qobtained from γby
concatenating exactly one move. Given an alignment γ∈Q,
the cost of a path from the initial node to node γ∈Qis:
g(γ) = kα(γ)k+K(γ)
where kσkdenotes the length of a sequence σand K(γ)is
the cost of alignment γ. Section IV-B discusses how the cost
function is constructed.
It is easy to check that, given two complete alignments
γ0
Cand γ00
C,K(γ0
C)< K(γ00
C)iff g(γ0
C)< g(γ00
C)and
K(γ0
C) = K(γ00
C)iff g(γ0
C) = g(γ00
C). Therefore, an optimal
solution returned by A-star coincides with an optimal align-
ment. Various heuristic functions can be adopted. For the lack
of space, we refer to [3] for an example of heuristic function.
B. Probability-based Cost Function
The cost of an alignment is equal to the sum of the costs of
its constituent moves. A move on model (, mM)indicates
that activity mMis expected to be executed but it is not. To
account for this behavior, we compute the cost based on the
probability of executing mMas the next activity. On the other
hand, a move on log (mL,)indicates that event mLwas
not supposed to occur and its occurrence is not conforming
the process specification. Thus, we compute the cost of moves
on log based on the probability of that an activity will never
eventually occur in the subsequent part of the trace.
These probabilities can be computed by analyzing the past
process executions recorded in an event log. Note that we only
consider the traces in an event log Lthat fit the process model
with respect to the control-flow, denoted Lfit . This is because
using non-fitting traces might lead to the construction of a
cost function that is biased by behaviors that should not be
permitted. Moreover, error correction for non-fitting traces is
not trivial and can result in overfitting of the training data
[6]. In practices, using only fitting traces as historical logging
data is not an unrealistic assumption as, in many application
domains, there is typically a sufficient amount of fitting traces.
It is worth noting that we do not impose any constraint on the
historical logging data with respect to data attributes. In fact, as
we assume that information on data attributes is not provided
in the process model, this information cannot be used to reason
about the conformance of event traces.
Based on the observations above, we compute the con-
ditional probabilities that activities occur or never eventually
occur when being in a certain state:
Definition 10. Let Lbe an event log and Lfit ⊆ L the subset
of traces that fit with a Petri net N. The probability that an
activity aoccurs after reaching state swith respect to Lfit ,
denoted by PLfit (a|s), is the ratio between the number of
event traces in Lfit in which ais executed after reaching a
state subsumed by sand the total number of event traces in
Lfit that traverse a state subsumed by s:
PLfit (a|s) =
|{σ∈Lfit :∃σ0∈prefix(σ). sstate(σ0)∧act(σ0)⊕hai∈prefix(act(σ))}|
|{σ∈Lfit :∃σ0∈prefix(σ). sstate(σ0)}|
(4)
Definition 11. Let Lbe an event log and Lfit ⊆ L the
subset of traces that fit with a Petri net N. The probability
that an activity awill never eventually occur in a process
execution after reaching state swith respect to Lfit , denoted
by PLfit (a|s), is the ratio between the number of event traces
in Lfit in which ais never eventually executed after reaching
a state subsumed by sand the total number of event traces in
Lfit that traverse a state subsumed by s:
PLfit (a|s) =
|{σ∈Lfit :∃σ0
∈prefix(σ). sstate(σ0)∧∀σ00
∈E∗act(σ0⊕σ00 )⊕hai/∈prefix(act(σ))}|
|{σ∈Lfit :∃σ0∈prefix(σ). sstate(σ0)}|
(5)
To clarify the reader, we propose the following example:
Example 4. Consider state s= (ha, bi, ϕ)s.t. ϕ(R) =⊥,
ϕ(A) =⊥, ϕ(V) = true, ϕ(D) =⊥and the history in Fig. 3a.
As shown in Example 3, there are 1300 traces in the history
whose prefix yields a state subsumed by s. In 400 of these
traces, activity cis executed after reaching a state subsumed
by s. Thus, the probability that coccurs after sis 0.31.
The cost of an alignment move is defined as follow:
Definition 12 (Move Cost).Let ΣNbe the set of legal moves
for a Petri net N,Lan event log and Lfit ⊆ L the subset of
traces that fit with N. The cost of a legal move (mL, mM)∈
ΣNwith respect to a state sand Lfit is
κ((mL, mM), s) =
0act(mL)=mM
0mL=and mM∈InvN
1+log1
PLfit(mM|s)mL=and mM6∈InvN
1+log1
PLfit(act(mL)|s)mM=
(6)
The choice of using the formulation 1 + log 1
pis moti-
vated in [3] where, through experiments, we show that this
formulation provides more accurate results compared with
other formulations.
The cost of an alignment is the sum of the cost of all moves
in the alignment:
Definition 13 (Alignment Cost).Let ΣNbe the set of legal
moves for a Petri net N. The cost of an alignment γ∈Σ∗
Nis
K(γ)=0γ=hi
κ((mL, mM), sγ0) + K(γ0)γ=γ0⊕(mL, mM)(7)
where sγ0is the state yielded by γ0:sγ0=actα(γ0).
Aprobable alignment is an alignment with the least
cost according to the cost function given in Definition 13.
Below we present an example of how probable alignments
are constructed.
Example 5. Suppose that an analyst wants to construct
the probable alignments of event trace σ3=h(b, {V=
true}),(h, {}),(g, {})iand the net in Fig. 1 where the traces
Event Trace Process
a
b, {V=true}b
(a) Alignment γ
sγ= (ha, bi, ϕ)s.t.
ϕ(V) = true
ϕ(R) = ⊥
ϕ(A) = ⊥
ϕ(D) = ⊥
(b) State of alignment γ
γ0=γ⊕
(h, )κ(h, ), sγ= 1.41
(, c)κ(, c), sγ= 1.51
(, d)κ(, d), sγ= 1.15
...
(c) Cost of alignment moves
Figure 4: Construction of the alignment between event trace
σ3=h(b, {V=true}),(h, {}),(g, {})iand the net in Fig. 1.
The traces in Fig. 3a are used as Lfit .
in Fig. 3a are used as historical logging data, i.e. Lfit .
Assume that the algorithm constructed the partial alignment
in Fig. 4a after analyzing the prefix trace h(b, {V=true})i.
The corresponding state is presented in Fig. 4b. The alignment
can be extended by move on log (h, )or by moves on model
(, c)or (, d). The cost of these moves is shown in Fig. 4c.
In the current state, move (, d)has the least cost. As there is
no alignment with lower cost than γ⊕(, d), this alignment
is selected for the next iteration.
V. EXP ER IM EN TS
We have implemented the proposed approach as a plugin of
the ProM framework (http://www.promtools.org). The plug-in
takes as inputs a process model and two event logs. It computes
the probable alignments of each trace in the first event log and
the process model based on the traces in the second event log
(historical logging data).
We have evaluated the proposed approach using synthetic
data. The aim of the evaluation is to assess the influence
of data attributes in determining the correct explanation of
nonconformity. To this end, in the experiments we artificially
added noise to the traces and assessed the ability of the
approach to reconstruct the original traces.
Setting: Based on the process model in Fig. 1, we gen-
erated 20000 traces with 136138 events using the CPN tools
(http://cpntools.org). We introduced different percentages of
noise (10%, 20%, 30%, 40%) to 20% of the traces in the event
log by adding or removing some events. The remaining traces
were used as history logging data to compute the cost function.
We computed the probable alignments for the manipulated
traces. Then, by projecting alignments over the process model,
we reconstructed the traces. To evaluate the accuracy of the
approach, we considered the percentage of reconstructed traces
that coincide with the original traces and the Levenshtein
distance [7] between the original and reconstructed traces.
The Levenshtein distance between two sequences computes
the minimal number of changes required to transform one
sequence into the other, indicating to what extent the approach
is able to reconstruct the original traces.
In the experiments we compared our approach with two
existing alignment-based techniques: one technique [3] con-
structs optimal alignments based on historical logging data
as the approach proposed in this work but only considers
the control-flow to construct optimal alignments; the other
-inf 2239 4323 6407 8491 10575 12658 14742 16826 18910 +inf
×
(a) Partition of the domain into 10 equal-width bins
-inf 1001 5000 10937 18292 +inf
×
(b) Intervals obtained using the supervised discretization method
Figure 5: Discretization of data attribute amount using super-
vised and unsupervised methods. Symbol “×” is used to rep-
resent the interval cut used for the generation of event traces.
technique [2] penalizes all deviations equally and thus returns
an alignment with the minimum number of deviations. In our
case study, we considered continuous data attributes. This may
have an impact on the state representation and, thus, bias the
computation of the probabilities used for the definition of the
cost function. To this end, we used various data discretization
techniques [8] to partition the domain of continuous attributes
into a finite number of intervals. In particular, we used
the unsupervised and supervised methods provided by Weka
(http://www.cs.waikato.ac.nz/ml/weka/). For unsupervised dis-
cretization, we used the equal-width method by partitioning
the domain of continuous data attributes into 10 equal-width
bins. Fig. 5 shows the discretization of the loan amount ob-
tained using the equal-width and the supervised discretization
methods of Weka.4
Results: Table I reports the average results over five runs
for different levels of noise. When a supervised method was
used for the discretization of data attributes, our approach,
on average, computed the correct alignments for 12% and
18.25% more traces than the techniques proposed in [3] and [2]
respectively. In addition, on average, the Levenshtein distance
is improved 54% compared to [3] and 57% compared to
[2]. These results show that the performed activities not only
depend on the activities previously executed but also on data
attributes. Thus, considering data attributes provides more
accurate explanations of nonconformity. In particular, when
the amount of noise is low, our approach is able to reconstruct
the correct alignments for most of the traces.
The results also show that data discretization has a signifi-
cant impact on the accuracy of the constructed explanations of
nonconformity. Intuitively, if interval cuts are defined in such
a way that event traces with similar behavior yield the same
state, the approach can construct alignments more accurately.
For the generation of the event log used for the experiments,
we assumed that loan requests with an amount lower than
5000 are more likely assessed using simple assessment while
the other requests are more likely assessed using advanced
assessment. From Fig. 5, we can observe that the equal-width
discretization method does not partition the domain of the
data attribute into intervals that capture the behavior of the
process properly. In fact, loan requests with an amount between
4323 and 6407 fall in the same interval. On the other hand,
the supervised discretization method was able to identify the
correct interval cut for the data attribute. Thus, event traces
with similar behaviors are grouped more precisely, resulting
4The discretization is performed w.r.t. the minimal and maximal value for
the data attribute observed in the training dataset. However, for the sake of
generality, in Fig. 5 we replace them with −inf and +inf respectively.
Supervised Unsupervised [3] [2]
Discretization Discretization
Noise CA LD CA LD CA LD CA LD
10% 99 18 97 80 89 298 86 344
20% 94 223 91 310 81 571 78 635
30% 82 735 80 827 69 1132 64 1256
40% 70 1334 67 1424 58 1778 54 1854
Table I: Results of experiments on synthetic data. CA indicates
the percentage of correct alignments; LD indicates the Leven-
shtein distance between the original traces and the projection
of the alignments over the process.
in more accurate alignments compared to the case where the
equal-width method was used for data discretization.
VI. RE LATE D WOR K
Several approaches have been proposed in the literature for
conformance checking [2], [9], [10], [11]. For instance, token-
based replay techniques [9], [11] measure the conformance of
an event log and a Petri net based on the number of missing
and remaining tokens after replaying traces. However, these
techniques fail to pinpoint the causes of nonconformity. This
limitation is overcome by alignment-based techniques [2], [12],
[13], [14], [15]. However, these techniques typically require
analysts to define a cost function based on their background
knowledge and beliefs, which can result in imperfections
leading to wrong or unlikely explanations of nonconformity.
Similar to this work, Alizadeh et al. [3] propose to compute
the cost function automatically based on the analysis of past
process executions. However, the technique in [3] only uses the
control-flow perspective to compute the cost function whereas
other process perspectives are not considered. As shown by
our experiments, this provides less accurate explanations of
nonconformity compared to the case where the data attributes
manipulated by activities are also used.
This is not the first work that uses multi process perspec-
tives for conformance checking. Some techniques [16], [17],
[18] use rules to restrict a process execution using various
process perspectives. These techniques, however, can only
identify whether a process execution violates a particular rule.
Instead of using a set of compliance rules, our approach
uses an imperative process model that explicitly specifies
how tasks should be executed, providing complete diagnostic
information. Banescu et al. [19] propose an approach based
on sequence distance metrics in which multi perspectives
(e.g., data accessed by an activity, the user performing the
activity) are used for the detection and quantification of privacy
deviations. However, this approach is not efficient and does not
support process models which allow infinitely many traces, e.g.
process models with loops.
Few approaches [20], [21] extend alignment-based tech-
niques to support conformance checking based on multi pro-
cess perspectives. In [20], alignments are built by considering
only the control-flow and, then, other process perspectives are
used to refine the computed alignments. Mannhardt et al. [21]
propose a multi perspective approach to construct optimal
alignments. Both techniques represent process models as Petri
nets with data and rely on guards to construct alignments. In
the case guards are not provided together with the process
model, they have to be inferred. In [22], de Leoni et al.
propose an approach to mine the guards of process activities.
In principle, the guard-discovery approach is based on the
probability of an activity to be executed in a given state: if
the guard does not hold, the probability is 0; otherwise, it is
1. The main drawback of such an approach is that guards are
supposed to be disjoint: in a given state, only one guard is
true and, hence, only one activity is enabled. In many real-
life cases, in a certain state multiple activities are possible
with different probabilities; in those cases, the guard-discovery
approach would fail to discover any reliable rules, making
it impossible to compute a probable alignment. Rather than
constructing alignments based on multi process perspectives,
our approach constructs alignments on the basis of the control-
flow, whereas other process perspectives are used to define the
cost function. This provides the flexibility necessary to deal
with situations in which well-defined guards are not provided
with the process model and/or cannot be inferred.
VII. CONCLUSION
This paper has presented an alignment-based technique
to determine probable explanations of nonconformity when
a process execution deviates from the prescribed process
behavior. The approach exploits information about both the
control-flow and data perspective of past process executions
to define a cost function that enables the construction of
probable alignments. An evaluation using synthetic data shows
that our approach provides more accurate explanations of
nonconformity compared to alignment-based techniques that
only rely on the control-flow. For future work, we plan to
perform more extensive experiments to verify whether our
conclusions also hold for real-life logs.
In this work, we only focused on the identification of the
causes of nonconformity. To support analysts in the analy-
sis of event traces, our approach should be complemented
with methods and metrics for the analysis and possibly the
quantification of the identified causes of nonconformity with
respect to the application domain of interest and purpose
of the analysis. Moreover, alignments only show low level
deviations, i.e. moves on model and log. While these deviations
indicate where the process deviates, assessing the criticality
of deviations requires understanding the actual high level
deviations which occurred. To this end, we are studying how
low level deviations can be correlated and combined into high
level deviations.
Acknowledgments: This work has been partially funded by
the NWO CyberSecurity programme under the PriCE project,
the Dutch national program COMMIT under the THeCS
project and the European Community’s Seventh Framework
Program FP7 under grant agreement n. 603993 (CORE).
REFERENCES
[1] W. van der Aalst, Process Mining: Discovery, Conformance and En-
hancement of Business Processes. Springer, 2011.
[2] A. Adriansyah, B. F. van Dongen, and W. van der Aalst, “Memory-
efficient alignment of observed and modeled behavior,” BPMcenter.org,
BPM Center Report BPM-03-03, 2013.
[3] M. Alizadeh, M. de Leoni, and N. Zannone, “History-based Construc-
tion of Log-Process Alignments for Conformance Checking: Discover-
ing What Really Went Wrong,” in Proceedings of the 4th International
Symposium on Data-driven Process Discovery and Analysis, ser. CEUR
Workshop Proceedings 1293. CEUR-ws.org, 2014, pp. 1–15.
[4] W. van der Aalst, M. H. Schonenberg, and M. Song, “Time prediction
based on process mining,” Information Systems, vol. 36, no. 2, pp. 450–
475, 2011.
[5] R. Dechter and J. Pearl, “Generalized best-first search strategies and the
optimality of A*,” Journal of the ACM, vol. 32, pp. 505–536, 1985.
[6] S. Geman, E. Bienenstock, and R. Doursat, “Neural networks and the
bias/variance dilemma,” Neural Comput., vol. 4, no. 1, pp. 1–58, 1992.
[7] G. Navarro, “A guided tour to approximate string matching,” ACM
Comput. Surv., vol. 33, no. 1, pp. 31–88, 2001.
[8] S. Garcia, J. Luengo, J. A. S´
aez, V. L ´
opez, and F. Herrera, “A
survey of discretization techniques: Taxonomy and empirical analysis
in supervised learning,” TKDE, vol. 25, no. 4, pp. 734–750, 2013.
[9] S. Banescu, M. Petkovic, and N. Zannone, “Measuring privacy com-
pliance using fitness metrics,” in Business Process Management, ser.
LNCS. Springer, 2012, vol. 7481, pp. 114–119.
[10] J. E. Cook and A. L. Wolf, “Software Process Validation: Quantitatively
Measuring the Correspondence of a Process to a Model,” TOSEM,
vol. 8, pp. 147–176, 1999.
[11] A. Rozinat and W. van der Aalst, “Conformance checking of processes
based on monitoring real behavior,” Information Systems, vol. 33, no. 1,
pp. 64–95, 2008.
[12] W. M. P. van der Aalst, A. Adriansyah, and B. F. van Dongen,
“Replaying history on process models for conformance checking and
performance analysis,” Wiley Interdisc. Rew.: Data Mining and Knowl-
edge Discovery, vol. 2, no. 2, pp. 182–192, 2012.
[13] A. Adriansyah, B. F. van Dongen, and N. Zannone, “Controlling
break-the-glass through alignment,” in Proceedings of International
Conference on Social Computing. IEEE, 2013, pp. 606–611.
[14] ——, “Privacy analysis of user behavior using alignments,” it - Infor-
mation Technology, vol. 55, no. 6, pp. 255–260, 2013.
[15] S. Suriadi, C. Ouyang, W. van der Aalst, and A. H. ter Hofstede,
“Root cause analysis with enriched process logs,” in Business Process
Management Workshops. Springer, 2013, pp. 174–186.
[16] A. Awad, M. Weidlich, and M. Weske, “Specification, verification and
explanation of violation for data aware compliance rules,” in Service-
Oriented Computing. Springer, 2009, pp. 500–515.
[17] E. Ramezani, D. Fahland, and W. van der Aalst, “Where did i mis-
behave? diagnostic information in compliance checking,” in Business
Process Management. Springer, 2012, pp. 262–278.
[18] E. R. Taghiabadi, V. Gromov, D. Fahland, and W. van der Aalst,
“Compliance checking of data-aware and resource-aware compliance
requirements,” in OTM. Springer, 2014, pp. 237–257.
[19] S. Banescu and N. Zannone, “Measuring privacy compliance with
process specifications,” in Proceedings of International Workshop on
Security Measurements and Metrics. IEEE, 2011, pp. 41–50.
[20] M. de Leoni and W. van der Aalst, “Aligning event logs and process
models for multi-perspective conformance checking: An approach based
on integer linear programming,” in Business Process Management.
Springer, 2013, pp. 113–129.
[21] F. Mannhardt, M. de Leoni, H. Reijers, and W. van der Aalst, “Balanced
multi-perspective checking of process conformance,” Computing, 2015.
[22] M. de Leoni and W. van der Aalst, “Data-aware process mining:
discovering decisions in processes using alignments,” in Proceedings
of Symposium on Applied Computing. ACM, 2013, pp. 1454–1461.