The ProM Framework:
A New Era in Process Mining Tool Support
B.F. van Dongen, A.K.A. de Medeiros, H.M.W. Verbeek, A.J.M.M. Weijters,
and W.M.P. van der Aalst
Department of Technology Management, Eindhoven University of Technology,
P.O. Box 513, NL-5600 MB, Eindhoven,
Abstract. Under the umbrella of buzzwords such as “Business Activ-
ity Monitoring” (BAM) and “Business Process Intelligence” (BPI) both
academic (e.g., EMiT, Little Thumb, InWoLvE, Process Miner, and
MinSoN) and commercial tools (e.g., ARIS PPM, HP BPI, and ILOG
JViews) have been developed. The goal of these tools is to extract knowl-
edge from event logs (e.g., transaction logs in an ERP system or audit
trails in a WFM system), i.e., to do process mining. Unfortunately, tools
use diﬀerent formats for reading/storing log ﬁles and present their re-
sults in diﬀerent ways. This makes it diﬃcult to use diﬀerent tools on
the same data set and to compare the mining results. Furthermore, some
of these tools implement concepts that can be very useful in the other
tools but it is often diﬃcult to combine tools. As a result, researchers
working on new process mining techniques are forced to build a mining
infrastructure from scratch or test their techniques in an isolated way,
disconnected from any practical applications. To overcome these kind of
problems, we have developed the ProM framework, i.e., an “pluggable”
environment for process mining. The framework is ﬂexible with respect
to the input and output format, and is also open enough to allow for
the easy reuse of code during the implementation of new process mining
ideas. This paper introduces the ProM framework and gives an overview
of the plug-ins that have been developed.
The research domain process mining is relatively new. A complete overview of
recent process mining research is beyond the scope of this paper. Therefore, we
limit ourselves to a brief introduction to this topic and refer to [3, 4] and the
http://www.processmining.org web page for a more complete overview.
The goal of process mining is to extract information about processes from
transaction logs. It assumes that it is possible to record events such that (i) each
event refers to an activity (i.e., a well-deﬁned step in the process), (ii) each event
refers to a case (i.e., a process instance), (iii) each event can have a performer
also referred to as originator (the actor executing or initiating the activity), and
(iv) events have a timestamp and are totally ordered. Table 1 shows an example
G. Ciardo and P. Darondeau (Eds.): ICATPN 2005, LNCS 3536, pp. 444–454, 2005.
Springer-Verlag Berlin Heidelberg 2005
The ProM Framework: A New Era in Process Mining Tool Support 445
Table 1 . An event log (audit trail)
case id activity id originator case id activity id originator
case 1 activity A John case 5 activity A Sue
case 2 activity A John case 4 activity C Carol
case 3 activity A Sue case 1 activity D Pete
case 3 activity B Carol case 3 activity C Sue
case 1 activity B Mike case 3 activity D Pete
case 1 activity C John case 4 activity B Sue
case 2 activity C Mike case 5 activity E Clare
case 4 activity A Sue case 5 activity D Clare
case 2 activity B John case 4 activity D Pete
case 2 activity D Pete
of a log involving 19 events, 5 activities, and 6 originators. In addition to the
information shown in this table, some event logs contain more information on
the case itself, i.e., data elements referring to properties of the case. For example,
the case handling system FLOWer logs every modiﬁcation of some data element.
Event logs such as the one shown in Table 1 are used as the starting point
for mining. We distinguish three diﬀerent perspectives: (1) the process perspec-
tive, (2) the organizational perspective and (3) the case perspective. The process
perspecti ve focuses on the control-ﬂow, i.e., the ordering of activities, as shown
in Figure 1(a). The goal of mining this perspective is to ﬁnd a good charac-
terization of all possible paths, e.g., expressed in terms of a Petri net  or
Event-driven Process Chain (EPC) [11, 12]. The organizational perspective fo-
cuses on the originator ﬁeld, i.e., which performers are involved and how are
they related. The goal is to either structure the organization by classifying peo-
ple in terms of roles and organizational units (Figure 1(b)) or to show relation
between individual performers (i.e., build a social network as described in  and
references there, and as shown in Figure 1(c)). The case perspectiv e focuses on
properties of cases. Cases can be characterized by their path in the process or by
the originators working on a case. However, cases can also be characterized by
the values of the corresponding data elements. For example, if a case represents
a replenishment order, it is interesting to know the supplier or the number of
Orthogonal to the three perspectives (process, organization, and case), the
result of a mining eﬀort may refer to logical issues and/or performance issues.
For example, process mining can focus on performance issues such as ﬂow time,
the utilization of performers or execution frequencies.
After developing ad hoc tools for the mining of the process perspective (e.g.,
EMiT  and Little Thumb ) and other ad hoc tools (e.g., MinSoN ) for
the other mining perspectives we started the design of a ﬂexible framework in
which diﬀerent algorithms for each of the perspectives can be plugged in.
446 B.F. van Dongen et al.
(a) The control-flow structure expressed in terms of a Petri net.
(b) The organizational structure expressed in
terms of a activity-role-performer diagram.
John Sue Mike Carol Pete Clare
role X role Y role Z
(c) A sociogram based on transfer of work.
Fig. 1. Some mining results for the process perspective (a) and organizational (b and
c) perspective based on the event log shown in Table 1
As indicated in the introduction, the basis for all process mining techniques is
aprocess log. Such a log is a ﬁle generated by some information system, with
information about the execution of a process. Since each information system has
its own format for storing log ﬁles, we have developed a generic XML format
for the ProM framework to store a log in. This format was based on a thorough
comparison of the input needs of various existing (ad-hoc) process mining tools
and the information typically contained in an audit trail or transaction log of
some complex information system (e.g., an ERP or a WFM system).
Another important feature of the ProM framework is that it allows for in-
teraction between a large number of so-called plug-ins. A plug-in is basically
the implementation of an algorithm that is of some use in the process mining
area, where the implementation agrees with the framework. Such plug-ins can
be added to the entire framework with relative ease: Once the plug-in is ready
it can be added to the framework by adding its name to some ini-ﬁle. Note that
there is no need to modify the ProM framework (e.g., recompiling the code)
when adding new plug-ins, i.e., it is a truly “pluggable” environment. This in
contradiction to open-source initiatives, such as the data mining software Weka1.
In Figure 2, we show an overview of the framework that we developed. It
explains the relations between the framework, the process log format, and the
plug-ins. As Figure 2 shows, the ProM framework can read ﬁles in the XML
format through the Log ﬁlte r component. This component is able to deal with
large data sets and sorts the events within a case on their timestamps before
1Weka is available from http://www.cs.waikato.ac.nz/∼ml/weka/
The ProM Framework: A New Era in Process Mining Tool Support 447
Aris Graph Format
(Aris AML Format)
Heuristic Net PNML
Aris Graph format TPN
NetMiner file Agna file
Aris PPM Instances DOT
Comma Seperated Values …
Fig. 2. Overview of the ProM framework
the actual mining starts. (If no timestamps are present, the order in the XML
ﬁle is preserved.) Through the Import plug-ins a wide variety of models can
be loaded ranging from a Petri net to logical formulas. The Mining plug-ins
do the actual mining and the result is stored in memory, and in a window on
the ProM desktop. The framework allows plug-ins to operate on each others
results in a standardized way. Typically, the mining results contain some kind of
visualization, e.g., displaying a Petri net , an EPC  or a Social network
, or further analysis or conversion. The Analysis plug-ins take a mining result
an analyze it, e.g., calculating a place invariant for a resulting Petri net. The
Conversion plug-ins take a mining result and transform it into another format,
e.g., transforming an EPC into a Petri net. In the remainder of this section, we
describe both the process log format and the plug-ins.
2.1 Process Log Format
Figure 3(a) visualizes the XML schema that speciﬁes the process log format.
The root element is a WorkﬂowLog element. (The name “workﬂow log” is cho-
sen for backwards compatibility and we prefer to talk about process log.) The
WorkﬂowLog element contains (in the given order) an optional Data element, an
optional Source element, and a number of Process elements. A Data element al-
lows for storing arbitrary textual data, and contains a list of Attribute elements.
ASource element can be used to store information about the information system
this log originated from. A Process element refers to a speciﬁc process in an in-
formation system. Since most information systems typically control several pro-
cesses, multiple Process elements may exist in a log ﬁle. A ProcessInst ance is an
instance of the process, i.e., a case. An AuditTrailEntry may refer to an activity
(WorkﬂowModelElement), an eventtype (Eventtype), a timestamp (Timestamp),
and a person that executed the activity (Originator).
448 B.F. van Dongen et al.
(a) Process log XML format
(b) Transactional model for EventType
Fig. 3. Process log XML format (a) and transactional model (b)
As will be clear from what was mentioned earlier, a log ﬁle typically contains
information about events that took place in a system. Such events typically refer
to a case and a speciﬁc activity within that case. Examples of such events are:
–The activity send message is now ready to be executed.
–The activity wait for incoming transmission has not been started for three
–The case with ID 203453 was aborted.
In order to be able to talk about these events in a standard way, we developed
a transactional model that shows the events that we assume can appear in a
log. Again this model is based on analyzing the diﬀerent types of logs in real-life
systems (e.g., Staﬀware, SAP, FLOWer, etc.) Figure 3(b) shows the transactional
When an activity is created, it is either scheduled or skipped automatically
(autoskip). Scheduling an activity means that the control over that activity is put
into the information system. The information system can now assign this activity
to a certain person or group of persons. It is possible to reassign an assigned
activity to another person or group of persons. This can be done by the system,
or by a user. A user can start working on an activity that was assigned to him, or
some user can decide to withdraw the activity or skip it manually (manualskip),
which can even happen before the activity was assigned. The main diﬀerence
between a withdrawal and a manual skip is the fact that after the manual skip
the activity has been executed correctly, while after a withdrawal it is not. The
user that started an activity can suspend and resume the activity several times,
but in the end he either has to complet e or abort (ate abort) it. Note the activity
can get aborted (pi abort) during its entire life cycle.
The ProM Framework: A New Era in Process Mining Tool Support 449
We do not claim that we have captured all possible behavior of all systems.
However, we have veriﬁed our transactional model against several commercial
systems and they all seem to ﬁt nicely. Nonetheless, in the XML format, we
allow for other event types to be deﬁned on the ﬂy.
In this section, we provide an overview of the plug-ins as currently implemented
in the context of the ProM framework. For more technical documentation and
scientiﬁc publications, we refer to our website http://www.processmining.org.
As shown in Figure 2 there are ﬁve kinds of plug-ins:
Mining plug-ins which implement some mining algorithm, e.g., mining algo-
rithms that construct a Petri net based on some event log.
Export plug-ins which implement some “save as” functionality for some ob-
jects (such as graphs). For example, there are plug-ins to save EPCs, Petri
nets (e.g., in PNML format ), spreadsheets, etc.
Import plug-ins which implement an “open” functionality for exported ob-
jects, e.g., load instance-EPCs from ARIS PPM.
Analysis plug-ins which typically implement some property analysis on some
mining result. For example, for Petri nets there is a plug-in which constructs
place invariants, transition invariants, and a coverability graph. However,
there are also analysis plug-ins to compare a log and a model (i.e., confor-
mance testing) or a log and an LTL formula.
Conversion plug-ins which implement conversions between diﬀerent data for-
mats, e.g., from EPCs to Petri nets.
The current version of the framework contains a large set of plug-ins. A detailed
description of these plug-ins is beyond the scope of this paper. Currently, there
are nine export plug-ins, four import plug-ins, seven analysis plug-ins, and three
conversion plug-ins. Therefore, we only mention some of the available mining
plug-ins. For each of the three perspectives which were mentioned in the intro-
duction, there are diﬀerent mining plug-ins.
For the process perspective, four plug-ins are available:
α-algorithm which implements the α-algorithm  and its extensions as devel-
oped by the authors. The α-algorithm constructs a Petri net which models
the process recorded in the log.
Tshinghua-αalgorithm which uses timestamps in the log ﬁles to construct
a Petri net. It is related to the αalgorithm, but uses a diﬀerent approach.
Details can be found in . It is interesting to note that this mining plug-in
was the ﬁrst plug-in developed by researchers outside of our research group.
Researchers from Tshinghua University in China (Jianmin Wang and Wen
Lijie) were able to develop and integrate this plug-in without any help or
changes to the framework.
450 B.F. van Dongen et al.
Genetic algorithm which uses genetic algorithms to tackle possible noise in
the log ﬁle as described in . Its output format is a heuristics net (which
can be converted into an EPC or a Petri net).
Multi-phase mining which implements a series of process mining algorithms
that use instance graphs (comparable to runs) as an intermediate format.
The two-phase approach resembles the aggregation process in Aris PPM.
For the organizational perspective, one plug-in is available:
Social network miner which uses the log ﬁle to determine a social network of
people . It requires the log ﬁle to contain the Originator element.
Finally, for the case perspective, also one plug-in is available:
Case data extraction which can be used for interfacing with a number of
standard knowledge discovering tools, e.g., Viscovery and SPSS AnswerTree.
Sometimes a collection of plug-ins is needed to achieve the desired functionality.
An example is the LTL-checker which checks whether logs satisfy some Linear
Temporal Logic (LTL) formula. For example, the LTL-checker can be used to
check the “four eyes” principle, i.e., two activities within the same case should
not be executed by the same person to avoid possible fraud. The LTL-checker
combines a mining plug-in (to get the log), an import plug-in (to load the ﬁle with
predeﬁned LTL formulas), and an analysis plug-in (to do the actual checking).
Since the ProM framework contains a large number of plug-ins, it is impossible
to discuss them all in detail. Therefore, we only present some screenshots of a
few plug-ins that we applied to the example of Table 1. In Figure 4, we show the
result of applying the α-mining plug-in to the example. The default settings of
the plug-in were used, and the result is a Petri net that is behaviorally equivalent
to the one presented in Figure 1. In Figure 5, we show the result of the social
network mining plug-in. We used the handover of work setting, considering only
direct succession, to generate this ﬁgure. Comparing it to Figure 1(c) shows that
the result is an isomorphic graph (i.e. the result is the same).
Petri nets are not the only modelling language supported by the framework.
Instead, we also have built-in support for EPCs (Event-driven Process Chains).
In Figure 6, we show the result of the multi-phase mining plug-in. The result is
an aggregated EPC describing the behavior of all cases. Note that it allows for
more behavior than the Petri net, since the connectors are of the type logical or.
In Figure 7 we show the user interface of the analysis plug-in that can be used
for the veriﬁcation of EPCs.
In this section, we have shown some screenshots to provide an overview of
the framework. We would like to stress that we only showed a few plug-ins of
the many that are available. We would also like to point out that most plug-ins
allow for user interaction. The latter it important because process mining is often
an interactive process where human interpretation is important and additional
knowledge can be used to improve the mining result.
The ProM Framework: A New Era in Process Mining Tool Support 451
Fig. 4. The α-mining plug-in Fig. 5. The social network mining plug-in
Fig. 6. The discovered EPC Fig. 7. Analyzing the EPC for correctness
4 Related Work
Process mining can be seen as a tool in the context of Business Activity Mon-
itoring (BAM) and Business (Process) Intelligence (BPI). In  a BPI toolset
on top of HP’s Process Manager is described. The BPI tools set includes a so-
called “BPI Process Mining Engine”. However, this engine does not provide any
techniques as discussed before. Instead it uses generic mining tools such as SAS
Enterprise Miner for the generation of decision trees relating attributes of cases
to information about execution paths (e.g., duration). In  the PISA tool is
described which can be used to extract performance metrics from workﬂow logs.
Similar diagnostics are provided by the ARIS Process Performance Manager
(PPM) . The latter tool is commercially available and a customized version
of PPM is the Staﬀware Process Monitor (SPM)  which is tailored towards
mining Staﬀware logs.2
2Note that the ProM Framework interfaces with Staﬀware, SPM, ARIS Toolset, and
452 B.F. van Dongen et al.
Given the many papers on mining the process perspective it is not possible to
give a complete overview. Instead we refer to [3, 5]. Historically, Cook et al. 
and Agrawal et al.  started to work on the problem addressed in this paper.
Herbst et al.  took an alternative approach which allows for dealing with
duplicate activities. The authors of this paper have been involved in diﬀerent
variants of the so-called α-algorithm [1, 5, 17]. Each of the approaches has its
pros and its cons. Most approaches that are able to discover concurrency have
problems dealing with issues such as duplicate activities, hidden activities, non-
free-choice constructs, noise, and incompleteness.
The ProM framework subsumes process mining tools like EMiT , Lit-
tle Thumb  and MinSon . Most of these tools had their own format to
store log ﬁles in, and had their own limitations. The tool EMiT for example
was unable to deal with log ﬁles of more than 1000 cases. To be able to use
all these tools together in an interactive way, we developed the ProM frame-
work, which can be seen as a successor of all these tools. The framework allows
researchers to seamlessly combine their own algorithms with algorithms from
other people. Furthermore, using the framework allows you to interface with
many existing tools, both commercial and public. These tools include: the Aris
Toolset, Aris PPM, Woﬂan, The Petri net kernel, Netminer, Agna, Dot, Viscov-
The ProM framework integrates the functionality of several existing process
mining tools and provides many additional process mining plug-ins. The ProM
framework supports multiple formats and multiple languages, e.g., Petri nets,
EPCs, Social Networks, etc. The plug-ins can be used in several ways and
combined to be applied in real-life situations. We encourage developers and re-
searchers to use the ProM framework for implementing new ideas. It is easy to
add a new plug-in. For adding new plug-ins it suﬃces to add a few lines to the
conﬁguration ﬁles and no changes to the code are necessary, i.e., new mining
plug-ins can be added without re-compiling the source code. Experiences with
adding the Thingua-αplug-in and the Social network miner show that this is
indeed rather straightforward.
The authors would like to thank all people that have been involved in the de-
velopment and implementation of the ProM framework. In particular we would
like to thank Minseok Song, Jianmin Wang and Wen Lijie for their contribu-
tions. Furthermore, we would like to thank IDS Scheer for providing us with
Aris PPM and the Aris toolset. Last, but certainly not least, we would like to
thank Peter van den Brand for doing the major part of the implementation work
The ProM Framework: A New Era in Process Mining Tool Support 453
1. W.M.P. van der Aalst and B.F. van Dongen. Discovering Workﬂow Performance
Models from Timed Logs. In Y. Han, S. Tai, and D. Wikarski, editors, International
Conference on Engineering and Deployment of Cooperative Information Systems
(EDCIS 2002), volume 2480 of Lecture Notes in Computer Science, pages 45–63.
Springer-Verlag, Berlin, 2002.
2. W.M.P. van der Aalst and M. Song. Mining Social Networks: Uncovering interac-
tion patterns in business processes. In J. Desel, B. Pernici, and M. Weske, editors,
International Conference on Business Process Management (BPM 2004),volume
3080 of Lecture Notes in Computer Science, pages 244–260. Springer-Verlag, Berlin,
3. W.M.P. van der Aalst, B.F. van Dongen, J. Herbst, L. Maruster, G. Schimm, and
A.J.M.M. Weijters. Workﬂow Mining: A Survey of Issues and Approaches. Data
and Knowledge Engineering, 47(2):237–267, 2003.
4. W.M.P. van der Aalst and A.J.M.M. Weijters, editors. Process Mining,Special
Issue of Computers in Industry, Volume 53, Number 3. Elsevier Science Publishers,
5. W.M.P. van der Aalst, A.J.M.M. Weijters, and L. Maruster. Workﬂow Mining:
Discovering Process Models from Event Logs. IEEE Transactions on Knowledge
and Data Engineering, 16(9):1128–1142, 2004.
6. R. Agrawal, D. Gunopulos, and F. Leymann. Mining Process Models from Work-
ﬂow Logs. In Sixth International Conference on Extending Database Technology,
pages 469–483, 1998.
7. J. Billington and et. al. The Petri Net Markup Language: Concepts, Technology,
and Tools. In W.M.P. van der Aalst and E. Best, editors, Application and Theory
of Petri Nets 2003, volume 2679 of Lecture Notes in Computer Science,pages
483–506. Springer-Verlag, Berlin, 2003.
8. J.E. Cook and A.L. Wolf. Discovering Models of Software Processes from Event-
Based Data. ACM Transactions on Software Engineering and Methodology,
9. D. Grigori, F. Casati, U. Dayal, and M.C. Shan. Improving Business Process Qual-
ity through Exception Understanding, Prediction, and Prevention. In P. Apers,
P. Atzeni, S. Ceri, S. Paraboschi, K. Ramamohanarao, and R. Snodgrass, ed-
itors, Proceedings of 27th Internati on al C onference on Very Large Dat a Ba se s
(VLDB’01), pages 159–168. Morgan Kaufmann, 2001.
10. J. Herbst. A Machine Learning Approach to Workﬂow Management. In P roceedings
11th European Conference on Machine Learning, volume 1810 of Lecture Notes in
Computer Science, pages 183–194. Springer-Verlag, Berlin, 2000.
11. IDS Scheer. ARIS Process Performance Manager (ARIS PPM): Measure, Ana-
lyze and Optimize Your Business Process Performance (whitepaper). IDS Scheer,
Saarbruecken, Gemany, http://www.ids-scheer.com, 2002.
12. G. Keller and T. Teufel. SAP R/3 Process Oriented Implementation. Addison-
Wesley, Reading MA, 1998.
13. A.K.A. de Medeiros, A.J.M.M. Weijters, and W.M.P. van der Aalst. Using Ge-
netic Algorithms to Mine Process Models: Representation, Operators and Results.
BETA Working Paper Series, WP 124, Eindhoven University of Technology, Eind-
454 B.F. van Dongen et al.
14. M. zur M¨uhlen and M. Rosemann. Workﬂow-based Process Monitoring and Con-
trolling - Technical and Organizational Issues. In R. Sprague, editor, Proceed in gs
of the 33rd Hawaii International Conference on System Science (HICSS-33),pages
1–10. IEEE Computer Society Press, Los Alamitos, California, 2000.
15. W. Reisig and G. Rozenberg, editors. Lectures on Petri Nets I: Basic Models,
volume 1491 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, 1998.
16. Staﬀware. Staﬀware Process Monitor (SPM). http://www.staﬀware.com, 2002.
17. A.J.M.M. Weijters and W.M.P. van der Aalst. Rediscovering Workﬂow Models
from Event-Based Data using Little Thumb. Integrated Computer-Aided Engi-
neering, 10(2):151–162, 2003.
18. L. Wen, J. Wang, W.M.P. van der Aalst, Z. Wang, and J. Sun. A Novel Approach
for Process Mining Based on Event Types. BETA Working Paper Series, WP 118,
Eindhoven University of Technology, Eindhoven, 2004.