Chapter

Complex Event Processing for Event-Based Process Querying

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Process querying targets the filtering and transformation of business process representations, such as event data recorded by information systems. This paper argues for the application of models and methods developed in the general field of Complex Event Processing (CEP) for process querying. Specifically, if event data is generated continuously during process execution, CEP techniques may help to filter and transform process-related information by evaluating queries over event streams. This paper motivates the use of such event-based process querying, and discuss common challenges and techniques for the application of CEP for process querying. In particular, focusing on event-activity correlation, automated query derivation, and diagnostics for query matches.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Nowadays, business processes are increasingly supported by IT services that produce massive amounts of event data during the execution of a process. These event data can be used to analyze the process using process mining techniques to discover the real process, measure conformance to a given process model, or to enhance existing models with performance information. Mapping the produced events to activities of a given process model is essential for conformance checking, annotation and understanding of process mining results. In order to accomplish this mapping with low manual effort, we developed a semi-automatic approach that maps events to activities using insights from behavioral analysis and label analysis. The approach extracts Declare constraints from both the log and the model to build matching constraints to efficiently reduce the number of possible mappings. These mappings are further reduced using techniques from natural language processing, which allow for a matching based on labels and external knowledge sources. The evaluation with synthetic and real-life data demonstrates the effectiveness of the approach and its robustness toward non-conforming execution logs.
Article
Full-text available
The volume of process-related data is growing rapidly: more and more business operations are being supported and monitored by information systems. Industry 4.0 and the corresponding industrial Internet of Things are about to generate new waves of process-related data, next to the abundance of event data already present in enterprise systems. However, organizations often fail to convert such data into strategic and tactical intelligence. This is due to the lack of dedicated technologies that are tailored to effectively manage the information on processes encoded in process models and process execution records. Process-related information is a core organizational asset which requires dedicated analytics to unlock its full potential. This paper proposes a framework for devising process querying methods, i.e., techniques for the (automated) management of repositories of designed and executed processes, as well as models that describe relationships between processes. The framework is composed of generic components that can be configured to create a range of process querying methods. The motivation for the framework stems from use cases in the field of Business Process Management. The design of the framework is informed by and validated via a systematic literature review. The framework structures the state of the art and points to gaps in existing research. Process querying methods need to address these gaps to better support strategic decision-making and provide the next generation of Business Intelligence platforms.
Conference Paper
Full-text available
Process model matching provides the basis for many process analysis techniques such as inconsistency detection and process querying. The matching task refers to the automatic identification of correspondences between activities in two process models. Numerous techniques have been developed for this purpose, all share a focus on process-level information. In this paper we introduce instance-based process matching , which specifically focuses on information related to instances of a process. In particular, we introduce six similarity metrics that each use a different type of instance information stored in the event logs associated with processes. The proposed metrics can be used as standalone matching techniques or to complement existing process model matching techniques. A quantitative evaluation on real-world data demonstrates that the use of information from event logs is essential in identifying a considerable amount of correspondences.
Article
Full-text available
Pattern queries are widely used in complex event processing (CEP) systems. Existing pattern matching techniques, however, can provide only limited performance for expensive queries in real-world applications, which may involve Kleene closure patterns, flexible event selection strategies, and events with imprecise timestamps. To support these expensive queries with high performance, we begin our study by analyzing the complexity of pattern queries, with a focus on the fundamental understanding of which features make pattern queries more expressive and at the same time more computationally expensive. This analysis allows us to identify performance bottlenecks in processing those expensive queries, and provides key insights for us to develop a series of optimizations to mitigate those bottlenecks. Microbenchmark results show superior performance of our system for expensive pattern queries while most state-of-the-art systems suffer from poor performance. A thorough case study on Hadoop cluster monitoring further demonstrates the efficiency and effectiveness of our proposed techniques.
Conference Paper
Full-text available
The analysis of business processes is often challenging not only because of intricate dependencies between process activities but also because of various sources of faults within the activities. The automated detection of potential business process anomalies could immensely help business analysts and other process participants detect and understand the causes of process errors. This work focuses on temporal anomalies, i.e., anomalies concerning the runtime of activities within a process. To detect such anomalies, we propose a Bayesian model that can be automatically inferred form the Petri net representation of a business process. Probabilistic inference on the above model allows the detection of non-obvious and interdependent temporal anomalies.
Conference Paper
Full-text available
The design of concurrent software systems, in particular process-aware information systems, involves behavioral modeling at various stages. Recently, approaches to behavioral analysis of such systems have been based on declarative abstractions defined as sets of behavioral relations. However, these relations are typically defined in an ad-hoc manner. In this paper, we address the lack of a systematic exploration of the fundamental relations that can be used to capture the behavior of concurrent systems, i.e., co-occurrence, conflict, causality, and concurrency. Besides the definition of the spectrum of behavioral relations, which we refer to as the 4C spectrum, we also show that our relations give rise to implication lattices. We further provide operationalizations of the proposed relations, starting by proposing techniques for computing relations in unlabeled systems, which are then lifted to become applicable in the context of labeled systems, i.e., systems in which state transitions have semantic annotations. Finally, we report on experimental results on efficiency of the proposed computations.
Conference Paper
Full-text available
Complex Event Processing (CEP) systems aim at processing large flows of events to discover situations of interest. In CEP, the processing takes place according to user-defined rules, which specify the (causal) relations between the observed events and the phenomena to be detected. We claim that the complexity of writing such rules is a limiting factor for the diffusion of CEP. In this paper, we tackle this problem by introducing iCEP, a novel framework that learns, from historical traces, the hidden causality between the received events and the situations to detect, and uses them to automatically generate CEP rules. The paper introduces three main contributions. It provides a precise definition for the problem of automated CEP rules generation. It dicusses a general approach to this research challenge that builds on three fundamental pillars: decomposition into subproblems, modularity of solutions, and ad-hoc learning algorithms. It provides a concrete implementation of this approach, the iCEP framework, and evaluates its precision in a broad range of situations, using both synthetic benchmarks and real traces from a traffic monitoring scenario.
Article
Full-text available
A large number of distributed applications requires continuous and timely processing of information as it flows from the periphery to the center of the system. Examples include intrusion detection systems which analyze network traffic in real-time to identify possible attacks; environmental monitoring applications which process raw data coming from sensor networks to identify critical situations; or applications performing online analysis of stock prices to identify trends and forecast future values. Traditional DBMSs, which need to store and index data before processing it, can hardly fulfill the requirements of timeliness coming from such domains. Accordingly, during the last decade, different research communities developed a number of tools, which we collectively call Information flow processing (IFP) systems, to support these scenarios. They differ in their system architecture, data model, rule model, and rule language. In this article, we survey these systems to help researchers, who often come from different backgrounds, in understanding how the various approaches they adopt may complement each other. In particular, we propose a general, unifying model to capture the different aspects of an IFP system and use it to provide a complete and precise classification of the systems and mechanisms proposed so far.
Conference Paper
Full-text available
Business process design is primarily driven by process improvement objectives. However, the role of control objectives stemming from regulations and standards is becoming increasingly important for businesses in light of recent events that led to some of the largest scandals in corporate history. As organizations strive to meet compliance agendas, there is an evident need to provide systematic approaches that assist in the understanding of the interplay between (often conflicting) business and control objectives during business process design. In this paper, our objective is twofold. We will firstly present a research agenda in the space of business process compliance, identifying major technical and organizational challenges. We then tackle a part of the overall problem space, which deals with the effective modeling of control objectives and subsequently their propagation onto business process models. Control objective modeling is proposed through a specialized modal logic based on normative systems theory, and the visualization of control objectives on business process models is achieved procedurally. The proposed approach is demonstrated in the context of a purchase-to-pay scenario.
Conference Paper
Full-text available
Process-aware information systems support business operations as they are typically defined in a normative process model. Often these systems do not directly execute the process model, but provide the flexibility to deviate from the normative model. This paper proposes a method for monitoring control-flow deviations during process execution. Our contribution is a formal technique to derive monitoring queries from a process model, such that they can be directly used in a complex event processing environment. Furthermore, we also introduce an approach to filter and aggregate query results to provide compact feedback on deviations. Our techniques is applied in a case study within the IT service industry.
Article
Full-text available
Analysis of behavioural consistency is an important aspect of software engineering. In process and service management, consistency verification of behavioural models has manifold applications. For instance, a business process model used as system specification and a corresponding workflow model used as implementation have to be consistent. Another example would be the analysis to what degree a process log of executed business operations is consistent with the corresponding normative process model. Typically, existing notions of behaviour equivalence, such as bisimulation and trace equivalence, are applied as consistency notions. Still, these notions are exponential in computation and yield a Boolean result. In many cases, however, a quantification of behavioural deviation is needed along with concepts to isolate the source of deviation. In this article, we propose causal behavioural profiles as the basis for a consistency notion. These profiles capture essential behavioural information, such as order, exclusiveness, and causality between pairs of activities of a process model. Consistency based on these profiles is weaker than trace equivalence, but can be computed efficiently for a broad class of models. In this article, we introduce techniques for the computation of causal behavioural profiles using structural decomposition techniques for sound free-choice workflow systems if unstructured net fragments are acyclic or can be traced back to S- or T-nets. We also elaborate on the findings of applying our technique to three industry model collections.
Article
Complex event processing (CEP) matches patterns over a continuous stream of events to detect situations of interest. Yet, the definition of an event pattern that precisely characterises a particular situation is challenging: there are manifold dimensions to correlate events, including time windows and value predicates. In the presence of historic event data that is labelled with the situation to detect, event patterns can be learned automatically. To cope with the combinatorial explosion of pattern candidates, existing approaches work on a type-level and discover patterns based on predefined event abstractions, aka event types. Hence, discovery is limited to patterns of a fixed granularity and users face the burden to manually select appropriate event abstractions. We present IL-Miner, a system that discovers event patterns by genuinely working on the instance-level, not assuming a priori knowledge on event abstractions. In a multi-phase process, IL-Miner first identifies relevant abstractions for the construction of event patterns. The set of events explored for pattern discovery is thereby reduced, while still providing formal guarantees on correctness, minimality, and completeness of the discovery result. Experiments using real-world datasets from diverse domains show that IL-Miner discovers a much broader range of event patterns compared to the state-of-the-art in the field.
Book
This is the second edition of Wil van der Aalst’s seminal book on process mining, which now discusses the field also in the broader context of data science and big data approaches. It includes several additions and updates, e.g. on inductive mining techniques, the notion of alignments, a considerably expanded section on software tools and a completely new chapter of process mining in the large. It is self-contained, while at the same time covering the entire process-mining spectrum from process discovery to predictive analytics. After a general introduction to data science and process mining in Part I, Part II provides the basics of business process modeling and data mining necessary to understand the remainder of the book. Next, Part III focuses on process discovery as the most important process mining task, while Part IV moves beyond discovering the control flow of processes, highlighting conformance checking, and organizational and time perspectives. Part V offers a guide to successfully applying process mining in practice, including an introduction to the widely used open-source tool ProM and several commercial products. Lastly, Part VI takes a step back, reflecting on the material presented and the key open challenges. Overall, this book provides a comprehensive overview of the state of the art in process mining. It is intended for business process analysts, business consultants, process managers, graduate students, and BPM researchers.
Conference Paper
Process mining is a rapidly developing field that aims at automated modeling of business processes based on data coming from event logs. In recent years, advances in tracking technologies, e.g., Real-Time Locating Systems (RTLS), put forward the ability to log business process events as location sensor data. To apply process mining techniques to such sensor data, one needs to overcome an abstraction gap, because location data recordings do not relate to the process directly. In this work, we solve the problem of mapping sensor data to event logs based on process knowledge. Specifically, we propose interactions as an intermediate knowledge layer between the sensor data and the event log. We solve the mapping problem via optimal matching between interactions and process instances. An empirical evaluation of our approach shows its feasibility and provides insights into the relation between ambiguities and deviations from process knowledge, and accuracy of the resulting event log.
Article
In recent years, monitoring the compliance of business processes with relevant regulations, constraints, and rules during runtime has evolved as major concern in literature and practice. Monitoring not only refers to continuously observing possible compliance violations, but also includes the ability to provide fine-grained feedback and to predict possible compliance violations in the future. The body of literature on business process compliance is large and approaches specifically addressing process monitoring are hard to identify. Moreover, proper means for the systematic comparison of these approaches are missing. Hence, it is unclear which approaches are suitable for particular scenarios. The goal of this paper is to define a framework for Compliance Monitoring Functionalities (CMF) that enables the systematic comparison of existing and new approaches for monitoring compliance rules over business processes during runtime. To define the scope of the framework, at first, related areas are identified and discussed. The CMFs are harvested based on a systematic literature review and five selected case studies. The appropriateness of the selection of CMFs is demonstrated in two ways: (a) a systematic comparison with pattern-based compliance approaches and (b) a classification of existing compliance monitoring approaches using the CMFs. Moreover, the application of the CMFs is showcased using three existing tools that are applied to two realistic data sets. Overall, the CMF framework provides a powerful means to position existing and future compliance monitoring approaches.
Article
A growing number of enterprises use complex event processing for monitoring and controlling their operations, while business process models are used to document working procedures. In this work, we propose a comprehensive method for complex event processing optimization using business process models. Our proposed method is based on the extraction of behavioral constraints that are used, in turn, to rewrite patterns for event detection, and select and transform execution plans. We offer a set of rewriting rules that is shown to be complete with respect to the all, seq, and any patterns. The effectiveness of our method is demonstrated in an experimental evaluation with a large number of processes from an insurance company. We illustrate that the proposed optimization leads to significant savings in query processing. By integrating the optimization in state-of-the-art systems for event pattern matching, we demonstrate that these savings materialize in different technical infrastructures and can be combined with existing optimization techniques.
Article
A key aspect in any process-oriented organisation is the evaluation of process performance for the achievement of its strategic and operational goals. Process Performance Indicators (PPIs) are a key asset to carry out this evaluation, and, therefore, having an appropriate definition of these PPIs is crucial. After a careful review of the literature related and a study of the current picture in different real organisations, we conclude that there not exists any proposal that allows to define PPIs in a way that is unambiguous and highly expressive, understandable by technical and non-technical users and traceable with the Business Process (BP). In addition, like other activities carried out during the BP lifecycle, the management of PPIs is considered time-consuming and error-prone. Therefore, providing an automated support for them is very appealing from a practical point of view. In this paper, we propose the PPINOT metamodel, which allows such an advanced definition of PPIs and is independent of the language used to model the business process. Furthermore, we provide an automatic semantic mapping from the metamodel to Description Logics (DL) that allows the implementation of design-time analysis operations in such a way that DL reasoners’ facilities can be leveraged. These operations provide information that can assist process analysts in the definition and instrumentation of PPIs. Finally, to validate the usefulness of our proposal, we have used the PPINOT metamodel at the core of a software tool called the PPINOT Tool Suite and we have applied it in several real scenarios.
Article
Some of the significant advancements and challenges faced in log analysis are discussed. The content and format of logs can vary widely from one system to another and among components within a system. The simplest and most common use for a debug log is to grep for a specific message. It is found that it is difficult to figure out what to search for in many cases, as there is no well-defined mapping between log messages and observed symptoms. Logging usually implies some internal synchronization and this can complicate the debugging of multi-threaded systems by changing the thread-interleaving pattern and obscuring the problem. Another key observation is that a program behaves nondeterministically only at certain execution points, such as clock interrupts and I/O.
Conference Paper
Previous studies have presented convincing arguments that a frequent pattern mining algorithm should not mine all frequent patterns but only the closed ones because the latter leads to not only more compact yet complete result set but also better efficiency. However, most of the previously developed closed pattern mining algorithms work under the candidate maintenance-and-test paradigm which is inherently costly in both runtime and space usage when the support threshold is low or the patterns become long. We present, BIDE, an efficient algorithm for mining frequent closed sequences without candidate maintenance. We adopt a novel sequence closure checking scheme called bidirectional extension, and prunes the search space more deeply compared to the previous algorithms by using the BackScan pruning method and the Scan-Skip optimization technique. A thorough performance study with both sparse and dense real-life data sets has demonstrated that BIDE significantly outperforms the previous algorithms: it consumes order(s) of magnitude less memory and can be more than an order of magnitude faster. It is also linearly scalable in terms of database size.
Business Process Querying
  • A Polyvyanyy