About
255
Publications
125,869
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,135
Citations
Introduction
Additional affiliations
November 2013 - March 2015
Education
May 2008 - September 2011
April 2007 - July 2008
September 2005 - March 2006
Ecole d'Ingenieurs des Technologies de l'Information et du Management
Field of study
- Computer Science
Publications
Publications (255)
The continued success of Large Language Models (LLMs) and other generative artificial intelligence approaches highlights the advantages that large information corpora can have over rigidly defined symbolic models, but also serves as a proof-point of the challenges that purely statistics-based approaches have in terms of safety and trustworthiness....
Schedules define how resources process jobs in diverse domains, reaching from healthcare to transportation, and, therefore, denote a valuable starting point for analysis of the underlying system. However, publishing a schedule may disclose private information on the considered jobs. In this paper, we provide a first threat model for published sched...
Maximal subgraph mining is increasingly important in various domains, including bioinformatics, genomics, and chemistry, as it helps identify common characteristics among a set of graphs and enables their classification into different categories. Existing approaches for identifying maximal subgraphs typically rely on traversing a graph lattice. How...
Complex event processing (CEP) detects situations of interest by evaluating queries over event streams. Once CEP is used in networked applications, the distribution of query evaluation among the event sources enables performance optimization. Instead of collecting all events at one location for query evaluation, sub-queries are placed at network no...
Anonymization of event logs facilitates process mining while protecting sensitive information of process stakeholders. Existing techniques, however, focus on the privatization of the control-flow. Other process perspectives, such as roles, resources, and objects are neglected or subject to randomization, which breaks the dependencies between the pe...
Event logs recorded during the execution of business processes provide a valuable starting point for operational monitoring, analysis, and improvement. Specifically, measures that quantify any deviation between the recorded operations and organizational goals enable the identification of operational issues. The data to compute such process-specific...
Advances in deep neural network (DNN) architectures have enabled new prediction techniques for stock market data. Unlike other multivariate time-series data, stock markets show two unique characteristics: (i) \emph{multi-order dynamics}, as stock prices are affected by strong non-pairwise correlations (e.g., within the same industry); and (ii) \emp...
Process Mining allows for the data-driven analysis of business processes based on logs that contain fine-granular data from the process' execution. However, such logs can potentially be exploited to extract sensitive information about process participants. To mitigate this risk, techniques that anonymize event logs to guarantee the privacy of proce...
We present Deep MinCut (DMC), an unsupervised approach to learn node embeddings for graph-structured data. It derives node representations based on their membership in communities. As such, the embeddings directly provide insights into the graph structure, so that a separate clustering step is no longer needed. DMC learns both, node embeddings and...
Traditional residential incentive-based demand response (DR) programs use fixed incentive structures that do not incorporate closed-loop feedback to compensate for non-compliance by participants. In practice, such programs may not reliably meet their event goals. To address this challenge, real-time feedback can be used to adaptively modify the par...
Process mining is a set of techniques to analyze business processes based on event logs extracted from information systems. Existing process mining techniques are designed for intra-organizational settings, as they assume that the entire event log of a process is available for analysis at once. In an intra-organizational process, each party only ha...
The propagation of rumours on social media poses an important threat to societies, so that various techniques for rumour detection have been proposed recently. Yet, existing work focuses on what entities constitute a rumour, but provides little support to understand why the entities have been classified as such. This prevents an effective evaluatio...
The propagation of rumours on social media poses an important threat to societies, so that various techniques for rumour detection have been proposed recently. Yet, existing work focuses on \emph{what} entities constitute a rumour, but provides little support to understand \emph{why} the entities have been classified as such. This prevents an effec...
The propagation of rumours on social media poses an important threat to societies, so that various techniques for rumour detection have been proposed recently. Yet, existing work focuses on what entities constitute a rumour, but provides little support to understand why the entities have been classified as such. This prevents an effective evaluatio...
By relating observed and modelled behaviour, conformance checking unleashes the full power of process mining. Techniques from this discipline enable the analysis of the quality of a process model discovered from event data, the identification of potential deviations, and the projection of real traces onto process models. This way, the insights gain...
Today’s social networks continuously generate massive streams of data, which provide a valuable starting point for the detection of rumours as soon as they start to propagate. However, rumour detection faces tight latency bounds, which cannot be met by contemporary algorithms, given the sheer volume of high-velocity streaming data emitted by social...
Today's social networks continuously generate massive streams of data, which provide a valuable starting point for the detection of rumours as soon as they start to propagate. However, rumour detection faces tight latency bounds, which cannot be met by contemporary algorithms, given the sheer volume of high-velocity streaming data emitted by social...
Social networks continuously generate massive streams of data very fast. Such high-velocity streams exceed any reasonable limit for any rumour detection algorithms in terms of latency. Indeed, the input bu er of a rumour detector is signi cantly small compared to the whole social network and its the detection latency is extremely important: rumours...
Predictive process monitoring is a family of techniques to analyze events produced during the execution of a business process in order to predict the future state or the final outcome of running process instances. Existing techniques in this field are able to predict, at each step of a process instance, the likelihood that it will lead to an undesi...
Hundreds of thousands of rumours emerge every day. Algorithmic models shall therefore support users of social platforms and provide alerts to prevent users from accidentally spreading rumours. However, existing alerting mechanisms are limited to post-hoc classification, and rumours are often detected after the damage has been done. This paper prese...
To enable process analysis based on an event log without compromising the privacy of individuals involved in process execution, a log may be anonymized. Such anonymization strives to transform a log so that it satisfies provable privacy guarantees, while largely maintaining its utility for process analysis. Existing techniques perform anonymization...
Business Process Management targets the design, execution, and optimization of business operations. This includes techniques for process querying, i.e., methods to filter and transform business process representations. Some of these representations may assume the form of event data, with an event denoting an execution of an activity as part of a sp...
Process mining enables the analysis of complex systems using event data recorded during the execution of processes. Specifically, models of these processes can be discovered from event logs, i.e., sequences of events. However, the recorded events are often too fine-granular and result in unstructured models that are not meaningful for analysis. Log...
Graph embedding aims at learning a vector-based representation of vertices that incorporates the structure of the graph. This representation then enables inference of graph properties. Existing graph embedding techniques, however, do not scale well to large graphs. While several techniques to scale graph embedding using compute clusters have been p...
Today’s scientific data analysis very often requires complex Data Analysis Workflows (DAWs) executed over distributed computational infrastructures, e.g., clusters. Much research effort is devoted to the tuning and performance optimization of specific workflows for specific clusters. However, an arguably even more important problem for accelerating...
The comparison of a model of a process against event data recorded
during its execution, known as conformance checking, is an important means in process analysis. Yet, common conformance checking techniques are computationally expensive, which makes a complete analysis infeasible for large logs. To mitigate this problem, existing techniques levera...
Privacy-preserving process mining enables the analysis of business processes using event logs, while giving guarantees on the protection of sensitive information on process stakeholders. To this end, existing approaches add noise to the results of queries that extract properties of an event log, such as the frequency distribution of trace variants,...
The behavioural comparison of systems is an important concern of software engineering research. For example, the areas of specification discovery and specification mining are concerned with measuring the consistency between a collection of execution traces and a program specification. This problem is also tackled in process mining with the help of...
This paper presents the idea of a compendium of process technologies, i.e., a concise but comprehensive collection of techniques for process model analysis that support research on the design, execution, and evaluation of processes. The idea originated from observations on the evolution of process-related research disciplines. Based on these observ...
To enable process analysis based on an event log without compromising the privacy of individuals involved in process execution, a log may be anonymized. Such anonymization strives to transform a log so that it satisfies provable privacy guarantees, while largely maintaining its utility for process analysis. Existing techniques perform anonymization...
Process performance indicators (PPIs) are metrics to quantify the degree with which organizational goals defined based on business processes are fulfilled. They exploit the event logs recorded by information systems during the execution of business processes, thereby providing a basis for process monitoring and subsequent optimization. However, PPI...
Queries to detect isomorphic subgraphs are important in graph-based data management. While the problem of subgraph isomorphism search has received considerable attention for the static setting of a single query, or a batch thereof, existing approaches do not scale to a dynamic setting of a continuous stream of queries. In this paper, we address the...
Process performance indicators (PPIs) are metrics to quantify the degree with which organizational goals defined based on business processes are fulfilled. They exploit the event logs recorded by information systems during the execution of business processes, thereby providing a basis for process monitoring and subsequent optimization. However, PPI...
Process performance indicators (PPIs) are metrics to quantify the degree with which organizational goals defined based on business processes are fulfilled. They exploit the event logs recorded by information systems during the execution of business processes, thereby providing a basis for process monitoring and subsequent optimization. However, PPI...
Complex event processing (CEP) evaluates queries over streams of event data to detect situations of interest. If the event data are produced by geographically distributed sources, CEP may exploit in-network processing that distributes the evaluation of a query among the nodes of a network. To this end, a query is modularized and individual query op...
The heterogeneity of today's Web sources requires information retrieval (IR) systems to handle multi-modal queries. Such queries define a user's information needs by different data modalities, such as keywords, hashtags, user profiles, and other media. Recent IR systems answer such a multi-modal query by considering it as a set of separate uni-moda...
This book constitutes thoroughly reviewed and selected short papers presented at the 25th East-European Conference on Advances in Databases and Information Systems, ADBIS 2021, as well as papers presented at doctoral consortium and ADBIS 2021 workshops. Due to the COVID-19 the conference and satellite events were held in hybrid mode.
The 11 full p...
Event logs that are recorded by information systems provide a valuable starting point for the analysis of processes in various domains, reaching from healthcare, through logistics, to e-commerce. Specifically, behavioral patterns discovered from an event log enable operational insights, even in scenarios where process execution is rather unstructur...
The heterogeneity of today's Web sources requires information retrieval (IR) systems to handle multi-modal queries. Such queries define a user's information needs by different data modalities, such as keywords, hashtags, user profiles, and other media. Recent IR systems answer such a multi-modal query by considering it as a set of separate uni-moda...
Queries to detect isomorphic subgraphs are important in graph-based data management. While the problem of subgraph isomorphism search has received considerable attention for the static setting of a single query, or a batch thereof, existing approaches do not scale to a dynamic setting of a continuous stream of queries. In this paper, we address the...
Conformance checking is receiving increasing attention in the last years. This is due to several reasons, that can be summarized into two: the explosion of digital information that talks about processes, and the need to use this data in order to monitor and improve processes in organizations. Naturally, conformance checking addresses this by provid...
Implementing regulatory documents is a recurring, mostly manual and time-consuming task for companies. To establish and ensure regulatory compliance, constraints need to be extracted from the documents and integrated into process models capturing existing operational practices. Since regulatory documents and processes are subject to frequent change...
The Internet of Things (IoT) refers to a network of connected devices that collects and exchanges data through the Internet. These things can be artificial or natural and interact as autonomous agents that form a complex system. In turn, business process management (BPM) was established to analyze, discover, design, implement, execute, monitor, and...
Conformance checking enables organizations to automatically assess whether their business processes are executed according to their specification. State-of-the-art conformance checking algorithms perform this task by establishing alignments between behaviour recorded by IT systems to a process model capturing desired behaviour. While such alignment...
Event logs capture the execution of business processes in terms of executed activities and their execution context. Since logs contain potentially sensitive information about the individuals involved in the process, they should be pre-processed before being published to preserve the individuals’ privacy. However, existing techniques for such pre-pr...
Process mining is no longer limited to the one-off analysis of static event logs extracted from a single enterprise system. Rather, process mining may strive for immediate insights based on streams of events that are continuously generated by diverse information systems. This requires online algorithms that, instead of keeping the whole history of...
This paper presents a command-line tool, called Entropia, that implements a family of conformance checking measures for process mining founded on the notion of entropy from information theory. The measures allow quantifying classical non-deterministic and stochastic precision and recall quality criteria for process models automatically discovered f...
While supporting the execution of business processes, information systems record event logs. Conformance checking relies on these logs to analyze whether the recorded behavior of a process conforms to the behavior of a normative specification. A key assumption of existing conformance checking techniques, however, is that all events are associated w...
Event logs capture the execution of business processes in terms of executed activities and their execution context. Since logs contain potentially sensitive information about the individuals involved in the process, they should be pre-processed before being published to preserve the individuals' privacy. However, existing techniques for such pre-pr...
Event logs capture the execution of business processes in terms of executed activities and their execution context. Since logs contain potentially sensitive information about the individuals involved in the process, they should be pre-processed before being published to preserve the individuals' privacy. However, existing techniques for such pre-pr...
Event logs recorded during the execution of business processes constitute a valuable source of information. Applying process mining techniques to them, event logs may reveal the actual process execution and enable reasoning on quantitative or qualitative process properties. However, event logs often contain sensitive information that could be relat...
Event logs recorded during the execution of business processes constitute a valuable source of information. Applying process mining techniques to them, event logs may reveal the actual process execution and enable reasoning on quantitative or qualitative process properties. However, event logs often contain sensitive information that could be relat...
The behavioural comparison of systems is an important concern of software engineering research. For example, the areas of specification discovery and specification mining are concerned with measuring the consistency between a collection of execution traces and a program specification. This problem is also tackled in process mining with the help of...
While supporting the execution of business processes, information systems record event logs. Conformance checking relies on these logs to analyze whether the recorded behavior of a process conforms to the behavior of a normative specification. A key assumption of existing conformance checking techniques, however, is that all events are associated w...
Process mining is a family of techniques for analyzing business processes based on event logs extracted from information systems. Mainstream process mining tools are designed for intra-organizational settings, insofar as they assume that an event log is available for processing as a whole. The use of such tools for inter-organizational process anal...
The open nature of the Web enables users to produce and propagate any content without authentication, which has been exploited to spread thousands of unverified claims via millions of online documents. Maintenance of credible knowledge bases thus has to rely on fact checking that constructs a trusted set of facts through credibility assessment. Due...
Recognising patterns that correlate multiple events over time becomes increasingly important in applications that exploit the Internet of Things, reaching from urban transportation through surveillance monitoring to business workflows. In many real-world scenarios, however, timestamps of events may be erroneously recorded, and events may be dropped...
Discovery plays a key role in data-driven analysis of business processes. The vast majority of contemporary discovery algorithms aims at the identification of control-flow constructs. The increase in data richness, however, enables discovery that incorporates the context of process execution beyond the control-flow perspective. A “control-flow firs...
Event logs recorded during the execution of business processes constitute a valuable source of information. Applying process mining techniques to them, event logs may reveal the actual process execution and enable reasoning on quantitative or qualitative process properties. However, event logs often contain sensitive information that could be relat...
Network alignment is the problem of pairing nodes between two graphs such that the paired nodes are structurally and semantically similar. A well-known application of network alignment is to identify which accounts in different social networks belong to the same person. Existing alignment techniques, however, lack scalability, cannot incorporate mu...
Process mining provides a rich set of techniques to discover valuable knowledge of business processes based on data that was recorded in different types of information systems. It enables analysis of end‐to‐end processes to facilitate process re‐engineering and process improvement. Process mining techniques rely on the availability of data in the f...
Privacy regulations for data can be seen as a major driver for data sovereignty measures. A specific example for that is the case of event data that is recorded by information systems during the processing of entities in domains such as e-commerce or healthcare. Since such data, typically available in the form of event log files, contains personali...