Matthias Weidlich

Matthias Weidlich
Humboldt-Universität zu Berlin | HU Berlin · Department of Computer Science

About

236
Publications
102,141
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,026
Citations
Additional affiliations
November 2013 - March 2015
Imperial College London
Position
  • Research Associate
Education
May 2008 - September 2011
Hasso Plattner Institute
Field of study
  • Computer Science
April 2007 - July 2008
Hasso Plattner Institute
Field of study
  • Computer Science
September 2005 - March 2006
Ecole d'Ingenieurs des Technologies de l'Information et du Management
Field of study
  • Computer Science

Publications

Publications (236)
Conference Paper
Full-text available
Process mining is a set of techniques to analyze business processes based on event logs extracted from information systems. Existing process mining techniques are designed for intra-organizational settings, as they assume that the entire event log of a process is available for analysis at once. In an intra-organizational process, each party only ha...
Technical Report
Full-text available
The propagation of rumours on social media poses an important threat to societies, so that various techniques for rumour detection have been proposed recently. Yet, existing work focuses on what entities constitute a rumour, but provides little support to understand why the entities have been classified as such. This prevents an effective evaluatio...
Preprint
Full-text available
The propagation of rumours on social media poses an important threat to societies, so that various techniques for rumour detection have been proposed recently. Yet, existing work focuses on \emph{what} entities constitute a rumour, but provides little support to understand \emph{why} the entities have been classified as such. This prevents an effec...
Article
The propagation of rumours on social media poses an important threat to societies, so that various techniques for rumour detection have been proposed recently. Yet, existing work focuses on what entities constitute a rumour, but provides little support to understand why the entities have been classified as such. This prevents an effective evaluatio...
Article
Full-text available
Today’s social networks continuously generate massive streams of data, which provide a valuable starting point for the detection of rumours as soon as they start to propagate. However, rumour detection faces tight latency bounds, which cannot be met by contemporary algorithms, given the sheer volume of high-velocity streaming data emitted by social...
Technical Report
Full-text available
Social networks continuously generate massive streams of data very fast. Such high-velocity streams exceed any reasonable limit for any rumour detection algorithms in terms of latency. Indeed, the input bu er of a rumour detector is signi cantly small compared to the whole social network and its the detection latency is extremely important: rumours...
Article
Full-text available
Predictive process monitoring is a family of techniques to analyze events produced during the execution of a business process in order to predict the future state or the final outcome of running process instances. Existing techniques in this field are able to predict, at each step of a process instance, the likelihood that it will lead to an undesi...
Technical Report
Full-text available
Hundreds of thousands of rumours emerge every day. Algorithmic models shall therefore support users of social platforms and provide alerts to prevent users from accidentally spreading rumours. However, existing alerting mechanisms are limited to post-hoc classification, and rumours are often detected after the damage has been done. This paper prese...
Chapter
To enable process analysis based on an event log without compromising the privacy of individuals involved in process execution, a log may be anonymized. Such anonymization strives to transform a log so that it satisfies provable privacy guarantees, while largely maintaining its utility for process analysis. Existing techniques perform anonymization...
Chapter
Business Process Management targets the design, execution, and optimization of business operations. This includes techniques for process querying, i.e., methods to filter and transform business process representations. Some of these representations may assume the form of event data, with an event denoting an execution of an activity as part of a sp...
Preprint
Process mining enables the analysis of complex systems using event data recorded during the execution of processes. Specifically, models of these processes can be discovered from event logs, i.e., sequences of events. However, the recorded events are often too fine-granular and result in unstructured models that are not meaningful for analysis. Log...
Article
Graph embedding aims at learning a vector-based representation of vertices that incorporates the structure of the graph. This representation then enables inference of graph properties. Existing graph embedding techniques, however, do not scale well to large graphs. While several techniques to scale graph embedding using compute clusters have been p...
Article
Today’s scientific data analysis very often requires complex Data Analysis Workflows (DAWs) executed over distributed computational infrastructures, e.g., clusters. Much research effort is devoted to the tuning and performance optimization of specific workflows for specific clusters. However, an arguably even more important problem for accelerating...
Conference Paper
The comparison of a model of a process against event data recorded during its execution, known as conformance checking, is an important means in process analysis. Yet, common conformance checking techniques are computationally expensive, which makes a complete analysis infeasible for large logs. To mitigate this problem, existing techniques levera...
Preprint
Privacy-preserving process mining enables the analysis of business processes using event logs, while giving guarantees on the protection of sensitive information on process stakeholders. To this end, existing approaches add noise to the results of queries that extract properties of an event log, such as the frequency distribution of trace variants,...
Conference Paper
Full-text available
The behavioural comparison of systems is an important concern of software engineering research. For example, the areas of specification discovery and specification mining are concerned with measuring the consistency between a collection of execution traces and a program specification. This problem is also tackled in process mining with the help of...
Conference Paper
Full-text available
This paper presents the idea of a compendium of process technologies, i.e., a concise but comprehensive collection of techniques for process model analysis that support research on the design, execution, and evaluation of processes. The idea originated from observations on the evolution of process-related research disciplines. Based on these observ...
Preprint
To enable process analysis based on an event log without compromising the privacy of individuals involved in process execution, a log may be anonymized. Such anonymization strives to transform a log so that it satisfies provable privacy guarantees, while largely maintaining its utility for process analysis. Existing techniques perform anonymization...
Chapter
Process performance indicators (PPIs) are metrics to quantify the degree with which organizational goals defined based on business processes are fulfilled. They exploit the event logs recorded by information systems during the execution of business processes, thereby providing a basis for process monitoring and subsequent optimization. However, PPI...
Preprint
Full-text available
Process performance indicators (PPIs) are metrics to quantify the degree with which organizational goals defined based on business processes are fulfilled. They exploit the event logs recorded by information systems during the execution of business processes, thereby providing a basis for process monitoring and subsequent optimization. However, PPI...
Conference Paper
Process performance indicators (PPIs) are metrics to quantify the degree with which organizational goals defined based on business processes are fulfilled. They exploit the event logs recorded by information systems during the execution of business processes, thereby providing a basis for process monitoring and subsequent optimization. However, PPI...
Article
Complex event processing (CEP) evaluates queries over streams of event data to detect situations of interest. If the event data are produced by geographically distributed sources, CEP may exploit in-network processing that distributes the evaluation of a query among the nodes of a network. To this end, a query is modularized and individual query op...
Article
The heterogeneity of today's Web sources requires information retrieval (IR) systems to handle multi-modal queries. Such queries define a user's information needs by different data modalities, such as keywords, hashtags, user profiles, and other media. Recent IR systems answer such a multi-modal query by considering it as a set of separate uni-moda...
Article
Queries to detect isomorphic subgraphs are important in graph-based data management. While the problem of subgraph isomorphism search has received considerable attention for the static setting of a single query, or a batch thereof, existing approaches do not scale to a dynamic setting of a continuous stream of queries. In this paper, we address the...
Book
This book constitutes thoroughly reviewed and selected short papers presented at the 25th East-European Conference on Advances in Databases and Information Systems, ADBIS 2021, as well as papers presented at doctoral consortium and ADBIS 2021 workshops. Due to the COVID-19 the conference and satellite events were held in hybrid mode. The 11 full p...
Technical Report
Full-text available
The heterogeneity of today's Web sources requires information retrieval (IR) systems to handle multi-modal queries. Such queries define a user's information needs by different data modalities, such as keywords, hashtags, user profiles, and other media. Recent IR systems answer such a multi-modal query by considering it as a set of separate uni-moda...
Technical Report
Full-text available
Queries to detect isomorphic subgraphs are important in graph-based data management. While the problem of subgraph isomorphism search has received considerable attention for the static setting of a single query, or a batch thereof, existing approaches do not scale to a dynamic setting of a continuous stream of queries. In this paper, we address the...
Chapter
Conformance checking is receiving increasing attention in the last years. This is due to several reasons, that can be summarized into two: the explosion of digital information that talks about processes, and the need to use this data in order to monitor and improve processes in organizations. Naturally, conformance checking addresses this by provid...
Chapter
Implementing regulatory documents is a recurring, mostly manual and time-consuming task for companies. To establish and ensure regulatory compliance, constraints need to be extracted from the documents and integrated into process models capturing existing operational practices. Since regulatory documents and processes are subject to frequent change...
Article
Full-text available
The Internet of Things (IoT) refers to a network of connected devices that collects and exchanges data through the Internet. These things can be artificial or natural and interact as autonomous agents that form a complex system. In turn, business process management (BPM) was established to analyze, discover, design, implement, execute, monitor, and...
Article
Conformance checking enables organizations to automatically assess whether their business processes are executed according to their specification. State-of-the-art conformance checking algorithms perform this task by establishing alignments between behaviour recorded by IT systems to a process model capturing desired behaviour. While such alignment...
Chapter
Event logs capture the execution of business processes in terms of executed activities and their execution context. Since logs contain potentially sensitive information about the individuals involved in the process, they should be pre-processed before being published to preserve the individuals’ privacy. However, existing techniques for such pre-pr...
Conference Paper
Process mining is no longer limited to the one-off analysis of static event logs extracted from a single enterprise system. Rather, process mining may strive for immediate insights based on streams of events that are continuously generated by diverse information systems. This requires online algorithms that, instead of keeping the whole history of...
Preprint
Full-text available
This paper presents a command-line tool, called Entropia, that implements a family of conformance checking measures for process mining founded on the notion of entropy from information theory. The measures allow quantifying classical non-deterministic and stochastic precision and recall quality criteria for process models automatically discovered f...
Preprint
While supporting the execution of business processes, information systems record event logs. Conformance checking relies on these logs to analyze whether the recorded behavior of a process conforms to the behavior of a normative specification. A key assumption of existing conformance checking techniques, however, is that all events are associated w...
Preprint
Event logs capture the execution of business processes in terms of executed activities and their execution context. Since logs contain potentially sensitive information about the individuals involved in the process, they should be pre-processed before being published to preserve the individuals' privacy. However, existing techniques for such pre-pr...
Conference Paper
Event logs capture the execution of business processes in terms of executed activities and their execution context. Since logs contain potentially sensitive information about the individuals involved in the process, they should be pre-processed before being published to preserve the individuals' privacy. However, existing techniques for such pre-pr...
Conference Paper
Event logs recorded during the execution of business processes constitute a valuable source of information. Applying process mining techniques to them, event logs may reveal the actual process execution and enable reasoning on quantitative or qualitative process properties. However, event logs often contain sensitive information that could be relat...
Chapter
Full-text available
Event logs recorded during the execution of business processes constitute a valuable source of information. Applying process mining techniques to them, event logs may reveal the actual process execution and enable reasoning on quantitative or qualitative process properties. However, event logs often contain sensitive information that could be relat...
Article
Full-text available
The behavioural comparison of systems is an important concern of software engineering research. For example, the areas of specification discovery and specification mining are concerned with measuring the consistency between a collection of execution traces and a program specification. This problem is also tackled in process mining with the help of...
Article
While supporting the execution of business processes, information systems record event logs. Conformance checking relies on these logs to analyze whether the recorded behavior of a process conforms to the behavior of a normative specification. A key assumption of existing conformance checking techniques, however, is that all events are associated w...
Conference Paper
Process mining is a family of techniques for analyzing business processes based on event logs extracted from information systems. Mainstream process mining tools are designed for intra-organizational settings, insofar as they assume that an event log is available for processing as a whole. The use of such tools for inter-organizational process anal...
Preprint
Full-text available
The open nature of the Web enables users to produce and propagate any content without authentication, which has been exploited to spread thousands of unverified claims via millions of online documents. Maintenance of credible knowledge bases thus has to rely on fact checking that constructs a trusted set of facts through credibility assessment. Due...
Article
Recognising patterns that correlate multiple events over time becomes increasingly important in applications that exploit the Internet of Things, reaching from urban transportation through surveillance monitoring to business workflows. In many real-world scenarios, however, timestamps of events may be erroneously recorded, and events may be dropped...
Article
Full-text available
Process mining provides a rich set of techniques to discover valuable knowledge of business processes based on data that was recorded in different types of information systems. It enables analysis of end‐to‐end processes to facilitate process re‐engineering and process improvement. Process mining techniques rely on the availability of data in the f...
Article
Discovery plays a key role in data-driven analysis of business processes. The vast majority of contemporary discovery algorithms aims at the identification of control-flow constructs. The increase in data richness, however, enables discovery that incorporates the context of process execution beyond the control-flow perspective. A “control-flow firs...
Preprint
Full-text available
Event logs recorded during the execution of business processes constitute a valuable source of information. Applying process mining techniques to them, event logs may reveal the actual process execution and enable reasoning on quantitative or qualitative process properties. However, event logs often contain sensitive information that could be relat...
Preprint
Full-text available
Network alignment is the problem of pairing nodes between two graphs such that the paired nodes are structurally and semantically similar. A well-known application of network alignment is to identify which accounts in different social networks belong to the same person. Existing alignment techniques, however, lack scalability, cannot incorporate mu...
Article
Full-text available
Privacy regulations for data can be seen as a major driver for data sovereignty measures. A specific example for that is the case of event data that is recorded by information systems during the processing of entities in domains such as e-commerce or healthcare. Since such data, typically available in the form of event log files, contains personali...
Preprint
Full-text available
Graph embedding aims at learning a vector-based representation of vertices that incorporates the structure of the graph. This representation then enables inference of graph properties. Existing graph embedding techniques, however, do not scale well to large graphs. We therefore propose a framework for parallel computation of a graph embedding using...
Conference Paper
Full-text available
The increasing volume of event data that is recorded by information systems during the execution of business processes creates manifold opportunities for process analytics. Specifically, conformance checking compares the behaviour as recorded by an information system to a model of desired behaviour. Unfortunately , state-of-the-art conformance chec...
Poster
Full-text available
The privacy of an organization’s workers represents a crucial concern in process mining settings, where data on an individual’s perfor- mance is recorded and possibly shared for analysis. To enable users to appropriately deal with privacy concerns in process mining, this paper introduces ELPaaS (Event Log Privacy as a Service), a web application th...
Article
Full-text available
Time prediction is an essential component of decision making in various Artificial Intelligence application areas, including transportation systems, healthcare, and manufacturing. Predictions are required for efficient resource allocation and scheduling, optimized routing, and temporal action planning. In this work, we focus on time prediction in c...
Conference Paper
Full-text available
The privacy of an organization's workers represents a crucial concern in process mining settings, where data on an individual's performance is recorded and possibly shared for analysis. To enable users to appropriately deal with privacy concerns in process mining, this paper introduces ELPaaS (Event Log Privacy as a Service), a web application that...