BookPDF Available

Event Stream Processing with BeepBeep 3: Log Crunching and Analysis Made Easy

Authors:

Abstract and Figures

Event logs and event streams can be found in software systems of very diverse kinds. For instance, workflow management systems and ERP platforms produce event logs in some common format based on XML. Financial transaction systems also keep a log of their operations in some standardized and documented format, as is the case for web servers such as Apache and Microsoft IIS. Network monitors also receive streams of packets whose various headers and fields can be analyzed. Recently, even the world of video games has seen an increasing trend towards the logging of players’ realtime activities. Analyzing the wealth of information contained in these logs can serve multiple purposes. Business process logs can be used to reconstruct a workflow based on a sample of its possible executions; financial database logs can be audited for compliance to regulations; suspicious or malicious activity can be detected by studying patterns in network or server logs. However, the available tools to process logs or streams of events are often large systems that are hard to setup, and even simple examples seem needlessly complicated. In this book, you will learn about BeepBeep, a versatile Java library intended to make the processing of event streams both fun and simple. Through more than a hundred simple, illustrated code examples, you will see how running event processing tasks can be done in just a few lines of code—and what is more, code that you actually understand. From generating plots to computing statistics and evaluating temporal logic specifications, BeepBeep can prove a handy addition to a developer’s toolbox.
Content may be subject to copyright.
A preview of the PDF is not available
... Given a CSV file, this task extracts the numerical value in each line, compute the average of each set of n successive values and check that it is below some threshold t (similar to the example we discussed earlier). Computing an aggregation over a sliding window is a common task in the field of event stream processing [26] and runtime verification [4], and is also provided by most statistical software, such as R's smooth package. It can be seen as a basic form of trend deviation detection [41], where the end result of the calculation is an "alarm" indicating that the expected trend has not been followed across the whole data file; a classical example of this is the detection of temperature peaks in a server rack [40]. ...
... Explainability functionalities could also easily be retrofitted into existing (Java) software, with minimal interference on their current code. Case in point, we already identified the Cornipickle web testing tool [24] and the BeepBeep event stream processing engine [26] as some of the first targets for the addition of explainability based on Petit Poucet. A lineage-aware version of the GRAL plotting library 8 is also considered. ...
Chapter
Full-text available
Explainability is the process of linking part of the inputs given to a calculation to its output, in such a way that the selected inputs somehow “cause” the result. We establish the formal foundations of a notion of explainability for arbitrary abstract functions manipulating nested data structures. We then establish explanation relationships for a set of elementary functions, and for compositions thereof. A fully functional implementation of these concepts is finally presented and experimentally evaluated.
... These concepts have been concretely implemented as an extension to an event stream processing engine called BeepBeep [9], which is described in Section V. Experiments with a number of different scenarios show that the multi-monitor adds constant memory overhead and linear time overhead over an input trace, which means that it can scale to large traces and large monitors (10 6 events and more than 10 9 states). Furthermore, we show that some types of data degradation can only be accounted for in related works by an over-approximation of uncertainty, which has a significant negative impact on the precision of a monitor's verdict and its performance, compared to the finer modeling presented in this paper. ...
... An implementation of propositional machines has been realized in the form of a Java library that extends the BeepBeep event stream processing engine [9]. The library is open source and publicly available4. ...
... These formal concepts have been concretely implemented into an existing event stream processing library, called Beep-Beep [14], which has been extended such that the output produced by a query can be precisely traced back to the individual data elements of the log that contribute to (i.e. "explain") the result. ...
... A detailed description of BeepBeep is out of the scope of this paper, due to space restrictions. For further details, the reader is referred to a complete textbook describing the system [14]. ...
Conference Paper
Full-text available
Added value can be extracted from event logs generated by business processes in various ways. However, although complex computations can be performed over event logs, the result of such computations is often difficult to explain; in particular, it is hard to determine what parts of an input log actually matters in the production of that result. This paper describes a framework to provide explainable results for queries executed over sequences of events, where individual output values can be precisely traced back to the data elements of the log that contribute to (i.e. “explain”) the result. This framework has been implemented into the BeepBeep event processing engine and empirically evaluated on various queries.
... However, none of these systems consider the special problem of explainability for event stream processing; in contrast, existing event stream processing systems provide very few in the way of lineage and explainability, leaving a gap that needs to be filled. In this paper, we describe how an existing log processing library, called BeepBeep [16], can be extended in order to provide a form of explanation mechanism: the output produced by a query can be precisely traced back to the individual data elements of the log that contribute to (i.e. "explain") the result. ...
... A detailed description of BeepBeep is out of the scope of this paper, due to space restrictions. For further details, the reader is referred to a complete textbook describing the system [16]. ...
Preprint
Added value can be extracted from event logs generated by business processes in various ways. However, although complex computations can be performed over event logs, the result of such computations is often difficult to explain; in particular, it is hard to determine what parts of an input log actually matters in the production of that result. This paper describes how an existing log processing library, called BeepBeep, can be extended in order to provide a form of provenance: individual output events produced by a query can be precisely traced back to the data elements of the log that contribute to (i.e. "explain") the result.
... An appealing advantage of such a setup is the capacity to query this blockchain for a variety of properties of interest that guarantee the correct and efficient operation of the supply chain. The second goal of this chapter, thus, is to define several kinds of such properties and implement them through the use of the BeepBeep stream monitor [35]. Firstly, we define what we call correctness properties. ...
... Over the past few years, BeepBeep has been involved in a variety of case studies [12,29,30,31,39,68], and provides built-in support for writing domain-specific languages [32]. A complete description of BeepBeep is out of the scope of this paper; the reader is referred to a recent tutorial [28] or to a recent textbook about the system [35]. ...
Chapter
Full-text available
The combination of the Internet of Things and blockchain-based technologies represents a real opportunity for supply chain and logistics protagonists, who need more dynamic, trustworthy and transparent tracking systems in order to improve their efficiency and strengthen customer confidence. In parallel, hyperconnected logistics promise more efficient and sustainable goods handling and delivery. This chapter shows how the Ethereum blockchain and smart contracts can be used to implement a shareable and secured tracking system for hyperconnected logistics. A simulation using the well-known AnyLogic software tool provides insights on the monitoring of properties depicting shipment lifecycle constraints through a stream of blockchain log events processed by BeepBeep 3, an open source stream processing engine.
... This pattern is a generic and high-level chain of basic event processing units, combined together to perform a precise computation. Although such a pattern can be implemented in any event stream processing system, in this paper, we focus on its definition using stream processors provided by the BeepBeep event stream engine [24]. First, we describe the static prediction workflow, which allows an event stream to be used as the basis for a prediction based on a static and predefined function, computed over a sliding window of recent events. ...
... Over the past few years, BeepBeep has been used in a variety of case studies [6], [22], [27], [54]. For a complete description of BeepBeep, the reader is referred to a recent textbook on the system [24]. ...
... An in-depth presentation of BeepBeep is out of the scope of this paper; the reader is referred to a recent book for details [63]. However, for the sake of self-containedness, we shall describe in the following the key concepts of this system that are necessary to understand our contribution. ...
... The symbolic names INPUT and OUTPUT refer to the (only) input or output of a processor, and stand for the value 0. The other possible way to create processor chains is through the definition of a grammar for a domain-specific language (DSL), coupled with an interpreter converting expressions of this DSL into processor chains. BeepBeep provides facilities for defining such custom-made languages in few lines of code (see for example Chapter 8 of the user manual [63] or a recent publication showing examples in the context of Runtime Verification [60]). ...
Article
Full-text available
Information systems produce different types of event logs; in many situations, it may be desirable to look for trends inside these logs. We show how trends of various kinds can be computed over such logs in real time, using a generic framework called the trend distance workflow. Many common computations on event streams turn out to be special cases of this workflow, depending on how a handful of workflow parameters are defined. This process has been implemented and tested in a real-world event stream processing tool, called BeepBeep. Experimental results show that deviations from a reference trend can be detected in realtime for streams producing up to thousands of events per second.
... The CEP system that has been chosen for the present study is BeepBeep [11,14], which is available under an open source license. 1 It was chosen for its relative ease of use, and most importantly, its capacity to compose together processing units that use widely differing specification languages. Also, our problem requires the handling of events of very varied types type (e.g. ...
... This is done by instantiating processors for various operations, and by connecting the output of a processor into the input of another. A complete description of the available processors in BeepBeep is out of the scope of this paper; the reader is referred to a recent monography on the subject [14]. Figure 2 shows the first part of the chain of event processors used to recognize the appliances that are turned on and off. ...
Conference Paper
In the past 10 years, the question of the care and well-being of the elderly became a priority for modern societies. The number of people over the age of 65 is increasing, while at the same time, resources such as caregivers and funds remain stable. It is in this context that several researchers proposed solutions based on Ambient Intelligence in order to provide targeted assistance according to the needs of the elderly. In this paper, we show to it is possible to recognize electrical appliances in use, based on readings from a unique sensor installed at the main electrical panel of a home. Moreover, when the system observes an unknown appliance being turned on, it can recommend a possible appliance based on the characteristics of its power consumption. An experimental evaluation of the system on real appliances shows a recognition and recommendation rate close to 100%.
... To compare between Palisade and Beep Beep 3, we constructed a Beep Beep 3 stream processor that mimicked the behavior of both the RangeCheck and LossDetect Palisade processors. Figure 11 shows an outline of this processor, along with Beep Beep 3 programs for reading and printing events [56]. Figure 11 uses the official Beep Beep 3 drawing guide to show how events are read, transmitted, filtered, retransmitted, and printed. ...
Article
Full-text available
In this article, we propose Palisade, a distributed framework for streaming anomaly detection. Palisade is motivated by the need to apply multiple detection algorithms for distinct anomalies in the same scenario. Our solution blends low latency detection with deployment flexibility and ease-of-modification. This work includes a thorough description of the choices made in designing Palisade and the reasons for making those choices. We carefully define symptoms of anomalies that may be detected, and we use this taxonomy in characterizing our work. The article includes two case studies using a variety of anomaly detectors on streaming data to demonstrate the effectiveness of our approach in an embedded setting.
Chapter
Full-text available
Runtime enforcement is an effective method to ensure the compliance of program with user-defined security policies. In this paper we show how the stream event processor tool BeepBeep can be used to monitor the security properties of Java programs. The proposed approach relies on AspectJ to generate a trace capturing the program’s runtime behavior. This trace is then processed by BeepBeep, a complex event processing tool that allows complex data-driven policies to be stated and verified with ease. Depending on the result returned by BeepBeep, AspectJ can then be used to halt the execution or take other corrective action. The proposed method offers multiple advantages, notable flexibility in devising and stating expressive user-defined security policies.
Conference Paper
Full-text available
We present an extension to the BeepBeep 3 event stream engine that allows the use of multiple threads during the evaluation of a query. Compared to the single-threaded version of BeepBeep, the allocation of just a few threads to specific portions of a query provides improvement in terms of throughput.
Chapter
Full-text available
This tutorial presents an overview of the field referred as to runtime verification. Runtime Verification is the study of algorithms, data structures, and tools focused on analyzing executions of system. The performed analysis aims at improving the confidence in systems behavior, either by improving program understanding, or by checking conformance to specifications or algorithms. This chapter focuses specifically on checking execution traces against requirements formalized in terms of monitors. It is first shown on examples how such monitors can be written using aspect-oriented programming, exemplified by ASPECTJ. Subsequently four monitoring systems are illustrated on the same examples. The systems cover such formalisms as regular expressions, temporal logics, state machines, and rule-based programming, as well as the distinction between external and internal DSLs.
Article
Full-text available
Many problems in Computer Science can be framed as the computation of queries over sequences, or "streams" of data units called events. The field of Complex Event Processing (CEP) relates to the techniques and tools developed to efficiently process these queries. However, most CEP systems developed so far have concentrated on relatively narrow types of queries, which consist of sliding windows, aggregation functions, and simple sequential patterns computed over events that have a fixed tuple structure. Many of them boast throughput, but in counterpart, they are difficult to setup and cumbersome to extend with user-defined elements. This paper describes a variety of use cases taken from real-world scenarios that present features seldom considered in classical CEP problems. It also provides a broad review of current solutions, that includes tools and techniques going beyond typical surveys on CEP. From a critical analysis of these solutions, design principles for a new type of event stream processing system are exposed. The paper proposes a simple, generic and extensible framework for the processing of event streams of diverse types; it describes in detail a stream processing engine, called BeepBeep, that implements these principles. BeepBeep's modular architecture, which borrows concepts from many other systems, is complemented with an extensible query language, called eSQL. The end result is an open, versatile, and reasonably efficient query engine that can be used in situations that go beyond the capabilities of existing systems.
Conference Paper
Full-text available
This paper describes the design and implementation of an SQL-like language for performing complex queries on event streams. The Event Stream Query Language (eSQL) aims at providing a simple, intuitive and fully non-procedural syntax, while still preserving backwards compatibility with traditional SQL. More importantly, eSQL's core syntax is designed to be extended by user-defined grammatical constrcts. These new constructs can form domain-specific sub-languages, with eSQL being used as the “glue” to form very expressive queries. These concepts have been implemented in BeepBeep 3, an open source event stream query engine.
Article
Full-text available
This open source computing framework unifies streaming, batch, and interactive big data workloads to unlock new applications.
Conference Paper
Full-text available
We explore of use of the tool BeepBeep, a monitor for the temporal logic LTL-FO\(^+\), in interpreting assembly traces, focusing on security-related applications. LTL-FO\(^+\) is an extension of LTL, which includes first order quantification. We show that LTL-FO\(^+\) is a sufficiently expressive formalism to state a number of interesting program behaviors, and demonstrate experimentally that BeepBeep can efficiently verify the validity of the properties on assembly traces in tractable time.
Conference Paper
Full-text available
Chapter
In this paper and its accompanying tutorial, we discuss the topic of runtime verification for linear-time temporal logic specifications. We recall the idea of runtime verification, give ideas about specification languages for runtime verification and develop a solid theory for linear-time temporal logic. Concepts like monitors, impartiality, and anticipation are explained based on this logic.