PreprintPDF Available

BLF: A Blockchain Logging Framework for Mining Blockchain Data

Authors:
Preprints and early-stage research may not have been peer reviewed yet.
Preprint

BLF: A Blockchain Logging Framework for Mining Blockchain Data

Abstract and Figures

Blockchain technology is increasingly used to realize decentralized applications and execute cross-organizational processes. Understanding how an application is used and how partners and users participate is essential to avoid failures and plan improvements. This understanding can be built by analyzing logs; but although data is in principle given in the immutable ledger, log extraction is currently still inconvenient, slow, and subject to interpretation. In this demo, we present BLF, an extensible logging framework for decentralized applications deployed on a blockchain. The framework is realized for Ethereum and Hyperledger, and has been tested for applications on those networks, but is extensible for other blockchains. Practitioners can use it to analyze their blockchain application and BPM researchers can explore with it new types of event data-event logs from blockchain applications.
Content may be subject to copyright.
BLF: A Blockchain Logging Framework for Mining
Blockchain Data
Paul Beck1,Hendrik Bockrath1,Tom Knoche1,Mykola Digtiar1,Tobias Petrich1,
Daniil Romanchenko1,Richard Hobeck2,Luise Pufahl2,Christopher Klinkmüller3
and Ingo Weber2
1Technische Universitaet Berlin, Berlin, Germany
2Software & Business Engineering, Technische Universitaet Berlin, Berlin, Germany
3CSIRO Data61, Sydney, Australia
Abstract
Blockchain technology is increasingly used to realize decentralized applications and execute cross-
organizational processes. Understanding how an application is used and how partners and users par-
ticipate is essential to avoid failures and plan improvements. This understanding can be built by ana-
lyzing logs; but although data is in principle given in the immutable ledger, log extraction is currently
still inconvenient, slow, and subject to interpretation. In this demo, we present BLF, an extensible log-
ging framework for decentralized applications deployed on a blockchain. The framework is realized for
Ethereum and Hyperledger, and has been tested for applications on those networks, but is extensible for
other blockchains. Practitioners can use it to analyze their blockchain application and BPM researchers
can explore with it new types of event data – event logs from blockchain applications.
Keywords
Logging, Blockchain Application, Process Mining
1. Introduction
Blockchain technology enables a new generation of applications, commonly referred to as
decentralized applications
(DApp), which can e.g., support the execution of cross-organizational
business processes [
1
]. Whereas DApp developers have full control over their DApp’s features,
the shared nature of the networks on which the DApps are deployed limits the developers
inuence on when, where, and under what circumstances they are executed. Thus, the analysis
of DApp behavior based on logs is essential for avoiding failures and planning improvements
for the future. Here, process mining techniques [
2
] support the analysis of events over time and
can provide useful insights as e.g., shown in [3,4].
In this demo, we present the Blockchain Logging Framework (BLF) whose main components
are summarized in Fig. 1. At heart, BLF enables the generation of logs from DApp data by
allowing users to dene an
extract-transform-load
(ETL) pipeline. To this end, users have to
Proceedings of the Demonstration & Resources Track, Best BPM Dissertation Award, and Doctoral Consortium at BPM
2021 co-located with the 19th International Conference on Business Process Management, BPM 2021, Rome, Italy,
September 6-10, 2021
"rstname.lastname@tu-berlin.de (R. Hobeck); rstname.lastname@tu-berlin.de (L. Pufahl);
christopher.klinkmueller@data61.csiro.au (C. Klinkmüller); rstname.lastname@tu-berlin.de (I. Weber)
©2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
1
Paul Beck et al. CEUR Workshop Proceedings 1–5
Output
Blockchain Logging Framework
(BLF)
Blockchain
TXT
Manifest
(BcQL)
DApp
Source Code
execute
emit
Extractor
Validator
State
Transaction
Receipts
Transactions
XES
CSV
Figure 1: Overview of the blockchain logging framework.
specify a manifest using BLF’s
Blockchain Query Language
(BcQL). To support and ease the
denition of the manifests, BLF’s validator can verify their correctness and inform users about
potential specication errors. The framework itself is an extension of the
Ethereum Logging
Framework
[
4
,
5
] (ELF) which was developed and tested for Ethereum [
6
]. BLF by contrast is
designed as a generic framework for generating logs from DApps on any blockchain platform.
While BLF currently provides adapters for Ethereum and Hyperledger [
7
], it is extensible so
that other platforms can be supported in the future. In the remainder, we present the main
functionality, its query language, and the possibility to extend it. We conclude by outlining case
studies and a small demonstration for a Hyperledger DApp.
2. Main Functionality of BLF
BLF is written in Java and its source code is publicly available
1
. It consists of three parts: BcQL,
the
validator
, and the
extractor
. In the following, we briey present how BcQL can be applied to
specify manifest les and thus ETL pipelines for DApps. After that, we summarize the extractor
and validator functionality which builds upon manifest les (see Fig. 1). Lastly, we elaborate on
possibilities to extend the framework.
Manifest and BcQL.
A manifest le denes how logs are generated from DApp data. This
includes details regarding which blockchain to connect to, what data to extract, how to structure
the data, and where to store it in which format. To this end, BcQL is designed as a declarative
1https://github.com/TU-ADSP/Blockchain-Logging-Framework
2
Paul Beck et al. CEUR Workshop Proceedings 1–5
Figure 2: Manifest structure
query language that abstracts away low-level extraction details like data decoding, composition
of API calls, etc., so that developers can focus on dening the actual ETL process.
In Fig. 2, the structure of a manifest document is shown with its mandatory and optional
elements. In each manifest the user rst has to dene the blockchain context (e.g. “ETHEREUM”,
“HYPERLEDGER”), a connection to a blockchain node, and an output folder. Additionally,
BcQL gives the option to specify (1) lters (e.g., block lter, transaction lter), which allow
users to narrow down the DApp data to be extracted, (2) expression statements which provide
transformation and logic operators that can be used to process the data, and (3) emit statements
for formatting data in a specic target format. Additionally, the user may congure the emission
mode and the error handling strategy. By default, emission of the output les is done as
soon as all data of a blockchain application has been extracted, transformed, and loaded.
Safe
batching
and
streaming
allow to emit data for each processed block whereby the former option
continuously updates the main output les and the latter produces new les for each block.
Regarding error handling, BLF is capable of handling runtime errors in two distinct ways: errors
are either ignored, or they lead to the abortion of the current execution. Either way, the errors
will get printed to the console and written to an error log le. A detailed step-by-step tutorial
on how to write a manifest is provided on the project website2.
Validator and Extractor.
The main components of BLF are the validator and the extractor
which process user-dened manifest les (see Fig. 1). Here, we briey summarize these two
components. The interested reader can nd more information on these functions in [5].
The
validator
supports the user and checks the manifest for specication errors. In a rst
step, the parser generator library
ANTLR4 parses
a textual manifest le based on a specication
of BcQL’s grammar. In this regard,
ANTLR4
generates an intermediate representation of the
manifest and identies syntactic errors. Semantic analysis of the intermediate representation
is implemented as a set of custom rules that, e.g., check if lters are correctly nested and that
2https://github.com/TU-ADSP/Blockchain-Logging-Framework/wiki/Manifest
3
Paul Beck et al. CEUR Workshop Proceedings 1–5
Figure 3: Listeners to Extend BLF to new Blockchain technologies.
variable, parameter, and literal types are compatible.
With a validated manifest, the
extractor
is able to extract, transform, and load data from
DApps. The framework extracts data block by block in their historical order, i.e., how they were
created and included in a blockchain. During the extraction, the specied lters are considered.
For transformation BLF provides a basic set of operators and additionally allows users to
integrate custom operators at compile time of the Ethereum Logging Framework. Finally, data
can be formatted and exported as textual
application logs
, or in the
comma-separated values
(CSV) and the eXtensible Event Stream (XES) [2] format.
Extending BLF.
The interaction between BLF and a blockchain node is done through a
standardized interface, called
BaseBlockchainListener
(see Fig. 3). By implementing this interface
for specic blockchain platforms, e.g., for Corda R3 or EOS, developers can add support for
additional platforms. Currently, BLF provides standard implementations for Hyperledger and
Ethereum. Besides functionality to extract data from blocks, transactions, log entries, and the
blockchain state, developers must declare the default variables that these entities have. For
example, on Ethereum blocks are identied by a block number, i.e., the position of the block in
the blockchain, while transactions and log entries are identied by indices that encode their
position within a block. Developers can follow the step-by-step tutorial
3
to integrate further
platforms.
3. Demonstration and Maturity
ELF, the predecessor of BLF, was already used in several case studies, amongst others to examine
the popular Ethereum game
CryptoKitties
[
4
], and the Ethereum DApp
Augur
[
3
], a popular
prediction and betting market. These case studies demonstrate real-world applications of the
framework and possibilities how it can be used. After reworking ELF to the extensible BLF, we
wrote a BcQL manifest again for
Augur
, ran BLF and successfully demonstrated BLFs continuous
capability to extract event logs from Ethereum DApps.
The availability of Hyperledger use cases from production environments is low, because
Hyperledger is used as a private permissioned blockchain and has no public blockchain system.
3https://github.com/TU-ADSP/Blockchain-Logging-Framework/wiki/Adding-a-new-Blockchain-to-the-BLF
4
Paul Beck et al. CEUR Workshop Proceedings 1–5
Thus, we implemented our own DApp on a Hyperledger node:
HyperKitties4
, a Hyperledger
reimplementation of the
CryptoKitties
Ethereum smart contract. By porting CryptoKitties to
Hyperledger we wanted to test whether the event log generation from Hyperledger results in
reasonable event logs as observed in previous case studies. We used
HyperKitties
to write events
into our private Hyperledger blockchain. We then created a manifest le that lets BLF connect
to the local Hyperledger node, extract the HyperKitties events from the blocks, and generate an
event log. We opened it in Disco where we could validate a reasonable event log similar to the
Ethereum result. Examples on how to write a manifest and a screencast are provided on the
main project website (see Footnote 1).
This demo presented a framework to log data of blockchain-based applications. It provides
functions to extract, transform, and load data. The framework additionally supports dierent
modes for data emission and exception handling. It has been already applied in larger case
studies and oers BPM researchers a new source of data for process mining. Albeit being usable
on production systems, BLF’s maturity should still be classied as a fully functional research
prototype. In future, we want to extend it to other blockchain technologies and improve the
usability of BcQL’s grammar.
References
[1] X. Xu, I. Weber, M. Staples, Architecture for blockchain applications, Springer, 2019.
[2] W. Van Der Aalst, Process mining - Data science in action, Springer, 2016.
[3]
R. Hobeck, C. Klinkmüller, H. D. Bandara, I. Weber, W. van der Aalst, Process mining on
blockchain data: A case study of augur, in: BPM 2021, accepted, Springer, 2021.
[4]
C. Klinkmüller, A. Ponomarev, A. B. Tran, I. Weber, W. van der Aalst, Mining blockchain
processes: Extracting process mining data from blockchain applications, in: BPM 2019:
Blockchain Forum, Springer, 2019, pp. 71–86.
[5]
C. Klinkmüller, I. Weber, A. Ponomarev, A. B. Tran, W. van der Aalst, Ecient logging for
blockchain applications, arXiv preprint arXiv:2001.10281 (2020).
[6]
G. Wood, et al., Ethereum: A secure decentralised generalised transaction ledger, Ethereum
project yellow paper 151 (2014) 1–32.
[7]
E. Androulaki, A. Barger, V. Bortnikov, C. Cachin, K. Christidis, A. De Caro, et al., Hyper-
ledger fabric: a distributed operating system for permissioned blockchains, in: EuroSys
conference, 2018, pp. 1–15.
4https://github.com/TU-ADSP/HyperKitties
5
ResearchGate has not been able to resolve any citations for this publication.
Book
This book addresses what software architects and developers need to know in order to build applications based on blockchain technology, by offering an architectural view of software systems that make beneficial use of blockchains. It provides guidance on assessing the suitability of blockchain, on the roles blockchain can play in an architecture, on designing blockchain applications, and on assessing different architecture designs and tradeoffs. It also serves as a reference on blockchain design patterns and design analysis, and refers to practical examples of blockchain-based applications. The book is divided into four parts: Part I provides a general introduction to the topic and to existing blockchain platforms including Bitcoin, Ethereum, and Hyperledger Fabric, and offers examples of blockchain-based applications. Part II focuses on the functional aspects of software architecture, describing the main roles blockchain can play in an architecture, as well as its potential suitability and design process. It includes a catalogue of 15 design patterns and details how to use model-driven engineering to build blockchain-based applications. Part III covers the non-functional aspects of blockchain applications, which are cross-cutting concerns including cost, performance, security, and availability. Part IV then presents three detailed real-world use cases, offering additional insights from a practical perspective. An epilogue summarizes the book and speculates on the role blockchain and its applications can play in the future. This book focusses on the bigger picture for blockchain, covering the concepts and technical considerations in the design of blockchain-based applications. The use of mathematical formulas is limited to where they are critical. This book is primarily intended for developers, software architects and chief information officers who need to understand the basic technology, tools and methodologies to build blockchain applications. It also provides students and researchers new to this field an introduction to this hot topic.
Book
This is the second edition of Wil van der Aalst’s seminal book on process mining, which now discusses the field also in the broader context of data science and big data approaches. It includes several additions and updates, e.g. on inductive mining techniques, the notion of alignments, a considerably expanded section on software tools and a completely new chapter of process mining in the large. It is self-contained, while at the same time covering the entire process-mining spectrum from process discovery to predictive analytics. After a general introduction to data science and process mining in Part I, Part II provides the basics of business process modeling and data mining necessary to understand the remainder of the book. Next, Part III focuses on process discovery as the most important process mining task, while Part IV moves beyond discovering the control flow of processes, highlighting conformance checking, and organizational and time perspectives. Part V offers a guide to successfully applying process mining in practice, including an introduction to the widely used open-source tool ProM and several commercial products. Lastly, Part VI takes a step back, reflecting on the material presented and the key open challenges. Overall, this book provides a comprehensive overview of the state of the art in process mining. It is intended for business process analysts, business consultants, process managers, graduate students, and BPM researchers.
Mining blockchain processes: Extracting process mining data from blockchain applications
  • C Klinkmüller
  • A Ponomarev
  • A B Tran
  • I Weber
  • W Van Der Aalst
C. Klinkmüller, A. Ponomarev, A. B. Tran, I. Weber, W. van der Aalst, Mining blockchain processes: Extracting process mining data from blockchain applications, in: BPM 2019: Blockchain Forum, Springer, 2019, pp. 71-86.
  • C Klinkmüller
  • I Weber
  • A Ponomarev
  • A B Tran
  • W Van Der Aalst
C. Klinkmüller, I. Weber, A. Ponomarev, A. B. Tran, W. van der Aalst, Efficient logging for blockchain applications, arXiv preprint arXiv:2001.10281 (2020).
Ethereum: A secure decentralised generalised transaction ledger
  • G Wood
G. Wood, et al., Ethereum: A secure decentralised generalised transaction ledger, Ethereum project yellow paper 151 (2014) 1-32.
  • E Androulaki
  • A Barger
  • V Bortnikov
  • C Cachin
  • K Christidis
  • A De Caro
E. Androulaki, A. Barger, V. Bortnikov, C. Cachin, K. Christidis, A. De Caro, et al., Hyperledger fabric: a distributed operating system for permissioned blockchains, in: EuroSys conference, 2018, pp. 1-15.