Jens Teubner

Jens Teubner
TU Dortmund University | TUD · Faculty of Computer Science

PhD

About

109
Publications
7,273
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,444
Citations
Additional affiliations
April 2013 - present
TU Dortmund University
Position
  • Professor (Full)
May 2001 - May 2005
University of Konstanz
Position
  • Research Assistant
August 2008 - March 2013
ETH Zurich
Position
  • PostDoc Position

Publications

Publications (109)
Chapter
Full-text available
Large-scale data processing forms the core of modern online services, such as social media and e-commerce, calling for an ever-increasing performance with predictable service quality. Even though emerging hardware platforms can deliver the required performance, actually harnessing it and guaranteeing a certain service quality is still a challenge f...
Chapter
Full-text available
Feed-Forward Networks (FFNs), or multilayer perceptrons, are fundamental network structures for deep learning. Although feed-forward networks are structurally uncomplicated, their training procedure is computationally expensive. It is challenging to design customized hardware for training due to the diversity of operations in forwardand backward-pr...
Chapter
Full-text available
With the increasing demand for time-predictable machine learning applications, e.g., object detection in autonomous driving systems, such a trend poses several new challenges for resource synchronization in real-time systems, especially when hardware accelerators like Graphics Processing Units (GPUs) are considered as shared resources. When the sha...
Article
Full-text available
Query compilation is a processing technique that achieves very high processing speeds but has the disadvantage of introducing additional compilation latencies. These latencies cause an overhead that is relatively high for short-running and high-complexity queries. In this work, we present Flounder IR and ReSQL, our new approach to query compilation...
Article
Query compilation has proven to be one of the most efficient query processing techniques. Despite its fast processing speed, the additional compilation times of the technique limit its applicability. This is because the approach is most beneficial only when the improvements in processing time clearly exceed the additional compilation time. Recently...
Article
Full-text available
Emerging hardware platforms are characterized by large degrees of parallelism, complex memory hierarchies, and increasing hardware heterogeneity. Their theoretical peak data processing performance can only be unleashed if the different pieces of systems software collaborate much more closely and if their traditional dependencies and interfaces are...
Article
In response to physical limitations, hardware has changed significantly during the past two decades. As the database community we have no chance but adapt to those changes in order to benefit from these and further hardware advances.
Chapter
Due to the growing demand on processing power and energy efficiency by today’s data-intensive applications developers have to deal with heterogeneous hardware platforms composed of specialized computing resources. These are highly efficient for certain workloads but difficult to handle from the software engineering perspective. Even state-of-the-ar...
Article
Graphics processing units (GPUs) promise spectacular performance advantages when used as database coprocessors. Their massive compute capacity, however, is often hampered by control flow divergence caused by non-uniform data distributions. When data-parallel work items demand for different amounts or types of processing, instructions execute with l...
Article
Die Arbeitsgruppe „Datenbanken und Informationssystem“ vertritt an der TU Dortmund das Gebiet in Forschung und Lehre. Dieser Artikel gibt einen Überblick über die Aktivitäten des Lehrstuhls in beiden Bereichen.
Conference Paper
Query processing on GPU-style coprocessors is severely limited by the movement of data. With teraflops of compute throughput in one device, even high-bandwidth memory cannot provision enough data for a reasonable utilization. Query compilation is a proven technique to improve memory efficiency. However, its inherent tuple-at-a-time processing style...
Article
As the operating costs of today’s data centres continue to increase and processor manufacturers are forced to meet thermal design power constraints when designing new hardware, the energy efficiency of a main-memory database management system becomes more and more important. Plus, lots of database workloads are more memory-intensive than compute-in...
Article
Genome-analysis enables researchers to detect mutations within genomes and deduce their consequences. Researchers need reliable analysis platforms to ensure reproducible and comprehensive analysis results. Database systems provide vital support to implement the required sustainable procedures. Nevertheless, they are not used throughout the complete...
Article
To escape a number of physical limitations (e.g., bandwidth and thermal issues), hardware technology is strongly trending toward heterogeneous system designs , where a large share of the application work can be off-loaded to accelerators , such as graphics or network processors. In the database domain, field-programmable gate arrays (FPGAs) were re...
Conference Paper
Technology limitations are making the use of heterogeneous computing devices much more than an academic curiosity. In fact, the use of such devices is widely acknowledged to be the only promising way to achieve application-speedups that users urgently need and expect. However, building a robust and efficient query engine for heterogeneous co-proces...
Article
Parallelism is currently seen as a mechanism to minimize the impact of the power and heat dissipation problems encountered in modern hardware. Data parallelism—based on partitioning the data—and pipeline parallelism—based on partitioning the computation—are the two main approaches to leverage parallelism on a wide range of hardware platforms. Unfor...
Article
For several decades, the roles in developing IT systems remained clearly separated. It was the responsibility of the hardware community to embrace and leverage the latest technology trends. The resulting hardware would become faster—but the interfaces it exposes to software remained basically unchanged for many years. The role of the software commu...
Article
This work revisits the processing of stream joins on modern hardware architectures. Our work is based on the recently proposed handshake join algorithm, which is a mechanism to parallelize the processing of stream joins in a NUMA-aware and hardware-friendly manner. Handshake join achieves high throughput and scalability, but it suffers from a high...
Article
Existing main-memory hash join algorithms for multi-core can be classified into two camps. Hardware-oblivious hash join variants do not depend on hardware-specific parameters. Rather, they consider qualitative characteristics of modern hardware and are expected to achieve good performance on any technologically similar platform. The assumption behi...
Article
While offering unique performance and energy-saving advantages, the use of Field-Programmable Gate Arrays (FPGAs) for database acceleration has demanded major concessions from system designers. Either the programmable chips have been used for very basic application tasks (such as implementing a rigid class of selection predicates) or their circuit...
Article
In this paper we experimentally study the performance of main-memory, parallel, multi-core join algorithms, focusing on sort-merge and (radix-)hash join. The relative performance of these two join approaches have been a topic of discussion for a long time. With the advent of modern multi-core architectures, it has been argued that sort-merge join i...
Conference Paper
In this demonstration, we present Ibex, a novel storage engine featuring hybrid, FPGA-accelerated query processing. In Ibex, an FPGA is inserted along the path between the storage devices and the database engine. The FPGA acts as an intelligent storage engine supporting query off-loading from the query engine. Apart from significant performance imp...
Article
Download Free Sample Roughly a decade ago, power consumption and heat dissipation concerns forced the semiconductor industry to radically change its course, shifting from sequential to parallel computing. Unfortunately, improving performance of applications has now become much more difficult than in the good old days of frequency scaling. This is a...
Conference Paper
Due to stagnant clock speeds and high power consumption of commodity microprocessors, database vendors have started to explore massively parallel co-processors such as FPGAs to further increase performance. A typical approach is to push simple but compute-intensive operations (e.g., pre-filtering, (de)compression) to FPGAs for acceleration. In this...
Article
In this paper we experimentally study the performance of main-memory, parallel, multi-core join algorithms, focusing on sort-merge and (radix-)hash join. The relative performance of these two join approaches have been a topic of discussion for a long time. With the advent of modern multicore architectures, it has been argued that sort-merge join is...
Conference Paper
The architectural changes introduced with multicore CPUs have triggered a redesign of main-memory join algorithms. In the last few years, two diverging views have appeared. One approach advocates careful tailoring of the algorithm to the architectural parameters (cache sizes, TLB, and memory bandwidth). The other approach argues that modern hardwar...
Chapter
We mainly looked at FPGAs from the hardware technology side so far. Clearly, the use and “programming” of FPGAs is considerably different to the programming models that software developers are used to. In this chapter, we will show how entire system designs can be derived from a given application context, and we will show how those designs can be m...
Chapter
So far, we have manly highlighted advantages of FPGAs with respect to performance, e.g., for low-latency and/or high-volume stream processing. An entirely different area where FPGAs pro-vide a number of benefits is secure data processing. Especially in the cloud computing era, guar-anteeing data confidentiality, privacy, etc., are increasingly impo...
Chapter
In this chapter, we give an overview of the technology behind field-programmable gate arrays (FP-GAs). We begin with a brief history of FPGAs before we explain the key concepts that make (re)programmable hardware possible. We do so in a bottom-up approach, that is, we first dis-cuss the very basic building blocks of FPGAs, and then gradually zoom o...
Chapter
FPGA technology can be leveraged in modern computing systems byusing them as a co-processor (or “accelerator”) in a heterogeneous computing architecture, where CPUs, FPGAs, and possibly further hardware components are used jointly to solve application problems.
Chapter
In the previous chapter, we illustrated various ways of applying FPGAs to stream processing applications. In this chapter, we illustrate that FPGAs also have the potential to accelerate more classical data processing tasks by exploiting various forms of parallelism inherent to FPGAs. In particular, we will discuss FPGA-acceleration for two differen...
Chapter
Before we delve into core FPGA technology in Chapter 3, we need to familiarize ourselves with a few basic concepts of hardware design. As we will see, the process of designing a hard-wired circuit—a so-called application specific integrated circuit (ASIC)—is not that different from im-plementing the same circuit on an FPGA. In this chapter, we will...
Article
The increasing number of cores and the rich instruction sets of modern hardware are opening up new opportunities for optimizing many traditional data mining tasks. In this paper we demonstrate how to speed up the performance of the computation of frequent items by almost one order of magnitude over the best published results by matching the algorit...
Article
While the performance opportunities of field-programmable gate arrays field (FPGAs)field for high-volume query processing are well-known, system makers still have to compromise between desired query expressiveness and high compilation effort. The cost of the latter is the primary limitation in building efficient FPGA/CPU hybrids. In this work we re...
Article
We demonstrate MXQuery/H, a modified version of MXQuery that uses hardware acceleration to speed up XML processing. The main goal of this demonstration is to give an interactive example of hardware/software co-design and show how system performance and energy efficiency can be improved by off-loading tasks to FPGA hardware. To this end, we equipped...
Article
Computer architectures are quickly changing toward heterogeneous many-core systems. Such a trend opens up interesting opportunities but also raises immense challenges since the efficient use of heterogeneous many-core systems is not a trivial problem. Software-configurable microprocessors and FPGAs add further diversity but also increase complexity...
Book
The architectural changes introduced with multi-core CPUs have triggered a redesign of main-memory join algorithms. In the last few years, two diverging views have appeared. One approach advocates careful tailoring of the algorithm to the architectural parameters (cache sizes, TLB, and memory bandwidth). The other approach argues that modern hardwa...
Article
Computing frequent items is an important problem by itself and as a subroutine in several data mining algorithms. In this paper, we explore how to accelerate the computation of frequent items using field-programmable gate arrays (FPGAs) with a threefold goal: increase performance over existing solutions, reduce energy consumption over CPU-based sys...
Conference Paper
In spite of the omnipresence of parallel (multi-core) systems, the predominant strategy to evaluate window-based stream joins is still strictly sequential, mostly just straightforward along the definition of the operation semantics. In this work we present handshake join, a way of describing and executing window-based stream joins that is highly a...
Conference Paper
We demonstrate a hardware implementation of a complex event processor, built on top of field-programmable gate arrays (FPGAs). Compared to CPU-based commodity systems, our solution shows distinctive advantages for stream monitoring tasks, e.g., wire-speed processing and predictable performance. The demonstration is based on a query-to-hardware comp...
Conference Paper
Field-programmable gate arrays (FPGAs) are chip devices that can be runtime-reconfigured to realize arbitrary processing tasks directly in hardware. Industrial products [Net, Xtr] as well as research prototypes [MTA09, MVB + 09, SLS + 10, TMA11] demonstrated how this capability can be exploited to build highly efficient processors for data warehous...
Article
Complex event detection is an advanced form of data stream processing where the stream(s) are scrutinized to identify given event patterns. The challenge for many complex event processing (CEP) systems is to be able to evaluate event patterns on high-volume data streams while adhering to real-time constraints. To solve this problem, in this paper w...
Conference Paper
Field-programmable gate arrays (FPGAs) are a promising technology that can be used in database systems. In this demonstration we show Glacier, a library and a compiler that can be employed to implement streaming queries as hardware circuits on FPGAs. Glacier consists of a library of compositional hardware modules that represent stream processing op...
Conference Paper
Field-programmable gate arrays (FPGAs) can provide performance advantages with a lower resource consumption (e.g., energy) than conventional CPUs. In this paper, we show how to employ FPGAs to provide an efficient and high-performance solution for the frequent item problem. We discuss three design alternatives, each one of them exploiting different...
Conference Paper
In line with the insight that "one size" of databases will not fit all application needs [19] the database community is currently exploring various alternatives to commodity, CPU-based system designs. One particular candidate in this trend are field-programmable gate arrays (FPGAs), programmable chips that allow tailor-made hardware designs optimiz...
Conference Paper
Full-text available
As network infrastructures with 10 Gb/s bandwidth and beyond have become pervasive and as cost advantages of large commodity-machine clusters continue to increase, research and industry strive to exploit the available processing performance for large-scale database processing tasks. In this work we look at the use of high-speed networks for distrib...
Article
Given the tremendous versatility of relational database implementations toward a wide range of database problems, it seems only natural to consider them as back-ends for XML data processing. Yet, the assumptions behind the language XQuery are considerably different to those in traditional RDBMSs. The underlying data model is a tree, data and result...
Article
Computer architectures are quickly changing toward heterogeneous many-core systems. Such a trend opens up interesting opportunities but also raises immense challenges since the efficient use of heterogeneous many-core systems is not a trivial problem. In this paper, we explore how to program data processing operators on top of field-programmable ga...
Article
Taking advantage of many-core, heterogeneous hardware for data processing tasks is a difficult problem. In this paper, we consider the use of FPGAs for data stream processing as coprocessors in many-core architectures. We present Glacier, a component library and compositional compiler that transforms continuous queries into logic circuits by compos...
Conference Paper
While there seems to be a general agreement that next years' systems will include many processing cores, it is often over- looked that these systems will also include an increasing number of dierent cores (we already see dedicated units for graphics or network processing). Orchestrating the diversity of processing functionality is going to be a maj...
Conference Paper
Full-text available
By leveraging modern networking hardware (RDMA-enabled network cards), we can shift priorities in distributed data- base processing signicantly. Complex and sophisticated mechanisms to avoid network trac can be replaced by a scheme that takes advantage of the bandwidth and low la- tency oered by such interconnects. We illustrate this phenomenon wit...
Conference Paper
Full-text available
We introduce a controlled form of recursion in XQuery, an inationary xed point operator , familiar from the context of relational databases. This operator imposes restrictions on the expressible types of recursion, but it is suciently versatile to capture a wide range of interesting use cases, including Regular XPath and its core transitive closure...
Article
Taking advantage of many-core, heterogeneous hardware for data processing tasks is a difficult problem. In this paper, we consider the use of FPGAs for data stream processing as co-processors in many-core architectures. We present Glacier, a component library and compositional compiler that transforms continuous queries into logic circuits by compo...
Article
Full-text available
By leveraging modern networking hardware (RDMA-enabled network cards), we can shift priorities in distributed data- base processing signicantly. Complex and sophisticated mechanisms to avoid network trac can be replaced by a scheme that takes advantage of the bandwidth and low la- tency oered by such interconnects. We illustrate this phenomenon wit...
Article
Systems Group, together with the Enterprise Computing Center (ECC) have taken initiatives at the ETH Zurich department of computer science to support both academic and industrial research. The goal of systems groups is to redefine, restructure, and reorganize systems research to avoid the trap and complex problems from a single, isolated perspectiv...
Article
Full-text available
Though inevitable for eective cost-based query rewriting, the derivation of meaningful cardinality estimates has re- mained a notoriously hard problem in the context of XQuery. By basing the estimation on a relational representation of the XQuery syntax, we show how existing cardinality esti- mation techniques for XPath and proven relational estima...
Conference Paper
XML Schema awareness has been an integral part of the XQuery language since its early design stages. Matching XML data against XML types is the main operation that backs up XQuery type expressions, such as typeswitch, instance of, or certain XPath operators. This interac- tion is particularly vital in data-centric XQuery applications, where data co...
Article
Full-text available
The Pathfinder project makes inventive use of relational database technology—originally developed to process data of strictly tabular shape—to construct efficient database-supported XML and XQuery pro- cessors. Pathfinder targets database engines that implement a set-oriented mode of query execution: many off-the-shelf traditional database systems...
Article
Full-text available
We introduce a controlled form of recursion in XQuery, inflationary fixed points, familiar in the context of relational databases. This imposes restrictions on the expressible types of recursion, but we show that inflationary fixed points nevertheless are sufficiently versatile to capture a wide range of interesting use cases, including the semanti...
Conference Paper
Full-text available
We explore the design and implementation of Rover, a post- mortem debugger for XQuery. Rather than being based on the traditional breakpoint model, Rover acknowledges XQuery's nature as a functional language: the debugger fol- lows a declarative debugging paradigm in which a user is enabled to observe the values of selected XQuery subexpres- sions....
Conference Paper
Full-text available
To compensate for the inherent impedance mismatch be- tween the relational data model (tables of tuples) and XML (ordered, unranked trees), tree join algorithms have become the prevalent means to process XML data in relational data- bases, most notably the TwigStack (6), structural join (1), and staircase join (13) algorithms. However, the addition...
Conference Paper
Full-text available
The Pathfinder XQuery compiler has been enhanced by a new code generator that can target any SQL:1999-compliant relational database system (RDBMS). This code genera- tor marks an important next step towards truly relational XQuery processing, a branch of database technology that aims to turn RDBMSs into highly efficient XML and XQuery processors wi...
Conference Paper
Full-text available
There are more spots than immediately obvious in XQuery expressions where order is immaterial for evaluation - this affects most notably, but not exclusively, expressions in the scope of unordered {} and the argument of fn:unordered(). Clearly, performance gains are lurking behind such expression contexts but the prevalent impact of order on the XQ...
Conference Paper
Relational database systems are highly efficient hosts to table-shaped data. It is all the more interesting to see how a careful inspection of both, the XML tree structure as well as the W3C XQuery language definition, can turn relational databases into fast and scalable XML processors. This work shows how the deliberate choice of a relational tree...
Conference Paper
Full-text available
Relational XQuery systems try to re-use mature relational data management infrastructures to create fast and scalable XML database technology. This paper describes the main features, key contributions, and lessons learned while implementing such a system. Its architecture consists of (i) a range-based encoding of XML documents into relational table...
Conference Paper
Full-text available
Relational XQuery processors aim at leveraging mature relational DBMS query processing technology to provide scalability and efficiency. To achieve this goal, various storage schemes have been proposed to encode the tree structure of XML documents in flat relational tables. Basically, two classes can be identified: (1) encodings using fixed-length...
Conference Paper
Full-text available
lowed: based on the extensible relational database ker-nel MonetDB [2], Pathˉnder provides highly e±cient and scalable XQuery technology that scales beyond 10 GB XML input instances on commodity hardware. Pathˉnder requires only local extensions to the un-derlying DBMS's kernel, such as the staircase join op-erator [7, 9]. A join recognition logic...
Article
Full-text available
Various techniques have been proposed for efficient evaluation of XPath expressions, where the XPath location steps are rooted in a single sequence of context nodes. Among these techniques, the staircase join allows to evaluate XPath location steps along arbitrary axes in at most one scan over the XML document, exploiting the XPath accelerator enco...
Article
Full-text available
Pathfinder/MonetDB is a collaborative effort of the University of Konstanz, the University of Twente, and the Centrum voor Wiskunde en Informatica (CWI) in Amsterdam to develop an XQuery compiler that targets an RDBMS back-end. The author of this abstract is student at the University of Konstanz and spent six months as an intern at the CWI, designi...
Chapter
The XPath accelerator encodes the tree structure of an XML document using unique pairs of integer values, the nodes' preorder and postorder traversal ranks. If these ranks are used to place the document nodes in the two-dimensional pre/post plane, it becomes apparent that the encoding preserves an important property. Any context node v divides the...
Article
Relational database systems may be turned into e#cient XML and XPath processors if the system is provided with a suitable relational tree encoding. This paper extends this relational XML processing stack and shows that an RDBMS can also serve as a highly e#cient XQuery runtime environment. Our approach is purely relational: XQuery expressions are c...
Article
The syntactic wellformedness constraints of XML (opening and closing tags nest properly) imply that XML processors face the challenge to effciently handle data that takes the shape of ordered, unranked trees. Although RDBMSs have originally been designed to manage table-shaped data, we propose their use as XML and XPath processors. In our setup, th...
Article
Full-text available
This work may be seen as a further proof of the versatility of the relational database model. Here, we add XQuery to the catalog of languages which RDBMSs are able to "speak" fluently. Given suitable relational encodings of sequences and ordered, unranked trees
Article
Full-text available
This article is a proposal for a database index structure, the XPath accelerator, that has been specifically designed to support the evaluation of XPath path expressions. As such, the index is capable to support all XPath axes (including ancestor, following, preceding-sibling, descendant-or-self, etc.). This feature lets the index stand out among r...
Chapter
Relational query processors derive much of their effectiveness from the awareness of specific table properties like sort order, size, or absence of duplicate tuples. This chapter applies (and adapts) this successful principle to database-supported XML and XPath processing: the relational system is made tree aware, i.e., tree properties like subtree...
Article
This article is a proposal for a database index structure, the XPath accelerator, that has been specifically designed to support the evaluation of XPath path expressions. As such, the index is capable to support all XPath axes (including ancestor, following, preceding-sibling, descendant-or-self, etc.). This feature lets the index stand out among r...
Article
The W3 Consortium is currently developing the XQuery specification to query XML data.
Article
Full-text available
Relational query processors derive much of their effectiveness from the awareness of specific table properties like sort order, size, or absence of duplicate tuples. This text applies (and adapts) this successful principle to database-supported XML and XPath processing: the relational system is made tree aware, i.e., tree properties like subtree si...

Network

Cited By