Hannes Voigt

Hannes Voigt
Neo4j

Dr.-Ing.

About

79
Publications
12,928
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
935
Citations

Publications

Publications (79)
Article
Full-text available
Property graphs have reached a high level of maturity, witnessed by multiple robust graph database systems as well as the ongoing ISO standardization effort aiming at creating a new standard Graph Query Language (GQL). Yet, despite documented demand, schema support is limited both in existing systems and in the first version of the GQL Standard. It...
Preprint
Full-text available
Property graphs have reached a high level of maturity, witnessed by multiple robust graph database systems as well as the ongoing standardization effort aiming at a creating a new standard Graph Query Language (GQL). Yet, despite documented demand, schema support is limited in existing systems. It is anticipated that the second version of the GQL S...
Conference Paper
As graph databases become widespread, the International Organi- zation for Standardization (ISO) and International Electrotechni- cal Commission (IEC) have approved a project to create GQL, a standard property graph query language. This complements the SQL/PGQ project, which specifies how to define graph views over a SQL tabular schema, and to run...
Preprint
Full-text available
As graph databases become widespread, JTC1 -- the committee in joint charge of information technology standards for the International Organization for Standardization (ISO), and International Electrotechnical Commission (IEC) -- has approved a project to create GQL, a standard property graph query language. This complements a project to extend SQL...
Preprint
Full-text available
The scientific community has been studying graph data models for decades. Their high expressiveness and elasticity led the scientific community to design a variety of graph data models and graph query languages, and the practitioners to use them to model real-world cases and extract useful information. Recently, property graphs and, in particular,...
Article
Full-text available
Ensuring the success of big graph processing for the next decade and beyond.
Preprint
Full-text available
Graphs are by nature unifying abstractions that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads understand these abstractions, future problems will require new abstractions and systems. What needs to happen in the...
Article
Full-text available
The ability to efficiently analyze changing data is a key requirement of many real-time analytics applications. In prior work, we have proposed general dynamic Yannakakis (GDyn), a general framework for dynamically processing acyclic conjunctive queries with \(\theta \)-joins in the presence of data updates. Whereas traditional approaches face a tr...
Article
The ability to efficiently analyze changing data is a key requirement of many real-time analytics applications. Traditional approaches to this problem were developed around the notion of Incremental View Maintenance (IVM), and are based either on the materialization of subresults (to avoid their recomputation) or on the recomputation of subresults...
Chapter
Despite the maturity of commercial graph databases, little consensus has been reached so far on the standardization of data definition languages (DDLs) for property graphs (PG). Discussion on the characteristics of PG schemas is ongoing in many standardization and community groups. Although some basic aspects of a schema are already present in most...
Conference Paper
Today, most commercial database systems provide some support for the management of temporal data, but the index support for efficiently accessing such data is rather limited. Existing access paths neglect the fact that time intervals are located on the timeline and have a duration, two important pieces of information for querying temporal data. In...
Article
The paper describes the present and the future of graph updates in Cypher, the language of the Neo4j property graph database and several other products. Update features include those with clear analogs in relational databases, as well as those that do not correspond to any relational operators. Moreover, unlike SQL, Cypher updates can be arbitraril...
Conference Paper
Full-text available
The ability to efficiently analyze changing data is a key requirement of many real-time analytics applications. Traditional approaches to this problem were developed around the notion of Incremental View Maintenance (IVM), and are based either on the materialization of subresults (to avoid their recomputation) or on the recomputation of subresults...
Preprint
Modern application domains such as Composite Event Recognition (CER) and real-time Analytics require the ability to dynamically refresh query results under high update rates. Traditional approaches to this problem are based either on the materialization of subresults (to avoid their recomputation) or on the recomputation of subresults (to avoid the...
Preprint
Despite the maturity of commercial graph databases, little consensus has been reached so far on the standardization of data definition languages (DDLs) for property graphs (PG). The discussion on the characteristics of PG schemas is ongoing in many standardization and community groups. Although some basic aspects of a schema are already present in...
Article
Full-text available
Agile software development allows us to continuously evolve and run a software system. However, this is not possible in databases, as established methods are very expensive, error-prone, and far from agile. We present InVerDa, a multi-schema-version database management system (MSVDB) for agile database development. MSVDBs realize co-existing schema...
Conference Paper
Full-text available
We report on a community effort between industry and academia to shape the future of graph query languages. We argue that existing graph database management systems should consider supporting a query language with two key characteristics. First, it should be composable, meaning, that graphs are the input and the output of queries. Second, the graph...
Conference Paper
Modern application domains such as Composite Event Recognition(CER) and real-time Analytics require the ability to dynamically refresh query results under high update rates. Traditional approaches to this problem are based either on the materialization of subresults (to avoid their recomputation) or on the recomputation of subresults (to avoid the...
Article
Modern application domains such as Composite Event Recognition (CER) and real-time Analytics require the ability to dynamically refresh query results under high update rates. Traditional approaches to this problem are based either on the materialization of subresults (to avoid their recomputation) or on the recomputation of subresults (to avoid the...
Article
Full-text available
Software developers adapt to the fast-moving nature of software systems with agile development techniques. However, database developers lack the tools and concepts to keep the pace. Whenever the current database schema is evolved, the already existing data needs to be evolved as well. This is usually realized with manually written SQL scripts, whic...
Chapter
Throughout the book we have highlighted open research challenges. In this final chapter we collect and consolidate these challenges, providing an overview of what we see as important open problems for the graph query processing research community, toward a shared research agenda for next-generation graph database systems.
Chapter
Graph-shaped data differs from structured data mainly because of the lack of an underlying schema and metadata. Graph datasets typically blend values with metadata information without a clear distinction among them. An important class of metadata is given by integrity constraints and dependencies, whose goal is to impose the adherence to a specifie...
Chapter
In this chapter we give a presentation of property graph query languages. We begin with the core language functionalities of graph navigation queries and (unions of) conjunctions of navigational queries. Our approach is then to give a presentation of major graph query language functionalities as restrictions or extensions of the recently proposed R...
Chapter
This chapter discusses how graph-centric features used in the graph query languages of Chapter 3 introduce new challenges in physical query evaluation. We focus particularly on the design and implementation of operators used in physical query plans for declarative graph queries.
Chapter
We describe in this chapter graph query specification techniques to help users formulate path queries from examples provided as input or via graph exploration. This problem amounts to learning queries from examples and reverse-engineering queries starting from examples that users want or do not want. The complexity of these problems has been studie...
Chapter
In this chapter, we introduce the property graph model. The property graph model is important for graph-based data management as it is implemented in many systems and used as a reference model for various research work. Our aim in this chapter is two-fold. First, we introduce the basic concepts of the property graph model, following the LDBC’s Grap...
Chapter
The diversity of applications in which graphs are used as primary data models led to a proliferation of a variety of graph processing tasks. For example, in social networks, one might be interested in looking for simple patterns in relationships between people such as finding persons with shared interests or discovering common friends. On the other...
Chapter
A property graph is a complex structure requiring some care to be represented in the linear memory model1 of computers. A memory representation for property graphs should be: (1) concise, i.e., represent a given graph with a small memory footprint; and (2) access-efficient, i.e., allow queries reading and writing as little data as possible to proce...
Article
Full-text available
We report on a community effort between industry and academia to shape the future of graph query languages. We argue that existing graph database management systems should consider supporting a query language with two key characteristics. First, it should be composable, meaning, that graphs are the input and the output of queries. Second, the graph...
Conference Paper
Pattern matching on large graphs is the foundation for a variety of application domains. The continuously increasing size of the underlying graphs requires highly parallel in-memory graph processing engines that need to consider non-uniform memory access (NUMA) and concurrency issues to scale up on modern multiprocessor systems. To tackle these asp...
Conference Paper
Graphs have become an ubiquitous type of data, increasing the desire and need to perform analytics on graph data. In this article, we review the fundamental concepts that form the common basis of most declarative graph querying languages. The article conveys a general understanding of these concepts which will help the reader to learn a specific gr...
Article
Driven by a multitude of use cases, graph data analytics has become a hot topic in research and industry. Particularly on big graphs, performing complex analytical queries efficiently to derive new insights is a challenging task. Systems that aim at solving the technical part of this challenge are often referred to as graph processing systems. They...
Conference Paper
Regular path queries (RPQs) are a fundamental part of recent graph query languages like SPARQL and PGQL. They allow the definition of recursive path structures through regular expressions in a declarative pattern matching environment. We study the use of the K2-tree graph compression technique to materialize RPQ results with low memory consumption...
Conference Paper
Full-text available
We introduce end-to-end support of co-existing schema versions within one database. While it is state of the art to run multiple versions of a continuously developed application concurrently, it is hard to do the same for databases. In order to keep multiple co-existing schema versions alive -- which are all accessing the same data set -- developer...
Conference Paper
Traditional modeling approaches and information systems assume static entities that represent all information and attributes at once. However, due to the evolution of information systems to increasingly context-aware and self-adaptive systems, this assumption no longer holds. To cope with the required flexibility, the role concept was introduced. A...
Article
Full-text available
We present InVerDa, a tool for end-to-end support of co-existing schema versions within one database. While it is state of the art to run multiple versions of a continuously developed application concurrently, the same is hard for databases. In order to keep multiple co-existing schema versions alive, that all access the same data set, developers u...
Conference Paper
In the increasingly dynamic realities of today’s software systems, it is no longer feasible to always expect human developers to react to changing environments and changing conditions immediately. Instead, software systems need to be self-aware and autonomously adapt their behavior according to their experiences gathered from their environment. Cur...
Patent
Full-text available
Disclosed herein are system, method, and computer program product embodiments for performing ad-hoc analytical queries of graph data. An embodiment operates by receiving a graph pattern for a subgraph of interest. The facts of interest are then selected from graph data based on the received graph pattern. Dimensions are then defined based on a dime...
Conference Paper
Full-text available
Software developers adapt to the fast-moving nature of software systems with agile development techniques. However, database developers lack the tools and concepts to keep pace. Data, already existing in a running product, needs to be evolved accordingly, usually by manually written SQL scripts. A promising approach in database research is to use a...
Conference Paper
Full-text available
We investigate the possibility to use update propagation methods for optimizing the evaluation of continuous queries. Update propagation allows for the efficient determination of induced changes to derived relations resulting from an explicitly performed base table update. In order to simplify the computation process, we propose the propagation of...
Conference Paper
Full-text available
Relational database management systems build on the closed world assumption requiring upfront modeling of a usually stable schema. However, a growing number of today's database applications are characterized by self-descriptive data. The schema of self-descriptive data is very dynamic and prone to frequent changes; a situation which is always troub...
Conference Paper
Full-text available
Currently, there is a mismatch between the conceptual model of an information system and its implementation in a database management system (DBMS). Most of the conceptual modeling languages relate their conceptual entities with relationships, but relational database management systems solely rely on the notion of relations to model both, entities a...
Article
Driven by novel application domains and hardware trends database research and development set off to many novel and specialized architectures. Particularly in the area of physical data layout, specialized solutions have shown exceptional performance for specific applications. This trend is great for research and development and for those in need of...
Conference Paper
Software product lines (SPLs) allow creating a multitude of individual but similar products based on one common software model. Software components can be developed independently and new products can be generated easily. Inevitably, software evolves, a new version has to be deployed, and the data already existing in the database has to be transform...
Conference Paper
Full-text available
The past few years have seen a tremendous increase in often irregularly structured data that can be represented most naturally and efficiently in the form of graphs. Making sense of incessantly growing graphs is not only a key requirement in applications like social media analysis or fraud detection but also a necessity in many traditional enterpri...
Conference Paper
Full-text available
An increasing number of application fields represent dynamic and open discourses characterized by high mutability, variety, and pluralism in data. Data in dynamic and open discourses typically exhibits an irregular schema. Such data cannot be directly represented in the traditional relational data model. Mapping strategies allow representation but...
Article
Full-text available
Database Management Systems (DBMS) are used by software applications, to store, manipulate, and retrieve large sets of data. However, the requirements of current software systems pose various challenges to established DBMS. First, most software systems organize their data by means of objects rather than relations leading to increased maintenance, r...
Thesis
Full-text available
With the ongoing expansion of information technology, new fields of application requiring data management emerge virtually every day. In our knowledge culture increasing amounts of data and work force organized in more creativity-oriented ways also radically change traditional fields of application and question established assumptions about data ma...
Conference Paper
Full-text available
In an increasing number of use cases, databases face the challenge of managing irregularly structured data. Irregularly structured data is characterized by a quickly evolving variety of entities without a common set of attributes. These entities do not show enough regularity to be captured in a traditional database schema. A common solution is to c...
Article
In an increasing number of use cases, databases face the challenge of managing heterogeneous data. Heterogeneous data is characterized by a quickly evolving variety of entities without a common set of attributes. These entities do not show enough regularity to be captured in a traditional database schema. A common solution is to centralize the dive...
Conference Paper
Full-text available
As databases accumulate growing amounts of data at an increasing rate, adaptive indexing becomes more and more important. At the same time, applications and their use get more agile and flexible, resulting in less steady and less predictable workload characteristics. Being inert and coarse-grained, state-of-the-art index tuning techniques become le...
Conference Paper
Full-text available
Main memory databases management systems are used more often and in a wide spread of application scenarios. To take significant advantage of the main memory read performance, most techniques known from traditional disk-centric database systems have to be adapted and re-designed. In the field of indexing, many mainmemory-optimized index structures h...
Article
Full-text available
As databases accumulate growing amounts of data at an increasing rate, adaptive indexing becomes more and more important. At the same time, applications and their use get more agile and flexible, resulting in less steady and less predictable workload characteristics. Being inert and coarse-grained, state-of-the-art index tuning techniques become le...
Conference Paper
Full-text available
With rapidly increasing datasets and more dynamic workloads, adaptive partial indexing becomes an important way to keep indexing efficiently. During times of changing workloads, the query performance suffers from inefficient tables scans while the index tuning mechanism adapts the partial index. In this paper we present the Adaptive Index Buffer. T...
Conference Paper
Full-text available
Data management is not limited anymore to towering data silos full of perfectly structured, well integrated data. Today, we need to process and make sense of data from diverse sources (public and on-premise), in different application contexts, with different schemas, and with varying degrees of structure and quality. Because of the necessity to def...
Conference Paper
In modern IT landscapes, databases are subject to a major role change. Especially in Service-Oriented Architectures, databases are more and more frequently dedicated to a single application. Therefore, it is even more important to reflect the application requirements in their design. Software developers and application experts formulate application...
Conference Paper
Full-text available
Today's information systems are often built on the foundation of service-oriented environments. Although the fundamental purpose of an information system is the processing of data and information, the service-oriented architecture (SOA) does not treat data as a core first class citizen. Current SOA technologies support neither the explicit modeling...
Conference Paper
Full-text available
Today, service orientation is a well established concept in modern IT infrastructures. Web services and WS-BPEL as the two key technologies handle large structured data sets very inefficiently because they process the whole data set at once. In this demo, we present a framework to build standing business processes. Standing business processes rely...
Conference Paper
Full-text available
Service-oriented architectures (SOA) based on Web service technology play an increasingly important role in many different application areas. The current service invocation methodology suffers from performance problems and heavy resource consumption when services are used to process large amounts of data. A number of solutions to this problem have...
Conference Paper
Full-text available
Physical design has always been an important part of database administration. Today's commercial database management systems offer physical design tools, which rec-ommend a physical design for a given workload. However, these tools work only with static workloads and ignore the fact that workloads, and physical designs, may change over time. Resear...
Conference Paper
Full-text available
Physical design has always been an important part of database administration. Today's commercial database management systems offer physical design tools, which recommend a physical design for a given workload. However, these tools work only with static workloads and ignore the fact that workloads, and physical designs, may change over time. Researc...
Article
Full-text available
The database research is always on the move. In order to integrate novel concepts, the significance of the database programmability aspect more and more increases. The programmability aspect focuses on internal components as well as on principle to push-down application logic to the database system. In this paper, we propose a novel database progra...

Network

Cited By