## About

451

Publications

49,443

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

19,589

Citations

Citations since 2016

## Publications

Publications (451)

Structural decomposition methods, such as generalized hypertree decompositions, have been successfully used for solving constraint satisfaction problems (CSPs). As decompositions can be reused to solve CSPs with the same constraint scopes, investing resources in computing good decompositions is beneficial, even though the computation itself is hard...

Modern applications combine information from a great variety of sources. Oftentimes, some of these sources, like machine-learning systems, are not strictly binary but associated with some degree of (lack of) confidence in the observation. We propose MV-Datalog and $\mathrm{MV-Datalog}^\pm$ as extensions of Datalog and $\mathrm{Datalog}^\pm$ , respe...

We investigate the computational complexity of mining guarded clauses from clausal datasets through the framework of inductive logic programming (ILP). We show that learning guarded clauses is NP-complete and thus one step below the Sigma2-complete task of learning Horn clauses on the polynomial hierarchy. Motivated by practical applications on lar...

Constraint Satisfaction Problems (CSP) are notoriously hard. Consequently, powerful decomposition methods have been developed to overcome this complexity. However, this poses the challenge of actually computing such a decomposition for a given CSP instance, and previous algorithms have shown their limitations in doing so. In this paper, we present...

The chase procedure, originally introduced for checking implication of database constraints, and later on used for computing data exchange solutions, has recently become a central algorithmic tool in rule-based ontological reasoning. In this context, a key problem is non-uniform chase termination: does the chase of a database w.r.t. a rule-based on...

Vadalog is a system for performing complex reasoning tasks such as those required in advanced knowledge graphs. The logical core of the underlying Vadalog language is the warded fragment of tuple-generating dependencies (TGDs). This formalism ensures tractable reasoning in data complexity, while a recent analysis focusing on a practical implementat...

Modern applications combine information from a great variety of sources. Oftentimes, some of these sources, like Machine-Learning systems, are not strictly binary but associated with some degree of (lack of) confidence in the observation. We propose MV-Datalog and MV-Datalog+- as extensions of Datalog and Datalog+-, respectively, to the fuzzy seman...

Following the recent successful examples of large technology companies, many modern enterprises seek to build Knowledge Graphs to provide a unified view of corporate knowledge, and to draw deep insights using machine learning and logical reasoning. There is currently a perceived disconnect between the traditional approaches for data science, typica...

Hypertree decompositions (HDs), as well as the more powerful generalized hypertree decompositions (GHDs), and the yet more general fractional hypertree decompositions (FHDs) are hypergraph decomposition methods successfully used for answering conjunctive queries and for solving constraint satisfaction problems. Every hypergraph H has a width relati...

This work investigates the decidability and complexity of database query answering under guarded existential rules with nonmonotonic negation according to the classical stable model semantics. In this setting, existential quantification is interpreted via Skolem functions, and the unique name assumption is adopted. As a first result, we show the de...

We investigate the computational complexity of mining guarded clauses from clausal datasets through the framework of inductive logic programming (ILP). We show that learning guarded clauses is NP-complete and thus one step below the $\sigma^P_2$-complete task of learning Horn clauses on the polynomial hierarchy. Motivated by practical applications...

To cope with the intractability of answering Conjunctive Queries (CQs) and solving Constraint Satisfaction Problems (CSPs), several notions of hypergraph decompositions have been proposed—giving rise to different notions of width, noticeably, plain, generalized, and fractional hypertree width (hw, ghw, and fhw). Given the increasing interest in usi...

We concentrate on ontology-mediated queries (OMQs) expressed using guarded Datalog\(^\exists \) and conjunctive queries. Guarded Datalog\(^\exists \) is a rule-based knowledge representation formalism inspired by the guarded fragment of first-order logic, while conjunctive queries represent a prominent database query language that lies at the core...

Modern trends in data collection are bringing current mainstream techniques for database query processing to their limits. Consequently, various novel approaches for efficient query processing are being actively studied. One such approach is based on hypertree decompositions (HDs), which have been shown to carry great potential to process complex q...

Constraint Satisfaction Problems (CSPs) play a central role in many applications in Artificial Intelligence and Operations Research. In general, solving CSPs is NP-complete. The structure of CSPs is best described by hypergraphs. Therefore, various forms of hypergraph decompositions have been proposed in the literature to identify tractable fragmen...

This work deals with the problem of semantic optimization of the central class of conjunctive queries (CQs). Since CQ evaluation is NP-complete, a long line of research has focussed on identifying fragments of CQs that can be efficiently evaluated. One of the most general restrictions corresponds to generalized hypetreewidth bounded by a fixed cons...

The problem of querying RDF data is a central issue for the development of the Semantic Web. The query language SPARQL has become the standard language for querying RDF since its W3C standardization in 2008. However, the 2008 version of this language missed some important functionalities: reasoning capabilities to deal with RDFS and OWL vocabularie...

Constraint Satisfaction Problems (CSPs) play a central role in many applications in Artificial Intelligence and Operations Research. In general, solving CSPs is NP-complete. The structure of CSPs is best described by hypergraphs. Therefore, various forms of hypergraph decompositions have been proposed in the literature to identify tractable fragmen...

To cope with the intractability of answering Conjunctive Queries (CQs) and solving Constraint Satisfaction Problems (CSPs), several notions of hypergraph decompositions have been proposed -- giving rise to different notions of width, noticeably, plain, generalized, and fractional hypertree width (hw, ghw, and fhw). Given the increasing interest in...

Constraint satisfaction problems (CSPs) are an important formal framework for the uniform treatment of various prominent AI tasks, e.g., coloring or scheduling problems. Solving CSPs is, in general, known to be NP-complete and fixed-parameter intractable when parameterized by their constraint scopes. We give a characterization of those classes of C...

Fractional (hyper-)graph theory is concerned with the specific problems that arise when fractional analogues of otherwise integer-valued (hyper-)graph invariants are considered. The focus of this paper is on fractional edge covers of hypergraphs. Our main technical result generalizes and unifies previous conditions under which the size of the suppo...

Constraint satisfaction problems (CSPs) are an important formal framework for the uniform treatment of various prominent AI tasks, e.g., coloring or scheduling problems. Solving CSPs is, in general, known to be NP-complete and fixed-parameter intractable when parameterized by their constraint scopes. We give a characterization of those classes of C...

Constraint Satisfaction Problems (CSP) are notoriously hard. Consequently, powerful decomposition methods have been developed to overcome this complexity. However, this poses the challenge of actually computing such a decomposition for a given CSP instance, and previous algorithms have shown their limitations in doing so. In this paper, we present...

Guarded existential rules form a robust rule-based language for modelling ontologies. The central problem of ontology-based query answering, as well as the notion of polynomial combined rewritability, have been extensively studied during the last years for this formalism. However, the relevant setting where the underlying signature is considered to...

The introduction of novel Datalog +/- fragments with good theoretical properties, together with the growing use of enterprise knowledge graphs motivated the development of Vadalog, a knowledge graph management system developed at the University of Oxford. It adopts Warded Datalog +/- as the core of its language for knowledge representation and reas...

Hypertree decompositions (HDs), as well as the more powerful generalized hypertree decompositions (GHDs), and the yet more general fractional hypertree decompositions (FHDs) are hypergraph decomposition methods successfully used for answering conjunctive queries and for solving constraint satisfaction problems. Every hypergraph $H$ has a width rela...

We reconsider the problem of containment of monadic datalog (MDL) queries in unions of conjunctive queries (UCQs). Prior work has dealt with special cases of the problem but has left the precise complexity characterization open. In addition, the complexity of one important special case, that of containment under access patterns, was not known befor...

Background
Data scientists spend considerable amounts of time preparing data for analysis. Data preparation is labour intensive because the data scientist typically takes fine grained control over each aspect of each step in the process, motivating the development of techniques that seek to reduce this burden.
Results
This paper presents an archit...

Vadalog is a system for performing complex reasoning tasks such as those required in advanced knowledge graphs. The logical core of the underlying Vadalog language is the warded fragment of tuple-generating dependencies (TGDs). This formalism ensures tractable reasoning in data complexity, while a recent analysis focusing on a practical implementat...

To cope with the intractability of answering Conjunctive Queries (CQs) and solving Constraint Satisfaction Problems (CSPs), several notions of hypergraph decompositions have been proposed - giving rise to different notions of width, noticeably, plain, generalized, and fractional hypertree width (hw, ghw, and fhw). Given the increasing interest in u...

Data-driven websites are mostly accessed through search interfaces. Such sites follow a common publishing pattern that, surprisingly, has not been fully exploited for unsupervised data extraction yet: the result of a search is presented as a paginated list of result records. Each result record contains the main attributes about one single object, a...

Vadalog is a logic-based reasoning language for modern AI applications, in particular for knowledge graph systems. In this paper, we present recent advances and applications, with a focus on the Vadalog language itself. We first give an easy-to-access self-contained introduction to Warded Datalog+/−, the logical core of Vadalog. We then discuss som...

Answering Conjunctive Queries (CQs) and solving Constraint Satisfaction Problems (CSPs) are arguably among the most fundamental tasks in Computer Science. They are classical NP-complete problems. Consequently, the search for tractable fragments of these problems has received a lot of research interest over the decades. This research has traditional...

To cope with the intractability of answering Conjunctive Queries (CQs) and solving Constraint Satisfaction Problems (CSPs), several notions of hypergraph decompositions have been proposed -- giving rise to different notions of width, noticeably, plain, generalized, and fractional hypertree width (hw, ghw, and fhw). Given the increasing interest in...

The problem of querying RDF data is a central issue for the development of the Semantic Web. The query language SPARQL has become the standard language for querying RDF since its W3C standardization in 2008. However, the 2008 version of this language missed some important functionalities: reasoning capabilities to deal with RDFS and OWL vocabularie...

Vadalog is a system for performing complex reasoning tasks such as those required in advanced knowledge graphs. The logical core of the underlying Vadalog language is the warded fragment of tuple-generating dependencies (TGDs). This formalism ensures tractable reasoning in data complexity, while a recent analysis focusing on a practical implementat...

With the introduction of its Knowledge Graph [20], Google has coined the name for a new generation of knowledge-based systems that go beyond what was previously expected of areas that include graph databases, knowledge bases, machine-learning systems and rule-based logical reasoners.

Following the recent successful examples of large technology companies, many modern enterprises seek to build knowledge graphs to provide a unified view of corporate knowledge and to draw deep insights using machine learning and logical reasoning. There is currently a perceived disconnect between the traditional approaches for data science, typical...

Over the past years, there has been a resurgence of Datalog-based systems in the database community as well as in industry. In this context, it has been recognized that to handle the complex knowl\-edge-based scenarios encountered today, such as reasoning over large knowledge graphs, Datalog has to be extended with features such as existential quan...

Two paradigmatic restrictions that have been studied for ensuring the decidability of query answering under existential rules are guardedness and stickiness. With the aim of consolidating these restrictions, a flexible condition, called tameness, has been proposed a few years ago, which relies on hybrid reasoning, i.e., a combination of forward and...

Over the past years, there has been a resurgence of Datalog-based systems in the database community as well as in industry. In this context, it has been recognized that to handle the complex knowl- edge-based scenarios encountered today, such as reasoning over large knowledge graphs, Datalog has to be extended with features such as existential quan...

Hypertree decompositions, as well as the more powerful generalized hypertree decompositions (GHDs), and the yet more general fractional hypertree decompositions (FHD) are hypergraph decomposition methods successfully used for answering conjunctive queries and for the solution of constraint satisfaction problems. Every hypergraph H has a width relat...

Most modern web scrapers use an embedded browser to render web pages and to simulate user actions. Such scrapers (or wrappers) are therefore expensive to execute, in terms of time and network traffic. In contrast, it is magnitudes more resource-efficient to use a "browserless" wrapper which directly accesses a web server through HTTP requests, and...

Duplicates in data management are common and problematic. In this work, we present a translation of Datalog under bag semantics into a well-behaved extension of Datalog (the so-called warded Datalog+-) under set semantics. From a theoretical point of view, this allows us to reason on bag semantics by making use of the well-established theoretical f...

Tree projections provide a unifying framework to deal with most structural decomposition methods of constraint satisfaction problems (CSPs). Within this framework, a CSP instance is decomposed into a number of sub-problems, called views, whose solutions are either already available or can be computed efficiently. The goal is to arrange portions of...

Many modern companies wish to maintain knowledge in the form of a corporate knowledge graph and to use and manage this knowledge via a knowledge graph management system (KGMS). We formulate various requirements for a fully fledged KGMS. In particular, such a system must be capable of performing complex reasoning tasks but, at the same time, achieve...

This paper gives a short overview of specific logical approaches to data extraction, data management, and reasoning about data. In particular, we survey theoretical results and formalisms that have been obtained and used in the context of the Lixto Project at TU Wien, the DIADEM project at the University of Oxford, and the VADA project, which is cu...

Hypertree decompositions, the more powerful generalized hypertree decompositions (GHDs), and the yet more general fractional hypertree decompositions (FHD) are hypergraph decomposition methods successfully used for answering conjunctive queries and for solving constraint satisfaction problems. Each hypergraph H has a width relative to each of these...

We claim it is realistic to assume that a database management system provides access to the active domain via built-in relations. Therefore, product databases, i.e., databases that include designated predicates that hold the active domain, form a natural notion that deserves our attention. An important issue then is to look at the consequences of p...

In the database context, the hypertree decomposition method is used for query optimization, whereby conjunctive queries having a low degree of cyclicity can be recognized and decomposed automatically, and efficiently evaluated. Hypertree decompositions were introduced at ACM PODS 1999. The present paper reviews' in form of questions and answers' th...

A conjunctive query (CQ) is semantically acyclic if it is equivalent to an acyclic one. Semantic acyclicity has been studied in the constraint-free case, and deciding whether a query enjoys this property is NP-complete. However, in case the database is subject to constraints such as tuple-generating dependencies (tgds) that can express, e.g., inclu...

UML class diagrams (UCDs) are a widely adopted formalism for modeling the intensional structure of a software system. Although UCDs are typically guiding the implementation of a system, it is common in practice that developers need to recover the class diagram from an implemented system. This process is known as reverse engineering. A fundamental p...

This tutorial, which is a continuation of the tutorial “Datalog and Its Extensions for Semantic Web Databases” presented in the Reasoning Web 2012 Summer School, discusses recent advances in the Datalog\(^\pm \) family of languages for knowledge representation and reasoning. These languages extend plain Datalog with key modeling features such as ex...

Figure 3 was left out in the article (Gottlob et al. 2013). The figure that is given there as figure 3 should in fact be figure 4. Further, the reference to figure 3 on the 4 th line of page 889 should be a reference to figure 4. The correct figure 3 (missing in the paper) is supplied below. We apologise for this error.

The web is overflowing with implicitly structured data, spread over hundreds of thousands of sites, hidden deep behind search forms, or siloed in marketplaces, only accessible as HTML. Automatic extraction of structured data at the scale of thousands of websites has long proven elusive, despite its central role in the "web of data".
Through an exte...

We give a solution to the succinctness problem for the size of first-order rewritings of conjunctive queries in ontology-based data access with ontology languages such as OWL 2 QL , linear Datalog±Datalog± and sticky Datalog±Datalog±. We show that positive existential and nonrecursive datalog rewritings, which do not use extra non-logical symbols (...

The hypergraph duality problem DUAL is defined as follows: given two simple hypergraphs G and H, decide whether H consists precisely of all minimal transversals of G (in which case we say that G is the dual of H). This problem is equivalent to decide whether two given non-redundant monotone DNFs are dual. It is known that co-DUAL, the complementary...

In this work, we present LoCo, a fragment of classical first-order logic carefully tailored for expressing technical product configuration problems. The core feature of LoCo is that the number of components used in configurations does not have to be finitely bounded explicitly, but instead is bounded implicitly through the axioms. Computing configu...

The problem of querying RDF data is a central issue for the development of the Semantic Web. The query language SPARQL has become the standard language for querying RDF, since its standardization in 2008. However, the 2008 version of this language missed some important functionalities: reasoning capabilities to deal with RDFS and OWL vocabularies,...

The so-called existential rules have recently gained attention, mainly due to their adequate expressiveness for ontological query answering. Several decidable fragments of such rules have been introduced, employing restriction such as various forms of guardedness to ensure decidability. Some of the more well-known languages in this arena are (weakl...

Ontological queries are evaluated against a knowledge base consisting of an
extensional database and an ontology (i.e., a set of logical assertions and
constraints which derive new intensional knowledge from the extensional
database), rather than directly on the extensional database. The evaluation and
optimization of such queries is an intriguing...

Evaluating a Boolean conjunctive query Q against a guarded first-order theory
F is equivalent to checking whether "F and not Q" is unsatisfiable. This
problem is relevant to the areas of database theory and description logic.
Since Q may not be guarded, well known results about the decidability,
complexity, and finite-model property of the guarded...

The recently introduced Datalog+ / − family of ontology languages is especially useful for representing and reasoning over lightweight ontologies, and is set to play a central role in the context of query answering and information extraction for the Semantic Web. Recently, it has become apparent that it is necessary to develop a principled way to h...

Motivated by considerations in quantum mechanics, we introduce the class of robust constraint satisfaction problems in which the question is whether every partial assignment of a certain length can be extended to a solution, provided the partial assignment does not violate any of the constraints of the given instance. We explore the complexity of s...

Combinatorial auctions allow bidders to bid on bundles of items rather than just on single items. The winner determination problem in combinatorial auctions is the problem of determining the allocation of items to bidders such that the sum of the accepted bid prices is maximized. This problem is equivalent to the well-known maximum-weight set packi...

We study the problem of answering a union of Boolean conjunctive queries q against a database Δ, and a logical theory ϕ which falls in the guarded fragment with transitive guards (GF + TG). We trace the frontier between decidability and undecidability of the problem under consideration. Surprisingly, we show that query answering under GF2 + TG, i.e...

Existential rules are Datalog rules extended with existential quantifiers in rule-heads. Three fundamental restriction paradigms that have been studied for ensuring decidability of query answering under existential rules are weak-acyclicity, guardedness and stickiness. Towards the identification of even more expressive decidable languages, several...