About
27
Publications
4,298
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
792
Citations
Publications
Publications (27)
Materialisation facilitates Datalog reasoning by precomputing all consequences of the facts and the rules so that queries can be directly answered over the materialised facts. However, storing all materialised facts may be infeasible in practice, especially when the rules are complex and the given set of facts is large. We observe that for certain...
Datalog reasoning based on the seminaive evaluation strategy evaluates rules using traditional join plans, which often leads to redundancy and inefficiency in practice, especially when the rules are complex. Hypertree decompositions help identify efficient query plans and reduce similar redundancy in query answering. However, it is unclear how this...
Datalog reasoning based on the semina\"ive evaluation strategy evaluates rules using traditional join plans, which often leads to redundancy and inefficiency in practice, especially when the rules are complex. Hypertree decompositions help identify efficient query plans and reduce similar redundancy in query answering. However, it is unclear how th...
Following the recent successful examples of large technology companies, many modern enterprises seek to build Knowledge Graphs to provide a unified view of corporate knowledge, and to draw deep insights using machine learning and logical reasoning. There is currently a perceived disconnect between the traditional approaches for data science, typica...
Datalog is a rule-based formalism that can axiomatise recursive properties such as reachability and transitive closure. Datalog implementations often materialise (i.e., precompute and store) all facts entailed by a datalog program and a set of explicit facts. Queries can thus be answered directly in the materialised facts, which is beneficial to th...
When RDF datasets become too large to be managed by centralised systems, they are often distributed in a cluster of shared-nothing servers, and queries are answered using a distributed join algorithm. Although such solutions have been extensively studied in relational and RDF databases, we argue that existing approaches exhibit two drawbacks. First...
A fruitful application of Semantic Technologies in the field of healthcare data analysis has emerged from the collaboration between Oxford and Kaiser Permanente a US healthcare provider (HMO). US HMOs have to annually deliver measurement results on their quality of care to US authorities. One of these sets of measurements is defined in a specificat...
Evaluating joins over RDF data stored in a shared-nothing server cluster is key to processing truly large RDF datasets. To the best of our knowledge, the existing approaches use a variant of the data exchange operator that is inserted into the query plan statically (i.e., at query compile time) to shuffle data between servers. We argue that such ap...
This paper describes the outcomes of an ongoing collaboration between Siemens and the University of Oxford, with the goal of facilitating the design of ontologies and their deployment in applications. Ontologies are often used in industry to capture the conceptual information models underpinning applications. We start by describing the role that su...
We study the problem of rewriting a Disjunctive Datalog program into an equivalent plain Datalog program (i.e., one that entails the same facts for every dataset). We show that a Disjunctive Datalog program is Datalog rewritable if and only if it can be rewritten into a linear program (i.e., having at most one IDB body atom in each rule), thus prov...
Answering conjunctive queries over ontology-enriched datasets is a core reasoning task for many applications. Query answering is, however, computationally very expensive, which has led to the development of query answering procedures that sacrifice either expressive power of the ontology language, or the completeness of query answers in order to im...
We present RDFox—a main-memory, scalable, centralised RDF store that supports materialisation-based parallel datalog reasoning and SPARQL query answering. RDFox uses novel and highly-efficient parallel reasoning algorithms for the computation and incremental update of datalog materialisations with efficient handling of owl:sameAs. In this system de...
Materialisation precomputes all consequences of a set of facts and a datalog
program so that queries can be evaluated directly (i.e., independently from the
program). Rewriting optimises materialisation for datalog programs with
equality by replacing all equal constants with a single representative; and
incremental maintenance algorithms can effici...
Datalog-based systems often materialise all consequences of a datalog program and the data, allowing users' queries to be evaluated directly in the materialisation. This process, however, can be computationally intensive, so most systems update the materialisation incrementally when input data changes. We argue that existing solutions, such as the...
Rewriting is widely used to optimise owl:sameAs reasoning in materialisation
based OWL 2 RL systems. We investigate issues related to both the correctness
and efficiency of rewriting, and present an algorithm that guarantees
correctness, improves efficiency, and can be effectively parallelised. Our
evaluation shows that our approach can reduce reas...
We study the closely related problems of rewriting disjunctive datalog programs and non-Horn DL ontologies into plain datalog programs that entail the same facts for every dataset. We first propose the class of markable disjunctive datalog programs, which is efficiently recognisable and admits polynomial rewritings into datalog. Markability natural...
We present an enhanced hybrid approach to OWL query answering that combines an RDF triple-store with an OWL reasoner in order to provide scalable pay-as-you-go performance. The enhancements presented here include an extension to deal with arbitrary OWL ontologies, and optimisations that significantly improve scalability. We have implemented these t...
We present a novel approach to parallel materialisation (i.e., fixpoint computation) of datalog programs in centralised, main-memory, multi-core RDF systems. Our approach comprises an algorithm that evenly distributes the workload to cores, and an RDF indexing data structure that supports efficient, 'mostly' lock-free parallel updates. Our empirica...
We study the problem of rewriting a disjunctive datalog program into plain
datalog. We show that a disjunctive program is rewritable if and only if it is
equivalent to a linear disjunctive program, thus providing a novel
characterisation of datalog rewritability. Motivated by this result, we propose
weakly linear disjunctive datalog---a novel rule-...
In our previous work, we showed how a scalable OWL 2 RL reasoner can be used to compute both lower and upper bound query answers over very large datasets and arbitrary OWL 2 ontologies. However, when these bounds do not coincide, there still remain a number of possible answer tuples whose status is not determined. In this paper, we show how in the...
We consider the quantifier-free languages, Bc and Bc0, obtained by augmenting
the signature of Boolean algebras with a unary predicate representing,
respectively, the property of being connected, and the property of having a
connected interior. These languages are interpreted over the regular closed
sets of n-dimensional Euclidean space (n greater...
We investigate (quantifier-free) spatial constraint languages with equality,
contact and connectedness predicates as well as Boolean operations on regions,
interpreted over low-dimensional Euclidean spaces. We show that the complexity
of reasoning varies dramatically depending on the dimension of the space and on
the type of regions considered. For...
By a Euclidean logic, we understand a formal language whose variables range over subsets of Euclidean space, of some fixed dimension, and whose
non-logical primitives have fixed meanings as geometrical properties, relations and operations involving those sets. In this
paper, we consider first-order Euclidean logics with primitives for the propertie...
We present a complete axiomatization of a logic denoted by MTML (Mereo-Topological Modal Logic) based on the followin g set of mereotopological relations: part-of, overlap, underlap, contact, dual con- tact and interior part-of. We prove completeness theorems for MTML with respect to several classes of models including the standard topological mod-...