Article

Structural characterizations of the navigational expressiveness of relation algebras on a tree

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Given a document D in the form of an unordered node-labeled tree, we study the expressiveness on D of various basic fragments of XPath, the core navigational language on XML documents. Working from the perspective of these languages as fragments of Tarski's relation algebra, we give characterizations, in terms of the structure of D, for when a binary relation on its nodes is definable by an expression in these algebras. Since each pair of nodes in such a relation represents a unique path in D, our results therefore capture the sets of paths in D definable in each of the fragments. We refer to this perspective on language semantics as the "global view." In contrast with this global view, there is also a "local view" where one is interested in the nodes to which one can navigate starting from a particular node in the document. In this view, we characterize when a set of nodes in D can be defined as the result of applying an expression to a given node of D. All these definability results, both in the global and the local view, are obtained by using a robust two-step methodology, which consists of first characterizing when two nodes cannot be distinguished by an expression in the respective fragments of XPath, and then bootstrapping these characterizations to the desired results.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Notice in particular that the separation results on graphs of Fletcher et al. do not necessarily also apply to trees. In addition, the expressiveness results for several XPath fragments [2,12,19,20,24,25,27] in the context of XML do not provide a complete picture of the relative expressive power of the navigational query languages we consider here. As a first step towards a complete picture of the relative expressive power, we study the expressive power of downward fragments: these are navigational query languages that only allow downward navigation in the tree via parent-child relations. ...
... Lastly, the XPath algebra of Gyssens et al. [12], when restricted to the downward fragment, corresponds to the navigational query language N (π, ∩, −). This work studied the expressiveness of various XPath algebra fragments with respect to a given tree, whereas we study the expressive power with respect to the class of labeled and unlabeled trees and chains. ...
Article
Full-text available
Motivated by the continuing interest in the tree data model, we study the expressive power of downward navigational query languages on trees and chains. Basic navigational queries are built from the identity relation and edge relations using composition and union. We study the effects on relative expressiveness when we add transitive closure, projections, coprojections, intersection, and difference; this for boolean queries and path queries on labeled and unlabeled structures. In all cases, we present the complete Hasse diagram. In particular, we establish, for each query language fragment that we study on trees, whether it is closed under difference and intersection.
... Relation algebraic methods have also been useful in proving metamathematical results, for example that for all n ≥ 3 there are sentences involving only 3 variables whose formal proofs require n variables [14]. In addition, relation algebras and their generalisations have numerous practical applications in computer science, for example in verification [9], computation tasks involving finite topologies [2], and navigation of XML documents [6], to name just a few. ...
Article
Using a variation of the rainbow construction and various pebble and colouring games, we prove that RRA, the class of all representable relation algebras, cannot be axiomatised by any first-order relation algebra theory of bounded quantifier depth. We also prove that the class At(RRA) of atom structures of representable, atomic relation algebras cannot be defined by any set of sentences in the language of RA atom structures that uses only a finite number of variables.
Article
Motivated by the continuing interest in the tree data model, we study the expressive power of downward navigational query languages on trees and chains. Basic navigational queries are built from the identity relation and edge relations using composition and union. We study the effects on relative expressiveness when we add transitive closure, projections, coprojections, intersection, and difference; this for Boolean queries and path queries on labeled and unlabeled structures. In all cases, we present the complete Hasse diagram. In particular, we establish, for each query language fragment that we study on trees, whether it is closed under difference and intersection.
Article
We study the definability problem for first-order logic, denoted by FO-Def. The input of FO-Def is a relational database instance I and a relation R; the question to answer is whether there exists a first-order query Q (or, equivalently, a relational algebra expression Q) such that Q evaluated on I gives R as an answer. Although the study of FO-Def dates back to 1978, when the decidability of this problem was shown, the exact complexity of FO-Def remains as a fundamental open problem. In this article, we provide a polynomial-time algorithm for solving FO-Def that uses calls to a graph-isomorphism subroutine (or oracle). As a consequence, the first-order definability problem is found to be complete for the class GI of all problems that are polynomial-time Turing reducible to the graph isomorphism problem, thus closing the open question about the exact complexity of this problem. The technique used is also applied to a generalized version of the problem that accepts a finite set of relation pairs, and whose exact complexity was also open; this version is also found to be GI-complete.
Conference Paper
Motivated by the continuing interest in the tree data model, we study the expressive power of downward fragments of navigational query languages on trees. The basic navigational query language we consider expresses queries by building binary relations from the edge relations and the identity relation, using composition and union. We study the effects on the expressive power when we add transitive closure, projections, coprojections, intersection, and difference. We study expressiveness at the level of boolean queries and path queries, on labeled and unlabeled trees, and on labeled and unlabeled chains. In all these cases, we are able to present the complete Hasse diagram of relative expressiveness. In particular, we were able to decide, for each fragment of the navigational query languages that we study, whether it is closed under difference and intersection when applied on trees.
Conference Paper
Full-text available
We study the satisfiability problem associated with XPath in the presence of DTDs. This is the problem of determining, given a query p in an XPath fragment and a DTD D, whether or not there exists an XML document T such that T conforms to D and the answer of p on T is nonempty. We consider a variety of XPath fragments widely used in practice, and investigate the impact of different XPath operators on satisfiability analysis. We first study the problem for negation-free XPath fragments with and without upward axes, recursion and data-value joins, identifying which factors lead to tractability and which to NP-completeness. We then turn to fragments with negation but without data values, establishing lower and upper bounds in the absence and in the presence of upward modalities and recursion. We show that with negation the complexity ranges from PSPACE to EXPTIME. Moreover, when both data values and negation are in place, we find that the complexity ranges from NEXPTIME to undecidable. Finally, we give a finer analysis of the problem for particular classes of DTDs, exploring the impact of various DTD constructs, identifying tractable cases, as well as providing the complexity in the query size alone.
Article
Full-text available
Motivated by applications in databases, this paper considers various fragments of the calculus of binary relations. The fragments are obtained by leaving out, or keeping in, some of the standard operators, along with some derived operators such as set difference, projection, coprojection, and residuation. For each considered fragment, a characterization is obtained for when two given binary relational structures are indistinguishable by expressions in that fragment. The characterizations are based on appropriately adapted notions of simulation and bisimulation.
Conference Paper
Full-text available
We study structural properties of each of the main sublanguages of XPath [8] commonly used in practice. First, we characterize the expressive power of these language fragments in terms of both logics and tree patterns. Second, we investigate closure properties, focusing on the ability to perform basic Boolean operations while remaining within the fragment. We give a complete picture of the closure properties of these fragments, treating XPath expressions both as functions of arbitrary nodes in a document tree, and as functions that are applied only at the root of the tree. Finally, we provide sound and complete axiom systems and normal forms for several of these fragments. These results are useful for simplification of XPath expressions and optimization of XML queries.
Conference Paper
Full-text available
Supporting efficient access to XML data using XPath [3] continues to be an important research problem [6, 12]. XPath queries Supporting efficient access to XML data using XPath [3] continues to be an important research problem [6, 12]. XPath queries are used to specify nodelabeled trees which match portions of the hierarchical XML data. In XPath query evaluation, indices are used to specify nodelabeled trees which match portions of the hierarchical XML data. In XPath query evaluation, indices similar to those used in relational database systems - namely, value indices on tags and text values - are first used, together similar to those used in relational database systems - namely, value indices on tags and text values - are first used, together with structural join algorithms [1, 2, 19]. This approach turns out to be simple and efficient. However, the structural containment with structural join algorithms [1, 2, 19]. This approach turns out to be simple and efficient. However, the structural containment relationships native to XML data are not directly captured by value indices. relationships native to XML data are not directly captured by value indices.
Conference Paper
Full-text available
Given a document D in the form of an unordered labeled tree, we study the expressibility on D of various fragments of XPath, the core navigational language on XML documents. We give charac- terizations, in terms of the structure of D, for when a binary relation on its nodes is definable by an XPath expression in these fragm ents. Since each pair of nodes in such a relation represents a unique path in D, our results therefore capture the sets of paths in D definable in XPath. We refer to this perspective on the semantics of XPath as the "global view." In contrast with this global view, ther e is also a "local view" where one is interested in the nodes to which one can navigate starting from a particular node in the document. In this view, we characterize when a set of nodes in D can be defined as the result of applying an XPath expression to a given node of D. All these definability results, both in the global and the lo cal view, are obtained by using a robust two-step methodology, which consists of first characterizing when two nodes cannot be dis tin- guished by an expression in the respective fragments of XPath, and then bootstrapping these characterizations to the desired results.
Conference Paper
Full-text available
Access control for XML documents is a non-trivial topic, as can be witnessed from the number of approaches presented in the literature. Trying to compare these, we discovered the need for a simple, clear and unambiguous language to state the declarative semantics of an access control policy. All current approaches state the semantics in natural language, which has none of the above properties. This makes it hard to assess whether the proposed algorithms are correct (i.e., really implement the described semantics). It is also hard to assess the proposed policy on its merits, and to compare it to others (for file systems for instance). This paper shows how XPath can be used to specify the semantics of an access control policy for XML documents. Using XPath has great advantages: it is standard technology, widely used and it has clear and easy syntax and semantics. We use the developed framework to give a formal specification of the five most prominent approaches of access control for XML documents from the literature.
Conference Paper
Full-text available
In recent years there has been an increased interest in managing data that does not conform to traditional data models, like the relational or object oriented model. The reasons for this non-conformance are diverse. On the one hand, data may not conform to such models at the physical level: it may be stored in data exchange formats, fetched from the Web, or stored as structured files. One the other hand, it may not conform at the logical level: data may have missing attributes, some attributes may be of different types in different data items, there may be heterogeneous collections, or the schema may be too complex or changes too often. The term semistructured data has been used to refer to such data. The semistructured data model consists of an edge-labeled graph, in which nodes correspond to objects and edges to attributes or values. Figure 1 illustrates a semistructured database providing information about a city.
Conference Paper
Full-text available
We study the expressiveness of a positive fragment of path queries, denoted Path+, on node-labeled trees documents. The expres- siveness of Path+ is studied from two angles. First, we establish that Path+ is equivalent in expressive power to a particular sub-fragment as well as to the class of tree queries, a sub-class of the flrst-order conjunc- tive queries deflned over label, parent-child, and child-parent predicates. The translation algorithm from tree queries to Path+ yields a normal form for Path+ queries. Using this normal form, we can decompose a Path+ query into sub-queries that can be expressed in a very small sub- fragment of Path+ for which e-cient evaluation strategies are available. Second, we characterize the expressiveness of Path+ in terms of its abil- ity to resolve nodes in a document. This result is used to show that each tree query can be translated to a unique, equivalent, and minimal tree query. The combination of these results yields an efiective strategy to evaluate a large class of path queries on documents.
Book
Full-text available
Finite model theory studies the expressive power of logics on finite models. Classical model theory, on the other hand, concentrates on infinite structures: its origins are in mathematics, and most objects of interest in mathematics are infinite, e.g., the sets of natural numbers, real numbers, etc. Typical examples of interest to a model-theorist would be algebraically closed fields (e.g., 〈ℂ, +, •〉), real closed fields (e.g., 〈ℝ, +, •
Article
Full-text available
Recent studies have proposed structural summary techniques for path-query evaluation on semi-structured data sources. One major line of this research has been the introduction of the DataGuide, 1-index, 2-index, and A(k) indices, and subsequent investigations and generalizations. Another recent study has considered structural characterizations of fragments of XPath, the standard path navigation language for XML documents. In this paper we provide a methodology on XPath query processing that couples these two areas of research on structural indices and query languages. To illustrate this methodology, we apply it to couple an upward-only XPath fragment with the A(k) and P (k) structural indices. With an eye towards applying this result to XPath query processing, we (1) show how upward-only XPath expressions can be evaluated directly on the corresponding indices; (2) develop a labeling scheme for A(k) and P(k) partition blocks, using algebraic expressions; and (3) leverage these results to develop generic techniques for making effective use of A(k) and P (k) indices for more general, frequently occurring XPath expressions.
Article
Full-text available
We study the expressiveness of a positive fragment of path queries, denoted Path+, on documents that can be represented as node-labeled trees. The expressiveness of Path+ is studied from two angles. First, we establish that Path+ is equivalent in expressive power to two particular subfragments, as well as to the class of tree queries, a subclass of the first-order conjunctive queries defined over the label, parent–child and child–parent predicates. The translation algorithm from tree queries to Path+ yields a normal form for Path+ queries. Using this normal form, we can decompose a Path+ query into subqueries that can be expressed in a very small fragment of Path+ for which efficient evaluation strategies are available. Second, we characterize the expressiveness of Path+ in terms of its ability to resolve nodes in a document. This result is used to show that each tree query can be translated to a unique, equivalent and minimal tree query. The combination of these results yields an effective strategy to evaluate a large class of path queries on documents.
Article
Full-text available
ABSTRACT Motivated by reasoning tasks in the context of XML lan- guages, the satisfiability problem of logics on data trees is investigated. The nodes of a data tree have a label from a finite set and a data value from a possibly infinite set. It is shown that satisfiability for two-variable first-order logic is decidable if the tree structure can be accessed only through the child and the next sibling predicates and the access to data values is restricted to equality tests. From this main re- sult decidability of satisfiability and containment for a data- aware fragment of XPath and of the implication problem for unary key and inclusion constraints is concluded.
Article
Full-text available
XPath is a language for navigating an XML document and selecting a set of element nodes. XPath expressions are used to query XML data, describe key constraints, express transformations, and reference elements in remote documents. This article studies the containment and equivalence problems for a fragment of the XPath query language, with applications in all these contexts.In particular, we study a class of XPath queries that contain branching, label wildcards and can express descendant relationships between nodes. Prior work has shown that languages that combine any two of these three features have efficient containment algorithms. However, we show that for the combination of features, containment is coNP-complete. We provide a sound and complete algorithm for containment that runs in exponential time, and study parameterized PTIME special cases. While we identify one parameterized class of queries for which containment can be decided efficiently, we also show that even with some bounded parameters, containment remains coNP-complete. In response to these negative results, we describe a sound algorithm that is efficient for all queries, but may return false negatives in some cases.
Article
Full-text available
Relation algebras are algebras arising from the study of binary relations. They form a part of the field of algebraic logic, and have applications in proof theory, modal logic, and computer science. This research text uses combinatorial games to study the fundamental notion of representations of relation algebras. Games allow an intuitive and appealing approach to the subject, and permit substantial advances to be made. The book contains many new results and proofs not published elsewhere. It should be invaluable to graduate students and researchers interested in relation algebras and games. After an introduction describing the authors' perspective on the material, the text proper has six parts. The lengthy first part is devoted to background material, including the formal definitions of relation algebras, cylindric algebras, their basic properties, and some connections between them. Examples are given. Part 1 ends with a short survey of other work beyond the scope of the book. In part 2, games are introduced, and used to axiomatise various classes of algebras. Part 3 discusses approximations to representability, using bases, relation algebra reducts, and relativised representations. Part 4 presents some constructions of relation algebras, including Monk algebras and the 'rainbow construction', and uses them to show that various classes of representable algebras are non-finitely axiomatisable or even non-elementary. Part 5 shows that the representability problem for finite relation algebras is undecidable, and then in contrast proves some finite base property results. Part 6 contains a condensed summary of the book, and a list of problems. There are more than 400 exercises. The book is generally self-contained on relation algebras and on games, and introductory text is scattered throughout. Some familiarity with elementary aspects of first-order logic and set theory is assumed, though many of the definitions are given. Chapter 2 introduces the necessary universal algebra and model theory, and more specific model-theoretic ideas are explained as they arise.
Conference Paper
Full-text available
XML and other semi-structured data may have partially specified or missing schema information, motivating the use of a structural summary which can be automatically computed from the data. These summaries also serve as indices for evaluating the complex path expressions common to XML and semi-structured query languages. However, to answer all path queries accurately, summaries must encode information about long, seldom-queried paths, leading to increased size and complexity with little added value. We introduce the A(k)-indices, a family of approximate structural summaries. They are based on the concept of k-bisimilarity, in which nodes are grouped based on local structure, i.e., the incoming paths of length up to k. The parameter k thus smoothly varies the level of detail (and accuracy) of the A(k)-index. For small values of k, the size of the index is substantially reduced. While smaller, the A(k) index is approximate, and we describe techniques for efficiently extracting exact answers to regular path queries. Our experiments show that, for moderate values of k, path evaluation using the A(k)-index ranges from being very efficient for simple queries to competitive for most complex queries, while using significantly less space than comparable structures
Article
Full-text available
Our experimental analysis of several popular XPath processors reveals a striking fact: Query evaluation in each of the systems requires time exponential in the size of queries in the worst case. We show that XPath can be processed much more efficiently, and propose main-memory algorithms for this problem with polynomial-time combined query evaluation complexity. Moreover, we show how the main ideas of our algorithm can be profitably integrated into existing XPath processors. Finally, we present two fragments of XPath for which linear-time query processing algorithms exist and another fragment with linear-space/quadratic-time query processing.
Article
Full-text available
Monadic query languages over trees currently receive considerable interest in the database community, as the problem of selecting nodes from a tree is the most basic and widespread database query problem in the context of XML. Partly a survey of recent work done by the authors and their group on logical query languages for this problem and their expressiveness, this paper provides a number of new results related to the complexity of such languages over so-called axis relations (such as "child" or "descendant") which are motivated by their presence in the XPath standard or by their utility for data extraction (wrapping).
Article
We study the satisfiability problem associated with XPath in the presence of DTDs. This is the problem of determining, given a query p in an XPath fragment and a DTD D, whether or not there exists an XML document T such that T conforms to D and the answer of p on T is nonempty. We consider a variety of XPath fragments widely used in practice, and investigate the impact of different XPath operators on the satisfiability analysis. We first study the problem for negation-free XPath fragments with and without upward axes, recursion and data-value joins, identifying which factors lead to tractability and which to NP-completeness. We then turn to fragments with negation but without data values, establishing lower and upper bounds in the absence and in the presence of upward modalities and recursion. We show that with negation the complexity ranges from PSPACE to EXPTIME. Moreover, when both data values and negation are in place, we find that the complexity ranges from NEXPTIME to undecidable. Furthermore, we give a finer analysis of the problem for particular classes of DTDs, exploring the impact of various DTD constructs, identifying tractable cases, as well as providing the complexity in the query size alone. Finally, we investigate the problem for XPath fragments with sibling axes, exploring the impact of horizontal modalities on the satisfiability analysis.
Conference Paper
Motivated by the continuing interest in the tree data model, we study the expressive power of downward fragments of navigational query languages on trees. The basic navigational query language we consider expresses queries by building binary relations from the edge relations and the identity relation, using composition and union. We study the effects on the expressive power when we add transitive closure, projections, coprojections, intersection, and difference. We study expressiveness at the level of boolean queries and path queries, on labeled and unlabeled trees, and on labeled and unlabeled chains. In all these cases, we are able to present the complete Hasse diagram of relative expressiveness. In particular, we were able to decide, for each fragment of the navigational query languages that we study, whether it is closed under difference and intersection when applied on trees.
Article
We survey expressivity results for navigational fragments of XPath 1.0 and 2.0, as well as Regular XPath≈. We also investigate algebras for these fragments.
Article
We study structural properties of each of the main sublanguages of navigational XPath (W3c Recommendation) commonly used in practice. First, we characterize the expressive power of these language fragments in terms of both logics and tree patterns. Second, we investigate closure properties, focusing on the ability to perform basic Boolean operations while remaining within the fragment. We give a complete picture of the closure properties of these fragments, treating XPath expressions both as functions of arbitrary nodes in a document tree, and as functions that are applied only at the root of the tree. Finally, we provide sound and complete axiom systems and normal forms for several of these fragments. These results are useful for simplification of XPath expressions and optimization of XML queries.
Article
Based on the observation that graphs play an important role in the representation of databases, an algebra is presented for the manipulation of binary relations, i.e., of directed unlabeled graphs. This so-called Tarski algebra is based on early work by Tarski. The key notion that has been added to it here is tagging, which is needed for providing both enough querying power. Moreover, tagging can also be seen as a value-based counterpart to object creation in object-oriented data models. We present tagging in a general formal framework that incorporates several specific tagging strategies as a special case. We show that each of these strategies allows for the simulation in the Tarski model of various other database models, in particular of the relational model. Finally, we discuss the genericity of tagging and show that the Tarski algebra augmented with multiple assignments and a while-construct is a computationally complete database language.
Article
Several established and novel applications motivate us to study the expressive power of navigational query languages on graphs, which represent binary relations. Our basic language has only the operators union and composition, together with the identity relation. Richer languages can be obtained by adding other features such as other set operators, projection and coprojection, converse, and the diversity relation. In this paper, we show that, when evaluated at the level of boolean queries with an unlabeled input graph (i.e., a single relation), adding transitive closure to the languages with coprojection adds expressive power, while this is not the case for the basic language to which none, one, or both of projection and the diversity relation are added. In combination with earlier work [10], these results yield a complete understanding of the impact of transitive closure on the languages under consideration.
Article
Coinduction is a method for specifying and reasoning about infinite data types and automata with infinite behaviour. In recent years, it has come to play an ever more important role in the theory of computing. It is studied in many disciplines, including process theory and concurrency, modal logic and automata theory. Typically, coinductive proofs demonstrate the equivalence of two objects by constructing a suitable bisimulation relation between them. This collection of surveys is aimed at both researchers and Master's students in computer science and mathematics and deals with various aspects of bisimulation and coinduction, with an emphasis on process theory. Seven chapters cover the following topics: history, algebra and coalgebra, algorithmics, logic, higher-order languages, enhancements of the bisimulation proof method, and probabilities. Exercises are also included to help the reader master new material.
Article
How difficult is it to decide whether two finite structures can be distinguished in a given logic? For first order logic, this question is equivalent to the graph isomorphism problem with its well-known complexity theoretic difficulties. Somewhat surprisingly, the situation is much clearer when considering the fragments L/sup k/ of first-order logic whose formulae contain at most k (free or bound) variables (for some k/spl ges/1). We show that for each k/spl ges/2, equivalence in the k-variable logic L/sup k/ is complete for polynomial time under quantifier-free reductions (a weak form of NC/sub 0/ reductions). Moreover, we show that the same completeness result holds for the powerful extension C/sup k/ of L/sup k/ with counting quantifiers (for every k/spl ges/2).
Article
The importance of performing efficient XML query process-ing increases along with its usage and pervasiveness. Study-ing the properties of important fragments of XML query languages and designing accurate structural summaries (in-cluding indexes and statistical summaries) are all critical in-gredients in solving this problem. However, up to this point there has been a gap between the theoretical and engineer-ing efforts taken in the context of XML. We draw from re-search methodologies used in relational query languages and database design and apply it to the study of XPath and the design of structural summaries for XML. In particular, we study the roles various fragments of XPath algebra play in distinguishing data components in an XML document, and leverage the results in designing novel structural indexes and statistical summaries for more efficient XML query process-ing and more accurate result size estimation.
Article
A variable-free, equational logic L\mathcal{L}^\times based on the calculus of relations (a theory of binary relations developed by De Morgan, Peirce, and Schröder during the period 1864–1895) is shown to provide an adequate framework for the development of all of mathematics. The expressive and deductive powers of L\mathcal{L}^\times are equivalent to those of a system of first-order logic with just three variables. Therefore, three-variable first-order logic also provides an adequate framework for mathematics. Finally, it is shown that a variant of L\mathcal{L}^\times may be viewed as a subsystem of sentential logic. Hence, there are subsystems of sentential logic that are adequate to the task of formalizing mathematics.
Article
We provide complete axiomatizations for several fragments of Core XPath, the navigational core of XPath 10 introduced by Gottlob. Koch and Pichler A complete axiomatization for a given fragment is a set of equivalences from which every other valid equivalence is derivable; equivalences can be thought of as (undirected) rewrite rules. Specifically, we axiomatize single axis fragments of Core XPath as well as full Core XPath Our completeness proofs use results and techniques from modal logic (C) 2009 Elsevier B V All rights reserved
Conference Paper
It is well known that two structures A{\cal A} and B{\cal B} are indistinguishable by sentences of the infinitary logic with k variables Lk¥wL^k_{\infty\omega} iff Duplicator wins the Barwise game on A{\cal A} and B{\cal B} with k pebbles. The complexity of the problem who wins the game is in general unknown if k is a part of the input. We prove that the problem is in PTIME for some special classes of structures such as finite directed trees and infinite regular trees. More specifically, we show an algorithm running in time log (k) ( |A| + |B| ) O(1). The algorithm for regular trees is based on a characterization of the winning pairs (A, B)({\cal A}, {\cal B}) which is valid also for a more general case of (potentially infinite) rooted trees.
Conference Paper
Abstract We give semantic,characterizations of the expressive power,of navigational,XPath (a.k.a. Core XPath) in terms of first order logic. XPath can be used to specify sets of nodes,and sets of paths in an XML document tree. We consider both uses. For sets of nodes, XPath is equally expressive as first order logic in two variables. For paths, XPath can be defined using four simple connectives, which together yield the class of first order definable relations which,are safe for bisimulation. Furthermore, we give a characterization of the XPath expressible paths in terms of conjunctive,queries.
Conference Paper
We introduce a new methodology for coupling language-induced partitions and index-induced partitions on XML documents that is aimed for the benefit of efficient evaluation of XPath queries. In particular, we identify XPath fragments which are ideally coupled with the newly introduced P(k)-partition which has its definition grounded in the well-known A(k) structural index and its associated partition. We then utilize these couplings to investigate fundamental questions about the use of structural indexes in XPath query evaluation.
Conference Paper
Motivated by both established and new applications, we study navigational query languages for graphs (binary relations). The simplest language has only the two operators union and composition, together with the identity relation. We make more powerful languages by adding any of the following operators: intersection; set difference; projection; coprojection; converse; transitive closure; and the diversity relation. All these operators map binary relations to binary relations. We compare the expressive power of all resulting languages, both for binary-relation queries as well as for boolean queries. In the absence of transitive closure, a complete Hasse diagram of relative expressiveness has already been established [8]. Moreover, it has already been shown that for boolean queries over a single edge label, transitive closure does not add any expressive power when only projection and diversity may be present [11]. In the present article, we now complete the Hasse diagram in the presence of transitive closure, both for the case of a single edge label, as well as for the case of at least two edge labels. The main technical results are the following: • (1) In contrast to the above-stated result [11] transitive closure does add expressive power when coprojection is present. • (2) Transitive closure also adds expressive power as soon as converse is present. • (3) Conversely, converse adds expressive power in the presence of transitive closure. In particular, the converse elimination result from [8] no longer works in the presence of transitive closure. • (4) As a corollary, we show that the converse elimination result from [8] necessitates an exponential blow-up in the degree of the expressions.
Conference Paper
In this paper, we have studied the problem of completeness of relational query languages. We have etablished a criterion for completeness. A query language was proved to be complete. Since this language is the one used as a standard for completeness this results gives strong theoretical basis to Codd's definition of completeness. There are however some limitations to this notion of completeness : for instance there is no first order formula describing the transitive closure r* of a binary relation r ! While this seems at first to be in contradiction with theorem 2, one should recall that the notion of completeness we have introduced is static. Therefore, for any configuration consisting of a binary relation r there exists a formula r of the first order calculus describing the transitive closure of r, but this formula depends on and there is no formula describing the mapping which associates with a binary relation its transitive closure.
Article
The concept of “reasonable” queries on relational data bases is investigated. We provide an abstract characterization of the class of queries which are computable, and define the completeness of a query language as the property of being precisely powerful enough to express the queries in this class. This definition is then compared with other proposals for measuring the power of query languages. Our main result is the completeness of a simple programming language which can be thought of as consisting of the relational algebra augmented with the power of iteration.
Article
We survey expressivity results for navigational frag- ments of XPath 1.0 and 2.0, as well as Regular XPath! . We also investigate algebras for these fragments.
Article
XPath 1.0 is a variable free language designed to specify paths between nodes in XML documents. Such paths can alternatively be specified in first-order logic. The logical abstraction of XPath 1.0, usually called Navigational or Core XPath, is not powerful enough to express every first-order definable path. In this article, we show that there exists a natural expansion of Core XPath in which every first-order definable path in XML document trees is expressible. This expansion is called Conditional XPath. It contains additional axis relations of the form (child::n[F])+, denoting the transitive closure of the path expressed by child::n[F]. The difference with XPath's descendant::n[F] is that the path (child::n[F])+ is conditional on the fact that all nodes in between the start and end node of the path should also be labeled by n and should make the predicate F true. This result can be viewed as the XPath analogue of the expressive completeness of the relational algebra with respect to first-order logic.
Article
This survey gives an overview of formal results on the XML query language XPath. We identify several important fragments of XPath, focusing on subsets of XPath 1.0. We then give results on the expressiveness of XPath and its fragments compared to other formalisms for querying trees, algorithms, and complexity bounds for evaluation of XPath queries, as well as static analysis of XPath queries.
Article
The logical theory which is called the calculus of (binary) relations, and which will constitute the subject of this paper, has had a strange and rather capricious line of historical development. Although some scattered remarks regarding the concept of relations are to be found already in the writings of medieval logicians, it is only within the last hundred years that this topic has become the subject of systematic investigation. The first beginnings of the contemporary theory of relations are to be found in the writings of A. De Morgan, who carried out extensive investigations in this domain in the fifties of the Nineteenth Century. De Morgan clearly realized the inadequacy of traditional logic for the expression and justification, not merely of the more intricate arguments of mathematics and the sciences, but even of simple arguments occurring in every-day life; witness his famous aphorism, that all the logic of Aristotle does not permit us, from the fact that a horse is an animal, to conclude that the head of a horse is the head of an animal. In his effort to break the bonds of traditional logic and to expand the limits of logical inquiry, he directed his attention to the general concept of relations and fully recognized its significance. Nevertheless, De Morgan cannot be regarded as the creator of the modern theory of relations, since he did not possess an adequate apparatus for treating the subject in which he was interested, and was apparently unable to create such an apparatus. His investigations on relations show a lack of clarity and rigor which perhaps accounts for the neglect into which they fell in the following years.
Conference Paper
The Tarski algebra, an algebraic foundation for object-based query languages, is presented. While maintaining physical data independence, the Tarski algebra is shown to be both simple and powerful enough to express all reasonable queries. It is shown how queries expressed in a graph-oriented query language (based on the functional data model) can be translated into the Tarski algebra. The graphical representation of queries in combination with the Tarski algebra is shown to be a convenient mechanism for effective query optimization
Article
Central to any XML query language is a path language such as XPath which operates on the tree structure of the XML document. We demonstrate in this paper that the tree structure can be e#ectively compressed and manipulated using techniques derived from symbolic model checking . Specifically, we show first that succinct representations of document tree structures based on sharing subtrees are highly e#ective. Second, we show that compressed structures can be queried directly and e#ciently through a process of manipulating selections of nodes and partial decompression.
Article
Access control for XML documents is a non-trivial topic, as can be witnessed from the number of approaches presented in the literature. Trying to compare these, we discovered the need for a simple, clear and unambiguous language to state the declarative semantics of an access control policy. All current approaches state the semantics in natural language, which has none of the above properties. This makes it hard to assess whether the proposed algorithms are correct (i.e., really implement the described semantics). It is also hard to assess the proposed policy on its merits, and to compare it to others (for file systems for instance).
A formalization of set theory without variables, vol-ume 41 of Colloquium Publications
  • A Tarski
  • S Givant
A. Tarski, S. Givant, A formalization of set theory without variables, vol-ume 41 of Colloquium Publications, American Mathematical Society, Prov-idence, Rhode Island, 1987.
Tagging as an alternative to object creation Query Pro-cessing for Advanced Database Systems
  • M Gyssens
  • L V Saxton
  • D Van
  • Gucht
M. Gyssens, L. V. Saxton, D. Van Gucht, Tagging as an alternative to object creation, in: J. C. Freytag, D. Maier, G. Vossen (Eds.), Query Pro-cessing for Advanced Database Systems, Morgan Kaufmann, San Mateo, CA, USA, 1994, pp. 201–242.
  • M Benedikt
  • C Koch
  • Xpath Leashed
M. Benedikt, C. Koch, XPath leashed, ACM Comput. Surv. 41 (2009) 3:1–3:54.