Reasoning about keys for XML

Department of Computer Science, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA
DOI: 10.1007/3-540-46093-4_8
Source: DBLP


We study absolute and relative keys for XML, and investigate their associated decision problems. We argue that these keys are important to many forms of hierarchically structured data including XML documents. In contrast to other proposals of keys for XML, we show that these keys are always (finitely) satisfiable, and their (finite) implication problem is finitely axiomatizable. Furthermore, we provide a polynomial time algorithm for determining (finite) implication in the size of keys. Our results also demonstrate, among other things, that the analysis of XML keys is far more intricate than its relational counterpart.

Download full-text


Available from: Susan Davidson,
  • Source
    • "For the class of XML keys, the finite and the unrestricted implication problem coincide [10]. For numerical constraints, however, the situation is already different for the satisfiability problem. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Boundaries occur naturally in everyday life. This paper introduces numerical constraints into the framework of XML to take advantage of the benefits that result from the explicit specification of such boundaries. Roughly speaking, numerical constraints restrict the number of elements in an XML data fragment based on the data values of selected subelements. Efficient reasoning about numerical constraints provides effective means for predicting the number of answers to XQuery and XPath queries, the number of updates when using the XQuery update facility, and the number of encryptions or decryptions when using XML encryption. Moreover, numerical constraints can help to optimise XQuery and XPath queries, to exclude certain choices of indices from the index selection problem, and to generate views for efficient processing of common queries and updates.We investigate decision problems associated with numerical constraints in order to capitalise on the range of applications in XML data processing. To begin with we demonstrate that the implication problem is strongly coNP-hard for several classes of numerical constraints. These sources of potential intractability direct our attention towards the class of numerical keys that permit the specification of positive upper bounds. Numerical keys are of interest as they are reminiscent of cardinality constraints that are widely used in conceptual data modelling. At the same time, they form a natural generalisation of XML keys that are popular in XML theory and practice. We show that numerical keys are finitely satisfiable and establish a finite axiomatisation for their implication problem. Finally, we propose an algorithm that decides numerical key implication in quadratic time using shortest path methods.
    Information and Computation 05/2010; 208(5):521-544. DOI:10.1016/j.ic.2008.09.004 · 0.83 Impact Factor
  • Source
    • "Updates in XML (EDBT Workshop Proceedings), March 22, 2010, Lausanne , Switzerland. integrity constraints ([7] [6]), general XML functional dependencies and more specifically XML keys ([4] [3] [5] [1] [16] [19] [17]). All these proposals differ the ones from the others depending on (a) they are or not independent from any schema such as DTDs or XSDs, (b) the ways they are accessing and comparing XML elements, and (c) their expressiveness and tractability. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Given an XML functional dependency fd and a class of updates U, we say that fd is independent with respect to U if and only if any XML document satisfies fd after any update q of U, as soon as it did it before q. In this paper we study the following problem: is it possible to detect if an XML functional dependency fd is independent with respect to a class of updates U? We address this problem when the functional dependency and the class of updates are specified with a same formalism: the regular tree patterns. We first show that the use of regular tree patterns federates most of the known approaches for expressing XML functional dependencies while allowing to capture some of constraints not so far expressible. Then we show that in general the addressed problem is PSPACE-hard, but we exhibit a sufficient condition testable in polynomial time ensuring the independence of a functional dependency with respect to a class of updates.
    Proceedings of the 2010 EDBT/ICDT Workshops, Lausanne, Switzerland, March 22-26, 2010; 03/2010
  • Source
    • "Note that the definition of val differs slightly from that in (Buneman et al., 2001a) since it was extended to define on complex element nodes. The reason for this is that to include the complex element node in the path definitions and able to compare elements by node identity, i.e node equality as illustrates in the following example: "
    [Show abstract] [Hide abstract]
    ABSTRACT: Functional dependency (FD) is one of the integrity constraints for any data model. In relational data model, FDs are well studied and are widely used in normalization theory and in key algorithm. In recent years, XML has emerged as an widely used data representation and storage format over the world wide web. The growing use of XML has necessitated the XML document semantically stronger. XML functional dependency can be one of the ways to make the XML data semantically richer. In this paper, we propose XML functional dependency (XFD), specially for the purpose of XML data transformation for semantic integration of schemas with integrity constraints. While proposing, we show how XFD is defined on the XML Document Type Definition (DTD) and is satisfied by the XML documents. We introduce the novel concept 'tuple' that produces the semantically correct tuples in the XML document during XFD satisfaction. We show that XML key is a special case of XFD. We also discuss the advantages of our proposal over other previous XFD definitions.
    Semantic Computing, 2009. ICSC '09. IEEE International Conference on; 10/2009
Show more