Conference Paper

Predicate-based Filtering of XPath Expressions.

University of Toronto, Canada
DOI: 10.1109/ICDE.2006.115 Conference: Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, 3-8 April 2006, Atlanta, GA, USA
Source: DBLP

ABSTRACT The XML/XPath filtering problem has found wide-spread interest. In this paper, we propose a novel algorithm for solving it. Our approach encodes XPath expressions (XPEs) as ordered sets of predicates and translates XML documents into sets of tuples, which are evaluated over these predicates. Predicates representing overlapping portions of XPEs are stored and processed once, thus fully exploiting potential overlap in XPEs. We experimentally evaluate the performance of our algorithm, demonstrating its scalability to millions of XPEs, with matching performance in the millisecond range. We show interesting trade-offs to alternative approaches.

  • Source
    EDBT 2011, 14th International Conference on Extending Database Technology, Uppsala, Sweden, March 21-24, 2011, Proceedings; 01/2011
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Distributed content-based publish/subscribe systems suffer from performance degradation and poor scalability caused by uneven load distributions typical in real-world applications. The reason for this shortcoming is the lack of a load balancing scheme. This article proposes a load balancing solution specifically tailored to the needs of content-based publish/subscribe systems that is distributed, dynamic, adaptive, transparent, and accommodates heterogeneity. The solution consists of three key contributions: a load balancing framework, a novel load estimation algorithm, and three offload strategies. A working prototype of our solution is built on an open-sourced content-based publish/subscribe system and evaluated on PlanetLab, a cluster testbed, and in simulations. Real-life experiment results show that the proposed load balancing solution is efficient with less than 0.2% overhead; effective in distributing and balancing load originating from a single server to all available servers in the network; and capable of preventing overloads to preserve system stability, availability, and quality of service.
    ACM Transactions on Computer Systems 12/2010; 28:9. DOI:10.1145/1880018.1880020 · 0.62 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Many XML filtering systems have emerged in recent years identifying XML data that structurally match XPath queries in an efficient way. However, apart from structural match- ing, it is considered equally important to deal with value- based predicates. In this paper, we propose methods to com- bine both structural and value XML filtering in a distributed environment based on distributed hash tables. Structural matching is performed using automata, while we study dif- ferent methods for evaluating value-based predicates. As a result, our algorithms scale in both the size of the query set and the number of the predicates per query. We perform an experimental evaluation and demonstrate the strengths and weaknesses of the proposed methods in both a controlled environment of a cluster and on a real testbed provided by the PlanetLab network.
    Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems, DEBS 2010, Cambridge, United Kingdom, July 12-15, 2010; 01/2010


Available from