Conference Paper

Predicate-based Filtering of XPath Expressions.

University of Toronto, Canada
DOI: 10.1109/ICDE.2006.115 Conference: Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, 3-8 April 2006, Atlanta, GA, USA
Source: DBLP

ABSTRACT The XML/XPath filtering problem has found wide-spread interest. In this paper, we propose a novel algorithm for solving it. Our approach encodes XPath expressions (XPEs) as ordered sets of predicates and translates XML documents into sets of tuples, which are evaluated over these predicates. Predicates representing overlapping portions of XPEs are stored and processed once, thus fully exploiting potential overlap in XPEs. We experimentally evaluate the performance of our algorithm, demonstrating its scalability to millions of XPEs, with matching performance in the millisecond range. We show interesting trade-offs to alternative approaches.

0 Bookmarks
 · 
59 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: More and more XML data is generated and used for data exchange. In this paper, we address the problem of filtering XML documents with large number of XPath expressions, which may contain ‘ancestor’ and ‘parent’ axes. XPath expressions with these axes are more powerful and flexible for users to describe their interests in publish/subscribe systems. First, we analyze the characteristics of the ‘parent’ axis and propose a series of rules to eliminate it in XPath expressions. Then we propose a new index structure called NIndex, which is designed to efficiently store and index large number of XPath expressions. NIndex offers several features which make it especially attractive for the large scale selective dissemination of information, including the ability to handle complex XPath expressions with ‘ancestor’ and ‘parent’ axes, and efficient pruning. Based on NIndex, we design a new filtering algorithm with low complexity for our problem. Our experiment results show that our algorithm performs well across a range of XPath expressions and documents.
    Information Sciences 11/2012; 210:41–54. · 3.89 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Current digital libraries suffer from the information over-load problem which prevents an effective access to knowledge. This is particularly true for scientific digital libraries where a growing amount of scientific articles can be explored by users with different needs, back-grounds, and interests. Recommender systems can tackle this limitation by filtering resources according to specific user needs. This paper in-troduces a content-based recommendation approach for enhancing the access to scientific digital libraries where a keyphrase extraction module is used to produce a rich description of both content of papers and user interests.
    Digital Libraries and Archives - 7th Italian Research Conference, IRCDL 2011, Pisa, Italy, January 20-21, 2011. REVISED SELECTED PAPERS; 01/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: Access control on form-based Web information systems has become one of the useful methods for implementing client systems in a service-oriented architecture. In particular, XForms language is being adopted in many systems as a description language for XML-based user interfaces and server interactions. In this paper, we propose an efficient algorithm for the evaluation of XPath-based access rules for XForms pages. In this model, an XForms page is a sequence of queries and the client system performs user interface realization along with XPath rule evaluations. XPath rules have instance-dependent predicates, which for the most part are shared between rules. For the efficient evaluation of shared predicate expressions in access control rules, we proposed a predicate graph model that reuses the previously evaluated results for the same context node. This approach guarantees that each predicate expression is evaluated for the relevant XML node only once. We present our approach and current implementation status.
    TENCON 2008 - 2008 IEEE Region 10 Conference (TENCON); 11/2008

Preview

Download
1 Download
Available from