Cost-based optimization in DB2 XML.

IBM Almaden Research Center, 650 Harry Road, San Jose, California 95120, USA
Ibm Systems Journal (Impact Factor: 1.29). 01/2006; 45:299-320. DOI: 10.1147/sj.452.0299
Source: DBLP

ABSTRACT DB2 XML is a hybrid database system that combines the relational capabilities of DB2 Universal Database™ (UDB) with comprehensive native XML support. DB2 XML augments DB2® UDB with a native XML store, XML indexes, and query processing capabilities for both XQuery and SQL/XML that are integrated with those of SQL. This paper presents the extensions made to the DB2 UDB compiler, and especially its cost-based query optimizer, to support XQuery and SQL/XML queries, using much of the same infrastructure developed for relational data queried by SQL. It describes the challenges to the relational infrastructure that supporting XQuery and SQL/XML poses and provides the rationale for the extensions that were made to the three main parts of the optimizer: the plan operators, the cardinality and cost model, and statistics collection.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: As XML is playing a crucial role in web services, databases, and document processing, efficient processing of XML queries has become an important issue. On the other hand, due to the increasing number of users, high throughput of XML queries is also required to execute tens of thousands of queries in a short time. Given the great success of GPGPU (General-Purpose computations on the Graphics Processors), we propose a parallel XML query model based on GPU, which mainly consists of two efficient task distribution strategies, to improve the efficiency and throughput of XML queries. We have developed a parallel simplified XPath language using Compute Unified Device Architecture (CUDA) on GPU, and evaluate our model on a recent NVIDIA GPU in comparison with its counterpart on eight-core CPU. The experiment results show that our model achieves both higher throughput and efficiency than CPU-based XML query.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Even though an effective cost-based query optimizer is of utmost importance for the efficient evaluation of XQuery expressions in native XML database systems, such a component is currently out of sight, because former approaches do not pay attention to the latest advances in the area of physical operators (e. g., Holistic Twig Joins and advanced indexes) or just focus only on some of them. To support the development of native XML query optimizers, we introduce an extensible cost-based optimization framework that integrates the cutting-edge XML query evaluation operators into a single system. Using the well-known plan generation techniques from the relational world and a novel set of plan equivalences---which allows for the generation of alternative query plans consisting of Structural Joins, Holistic Twig Joins, and numerous indexes (especially path indexes and content-and-structure indexes)---our optimizer can now benefit from the knowledge on native XML query evaluation to speed-up query execution significantly.
    Fourteenth International Database Engineering and Applications Symposium (IDEAS 2010), August 16-18, 2010, Montreal, Quebec, Canada; 01/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Bloom Filters are widely used in many applications includ-ing database management systems. With a certain allowable error rate, this data structure provides an efficient solution for membership queries. The error rate is inversely pro-portional to the size of the Bloom filter. Currently, Bloom filters are stored in main memory because the low locality of operations makes them impractical on secondary storage. In multi-user database management systems, where there is a high contention for the shared memory heap, the limited memory available for allocating a Bloom filter may cause a high rate of false positives. In this paper we are proposing a technique to reduce the memory requirement for Bloom filters with the help of solid state storage devices (SSD). By using a limited memory space for buffering the read/write requests, we can afford a larger SSD space for the actual Bloom filter bit vector. In our experiments we show that with significantly less memory requirement and fewer hash functions the proposed technique reduces the false positive rate effectively. In addition, the proposed data structure runs faster than the traditional Bloom filters by grouping the inserted records with respect to their locality on the filter.

Full-text (2 Sources)

Available from
May 27, 2014