Article

Cost-based optimization in DB2 XML.

IBM Almaden Research Center, 650 Harry Road, San Jose, California 95120, USA
Ibm Systems Journal (impact factor: 1.29). 01/2006; 45:299-320. DOI:10.1147/sj.452.0299 pp.299-320
Source: DBLP

ABSTRACT DB2 XML is a hybrid database system that combines the relational capabilities of DB2 Universal Database™ (UDB) with comprehensive native XML support. DB2 XML augments DB2® UDB with a native XML store, XML indexes, and query processing capabilities for both XQuery and SQL/XML that are integrated with those of SQL. This paper presents the extensions made to the DB2 UDB compiler, and especially its cost-based query optimizer, to support XQuery and SQL/XML queries, using much of the same infrastructure developed for relational data queried by SQL. It describes the challenges to the relational infrastructure that supporting XQuery and SQL/XML poses and provides the rationale for the extensions that were made to the three main parts of the optimizer: the plan operators, the cardinality and cost model, and statistics collection.

0 0
 · 
0 Bookmarks
 · 
37 Views
  • Source
    Article: Framework-Based Development and Evaluation of Cost-Based Native XML Query Optimization Techniques
    [show abstract] [hide abstract]
    ABSTRACT: Reflecting on the history of database management sys-tems reveals that cost-based query optimization has been the dominating method for effectively answering complex queries on large documents. Native XML database manage-ment systems provide an efficient infrastructure for storing, indexing, and querying large XML documents. Even though such systems can choose from a huge set of structural join operators and value-based join operators as well as various index access operators to efficiently query XML data, the development of powerful native XML query optimizers is just emerging. Furthermore, it is not known how the afore-mentioned operators behave in complex XQuery evaluation scenarios, which occur frequently in real-word applications. The extensible, rule-based, and cost-based XML query optimization framework proposed in this work, provides a basic testbed for exploring how and whether established techniques of relational cost-based query optimization (e. g., reordering of join operators) can be reused and which new techniques have to be developed to make a significant con-tribution for accelerating query execution. Using the best practices and an appropriate cost model that will be devel-oped using this framework, it can be turned into a stable cost-based XML query optimizer in the future.
  • Source
    Article: Parallel Optimization of Queries in XML Dataset Using GPU
    [show abstract] [hide abstract]
    ABSTRACT: As XML is playing a crucial role in web services, databases, and document processing, efficient processing of XML queries has become an important issue. On the other hand, due to the increasing number of users, high throughput of XML queries is also required to execute tens of thousands of queries in a short time. Given the great success of GPGPU (General-Purpose computations on the Graphics Processors), we propose a parallel XML query model based on GPU, which mainly consists of two efficient task distribution strategies, to improve the efficiency and throughput of XML queries. We have developed a parallel simplified XPath language using Compute Unified Device Architecture (CUDA) on GPU, and evaluate our model on a recent NVIDIA GPU in comparison with its counterpart on eight-core CPU. The experiment results show that our model achieves both higher throughput and efficiency than CPU-based XML query.
  • Source
    Article: Buffered Bloom filters on solid state storage
    [show abstract] [hide abstract]
    ABSTRACT: Bloom Filters are widely used in many applications includ-ing database management systems. With a certain allowable error rate, this data structure provides an efficient solution for membership queries. The error rate is inversely pro-portional to the size of the Bloom filter. Currently, Bloom filters are stored in main memory because the low locality of operations makes them impractical on secondary storage. In multi-user database management systems, where there is a high contention for the shared memory heap, the limited memory available for allocating a Bloom filter may cause a high rate of false positives. In this paper we are proposing a technique to reduce the memory requirement for Bloom filters with the help of solid state storage devices (SSD). By using a limited memory space for buffering the read/write requests, we can afford a larger SSD space for the actual Bloom filter bit vector. In our experiments we show that with significantly less memory requirement and fewer hash functions the proposed technique reduces the false positive rate effectively. In addition, the proposed data structure runs faster than the traditional Bloom filters by grouping the inserted records with respect to their locality on the filter.

Full-text (2 Sources)

View
5 Downloads
Available from
2 Feb 2013