Conference Paper

A New Bitmap Index and a New Data Cube Compression Technology.

DOI: 10.1007/978-3-540-69848-7_97 Conference: Computational Science and Its Applications - ICCSA 2008, International Conference, Perugia, Italy, June 30 - July 3, 2008, Proceedings, Part II
Source: DBLP

ABSTRACT This paper introduces a new kind of bitmap index. A tuple in the data cube is mapped to a sequential key (seqkey) determined
by its value in each dimension. Furthermore, the quotient bit sequence is constructed according to whether the corresponding
cell of a seqkey exists in the cover quotient cube or not, and the cover quotient cube is indexed by this quotient bit sequence
(qcbit index). A compression method is presented for the seqkey cover quotient cube, which compress the cover quotient cube
via omitting dimension attributes for all cells. To improve the storage and query of data cubes, based on these index and
compression methods, algorithms are proposed to query the cover quotient cube and seqkey cover quotient cube. Experimental
results on the dataset weather show that the volume of the qcbit index file is only 11% of the value-list index file, and
the volume of seqkey cover quotient cube is only 27.75% of the original cover quotient cube.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Typically, online aggregation algorithms on multi-dimensional data need additional auxiliary data for estimation, which make the performance of the storage and maintenance of the data cube worse. This paper presents the PE (progressively estimate) and HPE (hybrid progressively estimate) to progressively estimate the answers for range queries in the QC-Trees. MPE (multiple progressively estimate) is also proposed to simultaneously evaluate batches of range-sum queries. The difference between the algorithms and other online aggregation algorithms on data cubes is that these algorithms do not need any auxiliary information. The idea of this estimation method is to utilize the data stored in the QC-Tree itself. As a result, this algorithm will not deteriorate the performance of the storage and maintenance of the data cubes. Analysis and experimental results show that the algorithms provide an
    Journal of Software. 01/2006; 17(4).
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Recently, a technique called quotient cube was proposed as a summary structure for a data cube that preserves its semantics, with applications for online exploration and visualization. The authors showed that a quotient cube can be constructed very eciently and it leads to a significant reduction in the cube size. While it is an interesting proposal, that paper leaves many issues un- addressed. Firstly, a direct representation of a quotient cube is not as compact as possible and thus still wastes space. Secondly, while a quotient cube can in principle be used for answering queries, no specific algorithms were given in the paper. Thirdly, maintaining any sum- mary structure incrementally against updates is an im- portant task, a topic not addressed there. In this paper, we propose an ecient data structure called QC-tree and an ecient algorithm for directly constructing it from a base table, solving the first problem. We give ef- ficient algorithms that address the remaining questions. We report results from an extensive performance study that illustrate the space and time savings achieved by our algorithms over previous ones (wherever they ex- ist).
    Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, San Diego, California, USA, June 9-12, 2003; 01/2003
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Partitioning a data cube into sets of cells with "similar behavior" often better exposes the semantics in the cube. E.g., if we find that average boots sales in the West 10th store of Walmart was the same for winter as for the whole year, it signifies something interesting about the trend of boots sales in that location in that year. In this paper, we are interested in finding succinct summaries of the data cube, exploiting regularities present in the cube, with a clear basis. We would like the summary: (i) to be as concise as possible, (ii) to itself form a lattice preserving the rollup/drilldown semantics of the cube, and (iii) to allow the original cube to be fully recovered. We illustrate the utility of solving this problem and discuss the inherent challenges. We develop techniques for partitioning cube cells for obtaining succinct summaries, and introduce the quotient cube. We give efficient algorithms for computing it from a base table. For monotone aggregate functions (e.g., COUNT, MIN, MAX, SUM on non-negative measures, etc.), our solution is optimal (i.e., quotient cube of the least size). For nonmonotone functions (e.g., AVG), we obtain a locally optimal solution. We experimentally demonstrate the efficacy of our ideas and techniques and the scalability of our algorithms.
    VLDB 2002, Proceedings of 28th International Conference on Very Large Data Bases, August 20-23, 2002, Hong Kong, China; 01/2002