Conference Paper

Relative prefix sums: an efficient approach for querying dynamic OLAP data cubes

Dept. of Comput. Sci., California Univ., Santa Barbara, CA
DOI: 10.1109/ICDE.1999.754948 Conference: Data Engineering, 1999. Proceedings., 15th International Conference on
Source: DBLP

ABSTRACT Range sum queries on data cubes are a powerful tool for analysis.
A range sum query applies an aggregation operation (e.g., SUM) over all
selected cells in a data cube, where the selection is specified by
providing ranges of values for numeric dimensions. Many application
domains require that information provided by analysis tools be current
or “near-current.” Existing techniques for range sum queries
on data cubes, however, can incur update costs on the order of the size
of the data cube. Since the size of a data cube is exponential in the
number of its dimensions, rebuilding the entire data cube can be very
costly. We present an approach that achieves constant time range sum
queries while constraining update costs. Our method reduces the overall
complexity of the range sum problem

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper studies aggregate search in transaction time databases. Specifically, each object in such a database can be modeled as a horizontal segment, whose y-projection is its search key, and its x-projection represents the period when the key was valid in history. Given a query timestamp qt and a key range ~ qk, a count-query retrieves the number of objects that are alive at qt, and their keys fall in ~ qk. We provide a method that accurately answers such queries, with error less than 1 " + " · Nalive(qt), where Nalive(qt) is the number of objects alive at time qt, and " is any constant in (0,1). Denoting the disk page size as B, and n = N/B, our technique requires O(n) space, processes any query in O(logB n) time, and supports each update in O(logB n) amortized I/Os. As demonstrated by extensive experiments, the proposed solutions guarantee query results with extremely high precision (median relative error below 5%), while consuming only a fraction of the space occupied by the existing approaches that promise precise results.
    The VLDB Journal 01/2008; 17:1271-1292. DOI:10.1007/s00778-007-0066-x · 1.70 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this research, we propose to use the discrete cosine transform to approximate the cumulative distributions of data cube cells' values. The cosine transform is known to have a good energy compaction property and thus can approximate data distribution functions easily with small number of coefficients. The derived estimator is accurate and easy to update. We perform experiments to compare its performance with a well-known technique - the (Haar) wavelet. The experimental results show that the cosine transform performs much better than the wavelet in estimation accuracy, speed, space efficiency, and update easiness. Keywords—DCT, Data Cube
    Database and Expert Systems Applications, 19th International Conference, DEXA 2008, Turin, Italy, September 1-5, 2008. Proceedings; 01/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Data warehouses contain data consolidated from several operational databases and provide the historical, and summarized data. On-Line Analytical Processing (OLAP) is designed to provide aggregate information to analyze the contents of data warehouses. An increasingly popular data model for OLAP applications is the multidimensional database, also known as data cube. A range sum query applies a sum aggregation operation over all selected cells of an OLAP data cube where the selection is specified by providing ranges of values for numeric dimensions. It is very useful in finding trends and in discovering relationships between attributes in the database. For today's applications, interactive data analysis applications which provide the current information will require fast response time and have reasonable update time. Since the size of a data cube is exponential in the number of its dimensions, it costs a lot of time to rebuild the entire data cube. To solve these updating problem, we present the recursive relative prefix sum method, which provides a compromise between query and update cost. From our performance study, we show that the update cost of our method is always less than that of the prefix sum method. Our recursive relative prefix sum method has a reasonable response time for ad hoc range queries on the data cube, while at the same time, greatly reduces the update cost.


Available from