Conference Paper

# Relative prefix sums: an efficient approach for querying dynamic OLAP data cubes

Dept. of Comput. Sci., California Univ., Santa Barbara, CA

DOI: 10.1109/ICDE.1999.754948 Conference: Data Engineering, 1999. Proceedings., 15th International Conference on Source: DBLP

- [Show abstract] [Hide abstract]

**ABSTRACT:**In-memory OLAP systems require a space-efficient representation of sparse data cubes in order to accommodate large data sets. On the other hand, many efficient online aggregation techniques, such as prefix sums, are built on dense array-based representations. These are often not applicable to real-world data due to the size of the arrays which usually cannot be compressed well, as most sparsity is removed during pre-processing. A possible solution is to identify dense regions in a sparse cube and only represent those using arrays, while storing sparse data separately, e.g. in a spatial index structure. Previous dense-region-based approaches have concentrated mainly on the effectiveness of the dense-region detection (i.e. on the space-efficiency of the result). However, especially in higher-dimensional cubes, data is usually more cluttered, resulting in a potentially large number of small dense regions, which negatively affects query performance on such a structure. In this article, our focus is not only on space-efficiency but also on time-efficiency, both for the initial dense-region extraction and for queries carried out in the resulting hybrid data structure. After describing a pre-aggregation method for representing dense sub-cubes which supports efficient online aggregate queries as well as cell updates, our sub-cube extraction approach is outlined in detail. In addition, optimizations in our approach significantly reduce the time to build the initial data structure compared to former systems. Two methods to trade available memory for increased aggregate query performance are provided. Also, we present a straightforward adaptation of our approach to support multi-core or multi-processor architectures, which can further enhance query performance. Experiments with different real-world data sets show how various parameter settings can be used to adjust the efficiency and effectiveness of our algorithms.09/2010: pages 73-102; -
##### Conference Paper: Approximate Range-Sum Queries over Data Cubes Using Cosine Transform.

[Show abstract] [Hide abstract]

**ABSTRACT:**In this research, we propose to use the discrete cosine transform to approximate the cumulative distributions of data cube cells' values. The cosine transform is known to have a good energy compaction property and thus can approximate data distribution functions easily with small number of coefficients. The derived estimator is accurate and easy to update. We perform experiments to compare its performance with a well-known technique - the (Haar) wavelet. The experimental results show that the cosine transform performs much better than the wavelet in estimation accuracy, speed, space efficiency, and update easiness. Keywords—DCT, Data CubeDatabase and Expert Systems Applications, 19th International Conference, DEXA 2008, Turin, Italy, September 1-5, 2008. Proceedings; 01/2008 - [Show abstract] [Hide abstract]

**ABSTRACT:**As the applications of wireless sensor networks continue to expand, it is important to support fast and simultaneous data aggregation over multiple regions for advanced data analysis. In this paper, we propose a solution by using a novel distributed data structure called distributed data cube (DDC). A DDC maintains a set of special forms of aggregate values (prefix sum, prefix average, prefix max, and prefix min) in distributed sensor nodes. We will first present fast algorithms to build a DDC within a sharp time bound. Then, we will present efficient distributed query-processing algorithms to handle aggregate queries by using a DDC. For a query region with n sensor nodes, our algorithms can return within O (√ n ) time. Finally, extensive simulation studies confirm that a DDC can be built very quickly, which is consistent with the theoretical time bound. The network traffic injected while constructing a DDC is acceptable and also scalable as the network size grows. Query processing on a DDC is fast and energy efficient in terms of the time units needed and the number of messages incurred.IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews) 06/2011; · 2.55 Impact Factor

Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.