A Data Mining Approach for Selecting Bitmap Join Indices

JCSE 12/2007; 1(2):177-194. DOI: 10.5626/JCSE.2007.1.2.177
Source: DBLP


Index selection is one of the most important decisions to take in the physical design of relational data warehouses. Indices reduce significantly the cost of processing complex OLAP queries, but require storage cost and induce maintenance overhead. Two main types of indices are available: mono-attribute indices (e.g., B-tree, bitmap, hash, etc.) and multi-attribute indices (join indices, bitmap join indices). To optimize star join queries characterized by joins between a large fact table and multiple dimension tables and selections on dimension tables, bitmap join indices are well adapted. They require less storage cost due to their binary representation. However, selecting these indices is a difficult task due to the exponential number of candidate attributes to be indexed. Most of approaches for index selection follow two main steps: (1) pruning the search space (i.e., reducing the number of candidate attributes) and (2) selecting indices using the pruned search space. In this paper, we first propose a data mining driven approach to prune the search space of bitmap join index selection problem. As opposed to an existing our technique that only uses frequency of attributes in queries as a pruning metric, our technique uses not only frequencies, but also other parameters such as the size of dimension tables involved in the indexing process, size of each dimension tuple, and page size on disk. We then define a greedy algorithm to select bitmap join indices that minimize processing cost and verify storage constraint. Finally, in order to evaluate the efficiency of our approach, we compare it with some existing techniques.

Download full-text


Available from: Habiba Drias, Oct 04, 2015
93 Reads
  • Source
    • "The comparison of ACS-BJIS with the previous works of [1] and [3] is possible in the case of non restriction of storage where the gain in time reported for DynaClose is around 69.66%. In this work ACS-BJIS achieved near 75% of gain, which outperforms DynaClose. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Unlike existing studies dealing with the selection of Bitmap Join Indexes for star join queries optimization, this paper presents three original features. The first one consists in addressing the problem with ant based approach that is more robust than the simple heuristic algorithms, which are usually used in the related works. The second interesting novelty resides in the metric used to prune the search space. The fitness function designed in the ant approach is brought from information retrieval technologies and is more refined than the frequency measure usually used. Finally, the third efficient aspect is in the data structure used to manage dynamically the storage in order to select the best promising indexes.
    2010 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2010, Toronto, Canada, August 31 - September 3, 2010, Main Conference Proceedings; 01/2010
  • Source
    • "Quel algorithme de sélection dois-je utiliser ? Deux types majeurs d'algorithmes ont été proposés pour sélectionner des index de jointure binaires : les algorithmes gloutons (Bellatreche et al., 2007) et les algorithmes basés sur les techniques de fouille de données (Aouiche et al., 2005; Bellatreche et al., 2008). 4. Quels paramètres d'algorithmes de sélection dois-je configurer ? "
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Nous visons à travers cette thèse à proposer un ensemble d'approches permettant d'optimiser les entrepôts de données et d'aider l'AED à bien mener cette optimisation. Nos approches d'optimisation reposent sur l'utilisation de trois techniques d'optimisation : la fragmentation horizontale primaire, dérivée et les index de jointure binaires (IJB). Nous commençons par proposer une approche de fragmentation qui prend en considération à la fois la performance (réduction du coût d'exécution) et la manageabilité (contrôle du nombre de fragments générés). Nous proposons ensuite une approche gloutonne de sélection d'IJB. L'utilisation séparée de la fragmentation horizontale (FH) et des IJB ne permet pas d'exploiter les similarités existantes entre ces deux techniques. Nous proposons une approche de sélection conjointe de la FH et des IJB. Cette approche peut être utilisée pour le tuning de l'entrepôt. Nous avons mené plusieurs expériences pour valider nos différentes approches. Nous proposons par la suite un outil permettant d'aider l'AED dans ses tâches de conception physique et de tuning. Mots clés : Conception physique, Tuning, Techniques d'optimisation, Fragmentation Horizontale, Index de Jointure Binaires.
Show more