A Data Mining Approach for selecting Bitmap Join Indices.

JCSE 01/2007; 1:177-194. DOI: 10.5626/JCSE.2007.1.2.177
Source: DBLP

ABSTRACT Index selection is one of the most important decisions to take in the physical design of relational data warehouses. Indices reduce significantly the cost of processing complex OLAP queries, but require storage cost and induce maintenance overhead. Two main types of indices are available: mono-attribute indices (e.g., B-tree, bitmap, hash, etc.) and multi-attribute indices (join indices, bitmap join indices). To optimize star join queries characterized by joins between a large fact table and multiple dimension tables and selections on dimension tables, bitmap join indices are well adapted. They require less storage cost due to their binary representation. However, selecting these indices is a difficult task due to the exponential number of candidate attributes to be indexed. Most of approaches for index selection follow two main steps: (1) pruning the search space (i.e., reducing the number of candidate attributes) and (2) selecting indices using the pruned search space. In this paper, we first propose a data mining driven approach to prune the search space of bitmap join index selection problem. As opposed to an existing our technique that only uses frequency of attributes in queries as a pruning metric, our technique uses not only frequencies, but also other parameters such as the size of dimension tables involved in the indexing process, size of each dimension tuple, and page size on disk. We then define a greedy algorithm to select bitmap join indices that minimize processing cost and verify storage constraint. Finally, in order to evaluate the efficiency of our approach, we compare it with some existing techniques.

  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper deals with the problem of integrated physical database design involving two optimization techniques: horizontal data partitioning (HDP) and bitmap join indexes (BJI). These techniques compete for the same resource representing selection attributes. This competition incurs attribute interchangeability phenomena, where same attribute(s) may be used to select either HDP or BJI schemes. Existing studies dealing with integrated physical database design problem not consider this competition. We propose to study its contribution on simplifying the complexity of our problem. Instead of tackling it in an integrated way, we propose to start by assigning to each technique its own attributes and then it launches its own selection algorithm. This assignment is done using the K-Means method. Our design is compared with the state of the art work using APB1 benchmark. The results show that an interchangeability attribute-aware database designer can improve significantly query performance within the less space budget.
    Advances in Databases and Information Systems - 15th International Conference, ADBIS 2011, Vienna, Austria, September 20-23, 2011. Proceedings; 01/2011
  • Annals of Information Systems, Special Issue on new trends in data warehousing and data analysis, Springer. 11/2008; 3:179-2001.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Afin de réduire le temps d’exécution des requêtes décisionnelles, l’administrateur a la possibilité de sélectionner des index de jointure binaires (). Cette sélection demeure une tâche difficile vue la complexité de l’espace de recherche à parcourir. De ce fait, un grand intérêt est porté à la mise en oeuvre d’algorithmes de sélection. Cependant, ces algorithmes sont statiques. Dans cet article, nous centrons nos travaux sur la sélection des index de jointures binaires définis sur plusieurs attributs appartenant à des tables de dimension en utilisant des algorithmes génétiques. Nous présentons deux types d’algorithmes: des algorithmes de sélection statiques et des algorithmes de sélection incrémentales qui prévoient l’adaptation des index sélectionnés à l’arrivée de nouvelles requêtes. Nous concluons nos travaux par une étude expérimentale démontrant l’apport de notre sélection des index de jointure binaires en comparaison avec les travaux de sélection statiques et incrémentales existants.Bitmap join indexes (BJI) have been widely advocated by administrators as a solution to optimize complex queries. Their selection remains hard, since it needs to explore a large search space. Only a few classes of algorithms were proposed to deal with the problem of BJI selection. These algorithms are static and do not take into account the changes of data warehouses in terms of query arrival. In this paper, we propose a genetic algorithm to select BJI defined on multiple attributes belonging to various dimension tables in the static way. This algorithm is extended to deal with the incremental aspect. An intensive experiment was conducted to show the efficiency of our proposal and to compare it with the most important existing studies.
    Journal of Decision Systems. 01/2012; 21(1):51-70.

Full-text (2 Sources)

Available from
May 27, 2014