A Model-Driven Heuristic Approach for Detecting Multidimensional Facts in Relational Data Sources

Lucentia Research Group Dept. of Software and Computing Systems, University of Alicante, Spain
DOI: 10.1007/978-3-642-15105-7_2
Source: DBLP

ABSTRACT Facts are multidimensional concepts of primary interests for knowledge workers because they are related to events occurring
dynamically in an organization. Normally, these concepts are modeled in operational data sources as tables. Thus, one of the
main steps in conceptual design of a data warehouse is to detect the tables that model facts. However, this task may require
a high level of expertise in the application domain, and is often tedious and time-consuming for designers. To overcome these
problems, a comprehensive model-driven approach is presented in this paper to support designers in: (1) obtaining a CWM model
of business-related relational tables, (2) determining which elements of this model can be considered as facts, and (3) deriving
their counterparts in a multidimensional schema. Several heuristics –based on structural information derived from data sources–
have been defined to this end and included in a set of Query/View/Transformation model transformations.

  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper the authors present a semi-automatic method for designing and generating spatial data cubes in order to visualize and analyze the results of simulation models. In the authors approach users choose their fact of analysis, then the system derives automatically a set of possible measures, dimensions of analysis, and generates the corresponding spatial data cubes. The analysis and visualization of the spatial data cubes are carried out using appropriate SOLAP tools. They also present the SimOLAP system, developed to validate their approach.
    International Journal of Data Warehousing and Mining 01/2013; 9:70-95. DOI:10.4018/jdwm.2013010104 · 0.79 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Au cours de ces dernières années, plusieurs approches ont abordé la modélisation et le développement des Entrepôts de Données (ED). La plupart de ces approches fournit des solutions partielles qui traitent soit la modélisation multidimensionnelle, soit la modélisation des processus d’Extraction-Transformation-Chargement. Toutefois, peu de travaux ont visé à unifier ces deux problématiques dans un cadre structuré ou à automatiser le processus d’entreposage complet. Afin de pallier ces limites, nous proposons dans ce papier une démarche unifiée et automatique qui intègre la modélisation des ED et des processus ETL. Cette démarche est définie dans le cadre d’une Architecture Dirigée par les Modèles (MDA). Elle permet (i) de formaliser les besoins des décideurs, ensuite (ii) de générer les modèles conceptuel, logique et physique de l’ED et des processus ETL conjoints (iii) ainsi que les codes de création et d’alimentation (ETL) des structures multidimensionnelles. Les règles de transformation entre modèles sont formalisées en Query/View/Transformation.During the last few years, several frameworks have dealt with Data Warehousing (DW) design issues. Most of these frameworks provide partial answers that focus either on multidimensional modelling (MD) or on Extraction-Transformation-Loading modelling (ETL). However, less attention has been given neither to unifying both modelling issues into a single structured framework nor to automating the warehousing process. To overcome these limits, this paper provides a generic unified and automated method that integrates DW and ETL processes design. The framework is handled within the Model Driven Architecture (MDA). It (i) first helps the designer in modelling the decision-maker’s requirements and then (ii) generates the MD model as well as (iii) the logical and the physical models and finally (iv) generates the code. In this approach, the transformation rules are formalized using the Query/View/Transformation (QVT) language.
    Journal of Decision System 01/2012; 21(1):27-49. DOI:10.1080/12460125.2012.677641
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A data warehouse (DW) is a large data repository system designed for decision-making purposes. Its design relies on a specific model called multidimensional. This multidimensional model supports analyses of huge volumes of data that trace the enterprise's activities over time. Several design methods were proposed to build multidimensional schemas from either the relational data model or the entity-relationship data model. Almost all proposals that treated the object-oriented data model assume the presence of the data source UML class diagram. However, in practice, either such a diagram does not exist or is obsolete due to multiple changes/evolutions of the information system. Furthermore, these few proposals require an intense manual intervention of the designer, which requires a high expertise both in the DW domain and in the object database domain. To overcome these disadvantages, this work proposes an automatic DW schema design method starting from an object database (schema and its instances). This method applies a set of extraction rules to identify multi-dimensional concepts and to generate star schemas. It is defined for the standard ODMG model and, thus, can be adapted with slight changes for other object database models. In addition, its extraction rules have the merit of being independent of the domain semantics. Furthermore, they automatically generate schemas classified according to their analytical potential; this classification helps the DW designer in selecting the most relevant schemas among the generated ones. Finally, being automatic, our method is supported by a tool-set that also prepares for the automatic generation of the Extract Transform and Load procedures used to load the DW.
    International Journal of Information Technology and Decision Making 11/2013; 12(6-06):1223-1259. DOI:10.1142/S0219622013500351 · 1.89 Impact Factor

Full-text (4 Sources)

Available from
May 23, 2014