Conference PaperPDF Available

UMLtoNoSQL: Automatic Transformation of Conceptual Schema to NoSQL Databases

Authors:
... On the other hand, this approach to modeling, which does not require a conceptual data model, may also cause potential problems [15,16] such as forgetting important domain concepts or their relationships, losing information, or even misunderstandings of the business rules that no stakeholder [17][18][19][20][21][22][23][24] notices because there is no representation of the data domain to be assessed. Researchers and companies which provide products and services for the commercial use of NoSQL databases [25,26], have studied and made recommendations for the design of NoSQL databases considering conceptual data models besides queries. ...
... The conceptual model is not only an important element in ensuring data integrity but also in modeling a logical data model [57]. Starting from a conceptual model, it is automatically transformed into a NoSQL schema [22] that can serve the queries with minimal cost [18], it is mapped to heterogeneous datastores [23], and MongoDB [17] and HBase [19] databases are designed. In [24], a tool is designed to generate implementations for Cassandra and MongoDB from the same conceptual data model. ...
Article
Full-text available
Current information technologies generate large amounts of data for management or further analysis, storing it in NoSQL databases which provide horizontal scaling and high performance, supporting many read/write operations per second. NoSQL column-oriented databases, such as Cassandra and HBase, are usually modelled following a query-driven approach, resulting in denormalized databases where the same data can be repeated in several tables. Therefore, maintaining data integrity relies on client applications to ensure that, for data changes that occur, the affected tables will be appropriately updated. We devise a method called MDICA that, given a data insertion at a conceptual level, determines the required actions to maintain database integrity in column-oriented databases. This method is implemented for Cassandra database applications. MDICA is based on the definition of (1) rules to determine the tables that will be impacted by the insertion, (2) procedures to generate the statements to ensure data integrity and (3) messages to warn the user about errors or potential problems. This method helps developers in two ways: generating the statements needed to maintain data integrity and producing messages to avoid problems such as loss of information, redundant repeated data or gaps of information in tables.
... Imam et al. [39] document-oriented --JSON Akintoye et al. [6] document-oriented, graph --JSON Martins de Sousa and del Val Cura [49] graph new based on ER new based on ER -Vágner [76] graph EER -Neo4j model Abdelhedi et al. [2] column --Cassandra and Hbase models Nogueira et al. [53] document-oriented --JSON Hamouda and Zainol [34] document-oriented -new based on UML -Abdelhedi et al. [5] document-oriented, graph, column -generic model generic model ...
... Benchmark [53] Evaluation [60] Guidelines [6], [9], [12], [14], [17], [20], [21], [29], [37], [38], [40], [43], [56], [61], [65], [68], [71], [72], [73], [74], [76], [79], [80] Migration [34] Ontology [12] Process Transform [1], [2], [3], [4], [5], [13], [23], [26], [28], [47], [48], [49], [50], [54], [58], [59], [62], [63], [66], [69], [75], [77], [78] Query Oriented [46] Schema Generation [39], [41], [52] A distribution of NoSQL databases types along with the contexts where the models were used is shown in Figure 8. ...
Article
Modeling is one of the most important steps in developing a database. In traditional databases, the Entity Relationship (ER) and Unified Modeling Language (UML) models are widely used. But how are NoSQL databases being modeled? We performed a systematic mapping review to answer three research questions to identify and analyze the levels of representation, models used, and contexts where the modeling process occurred in the main categories of NoSQL databases. We found 54 primary studies where we identified that conceptual and logical levels received more attention than the physical level of representation. The UML, ER, and new notation based on ER and UML were adapted to model NoSQL databases, in the same way, formats such as JSON, XML, and XMI were used to generate schemas through the three levels of representation. New contexts such as benchmark, evaluations, migration, and schema generation were identified, as well as new features to be considered for modeling NoSQL databases, such as the number of records by entities, CRUD operations, and system requirements (availability, consistency, or scalability). Additionally, a coupling and co-citation analysis was carried out to identify relevant works and researchers.
... UMLtoNoSQL. [1] also presents a process for the automatic generation of various physical models starting from a conceptual UML class diagram. This approach describes a forward engineering process that maps across a generic metamodel to different physical models. ...
... Abdelhedi et 's. [17] have got offered an automatic alteration regarding conceptual designs making use of Unified Modeling Terminology (UML) directly into NoSQL actual designs To our best knowledge, just a few functions presently can be found in order to issue information through NoSQL as well as relational data source techniques simultaneously. ...
Conference Paper
Full-text available
Any relational database can be an electronic digital database by using the relational model of files/data since it is offered simply by E. F. Codd in 1970. Any computer software method is utilized to sustain relational sources can be a relational databases management system (RDBMS). Several relational databases methods provide an alternative regarding while using the queries and also keeping the particular databases. Any NoSQL (originally discussing "non-SQL or perhaps "non-relational") database offers a mechanism for safekeeping of data and also a collection of info which is modeled. Because of NoSQL directories tend to be developing within recognition, integration associated with various NoSQL techniques as well as interoperability associated with NoSQL techniques along with SQL directories turn out to be an ever more essential issue. We have proposed the mixed method to extracting OLAP cubes from NoSQL data sources and make a standard data entry system with regard to NoSQL as well as SQL data source. Our proposed algorithm consists of 4 stages: shingling, chunk, minhashing as well as locality-sensitive hashing MapReduce (LSHMR). Every stage works an effective procedure upon entering NoSQL directories. Our proposed method and algorithm demonstrate over 75% involving productivity as opposed to other algorithms.
... Later the same author has proposed three approaches to implement a big data warehouse within the column-oriented NoSQL systems, each approach differ in terms of conceptual and logical model [10]. In [11] the author proposes a transformation approach to implement UML class diagram under column oriented NoSQL database. ...
Conference Paper
Full-text available
Nowadays, NoSQL technologies are gaining significant ground and considered as the future of data storage, especially when it comes to huge amount of data, which is the case of data warehouse solutions. NoSQL databases provide high scalability and good performance among relational ones, which are really time consuming and can’t handle large data volume. The growing popularity of the term NoSQL these days and vaguely related phrases like big data make us think about using this technology in decision support systems. The purpose of this paper is to investigate the possibility to instantiate a big data mart under one of the most popular and least complicated types of NoSQL databases; namely key-value store, the main challenge is to make a good correlation between the old-school approach of data warehousing based on traditional databases that favor data integrity, and interesting opportunities offered by new generation of database management systems. The paper describes the transformation process from multidimensional conceptual schema to the logical model following three approaches, and outlines a list of strengths and weaknesses for each one based on practical experience under Oracle NoSQL Database.
... Afin de le compléter, nous avons proposé le processus OCL2Java qui intègre des contraintes OCL plus complexes. Les travaux de ce chapitre ont été présentés dans les publications : [Abdelhedi et al., 2017d] et [Abdelhedi et al., 2018c]. ...
Thesis
It is widely accepted today that relational systems are not appropriate to handle Big Data. This has led to a new category of databases commonly known as NoSQL databases that were created in response to the needs for better scalability, higher flexibility and faster data access. These systems have proven their efficiency to store and query Big Data. Unfortunately, only few works have presented approaches to implement conceptual models describing Big Da-ta in NoSQL systems. This paper proposes an automatic MDA-based approach that provides a set of transformations, formalized with the QVT language, to translate UML conceptual models into NoSQL models. In our approach, we build an intermediate logical model compatible with column, document, graph and key-value systems. The advantage of using a unified logical model is that this model remains stable, even though the NoSQL system evolves over time which simplifies the transformation process and saves developers efforts and time.
Article
Due to the scalability and availability problems with traditional relational database systems, a variety of NoSQL stores have emerged over the last decade to deal with big data. How data are structured in a NoSQL store has a large impact on the query and update performance and the storage usage. Thus, different from the traditional database design, not only the data structure but also the data access patterns need to be considered in the design of NoSQL database schemas. In this paper, we present a general workload-driven method for designing key–value, wide-column, and document NoSQL database schemas. We first present a generic logical model Query Path Graph (QPG) that can represent the data structures of the UML class diagram. We also define mappings from the SQL-based query patterns to QPG and from QPG to aggregate-oriented NoSQL schemas. We use a cost model to measure the query and update performance and optimize the QPG schemas. We evaluate the proposed method with several typical case studies by simulating workloads on databases with different schema designs. The results demonstrate that our method preserves the generality and the quality of the design.
Book
Full-text available
The ER 2021 Demos and Posters track was part of the 40th International Conference on Conceptual Modeling (ER 2021) . The track aims to serve as a platform for presenting and discussing novel research ideas, addressing any ER conference topics, and new emerging topics related to conceptual modeling. We received 14 submissions, each of which was assigned to three program committee members. Based on their reviews, we accepted nine papers. The accepted papers reflect upcoming work and the directions of conceptual modeling research over the next few years. The varied topics in this track demonstrate that the conceptual modeling field is actively exploring new horizons. Every paper deals with an avantgarde topic, be it flexible schemas or augmented reality. Several papers leverage artificial intelligence capabilities. We also recognize the value of past technologies and past conceptual modeling approaches, such as relational databases or entity-relationship diagrams, which are being appropriately applied in new contexts.
Article
Full-text available
Today, the relational database is not suitable for data management due to the large variety and volume of data which are mostly untrusted. Therefore, NoSQL has attracted the attention of companies. Despite it being a proper choice for managing a variety of large volume data, there is a big challenge and difficulty in performing online analytical processing (OLAP) on NoSQL since it is schema-less. This article aims to introduce a model to overcome null value in converting document-oriented NoSQL databases into relational databases using parallel similarity techniques. The proposed model includes four phases, shingling, chunck, minhashing, and locality-sensitive hashing MapReduce (LSHMR). Each phase performs a proper process on input NoSQL databases. The main idea of LSHMR is based on the nature of both locality-sensitive hashing (LSH) and MapReduce (MR). In this article, the LSH similarity search technique is used on the MR framework to extract OLAP cubes. LSH is used to decrease the number of comparisons. Furthermore, MR enables efficient distributed and parallel computing. The proposed model is an efficient and suitable approach for extracting OLAP cubes from an NoSQL database.
Article
Full-text available
Relational database management systems (RDMBSs) today are the predominant technology for storing. In the past few years, the "one size fits all"-thinking concerning datastores has been questioned by both, science and web affine companies, which has lead to the emergence of a great variety of alternative databases. There has been an enormous growth in the distributed databases area in the last few years, especially with the NOSQL movement. Keeping this as a motivation, this paper aims at giving a systematic overview of DBMS, discusses about the change from traditional file processing to RDS & ends with NOSQL. Also we have focused on the projects dealt by the NOSQL models with their description & we have said about when it is best suitable. Lastly we have listed the compared features of NoSQL & SQL. Further our paper will help researchers to develop new projects by overcoming the drawbacks of existing or work on the existing one & add up features.
Conference Paper
Full-text available
Big Data has recently gained popularity and has strongly questioned relational databases as universal storage systems, especially in the presence of analytical workloads. As result, co-relational alternatives, commonly known as NOSQL (Not Only SQL) databases, are extensively used for Big Data. As the primary focus of NOSQL is on performance, NOSQL databases are directly designed at the physical level, and consequently the resulting schema is tailored to the dataset and access patterns of the problem in hand. However, we believe that NOSQL design can also benefit from traditional design approaches. In this paper we present a method to design databases for analytical workloads. Starting from the conceptual model and adopting the classical 3-phase design used for relational databases, we propose a novel design method considering the new features brought by NOSQL and encompassing relational and co-relational design altogether.
Conference Paper
Full-text available
The need to store and manipulate large volume of (unstructured) data has led to the development of several NoSQL databases for better scalability. Graph databases are a particular kind of NoSQL databases that have proven their efficiency to store and query highly interconnected data, and have become a promising solution for multiple applications. While the mapping of conceptual schemas to relational databases is a well-studied field of research, there are only few solutions that target conceptual modeling for NoSQL databases and none of them focusing in graph databases. This is specially true when dealing with the mapping of business rules and constraints in the conceptual schema. In this article we describe a possible mapping from UML/OCL conceptual schemas to Blueprints, an abstraction layer on top of a variety of graph databases, and Grem-lin, a graph traversal language via an intermediate Graph metamodel representing data structure. Tool support is fully available.
Conference Paper
Full-text available
We are currently witnessing an important paradigm shift in information system construction, namely the move from object and component technology to model technology. The object technology revolution has allowed the replacement of the over twenty-year-old step-wise procedural decomposition paradigm with the more fashionable object composition paradigm. Surprisingly, this evolution seems to have triggered another even more radical change, the current trend toward model transformation. A concrete example is the Object Management Group's rapid move from its previous Object Management Architecture vision to the latest Model-Driven Architecture. This paper proposes an interpretation of this evolution through abstract investigation. In order to stay as language-independent as possible, we have employed the neutral formalism of Sowa's conceptual graphs to describe the various situations characterizing this organization. This will allow us to identify potential problems in the proposed modeling framework and suggest some possible solutions.
Article
In order to reduce the influence of requirement change for software development and improve the efficiency and portability of software development efficiently, this paper, based on the ideas of Model Driven Architecture (MDA), proposes a method that transforms UML class diagrams into HBase based on Meta-model. The method achieves the transformation from Platform Independent Model (PIM) to Platform Specific Model (PSM) on the meta-model level and is comprised of three phases. In the first phase, the meta-models of UML class diagram and HBase database are built. In the second phase, the mapping rules between the two meta-models are proposed. In the last phase, the UML class diagram is built and the HBase database model is generated by transformation. At last, the paper uses Atlas language to achieve a breakfast serving system to prove the feasibility of the MDA in the software development.
Article
In this article, we attempt to address the relative absence of empirical studies of model driven engineering (MDE) in two different but complementary ways. First, we present an analysis of a large online survey of MDE deployment and experience that provides some rough quantitative measures of MDE practices in industry. Second, we supplement these figures with qualitative data obtained from some semi-structured, in-depth interviews with MDE practitioners, and, in particular, through describing the practices of four commercial organizations as they adopted a model driven engineering approach to their software development practices. Using in-depth semi-structured interviewing, we invited practitioners to reflect on their experiences and selected four to use as exemplars or case studies. In documenting some details of their attempts to deploy model driven practices, we identify a number of factors, in particular the importance of complex organizational, managerial and social factors–as opposed to simple technical factors–that appear to influence the relative success, or failure, of the endeavor. Three of the case study companies describe genuine success in their use of model driven development, but explain that as examples of organizational change management, the successful deployment of model driven engineering appears to require: a progressive and iterative approach; transparent organizational commitment and motivation; integration with existing organizational processes and a clear business focus.
Conference Paper
With the proliferation of cloud service providers, the use of non-relational (NoSQL) data stores is increasing. In contrast to standard relational database schema design, which has its strong mathematical background in relational algebra and set theory, development with NoSQL data stores is largely based on empirical best practices. Furthermore, the huge variety of NoSQL variants may require different design considerations. In this paper, an algorithm is introduced to automatically derive cost and performance optimal schema in column-oriented data stores based on predefined queries and an initial relational database schema. Algorithms are given to perform database denormalization, as well as to transform the original queries to meet the newly created schemas.
Article
With the development of distributed system and cloud computing, more and more applications might be migrated to the cloud to exploit its computing power and scalability, where the first task is data migration. In this paper, we propose a novel approach that transforms a relational database into HBase, which is an open-source distributed database similar to BigTable. Our method is comprised of two phases. In the first phase, relational schema is transformed into HBase schema based on the data model of HBase. We present three guidelines in this phase, which could be further utilized to develop an HBase application. In the second phase, relationships between two schémas are expressed as a set of nested schema mappings, which would be employed to create a set of queries or programs that transform the source relational data into the target representation automatically.
Conference Paper
In this paper, we attempt to address the relative absence of empirical studies of model driven engineering through describing the practices of three commercial organizations as they adopted a model driven engineering approach to their software development. Using in-depth semi-structured interviewing we invited practitioners to reflect on their experiences and selected three to use as exemplars or case studies. In documenting some details of attempts to deploy model driven practices, we identify some ‘lessons learned’, in particular the importance of complex organizational, managerial and social factors – as opposed to simple technical factors – in the relative success, or failure, of the endeavour. As an example of organizational change management the successful deployment of model driven engineering appears to require: a progressive and iterative approach; transparent organizational commitment and motivation; integration with existing organizational processes and a clear business focus.
Column-stores vs. row-stores
  • D Abadi
  • S Madden
  • N Hachem