Figure 2 - uploaded by Hamzeh Khazaei
Content may be subject to copyright.
Visualization of CAP theorem. 

Visualization of CAP theorem. 

Source publication
Article
Full-text available
With the advent of the Internet of Things (IoT) and cloud computing , the need for data stores that would be able to store and process big data in an efficient and cost-effective manner has increased dramatically. Traditional data stores seem to have numerous limitations in addressing such requirements. NoSQL data stores have been designed and impl...

Similar publications

Article
Full-text available
State-of-the-art publish/subscribe systems are efficient when the subscriptions are relatively static—for instance, the set of followers in Twitter—or can fit in memory. However, now-a-days, many big data and IoT based applications follow a highly dynamic query paradigm, where both continuous queries and data entries are in the millions and can arr...
Presentation
Full-text available
O mundo vem produzindo uma grande quantidade de dados atualmente. Com a internet das coisas, temos uma rede de dispositivos capazes de coletar, transmitir e processar dados e com essa grande quantidade de dados, o armazenamento e processamento destes é o novo desafio. Para melhor compreensão dessa mudança de paradigma, é necessário a caracterização...

Citations

... Therefore, it is significant to have a study in place that discusses all possible architectural aspects of a data management system in depth. Although, multiple qualitative comparisons for NoSQL and NewSQL data management systems have been performed in recent times [83], [94], [80], [95], [93], [86], [87], [85], [101], [97], [6], [100], [83], [3], [75], [96], [18] but some wellknown systems are not considered including MariaDB [58], VoltDB [55], NuoDB [54], and MemSQL [57]. Justification regarding selection of the systems for comparative analysis has also not been provided. ...
... The architectural aspects have also not been analyzed in detail. The research works [94], [80], [95], [93], [86], [87], [85], [101], [97], [6], [100], and [83] have also evaluated NoSQL systems but did not extensively explore the architecture of these systems in detail. ...
Article
Full-text available
With the recent trend towards big data, a number of scalable data management systems: NoSQL and NewSQL are developed to manage massive data effectively. The algorithms involved in the architectural design of a data management system defines the response time of an application. The behavior and performance of different NoSQL and NewSQL systems vary on the basis of these architectural aspects. Hence, the architectural assessment of a data management system is a vital task to perform in order to understand their weaknesses and strengths. Therefore, this paper assesses the architecture of some well-known NoSQL and NewSQL systems in detail. To enhance the clarity of discussion and analysis, we identified and grouped together the logically related architectural features, forming a feature vector (FV). Feature vectors related to transactional properties, fault tolerance, data storage, and data handling are designed and involved in architectural assessment. Various significant features are identified and assigned to a feature vector. Some well-known NoSQL and NewSQL systems are analyzed, compared, and discussed in depth with respect to these feature vectors. The discussion involves describing the algorithms used in implementation of a particular architectural feature by each of the systems and their suitability analysis in various scenarios. Important guidelines are presented that helps in filtering the potential data management systems on the basis of application requirements.
... In their paper Khazaei et al. [6] explore some of the popular NoSQL databases and do a performance evaluation and describe the existing literature.The authors have in detailed described the characteristics of a NoSQL solution and how the NoSQL Databases have loosened up on the CAP theorem and resulted in BASE (Basically Available, Softstate, Eventually consistent) systems. The authors compared various bench-marking tools like YCSB, PigMix, GRIDMix, CALDA etc and in the end YCSB was chosen due to its flexibility for extension and modification. ...
... [3] Comparing MongoDB and MySQL using manual DML operations Bjeladinovic et al. [4] Proposed a new Hybrid Database model using RDBMS and NoSQL Yusuf etal. [5] {Evaluation of multiple NoSQL databases using YCSB Khazaei et al. [6] Comparison of various bench-marking tools like YCSB, PigMix, GRIDMix, CALDA Matallah et al. [7] Comparison of MongoDB and Hbase using YCSB custom workloads Aboutorabi et al. [8] Evaluate the performance of MongoDB and MySQL for a large eCommerce application ...
Technical Report
Full-text available
Databases are backbone of any Business application and it is of the utmost importance that the database serving the application stands out with respect to performance, availability, scalability, data integrity and security. Recently we have seen a sea of new cloud data serving databases which cater to cloud OLTP (online transaction processing) applications though they do not support ACID ((Atomicity, Consistency, Isolation, Durability)) transactions to a very great extent. Examples of such systems are MongoDB, HBase, Cassandra etc. They are also called as NoSQL (schema-less) systems. On the other hand we have traditional RDBMS systems which support ACID transactions and are widely used for a host of application types. It is becoming extremely important to measure the performance of databases with respect to certain parameters and decide which DBMS system (NoSQL or RDBMS) is best suited for the business needs. In this report we will try to replicate low and high volume application operations into MongoDB and MySQL databases using Yahoo! Cloud Serving Benchmark (YCSB) tool and analyze the performance differences between both the systems using the quantitative output generated by YCSB. The report describes the experimental setup to perform the test and evaluation of the results.
... It allows fast and reliable access to various important data and helps in making decisions within the company. It permits a proper functioning through enhanced decision analysis [1]. To respond to market requirements and technological developments that are continuously increasing, DW must implies an integration in this era of massive data storage otherwise called Big Data. ...
Article
Full-text available
As Big Data applications grow, many existing systems expect expanding their service to cover data dramatic increase. New software development systems are no longer working on a single database but on current multidatabases. These distributed data sources are under the name of NoSQL (Not only Structured Query Language) databases. Several companies try taking advantages from these technologies but without leaving their traditional systems. Especially, Data Warehouses (DW) are conceived based on users’ feedbacks. To allow and support this integration, a mechanism that takes data from NoSQL databases and stores it into relational databases is needed to have great added value without impacting organization existing systems. This paper proposes an integration algorithm to support hybrid database architecture, including MongoDB and MySQL, by allowing users to query data from NoSQL systems into relational SQL (Structured Query Language) systems.
... As opposed to relational databases-in which redundancy is frowned upon-column families support de-normalization. Most column stores are linked to analytical frameworks such as MapReduce [25], which enables fast analytics. ...
... A document in a document-oriented database contains data that are de-normalized, semi-structured, and stored hierarchically in the form of key-value pairs (such as JSON and BSON) [33]. Documents (even of the same type) do not require a uniform structure [25]. Document stores usually support secondary indexes, which assist in full-text search and retrieval. ...
... In Graph databases, the data are represented as a set of vertices, linked together by edges [32]. These are designed to fill the gap between graph data modeling requirements and the tabular abstraction in traditional databases [25] by explicating the links between vertices. Most graph database providers implement "property graphs" [17,27], in which both nodes and edges may possess properties in the form of key-value pairs and a name, and the edges are binary and directed. ...
Article
Full-text available
Over the last decade, a range of new database solutions and technologies have emerged, in line with the new types of applications and requirements that they facilitate. Consequently, various new methods for designing these new databases have evolved, in order to keep pace with progress in the field. In this paper, we systematically review these methods, with a view to better understanding their suitability for designing new database solutions. The study shows that while research in the field has expanded continuously, a range of factors still require further attention. The study identified important criteria in database design and analyzed existing studies accordingly. This analysis will assist in defining and recommending key areas for future research, guiding the evolution of design methods, their usability and adaptability in real-world scenarios. The study found that current database design methods do not address non-functional requirements; tend to refer to a preselected database; and are lacking in their evaluation.
... In document databases, each document can be formatted differently, and new structured data can be added without changing the existing documents [33]. Data can have a nested structure and document stores often use internal notations, which can be processed directly in applications [34]. For the diversity of data structures of SOOC and the frequent access of users to the same data when testing algorithms, databases of the key-value model and document model, which are used in the proposed service architecture, are suitable for storing data of various structures and formats. ...
Article
Full-text available
With the advancement of various technologies, the research and application of space object optical characteristic (SOOC), one of the main characteristics of space objects, are faced with new challenges. Current diverse structures of massive SOOC data cannot be stored and retrieved effectively. Moreover, SOOC processing and application platforms are inconvenient to build and deploy, while researchers’ innovative algorithms cannot be applied effectively, thereby limiting the promotion of the research achievements. To provide a scaffolding platform for users with different needs, this paper proposes SOOCP, a SOOC data and analysis service platform based on microservice architecture. Using the hybrid Structured Query Language (SQL)/NoSQL service, the platform provides efficient data storage and retrieval services for users at different levels. For promoting research achievements and reusing existing online services, the proposed heterogeneous function integration service assists researchers and developers in independently integrating algorithmic modules, functional modules, and existing online services to meet high concurrency requests with a unified interface. To evaluate the platform, three research cases with different requirement levels were considered. The results showed that SOOCP performs well by providing various data and function integration services for different levels of demand.
... This type of logs is not very convenient to analyze [4,5]. The easiest way is to make some statistics with small step, less than 10 minutes. ...
Article
Full-text available
In this paper, the existing approaches to the assessment of computer networks performance are considered. The standard structure of a network of the application layer of the OSI model using the example of SBIS3 application (product of Tensor Company) is treated.Further, two approaches allowing to analyze degradations in a network are considered - on the basis of aggregated data and the operational analysis.The degradation study of more than 60 000 request types between two versions of application which works on the basis of the computer network is the cornerstone of the first decision. Each type of requests is described by four based metrics, each metrics representing a time series. The input data are aggregated every 10 minutes before an analysis algorithm. Further, the threshold criteria based on mathematical expectation and dispersion within two adjacent versions of the software are used. Such an approach allows to significantly reduce time for the analysis of potential problems in case of updates within the computer network.The second decision is based on not aggregated input data. It consists of detail information about all requests, there are data section of the computer network. A threshold criterion is based on durations in the selected queue. This analysis type allows to diagnose the errors with problem clients.
... In addition, this paper also discussed how to select an appropriate NoSQL database from existing databases. The decision-making factors include data analysis, hardware scalability (horizontally scalable and BASE [3,4]), flexibility schema, fast deployment of servers (replication and sharding configuration), distributed technology, etc. (4) Khazaei et al. [9] illustrated the basic concepts of four popular NoSQL database models and evaluated some databases for each model. In this paper, the authors discussed several factors to be considered in order to select an appropriate NoSQL database, such as data model, access patterns, queries, non-functional requirements (including data access performance, replication, partition, horizontally scalable, BASE [3,4], software development and maintenance, etc.). ...
Article
Full-text available
The popularization of big data makes the enterprise need to store more and more data. The data in the enterprise’s database must be accessed as fast as possible, but the Relational Database (RDB) has the speed limitation due to the join operation. Many enterprises have changed to use a NoSQL database, which can meet the requirement of fast data access. However, there are more than hundreds of NoSQL databases. It is important to select a suitable NoSQL database for a certain enterprise because this decision will affect the performance of the enterprise operations. In this paper, fifteen categories of NoSQL databases will be introduced to find out the characteristics of every category. Some principles and examples are proposed to choose an appropriate NoSQL database for different industries.
... Like the CAP Theorem in distributed system design [19], there is a well-known Blockchin Trilemma in designing distributed ledger systems. According to Buterin [20], the founder of Ethereum, a BC platform can only fundamentally achieve 2 out of the following 3 traits at one time: ...
Conference Paper
Full-text available
In recent years, Distributed Ledger Technology (DLT) has been playing a more and more important role in building trust and security for Internet of Things (IoT). However, the unacceptable performance of the current mainstream DLT systems such as Bitcoin can hardly meet the efficiency and scalability requirements of IoT. In this paper, we propose a scalable transactive smart homes infrastructure by leveraging a Directed Acyclic Graph (DAG) based DLT and following the separation of concerns (SOC) design principle. Based on the proposed solution, an experiment with 40 Home Nodes is conducted to prove the concepts. From the results, we find that our solution provides a high transaction speed and scalability, as well as good performance on security and micropayment which are important in IoT settings. Then, we conduct an analysis and discuss how the new system breaks out the well-known Trilemma, which claims that it is hard for a DLT platform to simultaneously reach decentralization, scalability and security. Finally, we conclude that the proposed DAG-based distributed ledger is an effective solution for building an IoT infrastructure for smart communities.
... Like the CAP Theorem in distributed system design [19], there is a well-known Blockchin Trilemma in designing distributed ledger systems. According to Buterin [20], the founder of Ethereum, a BC platform can only fundamentally achieve 2 out of the following 3 traits at one time: ...
Conference Paper
Full-text available
In recent years, Distributed Ledger Technology (DLT) has been playing a more and more important role in building trust and security for Internet of Things (IoT). However, the unacceptable performance of the current mainstream DLT systems such as Bitcoin can hardly meet the efficiency and scalability requirements of IoT. In this paper, we propose a scalable transactive smart homes infrastructure by leveraging a Directed Acyclic Graph (DAG) based DLT and following the separation of concerns (SOC) design principle. Based on the proposed solution, an experiment with 40 Home Nodes is conducted to prove the concepts. From the results, we find that our solution provides a high transaction speed and scalability, as well as good performance on security and micropayment which are important in IoT settings. Then, we conduct an analysis and discuss how the new system breaks out the well-known Trilemma, which claims that it is hard for a DLT platform to simultaneously reach decentralization, scalability and security. Finally, we conclude that the proposed DAG-based distributed ledger is an effective solution for building an IoT infrastructure for smart communities.
... Many technologies emerged in the database field. Various surveys, such as [3,4,6,8,9,12,14], discuss characteristics, capabilities and benefits of various database technologies. These characteristics include technical aspects such as supported query languages, index implementation, availability, consistency, etc. ...
... A path finding query gets a complexity value 5. Complexity values 2-4 are assigned to queries involving filtering, joins and advanced search, respectively, representing the complexity of these operations. We assign these values of complexity based on different surveys [2,4,6], yet, it is might be that another analysis would results in different values. ...
... Based on the literature on database technology [2,6], in the following we present a list of possible NFRs. Each NFR is associated with a weight that determines its importance. ...