Figure 1 - uploaded by João Ricardo Lourenço
Content may be subject to copyright.
CAP theorem with databases that “choose” CA, CP and AP

CAP theorem with databases that “choose” CA, CP and AP

Source publication
Article
Full-text available
For over forty years, relational databases have been the leading model for data storage, retrieval and management. However, due to increasing needs for scalability and performance, alternative systems have emerged, namely NoSQL technology. The rising interest in NoSQL technology, as well as the growth in the number of use case scenarios, over the l...

Context in source publication

Context 1
... data at the same time [42]. Indeed, of Brewer's CAP theorem, most databases choose to be "AP", meaning they provide Availability and Partition-Tolerance. Since Partition-Tolerance is a property that often cannot be traded off, Availability and Consistency are juggled, with most databases sacrificing more consistency than availability [43]. In Fig. 1, an illustration of CAP is ...

Citations

... NoSQL databases like HBase, MongoDB, CouchDB, and Neo4j. Offer the scalability feature [7] which allows distributed computing possibilities. This feature allows storage of huge data in a distributed fashion in different commodity machines and hence these types of databases are mainly used for bigdata and real-time web applications [8], [9]. ...
Article
Full-text available
Traditional database systems like relational databases can store data which are structured with predefined schema, but in the case of bigdata, the data comes in different formats or are collected from diverse sources. The distributed databases like not only spark querying language (NoSQL) repositories are often used in relation to bigdata analytics, but a continual updating is required in business because of the streaming data that comes from stock trading, online activities of website visitors, and from the mobile applications in real time. It will not have to delay, for some report to show up, to assess and analyse the current situation, to move forward with the next business choice. Apache Spark’s structured streaming offer capabilities for handling streaming data in a batch processing mode with faster responses compared to MongoDB which is a document-based NoSQL database. This study completes similar queries to evaluate Spark SQL and NoSQL database performance, focusing on the upsides of Spark SQL over NoSQL databases in streaming data exploration. The queries are completed with streaming data stored in a batch mode.
... Relational databasesoperate based on a principle called ACID (Atomicity, Consistency, Isolation, Durability). These properties made them the most widely used databases (Lourenço, Cabral, Carreiro, Vieira, & Bernardino, 2015;Khan, et al., 2023;Xia, Yu, Butrovich, Pavlo, & Devadas, 2022;Begum & Chitra, 2020;Yashraj & Yashasvi, 2019;Modhiya, 2021). However, the complexity and volume of data stored and organized presented reduced the efficiency of queries and also storage. ...
... However, the complexity and volume of data stored and organized presented reduced the efficiency of queries and also storage. Non-relational databases were developed to solve some of the problems of relational databases, in particular, they lose the ACID properties in order to make them more available and scalable (Lourenço, Cabral, Carreiro, Vieira, & Bernardino, 2015). This scenario presents a trade-off which necessitate an evaluation benchmark for SQL and NoSQL databases. ...
Article
Full-text available
This research analyzes and designs an evaluation benchmark for SQL and NoSQLdatabase systems, specifically tailored for higher institutions in Zamfara State. By benchmarking various database operations, including insert, select, update, delete, and stored procedures, we identify performance differences between the two systems. MongoDB demonstrates superior performance in handling insert, select, update, and delete operations, making it ideal for applications with high transactional demands. Conversely, MySQL excels in executing stored procedures due to its native support, which is crucial for complex procedural logic. Considering that insert, select, update, and delete operations are more common, MongoDB is generally recommended. However, for applications heavily reliant on stored procedures, MySQL remains the preferred choice. Additionally, the complexity of converting an existing MySQL database to MongoDB should be factored into the decision-making process. This study provides valuable insights and recommendations to help institutions in Zamfara State choose the most suitable database system based on their specific operational needs and performance requirements.
... Projektant systemu informatycznego musi podjąć decyzję odnośnie wyboru systemu baz danych. Może wybierać pomiędzy RDBMS opartymi na właściwościach ACID (Atomicity, Consistency, Isolation, Durability) lub NoSQL oferującymi odmienne podejście do dostępu do danych oraz ich przechowywania (Lourenço et al. (2015)). ...
... Charakterystyki baz danych zostały opisane przez Lourenço et al. (2015), który szukał w swojej pracy odpowiedzi na pytanie: "Czy obecnie istnieje wystarczająca wiedza na temat atrybutów jakości w systemach NoSQL, aby wspomóc proces decyzyjny inżyniera oprogramowania?". Autorzy dokonali ewaluacji popularnych systemów baz danych NoSQL oraz skoncentrowali się na wybranych atrybutach jakościowych: ...
... Każda metryka w tym wektorze reprezentuje różny aspekt lub właściwość bazy danych, a cały wektor dostarcza kompleksowy obraz charakterystyk danej bazy. Przykładowy wektor charakterystyk wykorzystany w metodzie: ref_db_metrics = [25,20,15,10,5,30,25,20,15,10] W tym przypadku, ref_db_metrics to przykładowy wektor charakterystyk dla referencyjnej bazy danych lub wektor wymagań nie-funkcjonalnych. Każda liczba w wektorze odpowiada wartości danej metryki. ...
Preprint
Full-text available
Artykuł skupia się na istotności decyzji architektonicznych w zakresie wyboru bazy danych, podkreślając ich wpływ na realizację celów biznesowych i jakość projektowanego systemu informatycznego. W treści dokonano systematyki nazewnictwa związanego z architekturą systemów, wskazano kluczowy wpływ interesariuszy, doświadczenia architektów oraz wskazano standardy związane z tematyką projektowania architektury i jakości. Autor koncentruje się na opracowaniu uniwersalnej metody, wykorzystującej charakterystyki jakościowe wraz z ich priorytetyzacją i algorytm oparty na metodzie kNN (k-najbliższych sąsiadów), umożliwiającej optymalny dobór bazy danych z uwzględnieniem postawionych ograniczeń oraz wymagań nie-funkcjonalnych. Omówiono wyniki badania oraz wskazano dalsze kierunki pod względem zastosowania metody w stosunku do innych komponentów systemów informatycznych. Słowa kluczowe: architektura systemów, bazy danych, jakość, decyzje architektoniczne.
... In their survey, Nayak et al. (Nayak et al., 2013) analyse the different types and characteristics of SQL and NoSQL systems, while in (Mohamed et al., 2014;Sahatqija et al., 2018), SQL/NoSQL data stores are compared in terms of main features, such as scalability, query language, security issues, etc. Along the same lines, (Jatana et al., 2012) provides a general comparison of relational and non-relational data stores, while (Lourenco et al., 2015) reviews NoSQL data stores in terms of the consistency and durability of the data stored, as well as with respect to their performance and scalability; the results indicate that MongoDB can be the successor of SQL databases, since it provides good stability and consistency of data. ...
Conference Paper
Full-text available
The amount of textual data produced nowadays is constantly increasing as the number and variety of both new and reproduced textual information created by humans and (lately) also by bots is unprecedented. Storing, handling and querying such high volumes of textual data have become more challenging than ever and both research and industry have been using various alternatives, ranging from typical Relational Database Management Systems to specialised text engines and NoSQL databases, in an effort to cope with the volume. However, all these decisions are, largely, based on experience or personal preference for one system over another, since there is no performance comparison study that compares the available solutions regarding full-text search and retrieval. In this work, we fill this gap in the literature by systematically comparing four popular databases in full-text search scenarios and reporting their performance across different datasets, full-text search operators and parameters. To the best of our knowledge, our study is the first to go beyond the comparison of characteristics , like expressiveness of the query language or popularity, and actually compare popular relational, NoSQL, and textual data stores in terms of retrieval efficiency for full-text search. Moreover, our findings quantify the differences in full-text search performance between the examined solutions and reveal both anticipated and less anticipated results.
... Traditional databases cannot satisfy the storage and query of these unstructured data. Literature mainly proposed that NoSQL database was proposed at the end of the twentieth century, but at that time this kind of database could not support the traditional standard SQL interface (Lourenço et al. 2015). It was not until the early 2010s that this database began to support a standard SQL interface and was called a distributed database. ...
Article
Full-text available
With the continuous development of science and technology, we have fully entered the information age, people's entertainment life is becoming more and more abundant, and the Internet has also provided people with a lot of convenience. The advent of the Internet age means that more and more various kinds of data are appearing, and the situation is becoming more and more abundant. The use of traditional relational databases can no longer store these data, nor can it be queried. With the continuous development of voice technology, database technology based on NoSQL has become a research hotspot. The availability of NoSQL databases is very high, the scalability is also very high, and the efficiency is very high when processing data. Based on the development of 5G networks and the development of big data technology, this research proposes a brand-new business architecture. This architecture can use the network to store data on the basis of massive data. At the same time, we also described the business scenario, providing a new idea for more intelligent education services. We use this model for teaching in schools, and we can transmit some spoken language resources to the school through the Internet for students to use for learning. Nowadays, the application range of AI technology and intelligent technology has become more and more extensive, and these technologies have also been applied in the education field. We can apply this brand-new technology in teaching to promote the development of teaching and improve students’ enthusiasm and learning effect.
... We suggest a scale based on Lourenço et al. [71] to interpret the quality scores. The scale subdivides the quality score into five groups (i.e. ...
Article
Full-text available
Trustworthy data in the Industrial Internet of Things are paramount to ensure correct strategic decision-making and accurate actions on the shop floor. However, the enormous amount of industrial data generated by a variety of sources (e.g. machines and sensors) is often of poor quality (e.g. unreliable sensor readings). Research suggests that certain characteristics of data sources (e.g. battery-powered power supply and wireless communication) contribute to this poor data quality. Nonetheless, to date, much of the research on data trustworthiness has only focused on data values to determine trustworthiness. Consequently, we propose to pay more attention to the characteristics of data sources in the context of data trustworthiness. Thus, this article presents an approach for assessing Industrial Internet of Things data sources to determine their data trustworthiness. The approach is based on a meta-model decomposing data sources into data stores (e.g. databases) and providers (e.g. sensors). Furthermore, the approach provides a quality model comprising quality-related characteristics of data stores to determine their data trustworthiness. Moreover, a catalogue containing properties of data providers is presented to infer the trustworthiness of their provided data. An industrial case study revealed a moderate correlation between the data source assessments of the proposed approach and experts.
... Comparison between SQL and NoSQL both databases[12] ...
Article
Full-text available
Data has always been the company's most valuable resource because it can be used for analysis, decision-making, and judgement. Hard data handling necessitates the use of complicated cache and accessibility concepts. The effectiveness of SQL and NoSQL database systems for producing scientific data is examined in this study. SQL databases and NoSQL databases are the most popular and structured types of database solutions. Another name for the SQL database is RDBMS (Relational Database Management System). Associations or tables are used to organize the data. A NoSQL database is a non-relational database management system. NoSQL databases, a new type of database system, were created to address this issue by providing an unstructured platform and scalability for large data applications. The term "NoSQL" refers to more than just SQL. Wide column stores, documents, graph databases, and key-value pairs are a few NoSQL database types that do not have the necessary standard structure. Additionally, in RDBMS, it might scale horizontally rather than vertically. To compare SQL and NoSQL databases, the data is organized in unstructured tables or relationships. Both of them are open source. The experiment assessed and supported database loading, response, and retrieval times for both SQL and NoSQL databases to discover if a database is smoother, more efficient, and performant.
... NoSQL stands for ''Not Only SQL,'' which explains there exists a flexible schema but not restricted as RDBMS's schema. Milestone development in NoSQL has been triggered by the development of Google's Big Table [69] and Amazon's DynamoDB [70], [71]. NoSQL data stores provide capabilities such as storing large amounts of big data in different formats under a flexible and distributed schema, achieving horizontal scalability on commodity hardware, distributing copies of data across machines to increase availability and performance, and eliminating a single point of failure [72]. ...
Article
Full-text available
Data is one of the most valuable assets in the digital era because it may conceal hidden valuable insights. Diverse organizations in diverse domains overcome the challenges of the big data value chain by employing a wide range of technologies to meet their needs and achieve a variety of goals to support their decision-making. Due to the significance of data-oriented technologies, this paper presents a model of the big data value chain based on technologies used in the acquisition, storage, and analysis of data. The following are the paper’s contributions: First, a model of the big data value chain is developed to illustrate a comprehensive representation of the big data value chain that depicts the relationships between the characteristics of big data and the technologies associated with each category. Second, in contrast to previous research, this paper presents an overview of technologies for each category of the big data value chain. The third contribution of this paper is to assist researchers and developers of data-intensive systems in selecting the appropriate technology for their specific application development use cases by providing examples of applications and use cases from prominent papers in a variety of fields and by describing the capabilities and stages of the technologies being presented so that the right technology is used at the right time in the big data collection, processing, storage, and analytics tasks.
... The benefits of using ONDMs include simplifying porting of an application to other NoSQL data stores and database interoperability as well as polyglot persistence [9]. There are ONDMs called Multi Data Store Mappers supporting multiple NoSQL data stores and ONDMs called Single Data Store Mappers supporting only a particular system [10]. ...
... This RDBMS was file-based and lacked a SQL interface. NoSQL, often known as non-relational databases, was first introduced in 2009 by Eric Evans [4,5]. It is recognized as a promising database to handle massive data. ...