ArticlePDF Available

Big Data with Column Oriented NOSQL Database to Overcome the Drawbacks of Relational Databases

Authors:

Abstract

Due to the Era of Big Data with the large amount of distributed databases in the web and the rapid growth in the smart systems a rapid growth happening in database models and the relational database fails to dealing with such a big amount of data and have many limitations the need to new technologies comes up, which makes DBMS developers move towards column oriented NOSQL database. The main goal of this paper is to provide a survey on NOSQL Model especiallya column oriented NOSQL database, providing the user with the benefit of using NOSQL database, Instead of using the (row database) relational to overcome the drawbacks of the relational database Model.
Int. J. Advanced Networking and Applications
Volume: 11 Issue: 05 Pages: 4423-4428(2020) ISSN: 0975-0290
4423
Big Data with Column Oriented NOSQL
Database to Overcome the Drawbacks of
Relational Databases
NaglaaSaeedShehata
Faculty of Computers &artificial intelligence,Helwan University, Cairo,egypt
Email:nagla_sd@yahoo.com
Amira Hassan Abed
Egyptian Organization for Standardisation & Quality, Cairo,egypt
Email:Mirahassan61286@gmail.com
---------------------------------------------------------------------- ABSTRACT-----------------------------------------------------------
Due to the Era of Big Data with the large amount of distributed databases in the web and the rapid growth in the
smart systems a rapid growth happening in database models and the relational database fails to dealing with such
a big amount of data and have many limitations the need to new technologies comes up, which makes DBMS
developers move towards column oriented NOSQL database. The main goal of this paper is to provide a survey on
NOSQL Model especiallya column oriented NOSQL database, providing the user with the benefit of using
NOSQL database, Instead of using the (row database) relational to overcome the drawbacks of the relational
database Model.
Keywords - Relational Databases, NoSQL, Columnar Database, BASE properties.
----------------------------------------------------------------------------------------------------------------------------- ---------------------
Date of Submission: Feb 16, 2020 Date of Acceptance: May 08, 2020
----------------------------------------------------------------------------------------------------------------------------- ----------------------
1. INTRODUCTION
The rapid growth of the web technologies and
cloud applications that changes the nature of stored
data which included social media information's,
transactions, online purchases and because of the
relational database model scalability issues, the need
of new and easier approach arises to overcome those
problems researchers provides NOSQL Model [1],
that provides a new data stores techniques rather than
the relational database tabular data store.
1.1 Relational Database
The relational Database model presented by coded
in the 1960s and "it's the model that deals with data
and organizes it into one or more tables, or we
could call it a relations, which consists of columns
and rows, defining that relation with a unique
key"[2][3] identifying each row "Primary Key " we
also called the Rows in this kind of databases as
records or tuples and Columns called "attributes",
each relation represents one entity type and rows
represent instances of that entity and the columns
representing values attributed to the instance
thatconnect between two or more tables is called a
relation and it has some characteristic as
follows[4] :
- Optional attributes (NULLs),
- Depends on defined schema.
- Use joins to aggregate related data
- Dealing with large data VOLUME and high
rate of READ (scalability(
And it has a number of advantagesas [5]:
Based on ACID
Depends on Strong consistency,
concurrency,and recovery.
Mathematical background
Using The Standard Query language (SQL)
Vertical scaling (up scaling)
But in the other hand it has some drawbacksas it can't deal
easily with a huge amount of data and the distributed
databases that contains a variety of data type (semi
structured unstructured).
1.2 Big Data
The Big data is a new technology that deal with a huge
number of data (terabytes zettabytes) and it can be
defined in different ways, but the 3 Vs namely volume,
variety, and Velocity are sufficient to represent the most
general characteristics of big data [6]:fig.1 showing that
the three characteristics of big data.
- Volume: which is refers to the magnitude of
data.
- Variety: that refers to data come from a
number of variety sources.
- Velocity: and it's refers to the streamed data
collected on real time [7].
There is great interest in deploying big data technology
inthe healthcare industry to manage massive sets of
diversehealth datasets such as electronic health records
and sensor data, which are increasing in magnitude and
variety due to the commoditization of electronic devices
Int. J. Advanced Networking and Applications
Volume: 11 Issue: 05 Pages: 4423-4428(2020) ISSN: 0975-0290
4424
such as mobilephones and wireless sensors. The
newfangled medical andhealthcare systems have to be
augmented with new bigdata computing and analysis
capabilities.[8]
Fig .1 Big Data 3v's
1.3 NOSQL Database Model
NoSQL is a non-relational databasemanagement system
that neither uses SQL query languagefor operation data
nor is based on tabular relations that areextremely good in
dealing with the large amount of datainvolved in big data.
The main concept upon whichNoSQL is based on is the
notion of distributed storage ofdata alongside to the
handling of parallel processing [27].
NoSQL is based mainly on horizontal scalability and there
are a lot of different implementations, different systems
and techniques in building a NoSQL database system.
NoSQL databases mainly differ in the way data is stored
and accessed they can be classified into many different
types for example, wide-column store, document store,
and value store each of which has its own characters and
these three categories cover most of the techniques
involved.
NoSQLis Class of database management systems
(DBMS) [1] which Stands for "Not Only SQL
Which Characteristics according to[4] [20] as follows:
No fixed schema (schema less)
No joins (typical in databases operated with
SQL)
Does not use SQL as querying language
Distributed, Partition-tolerant architecture
Characteristics Of NOSQL Databases [4][22]:
NOSQL have a number of Characteristics like it's an open
free source,didn't depended on a schema which make it
easy to use and it could be considered as The most cluster
friendly[2]
Also the researchers say that it provides the users [9] with
the ability to add frequent changes to DB, some good
Solution to Impedance mismatch, also NOSQL depends
on BASE usability (Basically available, soft state,
Eventual consistency) and CAP (consistency, Availability,
Partition tolerance) .
NoSQL databases types are classified into four major
data models are [25]: which are showed in Fig.2
Key-value model
Document model
Column family model (our point of scope)
Graph model
(Each DB Model in NOSQL has its own query
language)
Fig.2 NOSQL Models
1.4 Column-Oriented Database Systems
It's called (Column - stores) [21] that referring to it stores
data in column rather than rows as in the relational
database because the column is the smallest and lowest
instance of data.It contains the data name value, also in
NOSQL, database stores each table with different
columns, with many values that are belongs to the same
column stored contiguously, compressed, and densely
packed, unlike the traditional database systems that store
main records (rows) after each other in a continually
manner.
In this category of NoSQL database, the columns are
realized and determined in relevant to each row in state of
predefined by the table organization owned uniform sized
columns for each tuple. Such these stores introduce a two
dimension gross/aggregate organization, a key and a row
gross that is defined as a set of columns. This allows any
column to be added to any particular row, and in this case
the rows can own a lot of various columns. In other words,
each row possesses a number of different columns that
were maintained and stored. It also is able to maintain
datain tables like segments of data columns.
2. State of The Art
In (Row Database) relational the data stored as tables
containing entities which relate to each other relationally
depends on a structured data type to fit them In the
relational tables but due to the huge evolving in the web
based application [27] nowadays which contains a
different types of data "semi structured and un-structured
data" (social media on line purchasing process and any
kind of online activity and according to that there are
significant researches to solve the problem of scale that
Big Daya 3V's
Velocity
Variety volume
NOSQL
Key-value
model
DOCUMEN
T MODEL
COLUMN
FAMILY
MODEL
Graph
model
Int. J. Advanced Networking and Applications
Volume: 11 Issue: 05 Pages: 4423-4428(2020) ISSN: 0975-0290
4425
occurred in the relational databases and connect with the
big data using NOSQL databases to overcome the
drawbacks in the (Row Database) relational which focused
especially on who use the "Column-oriented database
systems"and the result of the surveyin this filed listed in
the next few lines[11]:
2.1 NOSQL Vs. Relational Database
Researchers in [22] discuss the NOSQL database
concepts and explained how it differ from relational
database, also they mention why NOSQL database is
needed through the era of the big data also provides the
features of NOSQL data base and the no SQL model types
then focus on the consistency methods for NOSQL
database, but the paper did not mention the advantages and
disadvantage of NOSQL database or the relational
database throw their work.
[19] Presented a comparative studywhich compares
between relational and NOSQL databases they focusesin
the presented work on the processes and (features and
characteristics) constraints for each two types of
databases.This paper could be considered as a qualitative
research paperthat based on deeply analysis and detailed,
they produced their work on the latest researches that
published during the last few years, but the researcher
didn't answer the important questionthrough their work,
which is the best database solution?,also the
researcherdidn't take in consideration a number of key
points such as "flexibility, scalability, performance, query
language, security, and availability which considered as
strong and important issues".
Researchers provide in [16] strength comparison between
NOSQL and SQL databases. Focuses on the overhead, the
conclusion of their work is that the "overhead is not
related to SQL, but to other components defined main four
components such as buffer management,locking, logging,
and latching". The researcherexaminedtheir work through
the previous four components. Gettingtheir comparative
with that avoidance of the overhead, specialized with one
or more of these components, can provide a speedup of
two of them.
[23] Presents a Survey with more focuses on the
difference between Relational and Non-Relational
Database, shown the main differences between each
databases, take into consideration the advantages and
disadvantages of each database and the tools used through
each types of them but the work didnt give a clear results
that could help the user which database type to choose to
apply their needs.
2.2 NOSQL Models
In [29] provides comparative study for NOSQL and
They explained the main concepts and the analysis of the
NOSQL database's architecture like"Mongo DB,
Cassandra, and HBase", the researcher focuses on
Cassandra as a case study and explain briefly the
performance evaluation of the other databases in the
aspects of read and write performance. But the researcher
ignores or they didnt mention during his workto the result
of their analytical study, which should be mentioned
together as results of the study they made.
Researcher in [14]compared between relational and
NOSQL data base considering well the performance of
each databases and some technologies that is used during
his research, the researcher found that NOSQL databases
perform better than relational databases duo to the good
facility in non-relational databases, the researcher also
compared thedifferent types of NOSQL databases, testing
or checking some operations which are read, write, delete
and instantiate that are considered the main operations.
2.3 Column-Oriented Database Systems
The study in [28] provides extensive solutions to the
problem of relational DBmigration to HBase. The
researcher uses MySQL which is considered relational
DBas input to the model output with column oriented
database HBase. They extract the features of objects using
semantic enrich, encompassinginheritance, aggregation
and composition which"are represented in a New
Optimized Data Model (NODM)" .the goal of the
proposed model to store data in a column oriented
database through novelty methodfocusing onMap structure.
The model ignoresthe details of the relational database
limitations; they focus only on the goals only and how to
achieve it .
[15] Presents a Tutorial for Column-oriented where
some open research problems were discussed as by
authors column-store systems includes physical database
design, indexing techniques, parallel query execution,
replication, and load balancing, as a conclusion of this
work the authors compared between the column- stored
systems as a commercial products which exists in the real
market.
Authors in [17] provide fine comparison between
Column-Oriented Database Systems and XML their work
explained the relationships between XML compressors
and column-stores. They illustrated that a permuting XML
compressor, called XSAQCT with the DBMS back-end
has essentially the same functionality as a column-store
(while ignoring things such as SQL Joins), including in
their work a specific kind of compression, Also they test
the compression ratio achieved with the compressor they
made, experiments were performed on an XML corpus
and the test showed a very good results that make their
work strong an applicable to use instead of the XML they
also describes the existing XML compressor showing the
similar inherent between its compression technique and
column-stores.
3. Column-Oriented Database Systems
The main purpose of using the columnar database is
toreach to high performance in the operation ofread and
write data from and to stored data and get speed upin the
Int. J. Advanced Networking and Applications
Volume: 11 Issue: 05 Pages: 4423-4428(2020) ISSN: 0975-0290
4426
process returns of a query results, the column oriented
has so many uses as in customer relationship management
(CRM) systems, data warehouses, and in any other inquiry
systems.
3.1 Advantages Of Column-Oriented DBMS[12]
[21]
Uses less disk space through usingthe self-
indexing.
Highly Data compressed.
Use lower disk space duo to the compressed
schemes
Don't read unnecessarycolumns.
Faster to perform operations avoiding
decompression costs.
3.2 NOSQL database (column oriented) vs.
relational database (row oriented)[13][24]
Table (1) (Column Oriented) Vs. (Row Oriented)
Category
Column oriented
Row oriented
Description
Is a direction system
(DBMS) that stores
information tables by
column instead of by
row
Relational DBMS that
Stores the data in
two-dimensional
table, of columns and
rows
Stored
systems
A column-oriented info
serializes all of the
values of a column
along, then the values of
the following column,
and so on
A common technique
of storing a table is to
arrange every row of
information
Benefits
suited for OLTP
like workloads that
are a lot of heavily
loaded with
interactive
transactions
Compression
Duo to the lake of its
uniform data type not
available in row-
oriented data.
Table 2 differentiate between relational database and
column data base [19].
Table (2): shows how the data stored in relational
(row) differ from to be stored in column data base.
The example assumes that we have three variables
sales, product and country; we need to store values
and store them in the database. In the row store each
row contains values about product and sales and
country but in column store seriously all country then
all products then sales.
4. Survey Results
After presenting and surveying a number of related papers
and researches that related to our research scoop we could
say that now:
Column-arranged associations are progressively
productive when a total should be registered over
numerous lines yet just for a remarkably littler
subset of all segments of information, since
perusing that littler subset of information can be
quicker than perusing all information.
Column-oriented organizations are a lot of
economical once new values of a column
areequipped for all rows without delay, as a result
of that column information will be written
expeditiously and replace previous column
information while not touching the other columns
for the rows.
Row-oriented organizations are additional
economical once several columns of one row are
needed at the identical time, and once row-size is
comparatively little, because the entire row will
be retrieved with one disk request.[18]
Row-oriented organizations are additional
economical once writing a new row if all of the
column information is provided at the identical
time, because the entire row will be written with
one disk.
Table.2 Difference between Column Data Store and Row Store [10]
Int. J. Advanced Networking and Applications
Volume: 11 Issue: 05 Pages: 4423-4428(2020) ISSN: 0975-0290
4427
And as result of the survey done in the filed we will
compare between NOSQL database (column oriented) and
the relational database (row oriented) some other
perspective as follows in table (3):
Table .3The Difference Between Column Oriented and Row
Oriented
Category
Column oriented
Row oriented
1-Transaction
reliability
Transaction can
occurred when
Column range from
BASE to ACID.
high transaction
reliability
2- Data Model
Many models
Depend on the
mathematics
3- Scalability
Horizontal
scalability.
Vertical scalability
4- Cloud
suitable with cloud
Not suitable
5- Big data
handling
Used mainly for big
data
Face difficult to
dealing with big
data
6-Data
warehouse
Can serve
datawarehouse
Difficult to
manage increased
data
7-Complexity
Support
differentData types
High complexity
only table/row
formula
8- Crash
Recovery
Recovery achieved
through replication
achieved through
log files and ARIS
algorithm
9- Security
Need many solution
to be secure
Very secure also
can use different
tools to be more
secured
Security in NOSQL database (column oriented) and the
relational database (row oriented): in table 4
Table 4Column Oriented and Row Oriented
Category
column oriented
row oriented
1.Authentication
Could be achieved
using some methods
and techniques
Authentication
mechanisms applied
to All relational
databases
2- Data Integrity
data integrity is not
always achieved
achieved using
ACID properties
3.Confidentiality
Difficult to be
achieved
using encryption
techniques
4. Auditing
dont provide
auditing
mechanisms to audit
using advanced
language are
Provided
5. Client
communication
Client
communication
Security is missed.
provide secure client
communication
mechanism via
using SSL protocols
and encryption
Table (4) illustrated a brief comparison in the security
issues between the column and row databases as a result of
the survey taking in consideration the main 5 categories
Authentication, data integrity, confidentiality, auditing, and
the client communication.
5. Conclusion
Finally concluding of the work done here that column
oriented systems are used when a new data are bringing in
the data set while those data are un or semi structured
data, and the consistency could be relaxed for a while in
the situation that the performance will come first, in the
case we choose column a huge number of user requests
can be answered with eventually consistent unlike the row
DB which is focuses on having a strong consistency but at
the cost of scale and performance speed which makes the
column oriented systems a good choice in so many fields
but we still have a great problem, that we still need
distributed DBMS that having the four main properties:
- High availability
- Consistency
- Scalability
- Fault tolerance
Which founds here is no way to achieve them together
according to the CAP theorem which makes the column
also not the best solution but the future will be using a
combination of (SQL and NOSQL)the researchers named
it a NEWSQL and it could considered as future point of
research .
REFERENCES
[1] Raj, P., &Deka, G. A Deep Dive into NoSQL
Databases. San Diego: Elsevier Science &Technology.,
2018.
[2] M. State, Relational Database Management Systems
Semester - III, Management, no. 9038, pp. 18.
[3] M. State, Relational Database Management Systems
Semester - III, Management, no. 9038, pp. 18.
[4] Radoev, M. (2017). A Comparison between
Characteristics of NoSQL Databases and Traditional
Databases. Computer Science And Information
Technology, 5(5), 149-153. doi:
10.13189/csit.2017.050501
[5] M. A. Mohamed and O. G. Altrafi, Relational
vs .NoSQL Databases: A Survey, vol. 03, no. 03, pp.
598601, 2014.
[6] A. Corbellini, C. Mateos, A. Zunino, D. Godoy and S.
Schiaffino. Persisting Big Data: The NoSQL landscape.
Information Systems. Vol. 63, pp. 1-23. Elsevier Science,
2017.
[7] A. E. YOUNESS KHOURDIFI, MOHAMED
BAHAJ, A new approach for migration of a relational
Database into column-oriented A NEW APPROACH FOR
Int. J. Advanced Networking and Applications
Volume: 11 Issue: 05 Pages: 4423-4428(2020) ISSN: 0975-0290
4428
MIGRATION OF A RELATIONAL DATABASE INTO
COLUMN-ORIENTED NOSQL DATABASE ON
HADOOP, no. November, 2018.
[8] Amira Hassan. Mona Nasr, Walaa Saber The Future
of Internet of Things for Anomalies Detection using
Thermography, International Journal of Advanced
Networking and Applications (IJANA), Volume 11 Issue
1, pp. Pages: 4142-4149 (2019).
[9] S. Gajendran, A Survey on NoSQL Databases, IBM
J. Res. Dev., vol. 5, no. 8, 2014.
[10] HANA .s row store vs column store Internet:
http://www.saphanacentral.com/p/rowstore-vs-column-
store.html,2016 accessed on [Jan. 19, 2019].
[11] H.R.V.yawahere,Drp.pkarde&DrV.M.Thakare,
"Nosql database". International journal of evoloutionary
scientific research and technology ,Conference Paper ·
October 2017
[12] N. D. Karande, A Survey Paper on NoSQL
Databases: Key-Value Data Stores and Document Stores,
vol. 6, no. 2, pp. 153157, 2018.
[13] K.Dwivedi, A., S. Lamba, C., &Shukla, S.
Performance Analysis of Column Oriented Database Vs
Row Oriented Database. International Journal Of
Computer Applications, 50(14), 31-34, 2012.
[14] Y. Li and S. Manoharan, A performance comparison
of SQL and NoSQL databases in IEEE Pacific Rim
Conference on Communications, Computers and Signal
Processing, Canada, Aug. 2013, pp. 15- 19.
[15] D. J. Abadi and S. Harizopoulos, Column-Oriented
Database Systems ( Tutorial ) Column-oriented Database
Systems, no. August 2009.
[16] M. Stonebraker, SQL databases v. NoSQL
databases, Commun.ACM Journal, vol. 53, no. 1012,
2010.
[17] T. Corbin, T. Müldner, and J. K. Miziołek, Column-
Oriented Database Systems and XML Compression
Column-Oriented Database Systems and XML
Compression, no. November, 2014.
[18] M. Abdellatif, M. Salah, and N. Saeed,
ScienceDirect Overcoming business process
reengineering obstacles using ontology-based knowledge
map methodology, Futur.Comput.Informatics J., vol. 3,
no. 1, pp. 728, 2018.
[19] K. Sahatqija, J. Ajdari, X. Zenuni, B. Raufi, and F.
Ismaili, Comparison between relational and NOSQL
databases, no. May, 2018.
[20] M. Radoev, A Comparison between Characteristics
of NoSQL Databases and Traditional Databases, vol. 5,
no. 5, pp. 149153, 2017.
[21] N. C. Ug and J. Steemann, Column-oriented
databases, pp. 134, 2012.
[22] H. Vyawahare, NoSql Database, no. June, 2017.
[23] C. Győrödi, R. Győrödi, and R. Sotoc, A
Comparative Study of Relational and Non- Relational
Database Models in a Web- Based Application, vol. 6, no.
11, pp. 7883, 2015.
[24] N. Jatana, A Survey and Comparison of Relational
and Non-Relational Database, vol. 1, no. 6, pp. 16, 2012.
[25] A. Nayak, Type of NOSQL Databases and its
Comparison with Relational Databases, vol. 5, no. 4, pp.
1619, 2013.
[26] Amira Hassan. Mona Nasr, Diabetes Disease
Detection through Data Mining Techniques, International
Journal of Advanced Networking and Applications
(IJANA), Volume 11 Issue 1, pp. Pages: 4142-4149
(2019).
[27]Amira H. A., Recovery and Concurrency
Challenging in Big Data and NoSQL Database Systems,
International Journal of Advanced Networking and
Applications (IJANA), Volume 11 Issue 04, pp. Pages:
4321-4329 (2020).
[28] Oliveira, F., Oliveira, A., & Alturas, B..Migration of
Relational Databases to NoSQL - Methods of
Analysis. Mediterranean Journal Of Social Sciences, 9(2),
227-235, 2018.
[29] T. Of and V. Of, COMPARATIVE STUDY OF
NOSQL DOCUMENT , COLUMN S TORE D
ATABASES A ND EVALUATION OF CASSANDRA,
vol. 6, no. 4, pp. 1126, 2014.
... AI techniques for automatic oil spill identification can help accomplish objective 14.1 which call for a considerable reduction in maritime pollution of all types [31]. Target 15.3, which asks for halting desertification and repairing degraded land and soil, is another illustration. ...
... Raising awareness of the dangers posed by potential AI system breakdowns is crucial in light of society's growing reliance on this technology. Moreover, even though we were able to locate a large number of papers indicating that AI may be able to facilitate the achievement of several goals of the Sustainable Development and measurements, a large portion of these studies were carried out in controlled laboratory settings using small datasets [28,31]. Therefore, extrapolating this data to assess the effects in reality frequently remains difficult. ...
Article
Full-text available
The increasing prevalence of Artificial Intelligence (AI) across various industries necessitates an assessment of its impact on achieving the Sustainable Development Goals (SDGs). Studies indicate that AI has the potential to support 134 targets across all goals through professional, consensus-based data collection strategies. However, it may also hinder progress toward 59 targets, presenting a complex interplay between benefits and challenges. Key concerns include gaps in safety, transparency, and ethical standards, which arise when regulatory frameworks fail to keep pace with the rapid advancement of AI technologies. These issues highlight the need for robust governance and oversight mechanisms to address potential risks. Additionally, overlooked components in the study, such as social equity, environmental justice, and accessibility, are critical for ensuring AI-based solutions contribute effectively to sustainable growth. This research emphasizes the importance of aligning AI applications with global regulatory and ethical standards to maximize positive outcomes while mitigating adverse effects. By fostering collaboration among policymakers, industry leaders, and researchers, AI can become a transformative tool for achieving SDGs. Future efforts should prioritize addressing regulatory gaps and ensuring that AI-driven innovation remains inclusive, transparent, and aligned with the core principles of sustainability. JCBD is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
... Data science is the process of making meaning of data that might be more predictable, meet consumer needs, differentiate an organization's offering in the market, and increase revenues. [30] The diverse study domains of data science include physiology, computer science, mathematics, statistics, and philosophy. [31].The data science uses mathematics and statistics to enable analysis and prediction using the gathered data. ...
Article
Full-text available
The electronic Human Recourses (eHR) management solutions are frequently employed in large organizations and sectors. For the company, such eHR is extremely competent, congruent, affordable, and committed. These days, eHR is greatly impacted by the Internet of Things (IoT), which provides eHR functions like standards, privacy, and security with a variety of facilities and supports. There are several uses for eHR and IoT together to execute plans, rules, and practices inside the company. The five essential components of an eHR are e-Selection, e-Recruitment, e-Performance, e-Compensation, and e-Learning. The suggested system in this study consists of two components. The first section covered a discussion and detailed explanation of the different eHR tasks using examples. The second section describes and provides examples of analytics of data using IoT for every eHR task. The four parts of the data analytics section are as follows: (a) preparing the data; (b) choosing features; (c) classifying the data; and (d) assessing performance. Four HR analytic datasets gathered at the Kaggle site were used to conduct extensive experimentation for each eHR activity. Ultimately, each dataset was used to perform performance with appropriate reasons for each eHR activity. CART outperformed kNN and SVM classifiers in terms of performance.
... Data science is the process of making meaning of data that might be more predictable, meet consumer needs, differentiate an organization's offering in the market, and increase revenues. [30] The diverse study domains of data science include physiology, computer science, mathematics, statistics, and philosophy. [31].The data science uses mathematics and statistics to enable analysis and prediction using the gathered data. ...
... It is carried after the effective storage of data and each category uses a different mechanism as for the public data the password mechanism is followed and graphical password is used for the confidential data and OTP is used for the sensitive data. [20] The retrieval phase followed some guidelines for data access including [21][22][23][24][25][26]:  Not allowing access to confidential and sensitive data to users on public data.  Allowing access to confidential and public data if the user is granted access on sensitive data. ...
Article
Full-text available
One solution that helps with straightforward, on-demand access to a pool of reconfigurable computing resources is cloud computing. Cloud Computing is an emerging and ubiquitous trend. It allows users to enjoy the on-demand services, without the burden of data storage and maintenance costs. Users of this type of computing platform are very concerned about security, and they need to find reliable providers of cloud services. Authentication is believed to be a main necessity for assuring secure access to cloud. In this paper we discussed the comprehensive and detailed frameworks constructed to assure successful authentication in cloud computing. Also, this survey paper provides a discussion of differences between considered techniques used in different frameworks.
... In recent years, the two of the most active fields in research in science and technology have been DL and analytics for large amounts of data. BD refers to digital data that is hard or impossible to manage and evaluate with conventional tools and technologies [1]. Data analysis and information extraction are essential for the growth of science, national security, healthcare, and workplace decision-making. ...
Article
Full-text available
The vast quantity of data (Big Data) that is being gathered as a result of the latest developments in networks of sensors and Internet of Things technologies is known as big data (BD). Investigating such large volumes of data requires more efficient methods with excellent analytical precision. Conventional techniques significantly limit the ability to process vast volumes of data in real time. With BD's analytics solutions, deep learning (DL) has begun to take center stage during the last few years. In terms of BD analytics, DL provides more scalable, fast, and accurate outcomes. It has shown previously unheard-of success in disciplines such as speech recognition, computer vision, and natural language processing. Due to its ability to extract complex high-level representations and data scenarios—particularly unsupervised data from large volumes of data—DL is an intriguing and practical tool for BD analytics. Despite this interest, there is not any structured review encompassing DL techniques for BD analytics. The purpose of this survey is to review the BD informatics studies conducted using DL techniques. The possible application of big data based DL is examined in a number of studies that provide extremely accurate analytical results.
... Deep learning and big data analytics have emerged as the two most active areas of scientific and technical study in recent years. Digital data that is difficult or impossible to handle and analyze using traditional tools and technologies is referred to as BD [1]. For the purpose of national security, healthcare, scientific advancement, and making informed decisions in the workplace, data analysis and knowledge extraction are critical. ...
Article
Full-text available
Big Data (BD) is the massive amount of data that has been collected as a result of recent developments in sensor networks and IoT technology. More effective techniques with high analytical accuracy are required for the investigation of such vast amounts of data. The ability to analyze large amounts of data in real time is severely limited by the standard neural network and artificial intelligence algorithms. In the past several years, DL has started to take center stage in BD's analytics solutions. When it comes to BD analytics, DL can produce results that are more accurate, quicker, and scalable. In domains including natural language processing, speech recognition, and computer vision, it has achieved before unseen success. DL is an interesting and useful technique for BD analytics because of its capacity to extract high-level complicated representations as well as data scenarios, particularly unsupervised data from big volume data. To the best of our knowledge, no comprehensive survey covering all DL approaches for BD analytics exists, despite this interest. The current survey's goal is to examine the BD analytics research that has been done with DL methods. Several studies that offer very accurate analytical findings explore the potential use of DL with BD analytics.
... Column-oriented or commonly called Columnar database is a form of sorting data in a relational database which is generally in the form of rows but the form of columns because columns are the smallest and lowest instance of data [5]. Column-oriented objectives are to speed up reading and sorting on storage because it minimizes random access, reduces disk Input-Output (I/O), and processes queries using vectorized execution [6,7]. ...
Article
Full-text available
In making reports or dashboards from operational data, problems often occur in the query process with low speed in responding to an output, causing the server to experience overload. This condition often occurs in companies or higher education organizations in managing academic data. This condition can be improved by optimizing the database server by integrating relational databases with column-oriented databases to speed up query responses and save development costs. Based on the experiments that had been carried out, column-oriented has succeeded in optimizing with a significant difference in query execution time and the server does not crash.
... A recent study [18] compares the column-oriented approach and the NoSQL databases. First, the authors performed a basic overview of DBMS and the different data models, reflecting the increasing demands for state-of-theart tools to meet the growing data volume. ...
Article
Full-text available
One solution that helps with straightforward, anytime, anywhere accessibility to reconfigurable computational capabilities is cloud computing. Users of this computing platform are genuinely concerned about security and need to find dependable providers of cloud services. Authentication is believed to be a main necessity for assuring secure cloud access. In this paper, we discussed the comprehensive and detailed frameworks constructed to assure successful authentication in cloud computing. Also, this survey paper discusses differences between considered techniques used in different frameworks.
Article
Full-text available
Maritime shipping, with a significant role in global trade, confronts various accidents leading to loss of lives, properties, and the environment. Shipping 4.0 technologies are scaling up to address this problem by employing real-time data-driven technologies, including cyber-physical systems, advanced tracking and tracing, intelligent systems, and big data analytics. Despite growing attention, there is a general lack of clarity on the level and direction of progress in this field. Accordingly, this study aims to identify critical shipping accident risks, analyze the role of relevant shipping 4.0 technologies in controlling these risks, and consolidate the findings into a conceptual guiding framework directing future developments. Accordingly, a systematic review is performed that reveals how shipping 4.0 approaches address critical accident risks and the gaps that still exist. Overall, we found that the collision is the most frequent accident referred to, while the most frequent technology to control the accidents is the Automatic Identification System. In contrast, we see an evident lack of cloud computing, internet-of-things, and big data analytics, which play crucial roles in current industry 4.0 developments.
Article
Full-text available
Abnormal temperature of human body is a natural extensive indicator of illness. Infrared thermography (IRT) is a fast, non-invasive, non-contact and passive substitution to ordinary medical thermometers for monitoring and observation human body temperature. Aside from, IRT is able to chart body surface heat remotely. Last five decades testified a stationary development in thermal imaging cameras utilization to obtain relations between the thermal physiology and surface temperature. IRT has effectively used in diagnosis and detection of breast cancer, diabetes neuropathy and peripheral vascular disorders. It has been employed to detect issues related to gynecology, dermatology, heart, neonatal physiology, and brain imaging. With the advent of modern infrared cameras, data acquisition and processing techniques, it is now possible to have real time high resolution thermographic images, which is likely to surge further research in this field. The emergent technology known as the Internet of Things (IoT) has guided practitioners, physicians and researchers to design innovative solutions in different environments, particularly in medical and healthcare using smart sensors, computer networks and a remote server. This paper aims to propose IoT-enabled medical system enables diagnostics and detection for several medical anomalies remotely; in real-time and simultaneous depend on combination of IoT and Thermal Infrared imaging techniques. It will detect and diagnostics any abnormal and alert the user through IoT remotely and in real-time.
Article
Full-text available
Diabetes is a inveterate defect and disturbance resulted from metabolic conk out in carbohydrate metabolism thus it has occupied a globally serious health problem. In general, the detection of diabetes in early stages can greatly has significant impact on the diabetic patients treatment in which lead to drive out its relevant side effects. Machine learning is an emerging technology that providing high importance prognosis and a deeper understanding for different clustering of diseases such as diabetes. And because there is a lack of effective analysis tools to discover hidden relationships and trends in data, so Health information technology has emerged as a new technology in health care sector in a short period by utilizing Business Intelligence ‘BI’ which is a data-driven Decision Support System. In this study, we proposed a high precision diagnostic analysis by using k-means clustering technique. In the first stage, noisy, uncertain and inconsistent data was detected and removed from data set through the preprocessing to prepare date to implement a clustering model. Then, we apply k-means technique on community health diabetes related indicators data set to cluster diabetic patients from healthy one with high accuracy and reliability results.
Article
Full-text available
This article presents a new approach based on the "Object" concept, to successfully migrate a relational MySQL database to a column oriented HBase NoSQL database. The purpose of this article is to provide a new model of migration process divided into three phases, the first of which allows to obtain a copy of these metadata using the principle of semantic enrichment, and this to extract the different principles of the objects, including aggregation, inheritance and composition, the second phase of the process concerns the automatic generation of a New Optimised Data Model 'NODM' containing all relational database information in a flattened way. The last phase serves for the migration of the existing relational database into column-oriented database in the Hadoop ecosystem. The whole approach proposes a migration solution from a relational database to a NoSQL column-oriented database, which exploits the fast extraction of data columns for several types of applications, thus generating a better factor for analytic query performance, minimizes the input / output load of the disk, and reduces the amount of data being addressed from the disk.
Article
Full-text available
The amount of data to store, organize and manage in any organization, is very high and increases every day, fact well-known by companies as Facebook, Google or SAS. With this current growth rate, technologies must adapt to the amount of disposable data, and a new approach to information processing is required. Big Data technologies are more focused, and this is a reason for a greater spread of NoSQL database models. The purpose of this article is to validate the existing (and already used) migration methods and to adapt them, to understand the most efficient method to migrate a relational database to a NoSQL database. We will show the methodology used and what were the steps followed for the implementation, as well as the configuration of the environment used during the tests. Results show that in this migration process, the most efficient method is what is referred to as automatic offline migration. However, it requires a window of unavailability greater than the method of online migration, which in turn requires more resources from the operating system to migrate. Therefore, the most efficient method to migrate a database will depend on the application availability, and the computational resources available for it. We hope to make an important contribution in helping to choose a migration method to use, and the metrics that can be collected to better evaluate the performance of a migration.
Article
Full-text available
http://www.hrpub.org/download/20171130/CSIT1-13510351.pdf
Article
Full-text available
Context Business process reengineering (BPR) is identified as one of the most important solutions for organizational improvements in all performance measures of business processes. However, high failure rates 70% is reported about using it the most important reason that caused the failure is the focus on the process itself; regardless of the surrounding environment, and the knowledge of the organization. The other reasons are due to the lack of tools to determine the causes of the inconsistencies and inefficiencies. Objective This paper proposes Process Reengineering Ontology-based knowledge Map Methodology (PROM) to reduce the failure ratio, solve BPR problems, and overcome their difficulties. Method using an organizational ontology to show the structure and environment surrounding to organization's processes, using knowledge maps as an inference that succeeds to identify and find out the causes that lead to contradictions and inefficiencies, and using Analytical hierarchy processing to identify and prioritize processes of the business to be re-designed. Results through the proposed methodology all organizational processes are completely analyzed. Moreover, Analytical Hierarchy Processing technique is used to show the most important processes with high priority to be reengineered first then it is easy to discover any errors occurred during reengineering process through knowledge map so BPR is done successfully. Conclusion Apply the proposed methodology to inventory management shows how processes reengineering are done successfully and helping the organization to achieve its objectives.
Article
The growing popularity of massively accessed Web applications that store and analyze large amounts of data, being Facebook, Twitter and Google Search some prominent examples of such applications, have posed new requirements that greatly challenge traditional RDBMS. In response to this reality, a new way of creating and manipulating data stores, known as NoSQL databases, has arisen. This paper reviews implementations of NoSQL databases in order to provide an understanding of current tools and their uses. First, NoSQL databases are compared with traditional RDBMS and important concepts are explained. Only databases allowing to persist data and distribute them along different computing nodes are within the scope of this review. Moreover, NoSQL databases are divided into different types: Key-Value, Wide-Column, Document-oriented and Graph-oriented. In each case, a comparison of available databases is carried out based on their most important features.