BookPDF Available

APPLICATIONS OF BIG DATA IN HEALTHCARE

Authors:

Figures

A preview of the PDF is not available
... Big data applications have become extremely popular in the last two decades, not because of their competence to manage large datasets but rather their capability to extract insights from intricate, heterogeneous, longitudinal, noisy, and voluminous datasets (Khanna et al., 2021). Based on the nature of work, big data technologies can be divided into two specific categories, such as operational big data technology and analytical big data technology (Osadchuk, 2023). ...
Article
Full-text available
A well-known issue for social media sites consists of the hazy boundaries between malicious false news and protected speech satire. In addition to the protective measures that lessen the exposure of false material on social media, providers of fake news have started to pose as satire sites in order to escape being delisted. Potentially, this may cause confusion to the readers as satire can sometimes be mistaken for real news, especially when their context or intent is not clearly understood and written in a journalistic format imitating real articles. In this research, we tackle the issue of classifying Arabic satiric articles written in a journalistic format to detect satirical cues that aid in satire classification. To accomplish this, we compiled the first Arabic satirical articles dataset extracted from real-world satirical news platforms. Then, a number of classification models that integrate a variety of feature extraction techniques with machine learning, deep learning, and transformers to detect the provenance of linguistic and semantic cues were investigated, including the first use of the ArabGPt model. Our results indicate that BERT is the best-performing model with F1-score reaching 95%. We also provide an in-depth lexical analysis of the formation of Arabic satirical articles. The lexical analysis provides insights into the satirical nature of the articles in terms of their linguistic word uses. Finally, we developed a free open-source platform that automatically organizes satirical and non-satirical articles in their correct classes from the best-performing model in our study, BERT. In summary, the obtained results found that pretrained models gave promising results in classifying Arabic satirical articles.
Chapter
Big data in healthcare is a fast advancing area. With new diseases being continuously discovered, for instance, the COVID19 pandemic, there is a tremendous surge in data generation and a huge burden falls on the medical personnel where automation and emerging technologies can contribute significantly. Combining big data with the emerging technologies in healthcare is the need of the hour. In this chapter, first, we focus on the collection of big data in healthcare using emerging technologies like Radio Frequency Identification (RFID), Wireless Sensor Networks (WSN), and Internet of Things (IoT) along with its applications in medical field. We then explore the issues and challenges faced during data collection. Next, we bring out the different data analysis approaches. Then, the challenges and issues during data analysis are explored. Finally, the current research trends going on in the field are summarized.
Article
Full-text available
Big Data is a vast volume of data that is not easy to be stored or processed with conventional approaches within a limited period. Therefore, to manage and extract value from it, a new architecture, method and analysis are needed. Big Data poses many challenges and problems and it has different properties such as volume, velocity, variety and veracity. The goal of Big Data is not only to collect, save and organize huge volumes of data, but it is also used to evaluate, extract and visualize useful information for further processes. Big Data is a modern worldwide novel technology that has the potential to provide great benefits to business and organizations of different fields around the world and it will be more desirable in the next few years. This work describes the importance of Big Data, various challenges it faces in adapting to today’s modern era, characteristics and architecture of Big Data, technologies used in Big Data and applications created using Big Data. The paper also explains MapReduce and Hadoop Distributed File System as two important models of Big Data.
Article
Full-text available
One of the most dreadful disease is breast cancer and it has a potential cause for death in women. Every year, death rate increases drastically due to breast cancer. An effective way to classify data is through classification or data mining. This becomes very handy, especially in the medical field where diagnosis and analysis are done through these techniques. Wisconsin Breast cancer dataset is used to perform a comparison between SVM, Logistic Regression, Naïve Bayes and Random Forest. Evaluating the correctness in classifying data based on accuracy and time consumption is used to determine the efficiency of the algorithms, which is the main objective. Based on the result of performed experiments, the Random Forest algorithm shows the highest accuracy (99.76%) with the least error rate. ANACONDA Data Science Platform is used to execute all the experiments in a simulated environment.
Article
Full-text available
Breast Cancer is the most often identified cancer among women and major reason for increasing mortality rate among women. As the diagnosis of this disease manually takes long hours and the lesser availability of systems, there is a need to develop the automatic diagnosis system for early detection of cancer. Data mining techniques contribute a lot in the development of such system. For the classification of benign and malignant tumor we have used classification techniques of machine learning in which the machine is learned from the past data and can predict the category of new input. This paper is a relative study on the implementation of models using Logistic Regression, Support Vector Machine (SVM) and K Nearest Neighbor (KNN) is done on the dataset taken from the UCI repository. With respect to the results of accuracy, precision, sensitivity, specificity and False Positive Rate the efficiency of each algorithm is measured and compared. These techniques are coded in python and executed in Spyder, the Scientific Python Development Environment. Our experiments have shown that SVM is the best for predictive analysis with an accuracy of 92.7%.We infer from our study that SVM is the well suited algorithm for prediction and on the whole KNN presented well next to SVM.
Article
Full-text available
Like other fields, the healthcare sector has also been greatly impacted by big data. A huge volume of healthcare data and other related data are being continually generated from diverse sources. Tapping and analysing these data, suitably, would open up new avenues and opportunities for healthcare services. In view of that, this paper aims to present a systematic overview of big data and big data analytics, applicable to modern-day healthcare. Acknowledging the massive upsurge in healthcare data generation, various ‘V's, specific to healthcare big data, are identified. Different types of data analytics, applicable to healthcare, are discussed. Along with presenting the technological backbone of healthcare big data and analytics, the advantages and challenges of healthcare big data are meticulously explained. A brief report on the present and future market of healthcare big data and analytics is also presented. Besides, several applications and use cases are discussed with sufficient details.
Article
Full-text available
Recommender System (RS) has emerged as a major research interest that aims to help users to find items online by providing suggestions that closely match their interests. This paper provides a comprehensive study on the RS covering the different recommendation approaches, associated issues, and techniques used for information retrieval. Thanks to its widespread applications, it has induced research interest among a significant number of researchers around the globe. The main purpose of this paper is to spot the research trend in RS. More than 1000 research papers, published by ACM, IEEE, Springer, and Elsevier from 2011 to the first quarter of 2017, have been considered. Several interesting findings have come out of this study, which will help the current and future RS researchers to assess and set their research roadmap. Furthermore, this paper also envisions the future of RS which may open up new research directions in this domain.
Conference Paper
Full-text available
Nowadays we are facing with data flood in many areas. Big data come from numerous sources such as human activities, measuring instruments and many appliances connected to computers or smart phones. One of the most challenging topics in the next decade will be how combination of genome and exposome data will contribute to reveal the risks for particular diseases. According to the medical scientists, the exposome includes all exposure environmental factors, from chemical and nonchemical agents to socio-behavioral and psychological factors as stress, diet, endogenous and exogenous factors from whole lifespan. The growing of mobile and ubiquitous computing technologies contributes in increasing the number of records regarding personal health and habits of patients. Internet of Things (IoT) includes the development of wearable measurement sensors connected with Bluetooth, which are capable to capture and store health-related data, intended to be stored in patient health records. The exposome is a healthcare and medicine concept that implies an interdisciplinary and integrated approach of many sciences domains including epidemiology, computing, environment sciences, toxicology and social science. We aim to integrate the data collected from various sensors and detectors in the patient health record to provide clinicians with more elements for better disease prognosis, diagnosis and treatment.
Article
Data mining is the extraction of unseen predictive info from huge databases, is the process of arranging through enormous data sets to recognize patterns and create relationships to resolve the problems through data analysis. Cancer is one of the primary reasons of death wide-reaching. Timely detection and prevention of cancer plays a very vital role in decreasing deaths affected by cancer. Identification of genetic and environmental factors is very significant in emerging novel methods to identify and avert cancer. Many researchers’ use data mining techniques like clustering, classification and prediction find potential cancer patients. This paper focuses on a breast cancer prediction system built on data mining techniques. With the help of this system, people can guess the possibility of the breast cancer in the former stage itself.
Conference Paper
Cardiovascular sickness is the largest cause of death in developing countries. The study of this coronary heart disease prediction model using a data mining technique and decision tree algorithm are applied in medical research, especially in heart disease prediction. The Coronary Heart Disease is also known as Coronary Artery Disease (CAD). Hence the decision system is analyses the heart disease for the patient. In this paper coronary illness studied a more number of input attributes and database records based on the patient's clinical data. This paper focuses on the around the prediction of heart disease accuracy value using the decision tree algorithm.