Article

Big Data in Health and the Importance of Data Visualization Tools

Authors:
  • International Vision University, Gostivar, North Macedonia
  • International Vision University
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Big data concepts are increasing with their spatial speed, from personal information to extensive volume data. Since the human brain perceives visual data faster, the data must be processed and displayed appropriately. As in all areas of life, the size of the data obtained in the health sector has increased rapidly. Data storage and security have gained importance with the excessive increase in data. Big data, data mining, and visualization tools have become increasingly important to process and use data for valuation purposes. Therefore, the visualization of data and the use of analysis tools play a significant role in data processing and decision-making in the development of the health sector. The importance of data visualization tools in the health sector will become increasingly indispensable. There are many software tools developed for these purposes. This study's literature review explained the basic concepts of big data and data visualization. Research in the health sector around the world was summarized. In addition to this literature review, analyses with comparison and deduction research methods were also carried out. As a result, suggestions were made by making predictions for future studies in the health sector.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The introduction of Big Data Analytics (BDA) in healthcare will allow to use new technologies both in treatment of patients and health management. The paper aims at analyzing the possibilities of using Big Data Analytics in healthcare. The research is based on a critical analysis of the literature, as well as the presentation of selected results of direct research on the use of Big Data Analytics in medical facilities. The direct research was carried out based on research questionnaire and conducted on a sample of 217 medical facilities in Poland. Literature studies have shown that the use of Big Data Analytics can bring many benefits to medical facilities, while direct research has shown that medical facilities in Poland are moving towards data-based healthcare because they use structured and unstructured data, reach for analytics in the administrative, business and clinical area. The research positively confirmed that medical facilities are working on both structural data and unstructured data. The following kinds and sources of data can be distinguished: from databases, transaction data, unstructured content of emails and documents, data from devices and sensors. However, the use of data from social media is lower as in their activity they reach for analytics, not only in the administrative and business but also in the clinical area. It clearly shows that the decisions made in medical facilities are highly data-driven. The results of the study confirm what has been analyzed in the literature that medical facilities are moving towards data-based healthcare, together with its benefits.
Article
Full-text available
Most of the data on the web is non-structural, and it is required that the data should be transformed into a machine operable structure. Therefore, it is appropriate to convert the unstructured data into a structured form according to the requirements and to store those data in different data models by considering use cases. As requirements and their types increase, it fails using one approach to perform on all. Thus, it is not suitable to use a single storage technology to carry out all storage requirements. Managing stores with various type of schemas in a joint and an integrated manner is named as 'multistore' and 'polystore' in the database literature. In this paper, Entity Linking task is leveraged to transform texts into wellformed data and this data is managed by an integrated environment of different data models. Finally, this integrated big data environment will be queried and be examined by presenting the method.
Article
Full-text available
The orientation of the students is a very complex task and It takes a lot of time for the students and their families, because it depends on several factors such as the students’ marks and absences, etc...., and to have a good resultants we must get a large amount of data as well as having a great impact on the students’ path and we need a very powerful tool for analysis this massive data. So in this article we try to classify student into four classes, scientific, literary, technical and original.There are different type of classification, in this paper we have focused on the classification of students based on classification algorithms and big data tools, that give us the chance to manage and analysis numerous datasets by Hadoop Distributed File System (HDFS) and MapReduce model for parallel processing, then we try to evaluate the performance of classification algorithms Neural Networks, Naive Bayes and k-nearest-neighbors, by classification accuracy, speed up. That shows the power of the Naive Bayes algorithm.
Article
Full-text available
This paper surveys big data with highlighting the big data analytics in medicine and healthcare. Big data characteristics: value, volume, velocity, variety, veracity and variability are described. Big data analytics in medicine and healthcare covers integration and analysis of large amount of complex heterogeneous data such as various – omics data (genomics, epigenomics, transcriptomics, proteomics, metabolomics, interactomics, pharmacogenomics, diseasomics), biomedical data and electronic health records data. We underline the challenging issues about big data privacy and security. Regarding big data characteristics, some directions of using suitable and promising open-source distributed data processing software platform are given.
Article
Full-text available
Managing, processing and understanding big healthcare data is challenging, costly and demanding. Without a robust fundamental theory for representation, analysis and inference, a roadmap for uniform handling and analyzing of such complex data remains elusive. In this article, we outline various big data challenges, opportunities, modeling methods and software techniques for blending complex healthcare data, advanced analytic tools, and distributed scientific computing. Using imaging, genetic and healthcare data we provide examples of processing heterogeneous datasets using distributed cloud services, automated and semi-automated classification techniques, and open-science protocols. Despite substantial advances, new innovative technologies need to be developed that enhance, scale and optimize the management and processing of large, complex and heterogeneous data. Stakeholder investments in data acquisition, research and development, computational infrastructure and education will be critical to realize the huge potential of big data, to reap the expected information benefits and to build lasting knowledge assets. Multi-faceted proprietary, open-source, and community developments will be essential to enable broad, reliable, sustainable and efficient data-driven discovery and analytics. Big data will affect every sector of the economy and their hallmark will be ‘team science’.
Article
Full-text available
This paper provides an overview of recent developments in big data in the context of biomedical and health informatics. It outlines the key characteristics of big data and how medical and health informatics, translational bioinformatics, sensor informatics and imaging informatics will benefit from an integrated approach of piecing together different aspects of personalized information from a diverse range of data sources, both structured and unstructured, covering genomics, proteomics, metabolomics, as well as imaging, clinical diagnosis, and long-term continuous physiological sensing of an individual. It is expected that recent advances in big data will expand our knowledge for testing new hypotheses about disease management, from diagnosis, to prevention to personalized treatment. The rise of big data, however, also raises challenges in terms of privacy, security, data ownership, data stewardship and governance. This paper discusses some of the existing activities and future opportunities related to big data for health, outlining some of the key underlying issues that need to be tackled.
Article
Full-text available
In this article, I review An Introduction to Stata for Health Researchers, Fourth Edition, by Svend Juul and Morten Frydenberg (2014 [Stata Press]).
Article
Full-text available
The main purpose of this study was to explore whether the use of big data can effectively reduce healthcare concerns, such as the selection of appropriate treatment paths, improvement of healthcare systems, and so on. By providing an overview of the current state of big data applications in the healthcare environment, this study has explored the current challenges that governments and healthcare stakeholders are facing as well as the opportunities presented by big data. Insightful consideration of the current state of big data applications could help follower countries or healthcare stakeholders in their plans for deploying big data to resolve healthcare issues. The advantage for such follower countries and healthcare stakeholders is that they can possibly leapfrog the leaders' big data applications by conducting a careful analysis of the leaders' successes and failures and exploiting the expected future opportunities in mobile services. First, all big data projects undertaken by leading countries' governments and healthcare industries have similar general common goals. Second, for medical data that cuts across departmental boundaries, a top-down approach is needed to effectively manage and integrate big data. Third, real-time analysis of in-motion big data should be carried out, while protecting privacy and security.
Conference Paper
Full-text available
Internet of Things (IoT) will comprise billions of devices that can sense, communicate, compute and potentially actuate. Data streams coming from these devices will challenge the traditional approaches to data management and contribute to the emerging paradigm of big data. This paper discusses emerging Internet of Things (IoT) architecture, large scale sensor network applications, federating sensor networks, sensor data and related context capturing techniques, challenges in cloud-based management, storing, archiving and processing of sensor data.
Article
Full-text available
The metafor package provides functions for conducting meta-analyses in R. The package includes functions for fitting the meta-analytic fixed- and random-effects models and allows for the inclusion of moderators variables (study-level covariates) in these models. Meta-regression analyses with continuous and categorical moderators can be conducted in this way. Functions for the Mantel-Haenszel and Peto's one-step method for meta-analyses of 2 x 2 table data are also available. Finally, the package provides various plot functions (for example, for forest, funnel, and radial plots) and functions for assessing the model fit, for obtaining case diagnostics, and for tests of publication bias.
Article
Full-text available
Abstracts not available for BookReviews
Article
The main purpose of Industry 4.0 applications is to provide maximum uptime throughout the production chain, to reduce production costs and to increase productivity. Thanks to Big Data, Internet of Things (IoT) and Machine Learning (ML), which are among the Industry 4.0 technologies, Predictive Maintenance (PdM) studies have gained speed. Implementing Predictive Maintenance in the industry reduces the number of breakdowns with long maintenance and repair times, and minimizes production losses and costs. With the use of machine learning, equipment malfunctions and equipment maintenance needs can be predicted for unknown reasons. A large amount of data is needed to train the machine learning algorithm, as well as adequate analytical method selection suitable for the problem. The important thing is to get the valuable signal by cleaning the data from noise with data processing. In order to create prediction models with machine learning, it is necessary to collect accurate information and to use many data from different systems. The existence of large amounts of data related to predictive maintenance and the need to monitor this data in real time, delays in data collection, network and server problems are major difficulties in this process. Another important issue concerns the use of artificial intelligence. For example, obtaining training data, dealing with variable environmental conditions, choosing the ML algorithm better suited to a specific scenario, necessity of information sensitive to operational conditions and production environment are of great importance for analysis. In this study, predictive maintenance studies for the transfer press machine used in the automotive industry, which can predict the maintenance need time and give warning messages to the relevant people when abnormal situations approach, are examined. First of all, various sensors have been placed in the machine for the detection of past malfunctions and it has been determined which data will be collected from these sensors. Then, machine learning algorithms used to detect anomalies with the collected data and model past failures were created and an application was made in a factory that produces automotive parts.
Article
This research develops a MapReduce framework for automatic pattern recognition based on fault diagnosis by solving data imbalance problem in a cloud-based manufacturing (CBM). Fault diagnosis in a CBM system significantly contributes to reduce the product testing cost and enhances manufacturing quality. One of the major challenges facing the big data analytics in CBM is handling of data-sets, which are highly imbalanced in nature due to poor classification result when machine learning techniques are applied on such data-sets. The framework proposed in this research uses a hybrid approach to deal with big data-set for smarter decisions. Furthermore, we compare the performance of radial basis function-based Support Vector Machine classifier with standard techniques. Our findings suggest that the most important task in CBM is to predict the effect of data errors on quality due to highly imbalance unstructured data-set. The proposed framework is an original contribution to the body of literature, where our proposed MapReduce framework has been used for fault detection by managing data imbalance problem appropriately and relating it to firm’s profit function. The experimental results are validated using a case study of steel plate manufacturing fault diagnosis, with crucial performance matrices such as accuracy, specificity and sensitivity. A comparative study shows that the methods used in the proposed framework outperform the traditional ones.
Article
We review literature that uses spatial analytic tools in contexts where Geographic Information Systems (GIS) is the organizing system for health data or where the methods discussed will likely be incorporated in GIS-based analyses in the future. We conclude the review with the point of view that this literature is moving toward the development and use of systems of analysis that integrate the information geo-coding and data base functions of GISystems with the geo-information processing functions of GIScience. The rapidity of this projected development will depend on the perceived needs of the public health community for spatial analysis methods to provide decision support. Recent advances in the analysis of disease maps have been influenced by and benefited from the adoption of new practices for georeferencing health data and new ways of linking such data geographically to potential sources of environmental exposures, the locations of health resources and the geodemographic characteristics of populations. This review focuses on these advances.
Eco-morphic business digitization analytics
  • C F Nourani
Nourani CF. Eco-morphic business digitization analytics. Researchgate 2020; available at https://www.researchgate.net/publication/ 342106614_Eco-Morphic_Business_Digitization_Analytics/link/ 5ee24b09a6fdcc73be705823/download.
Advice: When to plot SD versus SEM
  • Graphpad
Introduction to Matlab. Book chapter
  • B Radi
  • El Hani
Radi B, El Hani A. Introduction to Matlab. Book chapter in Advanced Numerical Methods with Matlab 2, John Wiley & Sons, 2018.
Oracle: Big-data for enterprise. An Oracle White Paper
Oracle. Oracle: Big-data for enterprise. An Oracle White Paper, 2011. Available at https://www.oracle.com/technetwork/database/ bi-datawarehousing/wp-big-data-with-oracle-521209.pdf.
Jeffreys's Amazing Statistics Program)
  • Wikipedia
  • Jasp
Wikipedia. JASP (Jeffreys's Amazing Statistics Program). 2022, Available at https://en.wikipedia.org/wiki/JASP.
Big data and data visualization
  • S Celik
  • E Akdamar
Celik S, Akdamar E. Big data and data visualization. Akademik Bakis Uluslararasi Hakemli Sosyal Bilimler Dergisi 2018; 65: 253-264.
A big data MapReduce framework for fault diagnosis in cloud-based manufacturing
  • A Kumar
  • R Shankar
  • A Choudray
  • L S Thakur
Kumar A, Shankar R, Choudray A, Thakur LS. A big data MapReduce framework for fault diagnosis in cloud-based manufacturing. Loughborough University Institutional Repository, 2016.