Science topic

MapReduce - Science topic

Explore the latest publications in MapReduce, and find MapReduce experts.
Filters
All publications are displayed by default. Use this filter to view only publications with full-texts.
Publications related to MapReduce (10,000)
Sorted by most recent
Article
Full-text available
p class="ICST-abstracttext"> Hadoop is an open-source framework that enables the parallel processing of large data sets across a cluster of machines. It faces several challenges that can lead to poor performance, such as I/O operations, network data transmission, and high data access time. In recent years, researchers have explored prefetching tech...
Article
Full-text available
Big data classification involves the systematic sorting and analysis of extensive datasets that are aggregated from a variety of sources. These datasets may include but are not limited to, electronic records, digital imaging, genetic information sequences, transactional data, research outputs, and data streams from wearable technologies and connect...
Article
Full-text available
Distributed computing frameworks play a crucial role in supporting compute-intensive applications in the era of big data. The growing demand for computing resources has spurred the interconnection of data centers, leading to the formation of supercomputing Internet. MapReduce is a popular distributed computing framework designed for large independe...
Article
Full-text available
Abstract: The data growth rate is rapidly increasing with the emergence of new concepts and techniques such as CloudComputing, Big Data, IoT and Mobile Cloud Computing. This generates the problem of Information Overload, andobtaining information becomes difficult for users. Simultaneously, the problems of parallel computing and big data storagehave...
Article
Full-text available
With the rapid increase in the amount of big data, traditional software tools are facing complexity in tackling big data, which is a huge concern in the research industry. In addition, the management and processing of big data have become more difficult, thus increasing security threats. Various fields encountered issues in fully making use of thes...
Article
Full-text available
The current era has witnessed a remarkable transformation in scientific frontiers, largely driven by advancements in the digital domain. This has resulted in an unprecedented explosion of data known as big data. Among the platforms capable of effectively handling massive data volumes cost-effectively, MapReduce stands out. While previous research h...
Article
Full-text available
This paper systematically analyzes Apache Hadoop's technological evolution, tracing its transformation from a web crawling subsystem to a comprehensive enterprise computing platform. Beginning with its origins in Google's foundational papers on the Google File System (GFS) and MapReduce, we examine the critical architectural decisions and technical...
Article
Full-text available
The construction of a perfect employment guidance system is an inevitable choice to adapt to the current employment situation in colleges and universities, and the Internet and big data technology provide important support and opportunity for this. In this paper, the C4.5 algorithm is used to calculate the information entropy and information gain r...
Article
Full-text available
Coded distributed computing (CDC) is a powerful approach to reduce the communication overhead in distributed computing frameworks by utilizing coding techniques. In this paper, we focus on the CDC problem in (H,L)-combination networks, where H APs act as intermediate pivots and K=HL workers are connected to different subsets of L APs. Each worker p...
Thesis
Full-text available
A proposed secure cloud computing system has been built using Linux OS and Hadoop package. Hadoop package was utilized to build the area for saving and managing users’ data and enhance the security for it . Hadoop consist of one master (Name Node) and the number of slaves (Data Node). The master nodes oversee the two key functional pieces that make...
Article
Full-text available
Combining neural network technologies and computational techniques, this research establishes a career development promotion system based on a multi-modal neural network. It reveals that computer simulation technology and multimedia have positive intervention effects on college students’ career decision-making behaviors, similar to how biomolecular...
Poster
Full-text available
The Evolution of Apache Hadoop: A Technical Journey from Web Crawling to Enterprise Computing This paper systematically analyzes Apache Hadoop's technological evolution, tracing its transformation from a web crawling subsystem to a comprehensive enterprise computing platform. Beginning with its origins in Google's foundational papers on the Google...
Preprint
Full-text available
There are many established sorting algorithms such as insertion sort, bubble sort, merge sort, and quick sort. In this paper, we explore new sorting algorithms by exploiting the geometry of sort operations and related visualizations. We propose a class of Laghu-Guru Sort algorithms with many variants. We explore newer approaches to analysis of sort...
Preprint
Full-text available
The exponential rise in data generation has led to vast, heterogeneous datasets crucial for predictive analytics and decision-making. Ensuring data quality and semantic integrity remains a challenge. This paper presents a brain-inspired distributed cognitive framework that integrates deep learning with Hopfield networks to identify and link semanti...
Method
Full-text available
An educational project utilizing the Hadoop ecosystem (MapReduce, Hive, Pig, Spark & Scala) to analyze correlations between meteorological factors and fire incidents.
Thesis
Full-text available
Hadoop is an open-source version of the MapReduce Framework for distributed processing. A Hadoop cluster possesses the capacity to manage substantial volumes of data. Hadoop utilizes the Hadoop Distributed File System, also known as HDFS, to manage large amounts of data. The client will transfer data to the DataNodes by retrieving block information...
Article
Full-text available
The rapid expansion of digital communication has resulted in an overwhelming volume of email data that organizations must store, manage, and analyze. Email archives contain valuable information that can be leveraged for compliance, security monitoring, business intelligence, and legal discovery. However, traditional relational database management s...
Preprint
Full-text available
Large scale genome sequencing projects have produced huge datasets that pose challenges of high processing times especially for variant calling, a significant downstream analysis step. Efficient utilization of computational resources for accurate variant prediction in a timely manner is possible using Hadoop MapReduce framework. We have developed V...
Article
Full-text available
This research introduces a novel recommender system for adapting single-machine problems to distributed systems within the MapReduce (MR) framework, integrating knowledge and text-based approaches. Categorizing common problems by five MR categories, the study develops and tests a tutorial with promising results. Expanding the dataset, machine learn...
Article
Full-text available
Timely graduation is an important indicator of the quality of higher education. Yet, many students struggle to complete their studies on time due to challenges in finding relevant research topics and suitable supervisors. This study developed a two-way supervisor recommendation system that considers the preferences and expertise of both students an...
Article
Full-text available
The Hadoop/MapReduce framework has been widely utilized for processing big data. To overcome the limitations of existing work and meet the growing requirements of querying big data, this paper introduces novel join operations, called family joins, for HBase tables using their column families as join keys. Family joins possess the closure property t...
Article
Full-text available
Big data in healthcare defines a massive quantity of healthcare data accumulated from massive sources like electronic health records (EHR), medical imaging, genomic sequence, pharmacological research, wearable, and medical gadgets, etc. One of the data mining approaches commonly employed to classify big data is the MapReduce model. Data clustering,...
Preprint
Full-text available
The rapid evolution of computing paradigms has transformed technological innovation dramatically. This paper will dwell on the changing tide from mainframe centralized and to decentralized versions such as grid, cluster, and edge computing, focusing particularly on the elaboration of the concept of the cloud computing paradigm. Cloud computing-repr...
Preprint
Full-text available
This paper presents a conceptual framework, SocialMapReduce, that reimagines distributed computing through the lens of human social dynamics. We propose that by integrating cooperative and competitive social behaviors into multi-agent AI systems, we can create more adaptive and efficient distributed architectures. The framework challenges the tradi...
Thesis
Full-text available
The Multi-Label Classification problem is widely encountered in numerous real-world applications, attracting considerable attention from the Machine Learning and Data Mining communities over the past decades. This classification paradigm allows multiple labels to be assigned to an instance simultaneously. Although extensive research and experimenta...
Preprint
Full-text available
This paper introduces Nighttime Mapping and Daytime Reducing (NMDR), a novel and scalable MapReduce-inspired framework for multi-agent AI systems, drawing parallels to human social structures. NMDR organizes agents into family units that operate in two distinct phases: nighttime mapping, where agents within a family share resource, consolidate know...
Article
Full-text available
MapReduce has emerged as a cornerstone technology in the big data ecosystem, fundamentally transforming how organizations process and analyze massive datasets. This article provides a detailed examination of MapReduce's architecture, exploring its evolution from Google's original implementation to its current role in modern distributed computing sy...
Preprint
Full-text available
Democratic Autoregression (DA) proposes a biologically inspired framework for stabilizing autoregressive models by decoupling the training of "map" and "reduce" recurrences. In this approach, each local module (analogous to a cortical column) refines its predictions ("map") independently, while a global mechanism ("reduce") integrates these local f...
Preprint
Full-text available
Histograms provide a powerful means of summarizing large data sets by representing their distribution in a compact, binned form. The HistogramTools R package enhances R's built-in histogram functionality, offering advanced methods for manipulating and analyzing histograms, especially in large-scale data environments. Key features include the abilit...
Preprint
Full-text available
The neocortex processes information through two key MapReduce recurrences: 1. The "reduce" recurrence, which aggregates information across cortical columns and subcortical circuits, and 2. The "map" recurrence, which integrates information within the layers of a cortical column. These dual recurrences form the foundation of a novel framework for de...
Preprint
Full-text available
MapReduce principles are integral to both Transformer-based architectures and neocortical processing. In Transformers, the map operation occurs during the self-attention phase, where each token independently computes attention scores with all other tokens by producing key (K), query (Q), and value (V) vectors. This operation is inherently parallel...
Article
Full-text available
Numerous algorithms have been proposed to infer the underlying structure of the social networks via observed information propagation. The previously proposed algorithms concentrate on inferring accurate links and neglect preserving the essential topological properties of the underlying social networks. In this paper, we propose a novel method calle...
Article
Full-text available
In an ever-changing financial market, big data is set to revolutionize user interest management by sparking innovation and reshaping recommendations for the future. Conventional financial services face significant challenges like accessibility, personalization, limited reachability, and incomplete information about user interest patterns. Thus, it...
Article
Full-text available
The aim of this paper is to provide an in-depth analysis of parallel analysis strategies for MapReduce models, and to explore how to improve the overall performance by optimising task allocation and scheduling, improving data locality and increasing node utilisation. The research methodology includes an analysis and overview of existing MapReduce f...
Conference Paper
Full-text available
We propose a novel architecture to explore the mutual benefits of optical rackless data center (ORDC) and in-network computing for accelerating collective communications. It reduces job completion time of MapReduce clusters by 27.4% to 43.3% over traditional ORDC in experiments.
Preprint
Full-text available
While holding great promise for improving and facilitating healthcare, large language models (LLMs) struggle to produce up-to-date responses on evolving topics due to outdated knowledge or hallucination. Retrieval-augmented generation (RAG) is a pivotal innovation that improves the accuracy and relevance of LLM responses by integrating LLMs with a...
Article
Full-text available
Traditional approaches to data mining are generally designed for small, centralized, and static datasets. However, when a dataset grows at an enormous rate, the algorithms become infeasible in terms of huge consumption of computational and I/O resources. Frequent itemset mining (FIM) is one of the key algorithms in data mining and finds application...
Article
Full-text available
Distributed computing frameworks are essential for enabling efficient processing of large-scale data across multiple interconnected systems. These frameworks leverage the power of parallelism, allowing tasks to be distributed among various nodes, which can operate simultaneously to perform computations more rapidly than traditional single-node syst...
Conference Paper
Full-text available
The increase in the amount of data and the general change in social and market processes lead to the transformation of basic management principles. Under nondeterministic conditions, when developing solutions related to socially orientated systems forecasting of possible problems requires the use of modern tools of intelligent data analysis. Existi...
Article
Full-text available
With the continuous improvement in the efficiency of the heavy-haul railway freight transportation, the pressure on on-site maintenance is increasing. In-depth research on fault characteristics carries significant importance for fault scientific judgment and fault prevention. This study proposes an efficient association rule mining (ARM) algorithm,...
Article
Full-text available
Researchers have shown a lot of potential in optimizing cloud-based workload-scheduling over the past few years. However, executing scientific workloads inside the cloud is time-consuming and costly, making it inefficient from both a financial and productivity standpoint. As a result, there are many investigations conducted, with the general trend...
Article
Full-text available
This article addresses the importance of HaaS (Hadoop-as-a-Service) in cloud technologies, with specific reference to its usefulness in big data mining for environmental computing applications. The term environmental computing refers to computational analysis within environmental science and management, encompassing a myr-iad of techniques, especia...
Article
Full-text available
Knowledge extraction from large, uncertain datasets has become a new challenge due to the rapid growth of data. In the present world, where petabytes of data are being generated within a fraction of a second, certain mechanisms are needed to analyse it and extract useful features from it. In this paper, we propose an efficient distributed framework...
Article
Full-text available
It is impossible to separate the design and development of teaching resources from the teaching of ideological education in colleges and universities. However, the current digital teaching resources for ideological education exist in large quantities and vary in quality in many colleges and universities. In order to tackle these issues, the study e...
Article
Full-text available
In the era of big data, the volume, variety, and velocity of data generated pose significant challenges for data cleaning and mining processes. Traditional approaches to data cleaning and mining often struggle to handle large datasets efficiently, leading to increased processing time and reduced accuracy. Leveraging distributed processing technique...
Preprint
Full-text available
Supercomputers getting ever larger and energy-efficient is at odds with the reliability of the used hardware. Thus, the time intervals between component failures are decreasing. Contrarily, the latencies for individual operations of coarse-grained big-data tools grow with the number of processors. To overcome the resulting scalability limit, we nee...
Article
Full-text available
Big data streaming involves managing the vast volumes of data generated continuously by wearable medical devices with sensors, healthcare cloud platforms, and mobile applications. Traditional methods for processing this data are often time- and resource-intensive. To address this challenge, there is a need for efficient and scalable real-time big d...
Article
Full-text available
The amount of data that is growing rapidly today also requires fast storage. This is because the need for data is also very important, and to access the data also requires fast time. Therefore, it is necessary to know the tools that support the processing of large amounts of data and fast time. The presence of Impala and Hive-Hadoop helps in making...
Article
Full-text available
With the exponential growth of data, traditional database management systems (DBMSs) face unprecedented challenges in scalability, performance, and flexibility. This paper surveys key developments in scalable database systems, focusing on their evolution and integration with modern distributed architectures such as MapReduce, HadoopDB, alongside sy...
Article
Full-text available
The significance of developing Big Data applications has increased in recent years, with numerous organizations across various industries relying more on insights derived from vast amounts of data. However, conventional data techniques and platforms struggle to cope the Big Data, exhibiting sluggish responsiveness and deficiencies in scalability, p...
Article
Full-text available
Batch processing has become a critical approach in handling large-scale data operations in the era of big data. Apache Hadoop, with its MapReduce framework, has revolutionized how organizations store, process, and analyze massive datasets, enabling scalable, fault-tolerant batch processing. This paper introduces the fundamental concepts of batch pr...
Article
Full-text available
This paper provides a comprehensive analysis of the evolution and impact of data mining in the field of artificial intelligence (AI), with a particular focus on its application within social and information networks. It traces the origins of AI back to the 1956 Dartmouth Conference, highlighting the subsequent advancements in technologies such as m...
Article
Full-text available
In this work, we investigate the online MapReduce processing problem on m m uniform parallel machines, aiming at minimizing the makespan. Each job consists of two sets of tasks, namely, the map tasks and the reduce tasks. A job’s map tasks can be arbitrarily split and processed on different machines simultaneously, while its reduce tasks can only b...
Article
Full-text available
The amount of memory consumed by Internet data reached to terabytes, petabytes, or zettabytes which make it difficult for processing, analysed, and retrieving. At the same time many techniques have been carried to process the big data. The dealing with the statistical programs became very hard. There are several algorithms that is used in big data...
Article
Full-text available
Recently, the research community focuses on processing various types of location-based queries (or LBQs for short) (e.g., the range and nearest neighbor queries) on spatial objects of the same type in road networks, in which the road distance from objects to the query object is an important metric for determining the query result and needs to be ca...
Article
Full-text available
The traditional clustering algorithms are not appropriate for large real‐world datasets or big data, which is attributable to computational expensiveness and scalability issues. As a solution, the last decade's research headed towards distributed clustering using the MapReduce framework. This study conducts a bibliometric review to assess, establis...
Preprint
Full-text available
Enlarging the context window of large language models (LLMs) has become a crucial research area, particularly for applications involving extremely long texts. In this work, we propose a novel training-free framework for processing long texts, utilizing a divide-and-conquer strategy to achieve comprehensive document understanding. The proposed LLM$\...
Article
Full-text available
The current Chinese cooking nutritional and scientific degree is not high. Chinese cooking technology has made it difficult to meet people’s health needs. This paper explores the path of nutritional scientificization in digital cooking. The FCM algorithm based on MapReduce has been found to have a good clustering effect and execution efficiency. Ba...
Article
Full-text available
Recent advancements in information and communication technologies have led to a proliferation of online systems and services. To ensure these systems’ trustworthiness and prevent cybersecurity threats, Intrusion Detection Systems (IDS) are essential. Therefore, developing advanced and intelligent IDS models has become crucial. However, most existin...
Article
Full-text available
Naive Bayes classifier is well known machine learning algorithm which has shown virtues in many fields. In this work big data analysis platforms like Hadoop distributed computing and map reduce programming is used with Naive Bayes and Gaussian Naive Bayes for classification. Naive Bayes is manily popular for classification of discrete data sets whi...
Article
Full-text available
Cloud computing has recently made it easier to distribute varied, unstructured digital data within social networks of differing opinions. Processing large volumes of text data requires precise computational methods, which increases the system’s workload. Integrating big data with Natural Language Processing (NLP) has enhanced this. Frameworks li...
Article
Full-text available
Grid data is compressed and stored without processing. There are problems of large compression errors and long running time, which affect the compression and storage effect. Therefore, this paper proposes a method for storing grid data using parallel computing frameworks. While drawing on the task scheduling strategy of MapReduce distributed comput...
Article
Full-text available
In order to achieve user recommendations that best match their current contextual needs, the author proposes a mobile service QoS (Quality of Service) hybrid recommendation model based on sports user situational awareness. Cluster the users and service items covered by mobile users based on their location contextual information according to the cla...
Article
Full-text available
In the process of parallel density clustering, the boundary points of clusters with different densities are blurred and there is data noise, which affects the clustering performance and makes the clustering results subject to the influence of local optimality. A parallel density clustering algorithm based on MapReduce and optimized cuckoo algorithm...
Article
Full-text available
Machine learning has been widely used and applied in a variety of fields in our lives. However, as the big data era approaches, certain traditional machine learning algorithms will be unable to meet the demands of real-time data processing for massive datasets. As a result, machine learning must reinvent itself for the age of big data. In this pape...
Article
Full-text available
Due to the rapid growth of network technology, huge volume and distinct data sent via networks is expanding constantly. The situation shows how complex and dense cyber attacks and hazards are developing. Due to the rapid advancement in network density, cyber security specialists find it difficult to monitor all network activity. Due to frequent and...
Article
Full-text available
In the era of big data, it is necessary to provide novel and efficient platforms for training machine learning models over large volumes of data. The MapReduce approach and its Apache Spark implementation are among the most popular methods that provide high-performance computing for classification algorithms. However, they require dedicated impleme...
Preprint
Full-text available
The Monte Carlo (MC) method is the most common technique used for uncertainty quantification, due to its simplicity and good statistical results. However, its computational cost is extremely high, and, in many cases, prohibitive. Fortunately, the MC algorithm is easily parallelizable, which allows its use in simulations where the computation of a s...
Article
Full-text available
In the recent era, the practical implementation of Autonomous Vehicular Networks (AVNs) with the vulnerable Vehicle-to-Vehicle (V2V) communication of autonomous vehicles and inadequate intelligent decision-making systems has become a primary concern in smart city mobility. This has led to a worsening traffic congestion nightmare with its associates...
Article
Full-text available
The Industrial Internet of Things (IIoT) recently drawn the interest of numerous academics due to advancements in connection, efficiency, scalability, and cost reductions. In this research, consumers are supplied with secure access to the owners' data. The major goal of the suggested technique is to lower the price and workload of the services bein...
Article
Full-text available
With the exponential growth of data, the demand for efficient and scalable data processing solutions has become paramount. Hadoop and Spark, pivotal components of the open-source Big Data landscape, have been put to the test in this study. We conducted a comprehensive performance analysis of Hadoop and Spark in virtualized environments, evaluating...
Article
Full-text available
In data science era, the scale of weather data is enormous and rising rapidly. Apache Hadoop is a fast and efficient framework which has been used in many applications in big data field. However, for the large-scale weather dataset, the traditional algorithms are not capable enough to satisfy the genuine application requirements efficiently. Hadoop...
Article
Full-text available
Currently, high-performance computing environments are facing challenges such as limited resources and an increasing number of users. In order to improve the utilization of environmental resources, this paper proposes a high-performance hybrid computing architecture based on big data processing technology, which is constructed on the basis of an HD...
Article
Full-text available
Model-driven software development has become a hot research topic and discovery trend in the field of software engineering. Its core idea is to treat analysis and design models as equivalent to code. Better integration of models and code can greatly increase the chances of effective improvement and achieve automated software development through abs...
Article
Full-text available
To improve the accuracy of news communication between readers and audiences, it is necessary to avoid the errors generated in the process of news translation so as to make the English translation of news content expression more in line with the actual situation, thus ensuring the authenticity of news dissemination. In this paper, the MapReduce mult...
Article
Full-text available
Hadoop MapReduce (HMR) provides the most common MapReduce (MR) framework, and it is available as open source. MR is a famous computational framework for evaluating unstructured, and semi-structured big data and executing applications in the past ten years. Memory and input/output (I/O) overhead are just two of the many problems affecting the curren...
Article
Full-text available
p>Video saliency detection is a rapidly growing subject that has seen very few contributions. The most common technique used nowadays is to perform frame-by-frame saliency detection. The modified Spatio-temporal fusion method presented in this paper offers a novel approach to saliency detection and mapping. It uses frame-wise overall motion color s...
Article
Full-text available
Sentiment analysis (SA) or opinion mining is a general dialogue preparation chore that intends to discover sentiments behind the opinions in texts on changeable subjects. Recently, researchers in an area of SA have been considered for assessing opinions on diverse themes like commercial products, everyday social problems and so on. Twitter is a reg...
Article
Full-text available
In the context of the current era of big data, traditional Hadoop and cluster-based MapReduce frameworks are unable to meet the demands of modern research. This paper presents a MapReduce framework based on the AliCloud Serverless platform, which has been developed with the objective of optimizing word frequency counting in large-scale English text...
Preprint
Full-text available
Modern applications can generate a large amount of data from different sources with high velocity, a combination that is difficult to store and process via traditional tools. Hadoop is one framework that is used for the parallel processing of a large amount of data in a distributed environment, however, various challenges can lead to poor performan...
Research
Full-text available
• Focused on clustering and similarity-based retrieval techniques with a case study on finding similar documents. Key skills and topics include: • Implementing document retrieval systems using k-nearest neighbors and reducing computations with KD-trees and locality-sensitive hashing. • Clustering documents by topics using k-means and parallelizing...
Article
Full-text available
In an era where AI-driven decision-making is becoming increasingly important, the surge in data generation across different sectors poses significant scalability challenges for big data processing. This study delves into these challenges, aiming to enhance our understanding and management of large data volumes. It begins by stressing the critical i...
Article
Full-text available
The arrival of the big data era makes the amount of data explosive growth, which puts forward new challenges and demands for computer network technology, and the integration of big data and network technology has become an important trend. This paper uses the optimization strategy and the elimination mechanism of the genetic algorithm to optimize t...
Article
Full-text available
In recent years, cloud computing applications have facilitated the distribution of heterogeneous, unstructured digitized data among users’ social networks of varying opinions. Text processing at a large scale requires high-precision computational techniques, which increases the computational burden. The advent of big data analytics along with Natur...
Article
Full-text available
This paper investigates the application of serverless computing in conjunction with the MapReduce framework, particularly in machine learning (ML) tasks. The MapReduce programming model has been widely used to process large-scale datasets by simplifying parallel and distributed data processing. This study explores how the combination of these two t...
Article
Full-text available
Big Data is a collection of large amount used to store and to process for future use. Internet of Things (IoT) technology is used in smart home, smart healthcare. IoT has limited resources like processing capability and supplied energy. Many researchers carried out their research on resource optimized data clustering in bigdata environment. But, th...
Article
Full-text available
Within the Hadoop ecosystem, MapReduce stands as a cornerstone for managing, processing, and mining large-scale datasets. Yet, the absence of efficient solutions for precise estimation of job execution times poses a persistent challenge, impacting task allocation and distribution within Hadoop clusters. In this study, we present a comprehensive mac...
Preprint
Full-text available
Background: Compressive genomics consists of a set of techniques that, for some important bioinformatics tasks such as sequence comparison and search, leverages on a compressed representation of the data in order to improve time performance. That is, its good compression is not the final goal, since the new and smaller representation of the data sh...
Article
Full-text available
Recently, opinion mining has been introduced to extract useful information from a large amount of SNS (Social Network Service) data and evaluate the user's true information. Opinion mining requires an efficient technique that collects and analyzes data from a large amount of data within a short period of time to extract information suitable for the...
Article
Full-text available
Recently, as the construction of a large-scale sensor network increases, a system for efficiently managing large-scale sensor data is required. In this paper, we propose a cloud-based sensor data management system with low cost, high scalability, and high efficiency. In the proposed system, sensor data is transmitted to the cloud through the cloud...
Article
Full-text available
Recently, research and utilization of distributed storage and processing systems for
Article
Full-text available
In a world of data deluge, considerable computational power is necessary to derive knowledge from the mountains of raw data which surround us. This trend mandates the use of various parallelization techniques and runtimes to perform such analyses in a meaningful period of time. The information retrieval community has introduced a programming model...
Article
Full-text available
Due to high complexities within distributed computing environments, there's a critical need for advanced scheduling frameworks that are capable of optimizing MapReduce systems. Current approaches have static policies that limit their capability to adapt to changing system dynamics and workload variations for different cloud scenarios. To overcome t...
Article
Full-text available
When using modern big data processing tools, there is a problem of increasing the productivity of using modern frameworks in the context of effective setting of various configuration parameters. The object of the research is computational processes of processing big data with the use of technologies of high-performance frameworks. The subject is me...
Article
Full-text available
A cost-effective and effective agriculture management system is created by utilizing data analytics (DA), internet of things (IoT), and cloud computing (CC). Geographic information system (GIS) technology and remote sensing predictions give users and stakeholders access to a variety of sensory data, including rainfall patterns and weather-related i...
Article
Full-text available
The subject of the research is the problem of detecting falsified data, in particular in audio format, in socially oriented systems. The goal of the work is to develop an effective model based on recurrent and convolutional neural networks for determining the fact of forgery of sound data, using MapReduce technology for parallelization. The article...
Article
Full-text available
The Internet of Things (IoT) has been deployed in a vast range of applications with exponential increases in data size and complexity. Existing forensic techniques are not effective for the accuracy and detection rate of security issues in IoT forensics. Cyber forensic comprises huge volume constraints that are processing huge volumes of data in th...
Article
Full-text available
Ensuring data confidentiality is a critical requirement for modern security systems globally. Despite the implementation of various access-control policies to enhance system security, significant threats persist due to insecure and inadequate access management. To address this, Multi-Party Authorization (MPA) systems employ multiple authorities for...