Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The cloud computing paradigm provides a shared pool of resources and services with different models delivered to the customers through the Internet via an on-demand dynamically-scalable form charged using a pay-per-use model. The main problem we tackle in this paper is to optimize the resource provisioning task by shortening the completion time for the customers’ tasks while minimizing the associated cost. This study presents the dynamic resources provisioning and monitoring (DRPM) system, a multi-agent system to manage the cloud provider’s resources while taking into account the customers’ quality of service requirements as determined by the service-level agreement (SLA). Moreover, DRPM includes a new virtual machine selection algorithm called the host fault detection algorithm. The proposed DRPM system is evaluated using the CloudSim tool. The results show that using the DRPM system increases resource utilization and decreases power consumption while avoiding SLA violations.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Such models potentially allow cloud platforms to be efficient, scalable, and adaptable. In this direction, extensible and modular multi-agent architectures to optimize cloud resource management are still a research gap with improving interest [9,11,12,15,16]. ...
... is the user's estimated workload time obtained through the MLR presented in Eq. (15). represents the resource utilization upper limit predefined by SLA. ...
... represents the user-defined upper limit determined using the pricing model for each VM based on the estimated execution time presented in Eq. (15). ...
Article
Nowadays, scientific and commercial applications are often deployed to cloud environments requiring multiple resource types. This scenario increases the necessity for efficient resource management. However, efficient resource management remains challenging due to the complex nature of modern cloud-distributed systems since resources involve different characteristics, technologies, and financial costs. Thus, optimized cloud resource management to support the heterogeneous nature of applications balancing cost, time, and waste remains a challenge. Multi-agent technologies can offer noticeable improvements for resource management, with intelligent agents deciding on Virtual Machine (VM) resources. This article proposes MAS-Cloud+, a novel agent-based architecture for predicting, provisioning, and monitoring optimized cloud computing resources. MAS-Cloud+ implements agents with three reasoning models including heuristic, formal optimization, and metaheuristic. MAS-Cloud+ instantiates VMs considering Service Level Agreement (SLA) on cloud platforms, prioritizing user needs considering time, cost, and waste of resources providing appropriate selection for evaluated workloads. To validate MAS-Cloud+, we use a DNA sequence comparison application subjected to different workload sizes and a comparative study with state-of-the-art work with Apache Spark benchmark applications executed on the AWS EC2. Our results show that to execute the sequence comparison application, the best performance was obtained by the optimization model, whereas the heuristic model presented the best cost. By providing the choice among multiple reasoning models, our results show that MAS-Cloud+ could provide a more cost-effective selection of the instances reducing aprox 58% of execution average cost of WordCount, Sort, and PageRank BigDataBench benchmarking workloads. As for the execution time, the WordCount and PageRank present reduction, the latter with aprox 58%. The results indicate a promising solution for efficient cloud resource management.
... The dynamic resources provisioning and monitoring (DRPM) system proposed by Al-Ayyoub et al. [12] is a multi-agent system designed to handle the resources in the cloud provider whilst taking into consideration the required quality for the clients' requirements. These requirements are controlled by the SLA. ...
... The ones who are worrying about reaching efficient utilization and monitoring resources are the CSB. By using smart ways and methods, the broker should obtain the finest quality of service assures without affecting or creating any breach in the SLA [12]. A cloud SLA is an agreement between the cloud providers and a cloud customer that ensures the lowest level of service is preserved. ...
... The DCM system is a multi-agent model that takes into consideration various elements when forming the decision, for example, the number and the shortcoming of cloud resources, the customers' fulfilment, and the customers' QoS requirements. This system is created by adapting and combing both the DRPM system introduced in [12] and the cloud broker architecture introduced in [11]. Figure 5 shows the architecture of the DCM system. ...
Article
Full-text available
The cloud computing model offers a shared pool of resources and services with diverse models presented to the clients through the internet by an on-demand scalable and dynamic pay-per-use model. The developers have identified the need for an automated system (cloud service broker (CSB)) that can contribute to exploiting the cloud capability, enhancing its functionality, and improving its performance. This research presents a dynamic congestion management (DCM) system which can manage the massive amount of cloud requests while considering the required quality for the clients' requirements as regulated by the service-level policy. In addition, this research introduces a forwarding policy that can be utilized to choose high-priority calls coming from the cloud service requesters and passes them by the broker to the suitable cloud resources. The policy has made use of one of the mechanisms that are used by Cisco to assist the administration of the congestion that might take place at the broker side. Furthermore, the DCM system is used to help in provisioning and monitoring the works of the cloud providers through the job operation. The proposed DCM system was implemented and evaluated by using the CloudSim tool.
... 2. How to accurately predict when over-utilisation is going to occur on a host. 3. Deciding optimal times to schedule live migration. ...
... This section provides an overview of each chapter contained within this thesis. The initial chapters (1,2,3) describe the domains of cloud computing, live migration, Reinforcement Learning and Neural Networks, as well as describing the current and ongoing work relevant to our research. The ensuing results chapters (4,5,6) present our contributions and findings within these domains. ...
... In dynamic consolidation (DVMC), VMs are configured based on their current resource requirement, which is known as "re-sizing". This can lead to more efficient algorithm that selects a VM to migrate from the over-utilised host based on the maximum impact on the cause of the overload [3]. If the host's over-utilisation is caused by RAM then the VM with the maximum allocated RAM is selected by the algorithm. ...
Thesis
Full-text available
Cloud computing providers utilise large-scale data centres to provide computing resource to users’ worldwide. However, an ongoing challenge facing cloud providers is to ensure that the required resource is available to users at all times. This problem is compounded further by the fact that resource consumption is in a constant state of flux. One approach leveraged to improve resource availability for users is ‘Live Migration’. This thesis presents a number of novel intelligent live migration solutions, by applying machine learning to improve the performance of a data centre. The first contribution utilises a control algorithm known as ‘Reinforcement Learning’ to decide which virtual machines to migrate, based on RAM size when peak traffic condition is occurring and bandwidth availability is fluctuating. A Reinforcement Learning agent is implemented to decide which virtual machines to migrate from over-utilised hosts depending on currently available bandwidth and the host machine CPU utilisation. The second contribution utilises the power of Neural Networks to predict when a host will become over-utilised. A ‘Recurrent Neural Network’ is implemented, trained with both traditional training algorithms (BackPropagation and Back-Propagation-Through-Time) and evolutionary algorithms (Particle Swarm Optimisation, Covariance Matrix Adaptation and Differential Evolution) to best predict CPU utilisation of a host for single and multiple time steps into the future. The final contribution implements a ‘Recurrent Neural Network’ to predict both CPU of a host and bandwidth availability between host to decide optimal times to schedule live migration within a data centre. The Recurrent Neural Network predicts first if a host will become over-utilised and secondly predicts the available bandwidth before that host becomes over-utilised to best decided what times to migrate virtual machines. The work presented in this Thesis demonstrates how Artificial Intelligence and Machine Learning algorithms can have a positive impact on live migration in cloud data centres
... π indicates that the plan made this time is the sequence of actions to be performed by the agent. [9] Deliberative agents have strong reasoning and decision-making skills and are able to plan their actions according to changes in the environment to achieve predetermined goals. However, deliberative agents consume more resources than reactive agents. ...
... Apply multi-agent technology to micro-printer systems to solve the core problems of micro-cluster printing systems in terms of reliable printing and state regulation of continuous high-volume orders. Communication is the basis for cooperation between multi-agent systems [9] . Agents embedded in the device have the ability to resolve all issues [10] [11] . ...
... The new methodology named DPRM [29] demonstrates the multi-agent related dynamic resource utilizing the technique for an effective way of sharing the data in a cloud environment. The required time for the conversion of plain text into cipher text and vice-versa is dependent upon the complexness of the retrieval mechanism [30]. ...
... The configuration is analyzed and the workload for the virtual machine is used for analyzing the performance. The proposed work ISRU is compared with the related methods of CLBVM [36], CASID [34], and DPRM [29]. The performance analysis for the proposed methodology is done with the existing methods is executed using the real-time systems with Windows 10 operating system with Intel 2 Cores 8 CPU version i7, the RAM is 16 GB, speed is 2 GHZ. ...
Article
Full-text available
Cloud-based environments utilize a different kind of security services on the Internet in a cost effective manner. The cloud-based service providers may diminish the cost for the operational purpose by the methodology of automatic controlling of the resource utilization with the user’s demand. Moreover, the time and expenditure may expand and the amount of active utilization of computational resources is one of the high restrictions of the scalability measurement. The automatic controlling and utilization of resources are the biggest confront in a cloud computing environment. This paper proposed a solution for providing the automatic scalability of the limited resources for the multi-layered cloud applications. The Google penalty payment methodology was utilized to synchronize the expenditure for the penalty related issues and to correctly compute the actual profit. A hybrid resource utilization algorithm is used to find the valid resources in the cloud layer with a security-aware algorithm are utilized to distribute the resources to the active users based on their request. The experimental results are performed using CLOUDSIM that indicates the advancements of the proposed methodology in terms of resource utilization, providing security and profit identification.
... Considering that the research focus of this paper is on the use of multi-agent approach to effectively manage transient instance resources, the systematic literature review presented some works using agent approach with dedicated machines (on-demand) [2,6,12,31]. The core agent behavior in these works use elastic resource allocation associated to prediction methods for effective resource provisioning in cloud computing. ...
... The adjustment indicator function, A I , assumes the value 1 if its argument is true and zero otherwise. The kernel functions, K θ , for each of the attributes in θ D all assume the value 1 when γ (1) θ = γ (2) θ , and decrease polynomially to zero as the values become more distant. The rate of decay varies according to the respective attribute. ...
... To do this, independent samples t-test is used to determine which feature can be excluded without affecting the results negatively. Such tests are used in many studies for this purpose [21]. In our work, we select the best classifier according to the mentioned measures, which was SMO with bagging applied it. ...
Conference Paper
Full-text available
Quranic Recitation Rules (Ahkam Al-Tajweed) are the articulation rules that should be applied properly when reciting the Holy Quran. Most of the current automatic Quran recitation systems focus on the basic aspects of recitation, which are concerned with the correct pronunciation of words and neglect the advanced Ahkam Al-Tajweed that are related to the rhythmic and melodious way of recitation such as where to stop and how to "stretch" or "merge" certain letters. The only existing works on the latter parts are limited in terms of the rules they consider or the parts of Quran they cover. This paper comes to fill these gaps. It addresses the problem of identifying the correct usage of Ahkam Al-Tajweed in the entire Quran. Specifically, we focus on eight Ahkam Al-Tajweed faced by early learners of recitation. Popular audio processing techniques for feature extraction (such as LPC, MFCC and WPD) and classification (KNN, SVM, RF, etc.) are tested on an in-house dataset. Moreover, we study the significance of the features by performing several t-tests. Our results show the highest accuracy achieved is 94.4%, which is obtained when bagging is applied to SVM with all features except for the LPC features.
... To do this, independent samples t-test is used to determine which feature can be excluded without affecting the results negatively. Such tests are used in many studies for this purpose [21]. In our work, we select the best classifier according to the mentioned measures, which was SMO with bagging applied it. ...
... In [8] a multi-agent model, which is partially centralized by separate clusters of centralized resource provisioning mechanisms was developed. Multiple agents make cloud computing more flexible and more autonomous. ...
Article
Resource management is a fundamental concept in cloud computing and virtualization, encompassing the allocation, release, coordination, and monitoring of cloud resources to optimize efficiency. The complexity arises from the virtualized, heterogeneous, and multi-user nature of these resources. Effective governance is challenging due to uncertainty, large-scale infrastructures, and unpredictable user states. This paper presents a comprehensive taxonomy of resource management technologies, offering a detailed analysis of design architecture, virtualization, and cloud deployment models, along with capabilities, objectives, methods, and mechanisms. In a cloud computing environment, deploying application-based resource management techniques necessitates understanding the system architecture and deployment model. This paper explores centralized and distributed resource management system architectures, providing a review of effective resource management techniques for both, accompanied by a comparative analysis. The evolution of cloud computing from a centralized to a distributed paradigm is examined, emphasizing the shift towards distributed cloud architectures to harness the computing power of smart connected devices at the network edge. These architectures address challenges like latency, energy consumption, and security, crucial for IoT-based applications. The literature proposes various methods for distributed resource management, aligning with the distributed nature of these architectures. Resource management in cloud computing involves discovery, provisioning, allocation, and monitoring functions, with sub-functions like mapping and scheduling. Integrated approaches to consolidation and resource management have been explored in numerous studies. This paper summarizes and analyzes existing research on resource management functions, focusing on identification, provisioning, allocation planning, and monitoring, based on their objectives and methods.
... Al-Ayyoub et al. [29] provided an overview of the Dynamic Resources Provisioning and Monitoring (DRPM) system for organizing cloud resources through the use of multiagent systems. The DPRM also takes into consideration the QoS for cloud customers. ...
Article
Full-text available
Large amounts of data are created from sensors in Internet of Things (IoT) services and applications. These data create a challenge in directing these data to the cloud, which needs extreme network bandwidth. Fog computing appears as a modern solution to overcome these challenges, where it can expand the cloud computing model to the boundary of the network, consequently adding a new class of services and applications with high-speed responses compared to the cloud. Cloud and fog computing propose huge amounts of resources for their clients and devices, especially in IoT environments. However, inactive resources and large number of applications and servers in cloud and fog computing data centers waste a huge amount of electricity. This paper will propose a Dynamic Power Provisioning (DPP) system in fog data centers, which consists of a multi-agent system that manages the power consumption for the fog resources in local data centers. The suggested DPP system will be tested by using the CloudSim and iFogsim tools. The outputs show that employing the DPP system in local fog data centers reduced the power consumption for fog resource providers.
... Historically, the field of cloud computing is primarily characterized by hardware-centric models, in which physical resources determine computational capacities [4]. The introduction of SDS transforms this paradigm, focusing on software-driven governance, adaptability, and mechanization. ...
Conference Paper
Businesses increasingly lean on cloud-based solutions in the contemporary digital landscape, drawn by their scalability and adaptability. Navigating the financial intricacies of cloud subscription models, particularly when intertwined with software-defined systems, remains a formidable challenge. This challenge is accentuated by dynamic pricing structures, making accurate cost forecasting a critical but complex endeavor. This research unveils an innovative methodology designed to meticulously forecast the financial costs of cloud subscription tasks based on resource allocation characteristics. Our approach centers around a robust model carefully created through particular preparation and modeling stages that incorporate a pricing model for various virtual machines and utilize different advanced regression algorithms to navigate the complex world of cloud subscription services. The results of our study, which analyzed real cloud workloads, indicate that equipping businesses with a powerful predictive tool can lead to improved financial planning and strategic decision-making in their cloud operations. This study aims to guide businesses in the ever-changing field of cloud technology, focusing on promoting financial efficiency and improving decision-making strategies in their cloud initiatives.
... It may lead to imbalanced server loads, resulting in numerous hot and cold spots throughout the data center [3]. Cloud Resource Management System (CRMS) need to be equipped with the appropriate procedures for identifying and collecting information regarding active workloads to circumvent these potential issues [4]. Anomaly detection, clustering, classification, and regression are some of the Machine Learning (ML) models that can be utilized to improve the Selection and Decision-Making (SDM) modules. ...
... The system of DRPM has been presented by Al-Ayyoub et al. [38]. This multi-agent system was designed to control the cloud provider's resources while considering the customers quality of service requirements, which are controlled by the SLA. ...
Article
Full-text available
Cloud computing is a massive amount of dynamic ad distributed resources that are delivered on request to clients over the Internet. Typical centralized cloud computing models may have difficulty dealing with challenges caused by IoT applications, such as network failure, latency, and capacity constraints. One of the introduced methods to solve these challenges is fog computing which makes the cloud closer to IoT devices. A system for dynamic congestion management brokerage is presented in this paper. With this proposed system, the IoT quality of service (QoS) requirements as defined by the service-level agreement (SLA) can be met as the massive amount of cloud requests come from the fog broker layer. In addition, a forwarding policy is introduced which helps the cloud service broker to select and forward the high-priority requests to the appropriate cloud resources from fog brokers and cloud users. This proposed idea is influenced by the weighted fair queuing (WFQ) Cisco queuing mechanism to simplify the management and control of the congestion that may possibly take place at the cloud service broker side. The system proposed in this paper is evaluated using iFogSim and CloudSim tools, and the results demonstrate that it improves IoT (QoS) compliance, while also avoiding cloud SLA violations.
... In order to achieve data availability, the cloud data is replicated and stored. The replicas increase the data availability, however it introduces several challenges such as data modification and deletion [1][2][3]. When the original me modifications, the same modifications have to be reflected on the replicas also, The modifications must reflect the replicas immediately, so as to escape from data inconsistency. ...
Article
Full-text available
Cloud computing technology has gained substantial research interest, due to its remarkable range of services. The major concerns of cloud computing are availability and security. Several security algorithms are presented in the literature for achieving better security and the data availability is increased by utilizing data replicas. However, creation of replicas for all the data is unnecessary and consumes more storage space. Considering this issue, this article presents a Secure Data Replication Management Scheme (SDRMS), which creates replicas by considering the access frequency of the data and the replicas are loaded onto the cloud server by considering the current load of it. This idea balances the load of the cloud server. All the replicas are organized in a tree like structure and the replicas with maximum hit ratio are placed on the first level of the tree to ensure better data accessibility. The performance of the work is satisfactory in terms of data accessibility, storage exploitation, replica allocation and retrieval time.
... Nowadays, in real-world, the resource allocation and prioritizing tasks are challenging in the distribution system. In agent-based environments, developing algorithms of practical value operates very well where a large amount of uncertainty or dynamism (Al-Ayyoub et al. 2015). The centralized technique efficiency is limited by the scalability challenges in many problems and some level of autonomous behavior is required for the reliability of communications. ...
Article
Full-text available
Nowadays, in different fields, tremendous attention is received by the Multi-agent systems for complex problem solutions with smaller task subdivision. Multiple inputs are utilized, e.g., history of actions, interactions with its neighboring agents by an agent. By the existing techniques for the task planning of the control structure the low efficiency is exhibited. By utilizing the sole numerical analysis method for a complicated distributed resource planning problem, the satisfactory optimal solution is impossible to obtain. In this paper, the control structure model is presented based on the multi-agents, in which the multi-agents superiority is exploited for complex task achievement. The collaboration of multi-agent framework is redefined, and the local conflict coordination mechanism is developed. Moreover, the high adaptability and superior cooperation are exhibited by the presented technique. The function value and its time–space complexity are analyzed, and it is obtained that the lower objective function value is achieved by the algorithm and the better convergence and adaptability are exhibited. The presented technique is 37–43% better than the Hierarchical Task Network Planning (HTN) technique for different time slots. The performance of the presented technique is 29–34% better compared to the Time Preference HTN technique in terms of function value. The performance of the proposed technique is better compared to the existing techniques in terms of obtained function values.
... Data centers need efficient resource management approaches to ensure users' workload has the required resources while not violating service level agreements (SLA). Optimizing resource allocation for optimal performance requires applying prediction algorithms to forecast the upcoming workload and resources needed to perform any load manipulated by the data center [11]. The dynamic nature of server load hamstrings linear regression models due to these models' inherent limitations. ...
... Many centralized resource management techniques have been proposed in the literature adopting centralized architecture to support cloud-based applications, we describe some of the recent works such as in Gutierrez-Garcia et al. [100] developed an agent based elastic cloud bag-of-tasks which is a centralized model. Similarly, Al-Ayyoub et al. [9] also developed a multi-agent-based model which is partially centralized, due to individual clusters of centralized mechanism for resource provisioning. Multiple agents allow cloud computing to be more flexible and more autonomous. ...
Article
Full-text available
Resource management (RM) is a challenging task in a cloud computing environment where a large number of virtualized, heterogeneous, and distributed resources are hosted in the datacentres. The uncertainty, heterogeneity, and the dynamic nature of such resources affect the efficiency of provisioning, allocation, scheduling, and monitoring tasks of RM. The most existing RM techniques and strategies have insufficiency in handling such cloud resources dynamic behaviour. To resolve these limitations, there is a need for the design and development of intelligent and efficient autonomic RM techniques to ensure the Quality-of-Service (QoS) of cloud-based applications, satisfy the cloud user requirements, and avoid a Service-Level Agreement (SLA) violations. This paper presents a comprehensive review along with a taxonomy of the most recent existing autonomic and elastic RM techniques in a cloud environment. The taxonomy classifies the existing autonomic and elastic RM techniques into different categories based on their design, objective, function, and applications. Moreover, a comparison and qualitative analysis is provided to illustrate their strengths and weaknesses. Finally, the open issues and challenges are highlighted to help researchers in finding significant future research options.
... There are several prediction models categorized depending on the properties of the linear filter, including [71][72][73][74][75][76][77]: Auto Regressive (AR), Moving Average (MA), Auto Regressive Moving Average (ARMA), Auto Regressive Integrated Moving Average (ARIMA) and Linear Regression (LR). ...
Article
In the recent years, the Internet of Things (IoT) services has been increasingly applied to promote the quality of the human life and this trend is predicted to stretch for into future. With the recent advancements in IoT technology, fog computing is emerging as a distributed computing model to support IoT functionality. Since the IoT services will experience workload fluctuations over time, it is important to automatically provide the proper number of sufficient fog resources to address the workload changes of IoT services to avoid the over- or under-provisioning problems, meeting the QoS requirements at the same time. In this paper, an efficient resource provisioning approach is presented. This approach is inspired by autonomic computing model using Bayesian learning technique to make decisions about the increase and decrease in the dynamic scaling fog resources to accommodate the workload from IoT services in the fog computing environment. Also, we design an autonomous resource provisioning framework based on the generic fog environment three-tier architecture. Finally, we validate the effectiveness of our solution under three workload traces. The simulation results indicate that the proposed solution reduces the total cost and delay violation, and increases the fog node utilization compared with the other methods.
... Presenting service brokering algorithm [96][97], Load balancing techniques [98][99] EMUSim Designing a particular networks [100], Emulation of cloud network [101], Evaluating the performance of SSLB [102] CDOSim Deployment cloud options [103], Modeling the provider migration [104] TeachCloud Presenting resource allocation strategy [105,65], MapReduce modeling simulation [106] DartCSim Configurations of tasks and network topology [52] DartCSim+ Power-aware scheduling algorithms, Management of transmission in network links [53] ElasticSim Evaluating the effectiveness of EDA-NMS strategy [107] and DDS strategy [108] FederatedCloudSim Evaluation of scheduling algorithms at federation level [109], Presenting the SLA approaches for federated clouds [55] FTCloudSim Fault tolerance [110], Implementing task rescheduling method (TRM) [111], Evaluating the effectiveness of fault tolerance virtual machine placement strategy [112] WorkflowSim Simulation of workflow management system [113], Verifying the performance of SSLB [102], Implementation of workflow scheduling [114], Evaluation of PGA strategy [115] and DBCSO algorithm [116] CloudReports Implementation of energy-aware task scheduling algorithm [117], Modeling the VM allocation algorithm [118], Running several simulations at the same time [119] CEPSim Modelling the complex event processing systems [120] DynamicCloudSim Fault tolerance [56], Simulating the service fail during execution of service [113] CloudExp Modelling the mobile cloud computing framework [121], Evaluation of the DRPM system [122], Providing MapReduce system [123], Simulating the cloud-based WBANs model [124] ...
Article
Nowadays, cloud computing is an emerging technology due to virtualization and providing low price services on pay-as per-use basis. One of the main challenges in this regard is how to evaluate different models of the cloud resources usage and the ability of cloud systems based on QoS constraints. Experimentation in a real environment is very difficult and expensive. Therefore, many works pay much attention on designing cloud simulation frameworks that only cover a subset of the different components. Consequently, choosing the right tools to use needs a deep comparative analysis of available simulators based on different features. In this paper, we present a comparative analysis of 33 tools for cloud environments. This review will enable the readers to compare the prominent simulators in terms of the supported model, architecture, and high-level features. Subsequently, it provides recommendations regarding the choice of the most suitable tool for researchers, providers, and managers of cloud environment. Eight common simulators are appraised in a practical way based on various scenarios in order to evaluate their performance. Finally, it addresses the open issues and future challenges for further research.
... The authors define 3 possible states based on the CPU utilization degree of the infrastructure and 3 possible actions: expand, reduce or maintain the infrastructure. The performance was evaluated with real workloads and the approach was compared with 3 state-of-the-art strategies: Cost-aware-LRM [52] and Cost-aware-ARMA [53], both based on workload predictions, and DRPM [54], a multi-agent system to monitor and provision Cloud resources. The results showed that this approach was able to reduce by 50% the total cost and increase the use of resources by 12%. ...
Preprint
Full-text available
Reinforcement Learning (RL) has demonstrated a great potential for automatically solving decision making problems in complex uncertain environments. Basically, RL proposes a computational approach that allows learning through interaction in an environment of stochastic behavior, with agents taking actions to maximize some cumulative short-term and long-term rewards. Some of the most impressive results have been shown in Game Theory where agents exhibited super-human performance in games like Go or Starcraft 2, which led to its adoption in many other domains including Cloud Computing. Particularly, workflow autoscaling exploits the Cloud elasticity to optimize the execution of workflows according to a given optimization criteria. This is a decision-making problem in which it is necessary to establish when and how to scale-up/down computational resources; and how to assign them to the upcoming processing workload. Such actions have to be taken considering some optimization criteria in the Cloud, a dynamic and uncertain environment. Motivated by this, many works apply RL to the autoscaling problem in Cloud. In this work we survey exhaustively those proposals from major venues, and uniformly compare them based on a set of proposed taxonomies. We also discuss open problems and provide a prospective of future research in the area.
... Comprehensive stimulation is conducive to the creation of contextual content. The use of multimedia makes it easier and more convenient for us to obtain teaching resources on the tourism profession [1]. ...
Article
Full-text available
In recent years, with the rapid development of cloud computing, the massive storage capacity and massive computing power of cloud computing have brought new development opportunities to the security field. The traditional tourism professional multimedia teaching platform is also difficult to meet the current massive storage video. The demand for data, although there are work has been tried to deploy in the cloud environment, but a versatile platform is still an industry challenge. This paper designs and implements a cloud-based video surveillance platform based on the real-time, security, bandwidth dependence and high transmission cost of the multimedia professional teaching field. The cloud storage technology is used to solve the heterogeneity of video data. Use the cloud to solve the scalability of the platform. Then use H.264 video coding standard and RTSP video real-time transmission protocol to solve the problem of bandwidth dependence and real-time, and propose to build an embedded sensor network to carry out identity identification and centralized control separately. The network is dynamically tied to IP. The fixed video transmission method indirectly solves the instability of dynamic IP, and makes full use of FTTH resources, reducing the user cost.
... Therefore, it is imperative to monitor the QoS provided by the cloud provider to check whether the SLA is satisfied or not. Monitoring is required for different purposes such as resource provisioning [12] , scheduling [13][14][15] , security [16] , and re-encryption [17,18] . To detect any performance anomaly or to ensure that SLA requirements are achieved, continuous monitoring is essential [19] . ...
... The second approach is indicated to as ''Cost-aware (ARMA)'' that based on a second-order ARMA method filter (Roy et al. 2011). The third approach is indicated to as ''DRPM'' (Al-Ayyoub et al. 2015) that is a multi-agent-depend strategy to monitor and provision the dynamic resources. The fourth approach is indicated to as ''Mape-Q learning'' ; these approaches will be chosen to compare with our method because these approaches track the control MAPE loop; in addition, these are proactive, i.e., predict the suitable amount of resources at any specific time to expect undesirable states. ...
Article
Full-text available
In cloud computing, resources could be provisioned in a dynamic way on demand for cloud services. Cloud providers seek to realize effective SLA execution mechanisms for avoiding SLA violations by provisioning the resources or applications and timely interacting to environmental changes and failures. Sufficient resource provisioning to cloud’s services relies on the requirements of the workloads to achieve a high performance for quality of service. Therefore, deciding the suitable amount of cloud’s resources for these services to achieve is one of the main works in cloud computing. During the runtime of services, the amount of cloud’s resources can be specified and provisioned based on the actual workloads changes. Determining the correct amount of cloud’s resources needed for running the services on clouds is not easy task, and it depends on the existing workloads of services. Consequently, it is required to predict the future workloads for dynamic provisioning of resources in order to meet the changes in workloads and demands of services in cloud computing environments. In this paper, we study the possibility of using a cognitive/intelligent approach for cloud resource provisioning which is a combination of the autonomic computing concept, deep learning technique and fuzzy logic control. Deep learning technique is a state-of-the-art in the machine learning field. It achieved promising results in many other fields like image classification and speech recognition. For these reasons, deep learning is proposed in this work to tackle the workload prediction in cloud computing. Additionally, we also propose to use a fuzzy logic-based method in order to make a decision in the case of uncertainty of the workload prediction. We study various exiting works on autonomic cloud resource provisioning and show that there is still an opportunity to improve the current methods. We also present the challenges that may exist on this domain.
Chapter
Applications now operating on virtual servers will be moved to another virtual or physical server, or perhaps to the same server altogether, to save power consumption in cloud data centres and maximize the use of computing resources. Optimizing resources is directly affected by knowing when to move an application to a virtual machine. Efficiently tracking the use of computer resources is the surest way to optimize performance. Therefore, to assess the efficiency of virtual machines, they require an all-inclusive intelligent monitoring agent. They provide an agent-based system for tracking resource use, namely CPU and memory, in this chapter. Virtual machine resource utilization data is collected and shown on a dashboard by the monitoring agent. The dashboard shows important performance indicators like memory and CPU use. In order to optimize resources, the cloud administrator might use the data provided by the dashboard's statistics report.
Article
Today, in many areas of science and social life, machines, or so-called robots, are entrusted with tasks that previously could only be performed by humans, and this is what led to the creation of artificial intelligence and further stimulates its development and improvement. Automated machines, which are endowed with artificial intelligence, are thus able to relieve a person from routine activities, in particular. Thus, systems based on artificial intelligence are increasingly used in technology, for example, cars endowed with artificial intelligence, or, for example, robots involved in production. That is, the purpose of creating artificial intelligence is primarily to improve human life. However, any system has its shortcomings and problems that need to be explored for further improvement and effective development. It can be stated that scientists identify many problems in the field of artificial intelligence and this list is not exhaustive and with the development of society there will be other debatable issues, however, in my opinion the central problem is the lack of unambiguous opinion on scientific discourse. basic concepts, such as "thinking", "consciousness", "intelligence". And in view of the above, there is an urgent need for a common understanding of these concepts, so that in the future it is possible to qualitatively solve the already mentioned legal and moral problems in the field of artificial intelligence. A large number of domestic researchers are studying issues related to artificial intelligence and looking for ways to overcome problems or at least reduce the number of problems in this area. These include: Karchevsky MV Nikolskny, Yu. V., Pasichnyk VV, Shcherbyna Yu. M., Stefanchuk RO, Pozova DD ,. Radutny OE and others.
Article
Deciding the correct extent of resources needed to run the various cloud services is always a challenge. Often in such dynamic environments, there is a tremendous need for accurate predictions and timely decision making methodologies to estimate the future demands within a minimal cost. This brings in a need to elucidate the research divergence for optimal dynamic resource provisioning that predicts the future enumerated resources on the support of application's type. This paper proposes a framework to provision the resources in an optimal way, by combining the concepts of autonomic computing, linear regression and Bayesian learning. The use of Bayesian learning to the proposed model helps in a proactive decision making process and provide a solid theoretical framework to estimate the future predictions using the prior information available. The autonomic resource provisioning framework proposed here is developed using CloudSim toolkit inspired by a cloud layer model. The efficacy of the proposed technique is evaluated using real world workload traces from google followed by the traces from Clarknet. The model is evaluated for various parameters namely - response time, SLA violations, virtual machine usage hours and cost. It is found that the proposed model lowers the overall cost by 31% with the increase in the usage of resources by 12% when compared with the other existing approaches.
Article
The allocation of resources is a foremost demanding task in cloud computing. Scholars are yet finding it difficult to allocate appropriate resources to the set of user tasks. Our objective is to provide a platform that optimizes a dynamic resource allocation scheme. Multi‐agent deep reinforcement learning‐based greedy adaptive firefly algorithm (MAD‐GAF) has been proposed herein includes both the resource management and allocation techniques. This chooses the best Quality of Service (QoS) measured host for a group of tasks efficiently and subsequently minimize the task execution time. The proposed cloud brokering architecture comprises a multi‐agent system, the cloud provider and the user. Initially, deep reinforcement learning has been built to recreate the request of cloud customers by forecasting the value of unused resources. Then the recreated customer request is forwarded to the global broker agent, which maps the virtual machine (VM) to the most appropriate cluster of physical machine (PM). The virtual machine monitor (VMM) selects VMs by managing and accessing the physical resources. The global utility agent allocates VMs using the GAF optimization algorithm, which specifies the best QoS measured host to decrease the whole tasks' average response time, thus optimizing resource allocation compared to the current approaches.
Article
Statistics of recent years on attacking actions on information systems show both the growth of known attackers and the growth of new models and directions of attacks. In this regard, the task of collecting information about events occurring in the information system and related to the main objects of the information system, and conducting their effective analysis is relevant. The main requirements for the tools of analysis are: speed and ability to adapt to new circumstances - adaptability. Means that meet these requirements are artificial intelligence systems. In particular, there are a number of research that use neural networks as a means of analysis. There are different types of neural networks, which differ depending on the tasks to be solved and are more suitable for different input data. The proposed multi-agent attack detection system collects and analyzes the collected information about the events of the information system using two types of neural networks. A multilayer perceptron is used to analyze various logs of information system objects. The Jordan network is used to analyze directly collected information about the events of information system objects. The use of a multi-agent attack detection system can increase the security of the information system. Features of modern attacks are considered. The urgency of the task of detecting attacks is substantiated. The peculiarities of the attack process were considered. The actions of attackers of different types at different stages of the attack are analyzed. It was shown which methods of detecting attacks should be used at different stages of the attack by an attacker. A model of a multi-agent attack detection system is proposed. An interpretation of the results of the analysis of information system events by the method of detecting attacks was proposed, as well as an algorithm for joint decision-making by agents based on several sources of information about their status. A model of an attack detection system that takes into account these features is proposed. This attack detection system collects information at several levels of the information system and uses it to analyze the artificial intelligence system
Article
Among various deployment models, Hybrid cloud is the preferred model allowing customers to maximize cost savings and performance by leveraging advantage of quick provisioning capabilities of public cloud. However, due to vast diversity of cloud services, today’s struggle for the customers is to evaluate and discover the best fit Cloud Service Provider (CSP) for an efficient Virtual Machine (VM) provisioning in various hybrid cloud contracts of multi cloud environment. Currently there is no framework that allows customers to evaluate CSPs based on application workload, application environment and employ an budget optimized VM provisioning with dynamically varying factors while adhering to Service Level Agreement(SLA). Hence, the proposed VM Provisioning framework is a Multi Criteria Decision-Making (MCDM) model with an objective to provision the VMs from the well suited CSP in hybrid cloud. The framework initially derives weights for each decision-making input criteria considered as Quality of Service (QoS) Service Measurement Index(SMI) attributes using extent analysis method of Fuzzy Analytical processing (FHP). It ranks the CSP by aggregating weights of decision makers among the alternatives. The framework then proposes a budget optimized algorithm called BOPVM to provision VMs in the hybrid cloud based on the order of assigned rank to CSPs. The framework is evaluated and implemented on real time workload among the CSPs – Amazon, Azure and Openstack. The evaluation results shows significant provisioning cost savings of 50% with reduced provisioning time by 65% when compared with existing algorithms.
Article
Full-text available
Reinforcement Learning (RL) has demonstrated a great potential for automatically solving decision-making problems in complex, uncertain environments. RL proposes a computational approach that allows learning through interaction in an environment with stochastic behavior, where agents take actions to maximize some cumulative short-term and long-term rewards. Some of the most impressive results have been shown in Game Theory where agents exhibited superhuman performance in games like Go or Starcraft 2, which led to its gradual adoption in many other domains, including Cloud Computing. Therefore, RL appears as a promising approach for Autoscaling in Cloud since it is possible to learn transparent (with no human intervention), dynamic (no static plans), and adaptable (constantly updated) resource management policies to execute applications. These are three important distinctive aspects to consider in comparison with other widely used autoscaling policies that are defined in an ad-hoc way or statically computed as in solutions based on meta-heuristics. Autoscaling exploits the Cloud elasticity to optimize the execution of applications according to given optimization criteria, which demands deciding when and how to scale up/down computational resources and how to assign them to the upcoming processing workload. Such actions have to be taken considering that the Cloud is a dynamic and uncertain environment. Motivated by this, many works apply RL to the autoscaling problem in the Cloud. In this work, we exhaustively survey those proposals from major venues, and uniformly compare them based on a set of proposed taxonomies. We also discuss open problems and prospective research in the area.
Article
The study considers the development of methods for detecting anomalous network connections based on hybridization of computational intelligence methods. An analysis of approaches to detecting anomalies and abuses in computer networks. In the framework of this analysis, a classification of methods for detecting network attacks is proposed. The main results are reduced to the construction of multi-class models that increase the efficiency of the attack detection system, and can be used to build systems for classifying network parameters during the attack. A model of an artificial immune system based on an evolutionary approach, an algorithm for genetic-competitive learning of the Kohonen network and a method of hierarchical hybridization of binary classifiers with the addition to the detection of anomalous network connections have been developed. The architecture of the network distributed attack detection system has been developed. The architecture of the attack detection system is two-tier: the first level provides the primary analysis of individual packets and network connections using signature analysis, the second level processes the processing of aggregate network data streams using adaptive classifiers. A signature analysis was performed to study network performance based on the Aho-Korasik and Boyer-Moore algorithms and their improved analogues were implemented using OpenMP and CUDA technologies. The architecture is presented and the main points of operation of the network attack generator are shown. A system for generating network attacks has been developed. This system consists of two components: an asynchronous transparent proxy server for TCP sessions and a frontend interface for a network attack generator. The results of the experiments confirmed that the functional and non-functional requirements, as well as the requirements for computing intelligent systems, are met for the developed attack detection system.
Article
Full-text available
Cloud computing provides a set of resources and services for customers on the Internet on demand and based on a pay as you go model. Cloud providers are looking to decrease costs and increase profits. Therefore, resource management and provisioning are very important for cloud providers. Automated scaling can be used to provide resources for user requests. Auto‐scaling can decrease the total operational costs for providers, although it does have its own cost and time overheads. In this paper, a new solution is presented for resource provisioning on multi‐layered cloud applications based on MAPE‐K loop. A weighted ensemble prediction model is proposed to estimate the resources utilization in each cloud layer. In addition, accuracy of the model and a regularization technique are used to regulate the weights of the models in the proposed hybrid prediction model. Furthermore, a decision tree‐based algorithm is presented to analyze status of the resources to make scaling decision. In addition, we propose a resource allocation algorithm that is based on Virtual Machine priority and request deadline in order to allocate requests on suitable resources. The experimental results indicate that the proposed algorithm has the best performance among its counterparts.
Thesis
Because of the digital revolution, also known as Industry 3.0, the boundaries between the physical and digital worlds are shrinking to give life to a more interconnected and smart factories. These factories allow employees, machines, processes, and products to interact oriented to provide a better organization of all the productive means, empowering the entire company itself to achieve higher levels of efficiency and productivity. These technologies are profoundly transforming our society, allowing customizing everything in detail, reducing goods and services costs, transforming worker's and job’s conditions for safety and security, among others. In that sense, Industry 3.0 acted as a catalyst that promoted new production mechanisms, which originated a new industrial revolution known as Industry 4.0. The concept of Industry 4.0, is used to designate the new generation of connected, robotics, and intelligent factories. Fundamentally, the vision of Industry 4.0 is to give smart capabilities to the production and physical operations to create a more holistic and better-connected ecosystem.One crucial aspect to consider, regarding the idea of the Industry 4.0 concept, is related to integrability and interoperability of the actors involved in manufacturing processes. It means that people, things, processes, and data have to be able not only to make decisions for themselves and to carry out their work in a more autonomous way (independence) but, also, the self-management of the whole factory (need to promote integrability and interoperability). The previous statement implies that the production processes’ actors should be able to autonomously negotiate in order to reach agreements linked to achieve both individual and collective production goals. In that sense, Industry 4.0 represents not only a new way to produce goods and services but also a crucial integration challenge of the actors involved in the manufacturing processes that need connection, communication, coordination, cooperation, and collaboration (denoted as 5C) capabilities that allow them to comply with the vision of Industry 4.0.Principally, this thesis aims at empowering processes management for Industry 4.0, proposing a stack of five levels, denoted as 5C. The 5C stack levels represent a way to deal with integration and interoperability challenges so that they can be solved incrementally at each level. From this perspective, we must start solving connection and communication issues as a first step to promote more elaborated organization processes like coordination, cooperation, and collaboration. Mainly, the 5C denote the elements needed to allow autonomous integration and interoperability of actors in Industry 4.0.From this point of view, in this thesis project, we present a first contribution that is oriented to deal with the integration challenges regarding the Industry 4.0 context at the level of connection and communication. This solution is based in a Multi-agent system in which the physical elements of the system are characterized virtually as agents. Notably, the use of Multi-agent systems allows creating an intelligent environment dotted with characteristics of autonomy, decentralization, self-organization, self-direction, standardized protocol, and other properties of Multi-agent systems. Moreover, the proposed solution allows actors to extend their limited capabilities with service deployed through the Internet, as an intent to automatize, optimize, and in more mature stages, transform any environment into a fully integrated, automated, and intelligent environment. Consequently, the proposed architecture will be evaluated and compared to previous researches in this field.In the second place, we will solve some integration challenges of Industry 4.0 at the level of coordination, cooperation, and collaboration. In this case, we design a framework for autonomous integration of actors in Industry 4.0, to allow them to autonomously coordinate, cooperate, and collaborate. This framework uses technologies like the Internet of Everything, Everything mining, and Autonomic computing. Next, we design some autonomic cycles of data analytics tasks, oriented to enable autonomous coordination in manufacturing processes. Fundamentally, these data analytics tasks create the knowledge bases needed in a production environment to support self-planning, self-manage, self-supervising, self-healing, etc. to the manufacturing process.Finally, we implement an autonomous cycle of data analytics tasks for self-supervising, using several Everything-mining techniques over data sources corresponding to a real manufacturing process. It defines a self-value-driven supervisory system, according to the classification made by Xu et al. (2017), that can process and verify the functionalities and applicability of our framework in manufacturing processes. Moreover, the self-supervising system developed in this thesis project is compared to other research works.
Article
This study aims to (1) define the critical risk factors that influence the governance of enterprise internal control in an IoT environment, and (2) classify the risk factors and study their importance in such an environment. The study uses Gowin’s Vee knowledge map as a research strategy to mitigate the limitations of qualitative research through a set of strict research procedures. In addition, the Delphi method is used to test and provide feedback to justify and revise the critical risk factors. Finally, 83 items were obtained and categorized into eight different types of critical risk factors. For emphasizing how the risk factors of enterprise internal control involve diverse stakeholders, the critical risk factors are further classified based on the three-layer DCM architecture for mapping with various perceptions. The results of this research can be used as a reference in managing risk factors under the IoT environment. In the new generation of IoT governance practice, the related factors can also be regarded as the essential measurement items for enterprises in conducting effective internal control and auditing.
Article
Full-text available
The recent years have witnessed significant interest in migrating different applications into the cloud platforms. In this context, one of the main challenges for cloud applications providers is how to ensure high availability of the delivered applications while meeting users’ QoS. In this respect, replication techniques are commonly applied to efficiently handle this issue. From the literature, according to the used granularity for replication there are two major approaches to achieve replication: either through replicating the service or the underlying data. The latter one is also known as Data-oriented Replication (DoR), while the former one is referred to as Service-oriented Replication (SoR). DoR is discussed extensively in the available literature and several surveys are already published. However, SoR is still at its infancy and there is a lack of research studies. Hence, in this paper we present a comprehensive survey of SoR strategies in cloud computing. We propose a classification of existing works based on the research methods they use. Then, we carried out an in-depth study and analysis of these works. In addition, a tabular representation of all relevant features is presented to facilitate the comparison of SoR techniques and the proposal of new enhanced strategies.
Article
Full-text available
Providing a pool of various resources and services to customers on the Internet in exchanging money has made cloud computing as one of the most popular technologies. Management of the provided resources and services at the lowest cost and maximum profit is a crucial issue for cloud providers. Thus, cloud providers proceed to auto‐scale the computing resources according to the users' requests in order to minimize the operational costs. Therefore, the required time and costs to scale‐up and down computing resources are considered as one of the major limits of scaling which has made this issue an important challenge in cloud computing. In this paper, a new approach is proposed based on MAPE‐K loop to auto‐scale the resources for multilayered cloud applications. K‐nearest neighbor (K‐NN) algorithm is used to analyze and label virtual machines and statistical methods are used to make scaling decision. In addition, a resource allocation algorithm is proposed to allocate requests on the resources. Results of the simulation revealed that the proposed approach results in operational costs reduction, as well as improving the resource utilization, response time, and profit.
Article
With the increasing number of Internet of Things (IoT) devices, the volume and variety of data being generated by these devices are increasing rapidly. Cloud computing cannot process this data due to its high latency and scalability. In order to process this data in less time, fog computing has evolved as an extension to Cloud computing. In a fog computing environment, a resource monitoring service plays a vital role in providing advanced services such as scheduling, scaling and migration. Most of the research in fog computing has assumed that a resource monitoring service is already available. Conventional methods proposed for other distributed systems may not be suitable due to the unique features of a fog environment. To improve the overall performance of fog computing and to optimise resource usage, effective resource monitoring techniques are required. Hence, we propose a Support and Confidence based (SCB) technique which optimises the resource usage in the resource monitoring service. The performance of our proposed system is evaluated by examining a real-time traffic use case in a fog emulator with synthetic data. The experimental results obtained from the fog emulator show that the proposed technique consumes 19% lesser resources compared with the existing technique.
Article
The trend of the Internet of Everything is deepening, and the amount of data that needs to be processed in the network is growing. Using the edge cloud technology can process data at the edge of the network, lowering the burden on the data center. When the load of the edge cloud is large, it is necessary to apply for more resources to the cloud service provider, and the resource billing granularity affects the cost. When the load is small, releasing the idle node resources to the cloud service provider can lower the service expenditure. To this end, an on-demand resource provision model based on service expenditure is proposed. The demand for resources needs to be estimated in advance. To this end, a load estimation model based on ARIMA model and BP neural network is proposed. The model can estimate the load according to historical data and reduce the estimation error. Before releasing the node resources, the user data on the node need to be migrated to other working nodes to ensure that the user data will not be lost. In this paper, when selecting the migration target, the three metrics of load balancing, migration time consumption and migration cost of the cluster are considered. A data migration model based on load balancing is proposed. Through the comparison of experimental results, the proposed methods can effectively reduce service expenditure and make the cluster in a state of load balancing.
Chapter
The rapid development of cloud computing technology has led to a high level of energy consumption. The central processing unit (CPU) of the data center and other resources often use less than half the rate; therefore, if the work of the virtual machine is focused on part of the server, and the idle server switches to low power mode, the power consumption of the data center can be greatly reduced. Traditional research into virtual machine consolidation is mainly based on the high load threshold of the current host load setting or periodically migrates, and the present study made predictions based on the timing of problems of lower prediction accuracy faced. To solve these problems, we consider the impact of the multi-order Markov model and the CPU state at different times, and propose a new hybrid sequence K Markov model for the next period of time of the host CPU load forecasting. Owing to the large-scale data experiment on the CloudSim simulation platform, the host load forecasting method proposed in this paper is compared with the traditional load detection method to verify that the proposed model has a large reduction in the number of virtual machine migrations and amount of data center energy consumption, and the violation of the service level agreement (SLA) is also at an acceptable level.
Conference Paper
Full-text available
Multi-agent systems have been applied to integrate heterogeneous information, to work on the behalf of users and to provide decision support in challenging environments. As this technology continues to develop, the numbers of agents that comprise a system and their usage of computing resources are both likely to grow. Distributed agent platforms such as JADE (Java Agent Development framework) provide a partial solution to such scalability issues. The emergence of infrastructure as a service (IAAS) cloud computing environments which allow the dynamic scaling of resources provides us with new ways of harnessing the distributed nature of multi-agent systems and their development environments. In this paper we introduce Elastic-JADE, the aim of which is to allow a local JADE platform to automatically scale up and down using Amazon EC2 resources when the local platform is heavily loaded. This paper describes the system architecture, a prototype and comments on future directions that this work may take.
Conference Paper
Full-text available
Significant achievements have been made for automated allocation of cloud resources. However, the performance of applications may be poor in peak load periods, unless their cloud resources are dynamically adjusted. Moreover, although cloud resources dedicated to different applications are virtually isolated, performance fluctuations do occur because of resource sharing, and software or hardware failures (e.g. unstable virtual machines, power outages, etc.). In this paper, we propose a decentralized economic approach for dynamically adapting the cloud resources of various applications, so as to statistically meet their SLA performance and availability goals in the presence of varying loads or failures. According to our approach, the dynamic economic fitness of a Web service determines whether it is replicated or migrated to another server, or deleted. The economic fitness of a Web service depends on its individual performance constraints, its load, and the utilization of the resources where it resides. Cascading performance objectives are dynamically calculated for individual tasks in the application workflow according to the user requirements. By fully implementing our framework, we experimentally proved that our adaptive approach statistically meets the performance objectives under peak load periods or failures, as opposed to static resource settings.
Article
Full-text available
CLOUD COMPUTING, the long-held dream of computing as a utility, has the potential to transform a large part of the IT industry, making software even more attractive as a service and shaping the way IT hardware is designed and purchased. Developers with innovative ideas for new Internet services no longer require the large capital outlays in hardware to deploy their service or the human expense to operate it. They need not be concerned about overprovisioning for a service whose popularity does not meet their predictions, thus wasting costly resources, or underprovisioning for one that becomes wildly popular, thus missing potential customers and revenue. Moreover, companies with large batch-oriented tasks can get results as quickly as their programs can scale, since using 1,000 servers for one hour costs no more than using one server for 1,000.
Article
Full-text available
Aneka is a platform for deploying Clouds developing applications on top of it. It provides a runtime environment and a set of APIs that allow developers to build .NET applications that leverage their computation on either public or private clouds. One of the key features of Aneka is the ability of supporting multiple programming models that are ways of expressing the execution logic of applications by using specific abstractions. This is accomplished by creating a customizable and extensible service oriented runtime environment represented by a collection of software containers connected together. By leveraging on these architecture advanced services including resource reservation, persistence, storage management, security, and performance monitoring have been implemented. On top of this infrastructure different programming models can be plugged to provide support for different scenarios as demonstrated by the engineering, life science, and industry applications. Comment: 30 pages, 10 figures
Article
Half Title Series Information Title Copyright Dedication Contents Preface
Article
Aneka is a platform for deploying Clouds developing applications on top of it. It provides a runtime environment and a set of APIs that allow developers to build.NET applications that leverage their computation on either public or private clouds. One of the key features of Aneka is the ability of supporting multiple programming models that are ways of expressing the execution logic of applications by using specific abstractions. This is accomplished by creating a customizable and extensible service oriented runtime environment represented by a collection of software containers connected together. By leveraging on these architecture advanced services including resource reservation, persistence, storage management, security, and performance monitoring have been implemented. On top of this infrastructure different programming models can be plugged to provide support for different scenarios as demonstrated by the engineering, life science, and industry applications.
Article
Cloud computing is an emerging and fast-growing computing paradigm that has gained great interest from both industry and academia. Consequently, many researchers are actively involved in cloud computing research projects. One major challenge facing cloud computing researchers is the lack of a com- prehensive cloud computing experimental tool to use in their studies. This paper introduces CloudExp, a modeling and simulation environment for cloud computing. CloudExp can be used to evaluate a wide spectrum of cloud com- ponents such as processing elements, data centers, storage, networking, Ser- vice Level Agreement (SLA) constraints, web-based applications, Service Ori- ented Architecture (SOA), virtualization, management and automation, and Business Process Management (BPM). Moreover, CloudExp introduces the Rain workload generator which emulates real workloads in cloud environ- ments. Also, MapReduce processing model is integrated in CloudExp in order to handle the processing of big data problems.
Conference Paper
Originating from the field of physics and economics, the term elasticity is nowadays heavily used in the context of cloud computing. In this context, elasticity is commonly understood as the ability of a system to automatically provision and de-provision computing resources on demand as workloads change. However, elasticity still lacks a precise definition as well as representative metrics coupled with a benchmarking methodology to enable comparability of systems. Existing definitions of elasticity are largely inconsistent and unspecific leading to confusion in the use of the term and its differentiation from related terms such as scalability and efficiency; the proposed measurement methodologies do not provide means to quantify elasticity without mixing it with efficiency or scalability aspects. In this short paper, we propose a precise definition of elasticity and analyze its core properties and requirements explicitly distinguishing from related terms such as scalability, efficiency, and agility. Furthermore, we present a set of appropriate elasticity metrics and sketch a new elasticity tailored benchmarking methodology addressing the special requirements on workload design and calibration.
Conference Paper
Cloud computing represents an opportunity for IT users to reduce costs and increase efficiency providing an alternative way of using IT services. In this scenario value added services for dynamic and elastic provisioning play an important role, by giving the possibility to get the best resources configuration that satisfies the application requirements. Agent technology provides asynchronous mechanisms that could represent the best choice for effective programming of Cloud, due to the unpredictable behaviour of the network. Cloud Agency is a collection of agent based services for provisioning, monitoring and autonomic reconfiguration of Cloud resources at infrastructure level, which go beyond the common offer by commercial providers and by open Cloud technologies. In this work we describe how Cloud Agency extends the Open Cloud Computing Interface (OCCI[1]) proposal of standard for the Cloud infrastructures level. We present the Cloud agency design and the implementation of an RESTfull to/from ACL gateway that allows for the communication between the Cloud world and the agents' one, being compliant with OCCI and extending its model and services.
Article
The rapid growth in demand for computational power driven by modern service applications combined with the shift to the Cloud computing model have led to the establishment of large-scale virtualized data centers. Such data centers consume enormous amounts of electrical energy resulting in high operating costs and carbon dioxide emissions. Dynamic consolidation of virtual machines (VMs) using live migration and switching idle nodes to the sleep mode allows Cloud providers to optimize resource usage and reduce energy consumption. However, the obligation of providing high quality of service to customers leads to the necessity in dealing with the energy-performance trade-off, as aggressive consolidation may lead to performance degradation. Because of the variability of workloads experienced by modern applications, the VM placement should be optimized continuously in an online manner. To understand the implications of the online nature of the problem, we conduct a competitive analysis and prove competitive ratios of optimal online deterministic algorithms for the single VM migration and dynamic VM consolidation problems. Furthermore, we propose novel adaptive heuristics for dynamic consolidation of VMs based on an analysis of historical data from the resource usage by VMs. The proposed algorithms significantly reduce energy consumption, while ensuring a high level of adherence to the service level agreement. We validate the high efficiency of the proposed algorithms by extensive simulations using real-world workload traces from more than a thousand PlanetLab VMs. Copyright © 2011 John Wiley & Sons, Ltd.
Conference Paper
Cloud computing paradigm contains many shared resources, such as infrastructures, data storage, various platforms and software. Resource monitoring involves collecting information of system resources to facilitate decision making by other components in Cloud environment. It is the foundation of many major Cloud computing operations. In this paper, we extend the prevailing monitoring methods in Grid computing, namely Pull model and Push model, to the paradigm of Cloud computing. In Grid computing, we find that in certain conditions, Push model has high consistency but low efficiency, while Pull model has low consistency but high efficiency. Based on complementary properties of the two models, we propose a user-oriented resource monitoring model named Push&Pull (P&P) for Cloud computing, which employs both the above two models, and switches the two models intelligently according to users' requirements and monitored resources' status. The experimental result shows that the P&P model decreases updating costs and satisfies various users' requirements of consistency between monitoring components and monitored resources compared to the original models.
Conference Paper
Infrastructure-as-a-Service (IaaS) cloud computing provides the ability to dynamically acquire extra or release existing computing resources on-demand to adapt to dynamic application workloads. In this paper, we propose an extensible framework for on-demand cloud resource provisioning and adaptation. The core of the framework is a set of resource adaptation algorithms that are capable of making informed provisioning decisions to adapt to workload fluctuations. The framework is designed to manage multiple sets of resources acquired from different cloud providers, and to interact with different local resource managers. We have developed a fully functional web-service based prototype of this framework, and used it for performance evaluation of various resource adaptation algorithms under different realistic settings, e.g. when input data such as jobs' wall times are inaccurate. Extensive experiments have been conducted with both synthetic and real workload traces obtained from the Grid Workload Archives, more specifically the traces from the Large Hadron Collider Computing Grid. The results demonstrate the effectiveness and robustness of our proposed algorithms.
Conference Paper
Infrastructure-as-a-Service (IaaS) cloud computing offers new possibilities to scientific communities. One of the most significant is the ability to elastically provision and relinquish new resources in response to changes in demand. In our work, we develop a model of an “elastic site” that efficiently adapts services provided within a site, such as batch schedulers, storage archives, or Web services to take advantage of elastically provisioned resources. We describe the system architecture along with the issues involved with elastic provisioning, such as security, privacy, and various logistical considerations. To avoid over- or under-provisioning the resources we propose three different policies to efficiently schedule resource deployment based on demand. We have implemented a resource manager, built on the Nimbus toolkit to dynamically and securely extend existing physical clusters into the cloud. Our elastic site manager interfaces directly with local resource managers, such as Torque. We have developed and evaluated policies for resource provisioning on a Nimbus-based cloud at the University of Chicago, another at Indiana University, and Amazon EC2. We demonstrate a dynamic and responsive elastic cluster, capable of responding effectively to a variety of job submission patterns. We also demonstrate that we can process 10 times faster by expanding our cluster up to 150 EC2 nodes.
Conference Paper
In this paper we present the Coaster System. It is an automatically-deployed node provisioning (Pilot Job) system for grids, clouds, and ad-hoc desktop-computer networks supporting file staging, on-demand opportunistic multi-node allocation, remote logging, and remote monitoring. The Coaster System has been previously shown to work at scales of thousands of cores. It has been used since 2009 for applications in fields that include biochemistry, earth systems science, energy modeling, and neuroscience. The system has been used successfully on the Open Science Grid, the Tera Grid, supercomputers (IBM Blue Gene/P, Cray XT and XE systems, and Sun Constellation), a number of smaller clusters, and three cloud infrastructures (BioNimbus, Future Grid and Amazon EC2).
Article
Cloud computing is a recent advancement wherein IT infrastructure and applications are provided as ‘services’ to end-users under a usage-based payment model. It can leverage virtualized services even on the fly based on requirements (workload patterns and QoS) varying with time. The application services hosted under Cloud computing model have complex provisioning, composition, configuration, and deployment requirements. Evaluating the performance of Cloud provisioning policies, application workload models, and resources performance models in a repeatable manner under varying system and user configurations and requirements is difficult to achieve. To overcome this challenge, we propose CloudSim: an extensible simulation toolkit that enables modeling and simulation of Cloud computing systems and application provisioning environments. The CloudSim toolkit supports both system and behavior modeling of Cloud system components such as data centers, virtual machines (VMs) and resource provisioning policies. It implements generic application provisioning techniques that can be extended with ease and limited effort. Currently, it supports modeling and simulation of Cloud computing environments consisting of both single and inter-networked clouds (federation of clouds). Moreover, it exposes custom interfaces for implementing policies and provisioning techniques for allocation of VMs under inter-networked Cloud computing scenarios. Several researchers from organizations, such as HP Labs in U.S.A., are using CloudSim in their investigation on Cloud resource provisioning and energy-efficient management of data center resources. The usefulness of CloudSim is demonstrated by a case study involving dynamic provisioning of application services in the hybrid federated clouds environment. The result of this case study proves that the federated Cloud computing model significantly improves the application QoS requirements under fluctuating resource and service demand patterns.
Article
Scalability is said to be one of the major advantages brought by the cloud paradigm and, more specifically, the one that makes it different to an "advanced outsourcing" solution. However, there are some important pending issues before making the dreamed automated scaling for applications come true. In this paper, the most notable initiatives towards whole application scalability in cloud environments are presented. We present relevant efforts at the edge of state of the art technology, providing an encompassing overview of the trends they each follow. We also highlight pending challenges that will likely be addressed in new research efforts and present an ideal scalable cloud system.