Conference PaperPDF Available

Cost-Aware Resource Management for Federated Clouds Using Resource Sharing Contracts

Authors:
  • ByteDance

Abstract

Cloud computing and its pay-as-you-go model continue to provide significant cost benefits and a seamless service delivery model for cloud consumers. The evolution of small-scale and large-scale geo-distributed datacenters operated and managed by individual cloud service providers raises new challenges in terms of effective global resource sharing and management of autonomously-controlled individual datacenter resources. Earlier solutions for geo-distributed clouds have focused primarily on achieving global efficiency in resource sharing that results in significant inefficiencies in local resource allocation for individual datacenters leading to unfairness in revenue and profit earned. In this paper, we propose a new contracts-based resource sharing model for federated geo-distributed clouds that allows cloud service providers to establish resource sharing contracts with individual datacenters apriori for defined time intervals during a 24 hour time period. Based on the established contracts, individual cloud service providers employ a cost-aware job scheduling and provisioning algorithm that enables tasks to complete and meet their response time requirements. The proposed techniques are evaluated through extensive experiments using realistic workloads and the results demonstrate the effectiveness, scalability and resource sharing efficiency of the proposed model.
Cost-aware Resource Management
for Federated Clouds
Using Resource Sharing Contracts
Jinlai Xu, Balaji Palanisamy
School of Information Sciences
University of Pittsburgh
Cloud Computing
Data
analytics
App Media Health
care
CPU Storage Database Network Management
Cloud
2
IoT
Problems of previous “standalone” clouds
Resources available in a single data center are limited
One datacenter can cover most but not all the peak workloads
Under provisioning will cause heavy penalties
Lost revenue
Lost users
Resources
Demand
Capacity
Time (days)
12 3
Resources
Demand
Capacity
Time (days)
12 3
Resources
Demand
Capacity
Time (days)
12 3
Slide Credits: Berkeley RAD Lab 3
Problems of previous “standalone” clouds
Resources available in a single data center are limited
Dynamic electricity price has both pros and cons
The fluctuation is significant during days (from 10s to 100s)
The fluctuation is significant also in one day (up to 6x in one day)
High electricity price will cost datacenters a lot
So there is a need for the datacenters to share resources with each
other to both increase the potential capacity and decrease the cost.
Data from National Grid 4
Resource sharing mechanisms
Virtual Geo-distributed Cluster
One Cloud Service Provider(CSP)
manages several geo-distributed
datacenters and makes them work
together
Federated Cloud
Several CSPs get together to build
a federation to share resources in
the geo-distributed scenario
5
Previous solutions
Virtual geo-distributed cluster Federated cloud
6
WAN
Seattle
Berkeley Beijing
London
Broker
The weaknesses of previous solutions
Global controlling
Either the information of the datacenters are aggregated to a centralized
controller to help allocate the resource
Or all the requests from the users are submitted to a centralized broker to
respond.
Global optimization
Global optimization does not mean individual optimized.
The profit of each datacenter is not guaranteed which loses fairness
7
Contracts-based federated cloud
Resource sharing contract
Stipulate the rights and duties
between the buyer and seller
Effect time
Price
Resource amount
Problem 1: How to build the
resource sharing contracts?
Problem 2: How to appropriately
schedule the jobs based on the
contracts?
8
Contract
Contract
Contract
Contract
CSP 1 CSP 2
CSP 3 CSP 4
Problem1: Contracts establishment
The auction mechanisms fit the properties of the problem well
Both competing and cooperating need to be considered
The essence behind the auction is to match the demands and supplies to
allocate some resources.
Properties we desire in the auction mechanism:
Double auction: both buyers and sellers bid in the auction
Truthfulness: bidders tend to bid with their true valuation
Budget balance: auctioneer will not subsidize in the auction
McAfee mechanism satisfies the above criteria
9
Existing double
auction designs Truthfulness Ex-post Budget
Balance
Individual
Rationality
Average X√ √
VCG X
McAfee √√√
Proposed bidding strategies
The utility function of each CSP basically contains two parts:
The charges from the customers by running the tasks
The operating cost of the infrastructure
Mixed strategies for the potential buyer:
Lack of resource: bid by the charge value
Otherwise: bid by the operating cost
One strategy for the potential seller:
Idle resource: bid by the operating cost
Other strategies can be considered by adding into the utility function
10
Winning bids decision
CSPs bid by the strategies
Order the sell bids with ascending
order and buy bids with descending
order
Find the break even index
Calculate the intermediate price
If    or    
Choose the winning bids (first   1bids
win)
Calculate the clear prices (buy price and
sell price respectively)
If    
Choose the winning bids (first bids win)
Calculate the clear prices (buy price and
sell price are equal to )
11
$10
$20
$40
$20
$10
$40
Sell Bids Buy Bids
CSP 1 CSP 2 CSP 3
Break
Even
Index
1
2
3
= 2
Calculate:
Intermediate price:
= +
2=25
=20    20 =
Sell price ==$20
Buy price ==$20
Winning
Bids
Contracts Establishment Process
Two layers of loop
For each time slot   
For each type-resource
CSPs bid;
Winning bids decision;
Continue to next type of resource;
Continue to next time slot;
So for each time slot and each type
of resource, there is an auction and
an array of auction result.
The contract is built by the array of
the auction results one by one.
12
$10
$20
$40
$20
$10
$40
Sell Bids Buy Bids
CSP 1 CSP 2 CSP 3
Winning
Bids
Effective time: in
Resource type:
Sell price =$20
Buy price =$20
Problem2: Scheduling
Cost-aware mechanism
Sort the contracts by their unit buy price (price per unit resource)
Separate the contracts to two sets:
Lower-cost contracts (price less than the local operation cost)
Higher-cost contracts (price higher than the local operation cost)
Schedule the jobs:
Fill the lower-cost contract
Fill the local resource
Use the higher-cost contract only when the above two kinds of resource are
exhausted
13
Scheduling
Illustration
First in First out (FIFO)
Cost-aware
14
Time
Servers
Job Queue
Scheduler
11:00 12:00
Two
Sides
CSP1
->CSP2
Time
11:00~12:00
Type
2 (2 servers)
Price
$20
Head
Tail
Jobs
$10
Contract
Cost-aware:
Unit Price > Local Operation Cost
Place the job to local servers first
Cost: $5/server
Contract
Unit Price Local Resource
Simulation setup CSPs and datacenters
Locations from AWS EC2’s data 15
# CSP (each with
one datacenter) 25
# servers per DC 600
PUE 1.2
Max response
time (s) 600
Default setting
Simulation setup server
IBM server x3550
CPU: 2 x [Xeon X5675 3067 MHz, 6 cores],
Memory: 16GB
Load 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Power
(Watt) 58.4 98 109 118 128 140 153 170 189 205 222
Picture and spec from IBM 16
Simulation setup -dataset
Real workload trace from Google
Replay the trace for every task
Start time
Length
Resource requirement(CPU, memory)
Dynamic electricity price dataset
from National Grid
Randomly choose one days
electricity price array for each
datacenter
17
Simulation setup compared mechanisms
NF: each datacenter runs alone without federation
ConBLF: Local resource First Contracts-Based
ConBCA: Cost-Aware Contracts-Based
RT: real-time complete cooperation by a broker
18
Experiment Result impact of # of servers
The electricity cost is optimized with using the contracts-based
mechanisms.
The success rate of contracts-based mechanisms are between RT and
NF
19
0
0.2
0.4
0.6
0.8
1
1.2
200 400 600 800 1000
NORMALIZED ELECTRICITY COST
/ SUCCESS TASK
# SERVERS / DATACENTER
NF ConBLF ConBCA RT
0
0.2
0.4
0.6
0.8
1
1.2
200 400 600 800 1000
SUCCESS RATE
# SERVERS / DATACENTER
NF ConBLF ConBCA RT
Experiment Result impact of # of CSPs
The result is similar to the scenario of impact of number of servers
per datacenter
20
0
0.2
0.4
0.6
0.8
1
1.2
10 20 30 40 50
NORMALIZED ELECTRICITY COST
/ SUCCESS TASK
# OF CSP
NF ConBLF ConBCA RT
0
0.2
0.4
0.6
0.8
1
1.2
10 20 30 40 50
AVERAGE UTILIZATION
# OF CSP
NF ConBLF ConBCA RT
Experiment Result impact of prediction error
Prediction errors do not significantly influence the electricity cost and
success rate
Performance of the contracts-based mechanisms is not significantly
influenced by the prediction error
21
0
0.2
0.4
0.6
0.8
1
1.2
12345
NORMALIZED ELECTRICITY COST
/ SUCCESS TASK
PREDICTION ERROR (%)
NF LFConB CAConB RT
0
0.2
0.4
0.6
0.8
1
1.2
12345
SUCCESS RATE
PREDICTION ERROR (%)
NF LFConB CAConB RT
Experiment result -fairness
22
RT represents large variance
All the contracts-based mechanisms represent low variance
7/25 CSPs lose profits after participating into the federated cloud
when using RT mechanism
Contracts-based mechanisms perform better
-5
0
5
10
12345678910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
NORMALIZED PROFIT
CSP #
NF LFConB CAConB ConAConB ConACAConB RT
Conclusion
Propose a contracts-based mechanism for resource sharing between CSPs
(cloud service providers) in a federated cloud.
Develop an auction-based mechanism for contract establishment and a suit
of contracts-aware and cost-aware scheduling techniques that maximize
the local profits of the CSPs while servicing the individual job requirements.
Evaluate the performance of the proposed approach using a trace-driven
simulation study with realistic workload traces and electricity pricing.
The contracts-based solution achieves good performance and performs
significantly better than the traditional model especially in fairness
measurement.
23
Thanks
24
Backup slides
Jinlai Xu, Balaji Palanisamy
School of Information Sciences
University of Pittsburgh
System model
Cloud Service Provider
Resource management
Provisioning and scheduling
Contracts managing
Manage the contracts
Observe the statuses
Workload estimating
Predict the workload and help in
establishing the contracts
26
Tasks and provision requests
Servers
User layer
Physical layer
Virtual layer
Routers & Switches
Infrastructure
Cloud OS
Provision &
Scheduler
subnet
subnet
Service
Delivery Service layer
Contracts
Effective Time
Resource amount
Price
Contracts
manager
Cloud OS
subnet
contractor
Resource
manager
Workload
Estimator
System model
Federation Coordinator
Consulting
Match the demands and supplies of
the CSPs
Contract building
Establish the resource sharing
contracts based on the consulting
result
27
Coordinator
Consult Contract
Builder
CSP 1 CSP 2
Demand &
Supply
Contracts
Resource type: dedicated cloud
Emerging IaaS service
IBM Bluemix dedicated
AWS EC2 dedicated instances
Physically isolated hardware
Firm performance Guarantee
More configurable
More security
28
Proof of truthfulness for McAfee mechanism
If    or    
The first 1buyers and sellers have no incentive to change their
declaration since this will have no effect on their price;
The  buyer and seller have no incentive to change since they don't trade
anyway, and if they do enter the trading (e.g. increases his declaration
above  ), their profit from trading will be negative.
If    
The first buyers and sellers have no incentive to change their declaration
since this will have no effect on their price;
The (+ 1) buyer and seller have no incentive to change since they don't
trade anyway, and if they do enter the trading (e.g.  increases his
declaration above ), their profit from trading will be negative.
29
Algorithms pseudocode
30
Simulator
Implement in JAVA
5000+ lines of code
80+ classes
31
Experiment Result impact of # of servers
The average utilization of the servers are better than RT and NF
32
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
200 400 600 800 1000
AVERAGE SERVER UTILIZATION
# SERVERS / DATACENTER
NF ConBLF ConBCA RT
Experiment Result impact of # of CSPs
The result is similar to the scenario of impact of number of servers
per datacenter
33
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
10 20 30 40 50
SUCCESS RATE
# OF CSP
NF LFConB CAConB RT
Experiment Result impact of prediction error
Prediction errors do not significantly influence the server utilization
34
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
12345
AVERAGE SERVER UTILIZATION
PREDICTION ERROR (%)
NF LFConB CAConB RT
Experiment result impact of contract interval
The electricity cost is increased a little(2% to 5%) with increasing the
interval of the contracts (the length for one time slot)
35
0
0.2
0.4
0.6
0.8
1
1.2
1200 2400 3600 4800 6000 7200
NORMALIZED ELECTRICITY COST
/ SUCCESS TASK
CONTRACT INTERVAL (S)
NF LFConB CAConB RT
0
0.2
0.4
0.6
0.8
1
1.2
1200 2400 3600 4800 6000 7200
SUCCESS RATE
CONTRACT INTERVAL (S)
NF LFConB CAConB RT
Experiment result impact of contract interval
The contract interval length does not significantly influence the server
utilization
36
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1200 2400 3600 4800 6000 7200
AVERAGE SERVER UTILIZATION
CONTRACT INTERVAL (S)
NF LFConB CAConB RT

Supplementary resource (1)

... In Habibi et al. [17], similar to [16], cloud providers send their request to the federated broker, and then the federated broker distributes received requests among federation members based on different objective functions. Xu and Palanisamy [18] proposed a contracts-based resource sharing model for geo-distributed federated clouds. The proposed model permits service providers to settle resource sharing contracts and apply a cost-aware job scheduling for pre-defined time intervals during 24 h of a day. ...
... Nevertheless, Bayesian Ridge Regression can be used when a long history of workload is available to train a robust model. The proposed model is compared with Samaan et al. [13], Xu et al. [18] and Rebai et al. [33] ...
Article
Cloud federation helps cloud providers to scale up by renting resources from other providers when the workload increased. Moreover, cloud providers with idle or underutilized resources can sell their resources to others and earn revenue inside a cloud federation. One of the most critical problems in the federated cloud environment is the management of resources and cloud providers. The game theory seems to be an excellent way to model cloud federation. This paper introduces a new model for resource management between cloud providers in a centralized federated cloud environment, based on the well known Cournot and Bertrand games. To resolve the problem of heterogeneous resources, a physical resource unit which has specific computational features and can be shared between cloud providers is introduced in this paper. Besides, by introducing a new revenue-sharing approach between providers, this paper increased the collaboration of different providers. This model is implemented by the federated Cloudsim tool, and experiments show that the Cournot model outperforms others in terms of overall benefit and responsiveness. In other words, Cournot model can respond to an acceptable number of requests while making more profit. Besides, the reviews show that the proposed model works better than other methods in terms of time.
... In order to solve this problem, many experts have conducted research. In the stage of virtual machine lease, to deal with the problem that early distributed clouds pay too much attention to the efficiency of global resource sharing, which leads to the low efficiency of local resource allocation in individual data centers, Xu et al. [2] proposed a new contract-based distributed federated resource sharing model. Based on a cost-aware resource allocation and task scheduling algorithm to provide services to user tasks, their method improved economic benefits of cloud service providers. ...
Article
Full-text available
Cloud computing is a computing service provided on demand through the Internet, which can provide sufficient computing resources for applications such as big data and the Internet of Things. Before using cloud computing resources, users need to lease the corresponding virtual machines in the data centers, and then submit the tasks to the leased virtual machines for execution. In these two stages, how to choose the optimal data center for resource deployment and task submission is particularly important for cloud users. To tackle the problem that it is difficult to select optimal data center to lease and use virtual machines in distributed data centers, we proposed data center selection algorithms based on deep reinforcement learning. In the stage of virtual machine lease, aiming to minimizing user costs, we considered both the cost of computing resources and the cost of network communication. Then, we used a deep enforcement learning algorithm to obtain the shortest communication path between the users and the data centers,and solve the problem of network communication costs which are difficult to calculate. We achieved an optimal selection of data centers and effectively reduce the overall user cost. In the stage of virtual machine use, to improve quality of service for the users, we use a deep reinforcement learning algorithm to obtain an optimal task scheduling strategy. This effectively solved the problem that user tasks cannot be effectively scheduled due to dynamic changes in the type and size of user tasks and in the state of the virtual machines. Moreover, the proposed scheme reduces the overall task completion time.
... However, none of them consider the ability of keeping edge resources running across multi-drone deployments in a single pool, such that these resources can be holistically managed and controlled from a single federated plane, applications can be deployed dynamically across the resources, and vendor lock-in situations can be eliminated. Existing resource federation models focus on resource sharing approaches [45]- [47], where resources are outsourced through contracts. These models are not suitable for edge workloads that require agile forms of edge deployments, which should support diversity, data locality and the ability to keep resources in multi-deployments in synchro-nization or in a single pool. ...
Preprint
Existing research on edge computing has proposed several edge deployment types, such as unmanned aerial vehicles (UAV)-enabled edge computing, telecommunication base stations endowed with edge clusters, fog nodes, cloudlets, edge storage devices, etc. However, none of them consider the ability of keeping edge resources running across various edge deployments in a single pool, such that these resources can be holistically managed and controlled from a single federated plane, also to eliminate vendor lock-in situations. Moreover, as modern applications are becoming more complex, relatively few research considers the different characteristics of such applications, like dependencies, class, constraints, etc. This research extends the state-of-the-art by providing an intelligent multi-task orchestration solution in a federated edge computing system, considering both task dependencies and heterogeneous resource demands at the same time. To achieve this, we propose a Multi-Task Dispatching policy called Closest, to select the closest edge cluster or deployment having congruent resource availability to execute ready tasks at a given time. At the cluster, we propose a variant Bin-Packing optimization approach through Gang-Scheduling of multi-dependent tasks that co-schedules and co-locates multi-task tightly on nodes to fully utilize available resources. Extensive simulations on real-world data-trace from the recent Alibaba cluster trace, with information on task dependencies (about 12, 207, 703 dependencies) and resource demands, show the effectiveness, faster executions, and resource efficiency of our approach compared to the state-of-the-art approaches.
... Contract-based resource sharing model among federated geo distributed clouds is designed [4]. Here, cloud providers can be both buyer and seller based on their resource availability. ...
Article
Full-text available
Federated Cloud is a next generation cloud computing model which works towards a new dynamic scenario in smart ecosystems. It consolidates individual clouds together to form single large cloud to satisfy all on-demand requirements of the users. Federated cloud is currently facing many challenges including: monitoring of individual cloud services, maintaining service level agreement, fair load distribution, power usage etc. Hence, to address these challenges this paper presents a brokerage model “Optimized Bit Matrix based Power Aware Load Distribution Policy for Federated Cloud (OBMPLP)”. Presented model includes two novel concepts: Bit Matrix and Load Distribution Factor. Bit Matrix constructed by individual cloud members representing resource availability status and environment in executing the user’s request. Load Distribution Factor representing load distribution level at individual cloud. OBMPLP policy dispenses the incoming requests among multiple clouds by analyzing bits pattern and load distribution factor that not only performs fair load distribution among multiple clouds, but also improves response time, alleviate power consumption at each cloud and achieve better quality service. The performance of the proposed policy is evaluated and results demonstrate reduction in response time, optimized power consumption and fair distribution of load compared to existing approaches.
... For our experiments, we consider that 16 cloud providers are participating, each with a capacity of 8192 VM instances. Similar sizes for the cloud federation were considered in [36], [37]. We consider six different application program sizes ranging from 256 to 8192 jobs. ...
Article
Full-text available
Cloud providers can form cloud federations by pooling their resources together to balance their loads, reduce their costs, and manage demand spikes. However, forming cloud federations is a challenging problem, especially when considering the incentives of the cloud providers making their own decisions to participate in cloud federations. We model the formation of cloud federations necessary to provide resources to execute Map-heavy/Reduce-heavy programs while considering the trust and reputation among the participating cloud providers. The objective is to form cloud federations with highly reputable cloud providers that achieve maximum profit for their participation. We introduce a coalitional graph game to model the cooperation among cloud providers. We design a mechanism for cloud federation formation that enables the cloud providers with high reputation to organize into federations reducing their costs. Our proposed mechanism guarantees the highest profits for the participating cloud providers in the federations, and ensures high reliability of the formed federations in executing the applications. We perform extensive experiments to characterize the properties of the proposed mechanism. The results show that our proposed mechanism produces Pareto optimal and stable cloud federations that not only guarantee that the participating cloud providers have high reputation, but also high individual profits.
... Localized fog computing which is widely distributed and at the edge of the network, provides communication with lower latency and more context awareness compared with centralized cloud computing [14] [16]. Meanwhile, although cloud computing technology, applications and service modes are relatively mature, research on cloud computing resource allocation and management continues [17] [18]. Despite the effectiveness, scalability and resource sharing efficiency of the proposed resource allocation strategies for clouds and fog has been proven, such models are still not flexible enough [19] [20]. ...
Article
Full-text available
The diverse applications and high-quality services in smart cities have led to geographical unbalance of computation requirements. Traditional centralized cloud computing services and massive migration of computing tasks result in the increase of network delay and the aggravation of network congestion. Deploying fog nodes at the network edge has become an effective way to improve the quality of service (QoS). However, the dynamic requirements and application in various scenarios still challenge the network, resulting in geographical unbalance of computing resource demands. Nowadays, computing resources of on-board computers and devices in the Internet of Vehicles (IoV) are abundant enough to mitigate the geographical unbalance in computing power demand. Efficient usage of the natural mobility of constantly moving vehicles to solve the problems above remains an urgent need. In this paper, vehicle mobility-based geographical migration model of the vehicular computing resource is established for the fog computing-enabled smart cities. The Vehicle as a Service framework takes full advantage of the unbalance and randomness of vehicular computing resource and improves the flexibility of traditional cloud computing architecture. An incentive scheme that affects the vehicle path selection through resource pricing is proposed to balance the resource requirements and to geographically allocate computing resources. Simulation results indicate that the advantages and efficiency of the proposed scheme are significant.
... The scheduler aims to reduce the workflow make span and works in different cloud environment [35]. Jinlai Xu et al. implemented a framework for resource sharing to maintain contract of resource sharing with the data centers for geo-distributed clouds for defined time intervals [36]. Viviane T. Nascimento et al. proposed a model to show customer's contract data, least energy costs, and allocating the services based on search items for better usage of energy [37]. ...
Article
Full-text available
Resource scheduling is a tricky task in cloud environment. QoS is the main parameter from user's perspective for Resource scheduling, while in parallel with this task, profit is very important parameter from point of view of cloud provider. The cloud service platform controls the revenue under particular market needs. The consumer get puzzled with many cloud suppliers for storing their data because various suppliers' varying pricing scheme. In particular, recently many studies have paying attention on shaping the bond between server-side system facts and performance experience for dropping resource wastage. The main aim of cloud supplier is to provide utmost resource usage and profit, while also decreasing the energy and cost. The user wants higher throughput and less response time. Allocating proper resources with least overhead and full resource utilization is the objective of cloud. The service requests are generated by various users in cloud. Hence proper scheduling of resources is required for better performance of system and less operative cost.
Chapter
Artificial Intelligence for IT operations (AIOps) is an emerging research area for public cloud systems. The research topics of AIOps have been expanding from robust and reliable systems to cloud resource allocation in general. In this paper we propose a resource sharing scheme between cloud users, to minimize the resource utilization while guaranteeing Quality of Experience (QoE) of the users. We utilise the concept of recently emerged Artificial Swarm Intelligence (ASI) for resource sharing between users, by using Artificial-Intelligence-based agents to mimic human user behaviours. In addition, with the variation of real-time resource utilisation, the swarm of agents share their spare resource with each other according to their needs and their Personality Traits (PT). In this paper, we first propose and implement an Evolutionary Multi-robots Personality (EMP) model, which considers the constraints from the environment (resource usage states of the agents) and the evolution of two agents’ PT at each sharing step. We then implement a Single Evolution Multi-robots Personality (SEMP) model, which only considers to evolve agent’s PT and neglects the resource usage states. For benchmarking we also implement a Nash Bargaining Solution Sharing (NBSS) model which uses game theory but does not involve PT or risks of usage states. The objective of our proposed models is to make all the agents get sufficient resources while reducing the total amount of excessive resources. The results show that our EMP model performs the best, with least iteration steps leading to the convergence and best resource savings.
Article
Due to the problems exist in non-geographic federated clouds, the geographic ones are considered. Nevertheless, the approaches that have already been proposed to allocate resources across the geographical federated clouds have two basic problems that we will address in this article: (1) Lack of proper distribution of user requests leading to increases file transfer volume and cost, as well as response time to user requests, (2) Lack of appropriate resource sharing among requests due to: (1) the use of a centralized DC and (2) considering the satisfaction of single objective which case (1) suffers the problem of single-point of failure and case (2) raises an obstacle for the situations need considering multi conflicting objectives. Concerning the problem of one, it should be said that as federal DCs are distributed globally in the geographic clouds, the cost of file transfer between DCs in these clouds is more focused than the concentrated ones. Since there has been no work in this field in the geo-distributed federated clouds, we have presented a new scheduling mechanism based on hypervolume for the distribution of applications that leads to increasing service quality and reducing file transfer cost. Concerning the problem of two, the previous solutions in the geographic federated clouds have focused on a centralized resource sharing with single objective (increase of the cloud service provider (CSP) profit). These solutions not only just consider the CSP profitability, but, because of the possibility of failure of central broker of resource-sharing, suffer the single-point of failure. In this paper, we propose a new, autonomic and peer-to-peer multi-objective resource sharing approach that considers objectives: (1) enhancing the CSP's profit, (2) decreasing the network latency and (3) decreasing file transfer traffic and (3) increasing fairness in CSPs' profit. The techniques presented in this paper are evaluated by extensive experiments using real workloads. To validate the proposed method, we have extended the CloudSim tool. The results of our experiments show the increase of performance in the scheduling and resource-sharing objectives among which the main objectives of average rate of success, profit and execution time were enhanced 8.5%, 15.47% and 25.84%, respectively compared with previous studies.
Article
Fog computing as an extension of the cloud based infrastructure, provides a better computing platform than cloud computing for mobile computing, Internet of Things, etc. One of the problems is how to make full use of the resources of the fog so that more requests of applications can be executed on the edge, reducing the pressure on the network and ensuring the time requirement of tasks. The high mobility of fog nodes also has a great impact on the task completion time and user satisfaction. Thus, a general IoT-Fog-Cloud computing architecture with a contract-based resource sharing mechanism is proposed in this paper. The contract establishment problem of resource sharing mechanism among fog clusters is modeled as a sealed-bid bilateral auction in order to take full advantage of the fog resources and ensure that more tasks could be executed on the fog. Then, we propose a scheduling method based on functional domain construction to mitigate the influence of mobility of fog nodes. It includes the selection of critical fog nodes and the construction of fog function domains based on spectral clustering. The selection of critical fog nodes is used to find the best fog nodes in each fog cluster with respect to the betweenness centrality, computing performance and communication delay to the IoT nodes. The critical nodes are responsible for building the functional domains of the remaining fog nodes in each fog cluster. Functional domain construction is used to determine the set of fog nodes contained in the corresponding functional domain. Finally, through extensive simulation experiments, the performance difference between the proposed method and the other four methods in terms of average service time, average utilization of fog nodes, success rate of tasks, average WLAN delay and the average cost of successful tasks are evaluated. Results show that our method generally outperforms the other four methods in these metrics.
Article
As companies shift from desktop applications to Cloud-based Software as a Service (SaaS) applications deployed on public Clouds, the competition for end-users by Cloud providers offering similar services grows. In order to survive in such a competitive market, Cloud-based companies must achieve good Quality of Service (QoS) for their users, or risk losing their customers to competitors. However, meeting the QoS with a cost-effective amount of resources is challenging because workloads experience variation over time. This problem can be solved with proactive dynamic provisioning of resources, which can estimate the future need of applications in terms of resources and allocate them in advance, releasing them once they are not required. In this paper, we present the realization of a Cloud workload prediction module for SaaS providers based on the Autoregressive Integrated Moving Average (ARIMA) model. We introduce the prediction based on the ARIMA model and evaluate its accuracy of future workload prediction using real traces of requests to web servers.We also evaluate the impact of the achieved accuracy in terms of efficiency in resource utilization and QoS. Simulation results show that our model is able to achieve an average accuracy of up to 91%, which leads to efficiency in resource utilization with minimal impact on the QoS.
Article
With the increasingly growing amount of service requests from the world-wide customers, the cloud systems are capable of providing services while meeting the customers' satisfaction. Recently, to achieve the better reliability and performance, the cloud systems have been largely depending on the geographically distributed data centers. Nevertheless, the dollar cost of service placement by service providers (SP) differ from the multiple regions. Accordingly, it is crucial to design a request dispatching and resource allocation algorithm to maximize net profit. The existing algorithms are either built upon energy-efficient schemes alone, or multi-type requests and customer satisfaction oblivious. They cannot be applied to multi-type requests and customer satisfaction-aware algorithm design with the objective of maximizing net profit. This paper proposes an ant-colony optimization-based algorithm for maximizing SP's net profit (AMP) on geographically distributed data centers with the consideration of customer satisfaction. First, using model of customer satisfaction, we formulate the utility (or net profit) maximization issue as an optimization problem under the constraints of customer satisfaction and data centers. Second, we analyze the complexity of the optimal requests dispatchment problem and rigidly prove that it is an NP-complete problem. Third, to evaluate the proposed algorithm, we have conducted the comprehensive simulation and compared with the other state-of-the-art algorithms. Also, we extend our work to consider the data center's power usage effectiveness. It has been shown that AMP maximizes SP net profit by dispatching service requests to the proper data centers and generating the appropriate amount of virtual machines to meet customer satisfaction. Moreover, we also demonstrate the effectiveness of our approach when it accommodates the impacts of dynamically arrived heavy workload, various evaporation rate and consideration of power usage effectiveness. Copyright © 2014 John Wiley & Sons, Ltd.
Conference Paper
Recently, data center carbon emission has become an emerging concern for the cloud service providers. Previous works are limited on cutting down the power consumption of the data centers to defuse such a concern. In this paper, we show how the spatial and temporal variabilities of the electricity carbon footprint can be fully exploited to further green the cloud running on top of geographically distributed data centers. We jointly consider the electricity cost, service level agreement (SLA) requirement, and emission reduction budget. To navigate such a three-way tradeoff, we take advantage of Lyapunov optimization techniques to design and analyze a carbon-aware control framework, which makes online decisions on geographical load balancing, capacity right-sizing, and server speed scaling. Results from rigorous mathematical analyses and real-world trace-driven empirical evaluation demonstrate its effectiveness in both minimizing electricity cost and reducing carbon emission.
Conference Paper
Many cloud services are running on geographically distributed datacenters for better reliability and performance. We consider the emerging problem of joint request mapping and response routing with distributed datacenters in this paper. We formulate the problem as a general workload management optimization. A utility function is used to capture various performance goals, and the location diversity of electricity and bandwidth costs are realistically modeled. To solve the large-scale optimization, we develop a distributed algorithm based on the alternating direction method of multipliers (ADMM). Following a decomposition-coordination approach, our algorithm allows for a parallel implementation in a datacenter where each server solves a small sub-problem. The solutions are coordinated to find an optimal solution to the global problem. Our algorithm converges to near optimum within tens of iterations, and is insensitive to step sizes. We empirically evaluate our algorithm based on real-world workload traces and latency measurements, and demonstrate its effectiveness compared to conventional methods.
Conference Paper
An important challenge of running large-scale cloud services in a geo-distributed cloud system is to minimize the overall operating cost. The operating cost of such a system includes two major components: electricity cost and wide-area-network (WAN) communication cost. While the WAN communication cost is minimized when all virtual machines (VMs) are placed in one datacenter, the high workload at one location requires extra power for cooling facility and results in worse power usage effectiveness (PUE). In this paper, we develop a model to capture the intrinsic trade-off between electricity and WAN communication costs, and formulate the optimal VM placement problem, which is NP-hard due to its binary and quadratic nature. While exhaustive search is not feasible for large-scale scenarios, heuristics which only minimize one of the two cost terms yield less optimized results. We propose a cost-aware two-phase metaheuristic algorithm, Cut-and-Search, that approximates the best trade-off point between the two cost terms. We evaluate Cut-and-Search by simulating it over multiple cloud service patterns. The results show that the operating cost has great potential of improvement via optimal VM placement. Cut-and-Search achieves a highly optimized trade-off point within reasonable computation time, and outperforms random placement by 50%, and the partial-optimizing heuristics by 10-20%.