Green Clouds through Servers, Virtual Machines
and Network Infrastructure Management
Carlos Becker Westphall, Carla Merkle Westphall, Sergio Roberto
Villarreal, Guilherme Arthur Geronimo and Jorge Werner
Networks and Management Laboratory
Post-Graduate Program in Computer Science
Department of Informatics and Statistics
Federal University of Santa Catarina
Caixa Postal 476, 88040-970 Florianópolis, SC, Brazil
The aim of green cloud computing is to achieve a balance between resource
consumption and quality of service. This work introduces the distributed system
management model, analyses the system’s behavior, describes the operation principles,
and presents case study scenarios and some results. We extended CloudSim to simulate
the organization model approach and implemented the migration and reallocation
policies using this improved version to validate our management solution. In this
context, we proposed strategies to optimize the use of the cloud computing resources,
introducing two hybrid strategies, describing the base strategies, validating and
analyzing them, and presenting the results. The basic principles pointed out in recent
works for power management in legacy network equipment are presented, and a model
for its use to optimize green cloud approach is proposed.
The goal of green computing is to seamlessly integrate management of computing
devices and environmental for control mechanisms to provide quality of service,
robustness, and energy efficiency. The challenge in green cloud computing is to
minimize resource usage and still satisfy quality of service requirements and robustness
[Werner et al. 2012].
This section of introduction presents the fundamental principles and state of the
art technologies that enable the green cloud computing, commenting some of the latest
proposals from several authors and offering an overview about this subject.
1.1.1. Principles and Technologies Applied to Hardware
The power management techniques applied to hardware may be classified as SPM
(Static Power Management), which apply permanent improvements based primarily on
the development and utilization of more efficient components, and as DPM (Dynamic
Power Management), which apply temporary actions based on real-time knowledge of
the use of resources and workload [Beloglazov et al. 2011].
The techniques used in DPM are DCD (Dynamic Component Deactivation),
consisting on turning off components during idle periods, also known as sleep state, and
DPS (Dynamic Performance Scaling), which consist on the gradual reduction of the
performance when demand decreases [Beloglazov et al. 2011]. These technologies
allow to create "Energy Aware" devices that implement the strategy known as
proportional computing, i.e., devices whose energy consumption is proportional to their
workload and are based mainly on DVFS (Dynamic Voltage and Frequency Scaling)
technology and ACPI (Advanced Configuration and Power Interface) standard
DVFS (Dynamic Voltage and Frequency Scaling): taking into account that the
consumption of an electronic circuit is proportional to the operating frequency and the
square of the voltage, this technique consists in intentionally decrease the performance
of the processor, when not being fully utilized, by reducing the frequency and the
voltage. This technique is used in most of the modern equipments [Beloglazov et al.
ACPI (Advanced Configuration and Power Interface) is an open standard
proposed in 1996 by Intel, Microsoft, Toshiba, HP and Phoenix to define a unified
interface for power configuration and management, centered on operating system and
which describes platform independent interfaces for hardware discovery and energy
configuration, management and monitoring [ACPI 2010]. The main contribution of this
model, besides the standardization, is to have shifted the implementation of dynamic
management techniques from hardware to software, bringing flexibility to policies
configuration and its automation [Beloglazov et al. 2011].
The ACPI standard defines different states of energy that can be applied to
systems during their operation. The most relevant are the c-states and p-states. The c-
states are CPU power states that can be C0 (operation state), C1 (halt), C3 (stop clock)
and C4 (sleep mode). The p-states describe the processor performance state representing
different DVFS settings combinations. The amount of p-states varies between
implementations and P0 is always higher performance state [ACPI 2010].
As described by [Beloglazov et al. 2011], DPM (Dynamic Power Management)
strategies would be simple to implement disregarding the cost of moving from one state
to another, which represents an overhead because of the delay that affects system
performance and the additional energy consumption. Therefore, a change of state is
justified only if the period of use is long enough to cover the cost of the transition, a fact
that is not easy to predict. There is a vast number of studies that propose efficient
methods to solve this problem such as [Beloglazov et al. 2011] which identify the need
to improve statically the transition cost and also the prediction algorithms.
According to [Minas and Ellison 2009], the main sources of power consumption
in computers are the CPU, the RAM memory and losses in the power supply, and all
components are more efficient when operating at a high utilization rate. The CPUs have
received constant improvements. Multicore processors are statically more efficient than
traditional ones and, by implementing dynamic techniques, the consumption can be
reduced up to 70% preserving their ability to run the programs. Other components do
not support active low energy states and must be partially or fully turned off, which
leads to large losses in performance due to the activation time [Minas and Ellison 2009].
The same authors also state that heavy use of virtualization has led to servers
with a large amount of RAM in which the power consumption of the memory is greater
than the CPU consumption. Therefore, they emphasize the need to develop new
techniques and approaches to reduce memory consumption. They also suggest
improvements in power supply units as a priority research area.
When using virtualization, as virtualization software lies between the hardware
and the operating system, it should assume the power management, monitoring the
overall performance of the system and applying the appropriates DPS or DCD
techniques to the hardware components or, preferentially, attending the operating
systems calls of each VM (Virtual Machine) and mapping them into hardware changes
[Beloglazov et al. 2011].
Finally, it is observed that the static management techniques are very important
and should be taken into account during the design of systems and hardware purchase.
However, once acquired, its benefits are guaranteed, unlike the dynamic management
techniques that require configuration, and management policies that can maximize their
efficiency or even decrease it. This fact indicates the need to provide more effective
resources for the measurement and monitoring of energy consumption of devices.
1.1.2. Principles and Technologies Applied to Networks
The network infrastructure is responsible for a significant percentage of the
energy consumption of IT and has distinctive characteristics, which is why efforts to
make it more efficient and environmentally friendly are a particular study field that has
been identified under the name of Green Networking [Bianzino et al. 2012].
According to [Bianzino et al. 2012], the network systems design has
traditionally followed two principles diametrically opposed to the goals of green
networking: oversizing to support demand peaks with a margin for unexpected events,
and redundancy in order to undertake the task when a device fails. This fact makes the
Green Networking technically challenging, having as main objective to introduce the
energy-aware concept in network design, without compromising the performance or
The main strategies used in green networking are: "Proportional computing",
which is to adapt the devices processing speed and the links speed to the workload at a
given time, and the "workload consolidation", which is done considering daily and
weekly traffic patterns and turning off components not needed. Virtualization is used to
consolidate physical resources, mainly switches and routers [Bianzino et al. 2012].
The proportional computing is implemented using DFVS to regulate the packet
processing speed, ALR (Adaptive Link Rate) to regulate the links speed according to
the current traffic, and DCD techniques (sleep mode) to put devices in low-energy
mode. The application of DCD should be complemented with special proxying
techniques to maintain network presence of inactive devices [Bolla et al. 2011].
The ALR technique is based on the observation that the energy consumption of a
local network or access network link mainly depends on its speed and is relatively
independent of its utilization rate, and proposes adapting the capacity of the link putting
it in sleep state during periods of inactivity (which may be long or too short) or reducing
its speed during periods of low utilization [IEEE 2010].
According to what is explained in [Bolla et al. 2011], in networking equipment,
the sleep mode, introduces the special challenge that an inactive device loses its
presence on the network which is maintained by different types of messages that
generate permanent traffic. This constant traffic between network equipments also
affects the effectiveness of the DPM techniques applied on the final devices, where the
CPU must be "awakened" by the NIC (Network Interface Controller) to respond to
As described by [Bianzino et al. 2012], to solve these problems, the proposed
strategy is to use "interface proxying", which consists in delegating the handling of this
traffic that many times can be discarded or require simple answers to other entity more
economic than the CPU. The implementation of this entity may be made as a further
feature of the network card, or as an external drive that can serve multiple customers
using a dedicated server or adding this feature to the switch. [Bianzino et al. 2012] and
[Bolla et al. 2011] explain in detail various proposals for implementing these features.
The ALR (Adaptive Link Rate) technique is the basis of IEEE 802.3az standard
(Energy Efficient Ethernet) ratified in September 2010 [IEEE 2010], and there are
already in the market devices that meet this standard. Some switches have extra features
known as Green Ethernet, such as D-Link switches that reduce the power in ports where
the final device is unused and reduce the transmission power based on the length of the
link [D-LINK 2011].
In the physical level layer, the main proposals are related to exchange the metal
networks by optical networks because these networks are more efficient and provide
higher bandwidth, but since in optical domain the buffering is not possible, the optical
networks do not have the flexibility of metallic networks [Bianzino et al. 2012].
[Bianzino et al. 2012] state that in the network level layer, several techniques
have been proposed to implement energy-aware routing strategy in order to consolidate
and prioritize traffic routes with energy-aware devices and also point out that in this
layer it is necessary to adapt the routing protocols to avoid routing tables instability
caused by instantaneous changes introduced by DPM techniques.
These authors also state that the in transport layer there are proposals to make
TCP (Transport Control Protocol) energy-aware, with modifications as adding tcp_sleep
option in the header to tell that the transmitter will enter into sleep state, making the
other party to delay the transmission putting the data received from application in a
buffer. Furthermore, they say that there are proposals to modify some application layer
protocols to include this signaling, even though some authors consider most appropriate
to implement them at the transport layer providing green sockets for developers.
[Bianzino et al. 2012] suggest that, although conceptually the principle of
independence of the layers should be respected, it is important to consider the
information exchange between layers to reach to practical solutions that enable the
coordination of all measures to optimize results.
[Blanquicet and Christensen 2008] propose extensions to SNMP (Simple
Network Management Protocol), for agents to expose the power state of the devices to
the network, including its power management capabilities, the current settings and
statistics, and claim that with this information the network administrator can remotely
monitor energy consumption of IT equipment and make changes to the settings.
Finally, [Bianzino et al. 2012] and [Bolla et al. 2011], emphasize the urgent need
to standardize metrics (green metrics) to scale the equipment efficiency and benchmark
sets to evaluate and compare different solutions effectively.
1.1.3. Principles and Technologies Applied to Data Centers
The power consumption is the main operating cost of data centers. This energy is
consumed mainly by the IT equipment (servers, storage and LAN), the cooling system
and the power distribution system itself, and, in many cases, the energy consumption of
these last two items, considered an overhead, is greater that the IT equipment
themselves [Beloglazov at al. 2011].
To quantify the size of this overhead, there is a parameter that is becoming
standard, the PUE (Power Usage Effectiveness), which represents the ratio between the
total energy consumed by the data center and energy actually used in IT equipment.
Earlier this decade, the typical data center PUE values ranged between 1.6 and 3.0
[Garg and Buyya 2012], but great advances are taking place in this area through
infrastructure and facilities location improvements. In 2011, Google announced a data
center with PUE of 1.14 [GOOGLE 2011], and Facebook claims to have a data center in
the Arctic Circle with PUE of 1.07 [FACEBOOK 2013].
Only considering the IT equipment, the main cause of inefficiency in the data
center is the low average utilization rate of resources, generally less than 50%, mainly
caused by workload variability, which requires building the infrastructure to handle
workload peaks that seldom happen but that would degrade the quality of service if the
application was running on a fully occupied server [Beloglazov at al. 2011].
The strategy used to deal with this situation is the workload consolidation, which
consists of allocating the entire workload in the minimum amount of physical resources
to keep them with the highest occupancy possible, and put the unused physical
resources in a low energy state. The challenge is how to handle unanticipated load peaks
and inactive resources activation cost [Garg and Buyya 2012]. Virtualization and the
ability to migrate virtual machines, along with the concentration of the files on
centralized storage systems, has helped to implement this strategy with greater
efficiency [Beloglazov at al. 2011].
1.1.4. Chapter Organization
This chapter is organized as follows:
Section 1 addresses the fundamental principles and state of the art technologies
that enable the green cloud computing [Westphall and Villarreal 2013].
Section 2 presents the motivations to propose an integrated management model
[Werner et al. 2012], strategies for allocation and provisioning of physical machines and
virtual machines [Geronimo et al. 2014], and power management in legacy network
equipments [Villarreal et al. 2014].
Section 3 comments the related works on which is based our proposals.
Section 4 describes our proposals and case studies (tests and results).
Section 5 concludes this chapter making some analysis, emphasizing the main
contributions and proposing future works.
1.2.1. Motivation for Integrated Management Model
The load prediction models in traditional architectures and cloud computing
environments are based on the analysis of historical data and demand increments from
business models. This information makes it possible to pre-allocate resources. However,
load prediction models are challenged (and frequently broken) when unexpected peaks
of demand occur.
Approaches to dealing with the problems of load prediction models include the
following: allow for a margin of on-line resources, i.e., over-provision resources; to turn
on idle resources; and to temporarily use external resources on-demand (i.e., federated
clouds), and others. Each of these approaches has its advantages and disadvantages. The
challenge in green computing, as described by [Valancius et al. 2009], is to exploit the
balance between these approaches in order to address the pressing issue of data center
over-provisioning related to the need to match the peak demand.
We propose a solution based on integrated environment, services and network
management that promotes: equitable load distribution through techniques like virtual
machines; predictive resource allocation models through historical load analysis and
pro-active allocation methods; aggregate energy management of network devices; and
integrate control over the environmental support units, which represent the larger share
of energy consumption.
The objectives are the following: to provide flexibility of the system
configuration that allows for the easy introduction of new elements in the managed
environment and the configuration processing distribution among services; to provide a
level of availability that keeps to higher standard SLA (Service Level Agreement)
compliance rates and which contributes to system’s stability and security; to reduce cost
in both capital and operational costs (CAPEX and OPEX), [Gruber at al. 2009], to
support the business predicates, and thus promote the acceptability of the proposed
method; and to provide sustainability by using methods to reduce energy utilization and
carbon emission footprints.
To achieve our objectives we propose an OTM (Organization Theory Model) for
integrated management of a green cloud computing environment. It works based on
organization models that regulate the behavior of autonomous components (agents) that
view the environmental elements, network devices (e.g. switches, cards and ports) and
service providers (e.g. processing servers, load distribution services, task processors and
temperature reduction services). For example, the management system is able to turn off
unused network devices and servers, turning off the environmental (cooling) support
units. This is reactive to characteristics of the predicted system load. The controlling
elements are able to coordinate between themselves aiming at a higher-level system’s
objective, e.g. to keep overall energy utilization and SLA compliance metrics.
Our research advances the state of the art as follows: it introduces an
organization theory model for integrated management of the green clouds based on the
concepts of organization models, network management, and distributed computing; it
analyses the network and system’s behavior and operational principles; it validates the
proposal demonstrating the system’s added-value in a case study scenario; and it
improves a simulator (the CloudSim framework) to validate the green cloud computing
Our research was motivated by a practical scenario at our university’s data
center. In the (not so distant) past, we applied the “traditional architecture” which was
composed of diverse processing clusters configured to process different services. We
faced the usual issues encountered in large data centers at that time: lack of rack space,
which impacted flexibility and scalability; an excessive number of (usually outdated)
servers, which impacted operation costs; the need of an expensive refrigeration system;
and an ineffective UPS (Uninterruptible Power Supply) system, which was problematic
to scale due to the number of servers involved.
With the use of cloud computing, we managed to consolidate the number of
servers using virtualization techniques. Using this technology, we concentrated the
predicted load on a few machines and kept the other servers on standby to take care of
peak loads. The immediate results were very positive: reduction of rack space
utilization; lower heat emission due to the reduction in server utilization, with
consequent optimization of the cooling infrastructure, and, a quick fix for the
problematic UPS system because we had less active servers.
As part of an institutional initiative towards sustainability and eco-friendliness,
our next step was to optimize energy utilization [Lefevre and Orgerie 2010] and reduce
carbon emission. For this, we looked at solutions from the fields of green computing
and, more specifically, green cloud computing. We noticed that there was room for
improvement as we consolidated resources using cloud computing. For instance, there
were periods in time when the VMs (Virtual Machines) were idle and the servers were
underutilized. Based on the principles established by [Buyya et al. 2010], our goal was
to promote energy-efficient management and search for methods to safely turn off
unused servers using an on-demand basis. The intuitive approach was to concentrate the
running applications (configured per VMs) in a few servers and recycle server capacity.
Although appealing, this approach led to a major issue: service unavailability! A
quick analysis concluded that it was related to the time required to bring up the servers
during unpredictable peak loads. We concluded the following: the dimensioning is
based on historic intra-day analysis of services demand. More specifically, it is based on
the analysis of previous day’s demand plus a margin of the business growth that can be
estimated as the amount of resources required for one service in a period of time;
however, when dealing with services with highly variable workloads, that prediction
becomes complex and often immature. Moreover, external factors can lead to
unexpected peaks of demand. For that, we left a safety margin of resources available
(e.g. 20% extra resources on standby). Besides the excessive energy utilization, this
approach fails when the demand surpassed that threshold; as a solution, we needed to
bring up turned-off resources. The lapse of time between the detection of the situation
and the moment that processing resources become available caused the service
We analyzed several alternatives to overcome this issue that implements an
OTM (Organization Theory Model) for integrated management of the green clouds
focusing on: optimizing resource allocation through predictive models; coordinating
control over the multiple elements, reducing the infrastructure utilization; promoting the
balance between local and remote resources; and aggregating energy management of
Cloud computing is based on server virtualization functionalities, where there is
a layer that abstracts the physical resources of the servers and presents them as a set of
resources to be shared by VMs. These, in turn, process the hosted services and (may)
share the common resources. The green cloud is not very different from cloud
computing, but it infers a concern over the structure and the social responsibility of
energy consumption [Liu at al. 2009], hence aiming to ensure the infrastructure
sustainability [Buyya, Ranjan and Calheiros 2010] without breaking contracts.
1.2.2. Motivation for Provisioning and Allocation Strategies
We are also proposing two strategies for allocation and provisioning of PMs (Physical
Machines) and VMs (Virtual Machines) using DVFS (Dynamic Voltage and Frequency
Scaling) as an improvement of private clouds sustainability, transforming the Cloud into
Green Cloud [Werner at al. 2012]. Green Clouds crave for efficiency of its components,
so, we adopted positive characteristics of multiple existing strategies, developing hybrid
strategies that, in our scope, aim to address:
- A sustainable solution to mitigate peaks in unpredictable workload
environments with rapid changes;
- An optimization of the data center infrastructure without compromising the
availability of services during the workload peaks;
- Balance between the sustainability of the infrastructure and the services
availability defined on SLAs (Service Level Agreements).
This work was based on actual data collected by the university data center, that
has multiple services suffering often with unexpected workload peaks, whether from
attacks on servers or overuse of services in short periods of time. First, we propose an
allocation model for private Clouds that aims to reduce the costs (energy and SLA fines)
while improving the resource optimization. Second, we propose a provisioning model
for private Clouds, turning them into Green Clouds, allowing the reduction of energy
consumption and resource optimization while maintaining the SLAs with the integration
of public Cloud resources. Third, after we validate our hybrid provisioning strategy, we
have the opportunity to apply the hybrid provisioning strategy in a Cloud environment
that uses DVFS (Dynamic Voltage and Frequency Scaling) in its physical machines.
This way we achieve an improvement in energy consumption and resource optimization
with no impacts on the Cloud SLAs.
The motivation for this work can be summarized in the following points:
- Energy saving: [Murugesan 2008] says ”Energy saving is just one of the
motivational topics within green IT environments.” We highlight the following points:
the reduction of monthly data center OPEX (Operating Expenses); the reduction of
carbon emissions into the atmosphere (depending on the country); and the extension of
the lifespan of UPS (Uninterruptible Power Supply) [Buyya et al. 2010].
- Availability of Services: Given the wave of products, components, and
computing elements being delivered as services by the Cloud (*aaS), a series of pre-
defined agreements or governing the behavior of the service that will be supplied /
provided is needed [Leandro et al. 2012]. According to Cloud Administrators,
agreements that provide availability rates, usually 99.9% of the time (or more) are a
concerning factor. Thus, the question is how to provide this availability rate while
consuming little power.
- Variation Workload: In environments with multiple services, the workload
prediction is complex work. Historical data is mostly used to predict future needs and
behaviors. However, abrupt changes are unpredictable causing temporary unavailability
of provided services. The need to find new ways to deal with these sudden changes in
the workload is evident.
!- Delayed Activation: Activation and deactivation of resources are a common
technique for reducing power consumption, but the time required to complete this
process can cause some unavailability of provided services, generating contractual fines.
- Public Clouds: Given the growing amount of public Clouds and the
development of communication methods among Clouds, like Open Cloud Consortium
[OpenCC 2012], and Open Cloud Computing Interface [OCCI 2012], it became
possible, for small or big companies, to easily use multiple public Clouds as extensions
of a single private Cloud. We considered this as an alternative resource to implement
new Green Cloud strategies. This is beneficial to those who need to expand their Cloud,
and to the new clients of Cloud providers.
In a broad sense, this proposed model is for the Cloud provider that seeks the
balance between energy saving and service providing (defined by the SLA).
We aim to propose an allocation strategy for private Clouds and a provisioning
strategy for Green Clouds, which suits the oscillatory workload and unexpected peaks.
We will focus on finding a solution that consumes low power and generates acceptable
request losses, in comparison to other base strategies.
1.2.3. Motivation for Management in Legacy Network Equipment
Traditionally, computer systems have been developed focusing on performance and
cost, without much concern for their energy efficiency. However, with the advent of
mobile devices, this feature has become a priority because of the need to increase the
autonomy of the batteries.
Recently, the large concentration of equipment in data centers brought to light
the costs of inefficient energy management in IT infrastructure, both in economic and
environmental terms, which led to the adaptation and application of technologies and
concepts developed for mobile computing in all IT equipment.
The term Green IT was coined to refer to this concern about the sustainability of
IT and includes efforts to reduce its environmental impact during manufacturing, use
and final disposal.
Cloud computing appears as an alternative to improve the efficiency of business
processes, since from the point of view of the user, it decreases energy costs through the
resources sharing and efficient and flexible sizing of the systems. Nevertheless, from the
standpoint of the service provider, the actual cloud approach needs to be seen from the
perspective of Green IT, in order to reduce energy consumption of the data center
without affecting the system’s performance. This approach is known as Green Cloud
Computing [Westphall and Villarreal 2013].
Considering only IT equipment, the main cause of inefficiency in the data center
is the low average utilization rate of the resources, usually less than 50%, mainly caused
by the variability of the workload, which obliges to build the infrastructure to handle
work peaks that rarely happen, but that would decrease the quality of service if the
application was running on a server fully occupied [Beloglazov et al 2011].
The strategy used to deal with this situation is the workload consolidation that
consists of allocating the entire workload in the minimum possible amount of physical
resources to keep them with the highest possible occupancy, and put the unused
physical resources in a state of low energy consumption. The challenge is how to handle
unanticipated load peaks and the cost of activation of inactive resources. Virtualization,
widely used in the Cloud approach, and the ability to migrate virtual machines have
helped to implement this strategy with greater efficiency.
Strategies to improve efficiency in data centers have been based mainly on the
servers, cooling systems and power supply systems, while the interconnection network,
which represents an important proportion of consumption, has not received much
attention, and the proposed algorithms for load consolidation of servers, usually
disregard the consolidation of network traffic.
The concepts of Green IT, albeit late, have also achieved design and
configuration of network equipment, leading to Green Networking, which has to deal
with a central problem: the energy consumption of traditional network equipment is
virtually independent of the traffic workload. The Green Networking has as main
strategies proportional computing that applies to adjust both the equipment processing
speed such as the links speed to the workload, and the traffic consolidation, which is
implemented considering traffic patterns and turning off components not needed.
According to [Bianzino et al. 2012], traditionally the networking system design has
followed two principles diametrically opposed to the aims of Green Networking, over-
sizing to support demand peaks and redundancy for the single purpose of assuming the
task when other equipment fail. This fact makes Green Networking technically
challenging, with the primary objective of introducing the concept of energy-aware
design in networks without compromising performance or reliability.
While the techniques of Green Networking begin to be standardized and
implemented in the new network equipment, a large amount of legacy equipment forms
the infrastructure of current data centers. In the works to be presented, it is shown that it
is possible to manage properly these devices to make the network consumption roughly
proportional to the workload.
Thereby, there is the need and the possibility to add, to the Green Cloud
management systems, means of interaction with the data center network management
system, to synchronize the workload consolidation and servers shutdown, with the
needs of the network traffic consolidation.
Taking into account that the more efficient becomes the management of virtual
machines and physical servers, the greater becomes the network participation in the
total consumption of the data center, the need to include network equipment in green
cloud model is reinforced.
The principles suggested in recent papers by several authors for power
management in legacy network equipment are presented, and their application to
optimize our approach of green cloud is proposed.
1.3. Related Works
1.3.1. Related Work for Integrated Management Model
[Pinheiro et al. 2001] have proposed a technique for managing a cluster of physical
machines that minimizes power consumption while maintaining the QoS level. The
main technique to minimize power consumption is to adjust the load balancing system
to consolidate the workload in some resources of the cluster to shut down the idle
resources. This concept tries to predict the performance degradation due to throughput
and workload migration based on historical trace. However, the estimated demand is
static - the forecast does not consider possible fluctuation in the demand over time. At
the end, besides having an economy of 20% compared to fulltime online clusters, it
saves less than 6% of the whole consumption of the data center.
[Calheiros et al. 2011] have developed a framework for cloud computing
simulation. It has four main features: it allows for modeling and instantiation of major
cloud computing infrastructures, it offers a platform providing flexibility of service
brokers, scheduling and allocations policies, its virtualization engine can be customized,
thus providing the capability to simulate heterogeneous clouds, and it is capable of
choosing the scheduling strategies for the resources.
There is some research on cloud computing models. For example, [Buyya,
Ranjan and Calheiros 2010] suggested creating federated clouds, called Interclouds,
which form a cloud computing environment to support dynamic expansion or
contraction. The simulation results revealed that the availability of these federated
clouds reduces the average turnaround time by more than 50%. It is shown that a
significant benefit for the application’s performance is obtained by using simple load
There are some preliminary researches. For example, [Buyya et al. 2010] aimed
to create architecture of green cloud. In the proposals some simulations are executed
comparing the outcomes of proposed policies, with simulations of DVFS (Dynamic
Voltage and Frequency Scaling). Their results are interesting, and they leave other
possible research directions open, such as optimization problems due to the virtual
network topology, increasing response time for the migration of VMs because of the
delay between servers or virtual machines when they are not located in the same data
[Liu et al. 2009] presented the GreenCloud architecture to reduce data center
power consumption while guaranteeing the performance from user perspective.
GreenCloud architecture enables comprehensive online monitoring, live virtual machine
migration, and VM placement optimization. To evaluate the efficiency and effectiveness
of the proposed architecture, they used an online real-time game, Tremulous, as a VM
application. Evaluation results showed that they can save up to 27% of the energy by
applying GreenCloud architecture. However the adoption of a set for validation of the
approach is weak. In addition managing the centralized structure is not shown.
[Mahadevan et al. 2011] described the challenges relating to life cycle energy
management of network devices, present a sustainability analysis of these devices, and
develop techniques to significantly reduce network operation power. The key insight
from their network energy management experience so far is that an integrated approach
which aims to minimize the total power consumed by a data center by including
network power, server power, and cooling costs as inputs to a global data center power
optimization strategy can potentially result in much greater energy savings.
1.3.2. Related Work for Provisioning and Allocation Strategies
According to [Laszewski et al. 2009], energy consumption is a major challenge. They
use a DVFS strategy to decrease the energy consumption in PMs used as virtualization
hosts. It adapts the clock frequency of the CPUs to the real usage of the PMs, decreasing
the frequency in idle nodes and increasing when is needed. However, the major energy
consumption is not in the CPU, but in other parts of the PM, so to really decrease the
energy consumption you need to turn them off.
[Gunaratne et al. 2008] stated that on USA just the NICs (Network Interface
Controllers) consume hundreds of millions of US dollars in electricity every year. That
amount of energy used by the NICs is growing rapidly as the default 100Mbps
controllers are being replace by brand new 1Gbps controllers, which consume about 4
W more than a 100Mbps controllers. They also found out that idle and fully loaded
Ethernet links consume about the same amount of power while the amount of power
used by an Ethernet link is actually dependent on the link speed. Given the fact that
measurements shown that the usage rate of Ethernets links are about 1% to 5% of the
capacity, that brought attentions to the ”Network Layer” as a new field to lower the
energy consumption in the data center.
[Gunaratne et al. 2008] proposed a system design for ALR (Adaptive Link Rate)
[Gunaratne et al. 2005] to be applied not just on edge links but rather on the whole
network. With this approach was possible to operate Ethernet links 80% of the time on
lower frequency, lowering the power consumption of the data center without affecting
services and users. Given the fact that the network management is out of this paper
scope, it was decided to not include the network infrastructure consumption in the final
The workload balance strategy for clusters in [Pinheiro et al. 2001] tries to
achieve a lower energy consumption unbalancing the cluster workload, generating idle
nodes and turning them off. In Cloud Computing, this strategy will not work in the case
of Denial-of-Service attacks. Because in that scenario all nodes will be on, and there
will be none node to turn off. This way, we foresee the need for VM migration between
Clouds as mandatory function, to avoid cases where the unbalance of the load cannot be
[Urgaonkar et al. 2009] proposed an overbooking strategy to consolidate virtual
machines in physical machines and this way a lower resource consumption would be
achieved. But, this work did not care that much for service degradation generated by
resource contention that happens when you consolidate workloads.
[Do et al. 2011] proposed a model that instead of taking just historic data of
workload resource consumption and behavior for resource allocation and provisioning
also takes in the interference generated by applications that compete for resource. It
applies a canonical correlation analysis technique to find the resources that influence the
most the application behavior; this way they could consolidate workloads with less
impact on provided services. Their model presents a better result than [Urgaonkar et al.
2009]], but it has high computational costs and still does not have a good performance
while predicting workload needs.
[Gong at al. 2010] proposed PRESS (PRedictive Elastic ReSource Scaling for
Cloud systems), a lightweight model to predict workload resource needs, based mainly
on historical data. It uses a FFT (Fast Fourier Transform) to spot dominant frequencies
and identify workload behaviors of resource usage. When there are no dominant
frequencies found it applies a Markov Chain technique to predict the workload resource
need for a short period of time. It is an early work that does not have the overhead
problems found in [Do et al. 2011], but it still does not have a great performance, since
there are much more variables to take in, such as background workload, VMs migration
need, application design, etc.
[Shen et al. 2011] improved the work done in [Gong at al. 2010]. They propose
a smart model for provisioning resource that aims at reducing the SLAs breaches The
PRESS prediction was extended to round data values. This way, it would achieve a
better result since PRESS has not been so accurate to predict workload need. In order to
improve the management of resources by the predictive model PRESS, they added
SLAs breaches measurement as a new variable into the prediction model. It was also
proposed a predictive migration model since the migration is one of the most expensive
processes when dealing with virtualized resources. It is best to start a migration before
the resource contention happens, avoiding this way a long period of service degradation.
They keep track of all VMs needs in a physical machine of the Cloud. This way, when
the resource prediction of VMs in that physical machine uses up the amount of
resources the machine has, the model triggers a migration process before the resource
[Dawoud et al. 2012] highlights that the use of “historical resource usage traces”
by themselves are not enough for a predictive model. That could lead to wrong actions
at management level, especially when dealing with Web applications that usually are
deployed in a multi-tier way (front end, application layer and database). So, he proposed
a model to manage Web applications (in public Cloud environments) that correlates
three factors, (1) historical traces of resource usage, (2) workload and (3) request types.
Given the fact that the Web Applications are, in most cases, developed in a multi-tier
way, the work raises the attention to the fact that each tier load does not interfere with
the other tiers equally. That means, if we give more resource to a tier, we should
proportionally increase the resource of every single tier of the application, since they all
are tied together.
[Hulkury and Doomun 2012] proposed an integrated Green Cloud Computing
Architecture that addresses the workload placement problem, determining the better
place to deploy the users jobs based on their theoretical energy consumption. It requires
a manager (cloud client side) to provide the jobs SLAs, job descriptions, network and
server specifications, to calculate the energy consumption of the job in each cloud
scenario (local, private or public Cloud). Just like [Werner et al. 2012], it touches the
point of using public clouds as an extension, and routing jobs between the clouds when
it can be profitable. Sadly, it depends on some information that, in most cases, the
Cloud client does not have access to, like the energy consumption of the public Cloud
elements. It also mentions the idea of using XML to store SLAs and QoSs constraints in
the Cloud Manager; however it does not define any standard for that.
1.3.3. Related Work for Management in Legacy Network Equipment
[Mahadevan et al. 2009] present the results of an extensive research conducted to
determine the consumption of a wide variety of network equipment in different
conditions. The study was performed by measuring the consumption of equipment in
production networks, which made it possible to characterize the energy expenditure
depending on the configuration and use of the equipment, and determine a mathematical
expression that allows calculating it with an accuracy of 2%. This expression
determines that total consumption has a fixed component, which is the consumption
with all ports off, and a variable component which depends on the number of active
ports and the speed of each port.
Research has determined that the power consumed by the equipment is relatively
independent of the traffic workload and the size of packets transmitted, and dependent
on the amount of active ports and their speed. The energy saved is greater when the port
speed is reduced from 1 Gbps to 100 Mbps, than from 100 Mbps to 10 Mbps.
This research also presents a table with the average time needed to achieve the
operational state after the boot of each equipment category, and also demonstrates that
the behavior of the current equipment is not proportional, as expected according to the
proposals of the Green Networking, and therefore the application of traffic
consolidation techniques have the potential to produce significant energy savings.
[Mahadevan et al. 2011], continuing the work presented in the preceding
paragraphs, put the idea that the switches consumption should ideally be proportional to
the traffic load, but as in legacy devices the reality is quite different, they propose
techniques to make the network consumption closer to the proportional behavior by the
application of configurations available in all devices.
The results are illustrated in Figure 1.1, which shows the ideal behavior
identified as "Energy Proportional" which corresponds to a network with fully "Energy
Aware" equipment, the actual curve of the most of the today's networks where the
consumption is virtually independent of load, labeled "Current", and finally the
consumption curve obtained by applying the techniques they proposed, labeled
Figure 1.1. Consumption in computer networks [Mahadevan et al. 2011].
The recommended configurations are: slow down the ports with low use, turn
off unused ports, turn off line cards that have all their ports off and turn off unused
switches. The authors, through field measurements, have shown that it is possible to
obtain savings of 35% in the consumption of a data center network with the application
of these settings. Also, with the use of simulations, they have demonstrated that in ideal
conditions savings of 74% are possible combining servers load consolidation and
network traffic consolidation.
!Figure 1.2. Green cloud management system based on OTM [Werner et al. 2012]
[Werner et al. 2012] proposes a solution for the integrated control of servers and
support systems for green cloud model based on theory of organization, the OTM
(Organization Theory Model). This approach defines a model of allocation and
distribution of virtual machines that were validated through simulations and showed to
get up to 40% energy saving compared to traditional cloud model.
The proposed model determines when to turn off, resize or migrate virtual
machines, and when to turn on or off physical machines based on the workload and the
SLA (Service Level Agreement) requirements. The solution also envisages the
shutdown of support systems. Figure 1.2 shows the architecture of the management
system proposed, which is based on norms, roles, rules and beliefs.
We made extensions to the CloudSim simulator by [Calheiros et al. 2011],
developed at the University of Melbourne, creating the necessary classes to support the
Organization Theory Model, presented in the previous paragraphs, which allowed to
calculate the energy savings and SLA violations in various scenarios.
The management of legacy network devices in Organization Theory Model and
the rules and beliefs for the proper functioning of the model based on the findings of the
works described above are presented. The rules and equations required to include this
extension in CloudSim simulations are also presented and validated through a study
1.4. Proposals and Case Studies
1.4.1. Proposal for Integrated Management Model
To understand the problem scenario, we introduce the elements, interactions, and
operation principles in green clouds. Green clouds emerged as a solution to save power
by utilizing server consolidation and virtualization technologies. Fine tuning resource
utilization can reduce power consumption, since active resources (servers, network
elements, and A/C units) that are idle lead to energy waste. The target in green clouds
is: how to keep resources turned off as long as possible?
The interactions and operation principles of the scenario are described below:
• There are multiple applications generating different load requirements over the
• A load balance system distributes the load to active servers in the processing
• The resources are grouped in clusters that include servers and local
environmental control units (A/C, UPS, etc.). Each server can run multiple VMs that
process the requests for one specific application. Resources can be fully active (servers
and VM on), partially active (servers on and VMs off), or inactive (servers and resource
off). The number of servers and their status configuration is defined based on historical
analysis of the load demand.
• The management system can turn on/off machines overtime, but the question is
when to activate resources on-demand? In other words, taking too much delay to
activate resources in response to a surge of demand (too reactive) may result in the
shortage of processing power for a while. This reflects directly on the quality of service,
as it could deteriorate the service availability level (even if this is a short time). On the
other hand, activating more unnecessary resources causes resources to be left idle and
wastes energy consumption.
Green cloud with integrated management is a structure that we see as a tendency
of this area and seek like a goal. These aspects that are described below are the
reference for what our model aims to fulfill. In comparison to green cloud, we infer the
responsibility of consuming less energy in addition to ensuring the agreements
predefined in the SLA.
• Flexibility: is state-aware of all equipment under its control, acting for when it
will be necessary, not when it is needed, and plan their actions based in the information
of the whole cloud. It is able to predict and execute necessary changes in hardware
according to the demand of the cloud; such as slowing down an overheated CPU,
turning on machines based on foreseen load coming, or triggering a remote backup in
case of fire. It is able to interact automatically with public clouds [Buyya et al. 2010],
migrating or rising up new nodes on demand in remote clouds. It provides a better
support for occasional workload peaks or DoS (Denial of Service) attacks.
• Availability: encompasses a new level by extending itself to the public clouds,
allowing the creation of mirror clouds. It deals with context grouping automatically,
being able to migrate these groups, or elements to public clouds.
• Cost reduction: by having an automated management based on previous
experience and results, it can manage itself with minimal human intervention. It uses a
24/7 management system aiming to provide a better utilization of the resources. It will
enlarge the equipments lifetime, decrease the downtime caused by human errors and
reduce the expenses by adopting smart strategies for resource utilization. With inter-
cloud communications it can adopt a minimalist configuration, ensuring local
processing for most of their workload, and leaving the workload peaks to an external
• Sustainability: its structure has the ability to adopt goals for SLA, goals for
energy consumption (average X kWh per day) or goals for heat emission (average Y
BTU per day). The structure reacts with the environment events in order to fulfill the
predefined goals. Events like UPS state down, temperature sensors accusing high
degrees or fire alarms on. In parallel, adapts the environment dynamically in order to
fulfill the internal goals; like decreasing the cooling system to reach consumption goals.
We propose that breaking the centralized management service in several little
management services gives us the necessary elements to increase the “degree of
freedom” of the cloud, creating the possibility to achieve a balanced situation between
risk and consumption.
However, with several management services in the cloud we introduce a new
problem: the management of these services becomes a complex job. For this, we use the
principles of organization theory, to organize and classify such services, making them
easier to control. Cloud management through the organization theory principles gives
the possibility to auto configure the management system, since the addition of a new
element (such as network device, VM, PM, UPS) is just a matter of including a new
service in the management group.
Hence, we propose a proactive model for cloud management based on the
distribution of responsibilities for roles, shown in Figure 1.2. In this approach, the
responsibility for managing the elements of the cloud is distributed among several
agents, each one in one area. These agents will individually monitor the elements of the
cloud of their responsibility. They act in an orchestrated way aiming for the fulfillment
of the standards (norms).
Such orchestration is based on the fact that the knowledge about the state of the
cloud (as a whole) be shared by all agents, the existence of planning rules, to guide the
actions of the agents, and the development of beliefs about the inner workflow of the
cloud, that are constantly revised.
Since the data center structure is scaled and used to provide services, this
remains only a tool to provide such services. Generally, service level agreements are
established in order to clarify the responsibilities of each part - client and provider. We
emphasize that these agreements should be kept at their level (i.e. service), making them
purely behavioral rules (e.g. delay, fault tolerance) for the service, excluding structural
and physical requirements. Without the details of the environment configuration in the
agreement, the cloud becomes flexible. With the independence and flexibility to change
the configuration of the structure, it can become dynamic and extensible.
It can allow for covering external agreement factors still critical to the data
center infrastructure (i.e., energy consumption, hardware wear, among others), but not
related to the agreement. Just as we live under the laws of physics, the cloud should also
exist in well-defined laws, which we call norms. These norms express the rules of the
service behavior established in the SLA and the internal interests of the cloud, which
need to be considered.
For the various elements of the cloud to work efficiently, seeking the
enforcement of these standards, they should be coordinated by external agents to the
services they audited; managing, for example: enabling and disabling VMs; enabling
and disabling PMs; configuration changes in VMs; and enabling and disabling network
Since there is a wide range of elements to manage, the complexity would grow
proportionally with the size of the cloud. To avoid such complexity we infer a hierarchy
to the existing agents. We can make an analogy to a large company where there is a
hierarchy to be followed and responsibilities being delegated. Just as in a company,
there must be a system manager (the boss) that controls the entire environment.
Following the hierarchy we have the coordinators who split the operations between their
teams [Dignum et al. 2009] in order to facilitate the division of tasks and responsibilities
among its teams.
Depending on the situation, decisions will generate system operations or service
operations, or both. System operations can be divided into VM management, servers
management, network management and environment management. The service
operations can be divided into monitor element, service scheduler and service analyzer.
The action of each role is directly reflected in the configuration of the structure
as a whole. The system operations will act over the structure and environment in which
the services are being processed. The services operations will act over the service layer
and the environment, acquiring information from both.
The four roles that operations system may be classified as are:
• VM management: responsible for the actions implied the virtual machines. It
has an interface between the model and the virtual machines. As an example, creating or
destroying a VM, changing your settings and even moving it from one host to other host
(either from local or remote data center).
• Servers management: responsible for the actions implied the physical
machines. It has an interface between the physical machines and the model. As an
example, turning off and on a physical machine, changing the settings of the host
operating system (e.g. such as BIOS - Basic Input/Output System, SMART - Self-
Monitoring, Analysis, and Reporting Technology), hardware configurations (e.g. cooler
and accelerometer), and backend equipment (e.g. such as storage devices, switches and
• Network management: Responsible for actions implied the network devices. It
uses SNMP tools gathering traffic data and computing the utilization of each port on all
the switches, minimizing the active network components, while turning off unused
switches, and disabling unused ports saving energy.
• Environment management: responsible for actions outside the structure. It has
an interface between the environment and the model. As an example, temperature
control of the data center, control of power backup systems (e.g. UPS and generator),
control over the accessibility of the data center (e.g. physical security).
The three roles that service system may be classified as are:
• Monitor element: responsible for the collection of information structure in
general, and your understanding. It has the responsibility to keep the model aware of the
state of the cloud by monitoring the servers, VMs, network traffic and so on. It is based
on specific parameters previously configured by the System Manager, such as the use of
a resource and its threshold notification, the availability of network links (binary data)
or idleness of some element of the structure.
• Service scheduler: responsible for the cloud agenda. It has a proactive role in
the model, planning the actions to be taken before the scheduled events. In an exchange
of physical machines, for example, it will generate the following list of steps to be
followed: test secondary UPS; enabling secondary server; and VM’s migration.
• Service analyzer: responsible for testing services and behavioral analysis. It
has the role of auditing the service provided by the framework and understanding it. It
makes sure that the service provided is in accordance with the norms to be followed, by
inferring pre-established thresholds and alerting the system manager. It monitors the
quality of service that is provided, and tries to relate it with the variations in the
structure, finding patterns between the performance obtained and the variants elements.
Planning Rules and Beliefs:
• Planning rules: the basis of theoretical knowledge, which relates contexts and
objectives. They are used at times when decisions must be made, during the planning of
actions. They are pieces of primitive knowledge gleaned from the experience of
managers. We can take as an example of Planning Rules the following notions: if a VM
increases the use of page swap ping, to decrease it, we will increase memory RAM
(Random Access Memory); if the physical machine presents a high load, to decrease the
load, we will move the VM with more processing to another physical machine; if the
data center presents a high load, to decrease the general load, we will turn on more
• Beliefs: empirical knowledge used to improve the decisions to be taken. In this
we have the empirical understanding above the functioning of the cloud. The beliefs
express the junction of practical knowledge (the premises), coming from the norms and
empirical knowledge, originating from the historical data and past experiences. The
beliefs must be reviewed frequently by all elements of the model, as well as the sharing
of these reviews. We can take as an example of beliefs the following notions: the
activation of a server type X represents an increase of Y degrees in Z minutes; the
activation of a VM type A increases the consumption in B kWh; the VM type A
supports C requests per second.
188.8.131.52. Case Study for Integrated Management Model
We modeled the system using NM (Norms), BL (Beliefs) and PR (Plan Rules), inferring
that we would need NM to reduce energy consumption, reduce the costs of the cloud
and maintain a minimalist structure, based on a PR minimum of SLA violations and
reduction of changes in the environment, not forgetting parameter settings BL of time
provisioning of virtual machines.
Based on these definitions and responsibilities, the agents’ sensors respond more
appropriately to balance the environment. Let’s consider three services (i.e. web service,
backup, remote boot) running concurrently and whose charge distribution appears to be
complementary. Their high peaks (i.e., variation of workload) happen at different times.
Based on inferences from NM, BL and PR agents would monitor the system and
determine actions dynamically.
The agents have two solutions to the adequacy of servers and virtual machines:
at a time before the peak, migrate the virtual machine to a more robust server or turn it
off. Thus the system would act more dynamically and autonomically, according to the
predefined requirements. Our environment is simply all the variations of workload
(input), allocating and distributing services (moving/relocating) to the reduced use of
resources (system output), searching environmental sustainability.
Due to the difficulty of replicating experiments in real environments and with
the goal of performing controlled and repeatable experiments, we opted to validate the
proposed scenarios using simulation. For this task we used the CloudSim framework
[Calheiros et al. 2011]; a tool developed in Java at the University of Melbourne,
Australia, to simulate and model cloud-computing environments.
We extended CloudSim to simulate the organization theory model approach and
implemented the migration and reallocation policies using this improved version (see
figure 1.3). In this way, one can evaluate the scenario proposed and reuse the
implemented models and controls in further simulations.
We set the basic characteristics of the simulated environment, physical machines
and virtual machines using data extracted from production equipments located at our
university. The data was used to represent the reality of a data center, and is based on a
data center into production at the university. It consists of different physical machines
and applications that require heterogeneous virtual machine configurations. The
dynamic workload was modeled according to information on peak load periods
extracted from a web server. The peak load periods are random and do not present any
Figure 1.3. Classes implemented in the CloudSim framework [Werner et al. 2011]
The main components implemented in the improved version at CloudSim are as
• HostMonitor: controls the input and output of physical machines.
• VmMonitor: controls the input and output of virtual machines.
• NewBroker: controls the size of requests.
• SensorGlobal: controls the sensors.
• CloudletSchedulerSpaceShareByTimeout: controls the size and simulation
• VmAllocationPolicyExtended: allocation policy.
• VmSchedulerExtended: allocates the virtual machines.
• UtilizationModelFunction: checks the format of requests.
• CloudletWaiting: controls the time of the request.
• DatacentreExtended: controls the data center.
Figure 1.4. Simulation using both policies [Werner et al. 2012]
Some experiments were simulated reaching the comparison of the usage of
different VM management agents. Two kinds of agents were selected, one responsible
for migrating the VM between the PM and another in charge of changing the VM
configuration, like memory size or CPU frequency. Four simulations were performed,
(1) one without any resource agents, (2) one applying only VM reallocating agents, (3)
one applying only migrating agents and (4) one applying both agents (reallocation and
migration), as presented in Figure 1.4.
Figure 1.4 shows two clusters, each containing 100 physical machines on which
virtual machines, created on demand, are allocated to applications. Some tasks are
analyzed, verifying the need to migrate and relocate virtual machines.
Table 1.1. Proposed scenario characteristics [Werner et al. 2012].
VM - Image Size
VM – RAM
PM – Engine
PM – RAM
PM - Frequency
PM – Cores
The simulator CloudSim was adapted to behave as the proposed model. Some
parameters of the simulated scenario are presented in Table 1.1.
This experiment aims to verify the advantages of the strategy of using the
"relocation of virtual machines" in conjunction with the strategy of "migration of virtual
machines" as a resource in real time (online). It checks, analyses and addresses the
changes in workload on virtual and physical machines, providing substantial resource
savings in the data center, leasing to further savings with power and air conditioning.
The availability rate increases to 99.9% and the number of SLA violations decreases.
Figure 1.5. Intra-day energy consumption (kWh) [Werner et al. 2012]
We intend to save energy by implementing policies for migrating virtual
machines, allowing us to minimize the number of physical machines running. Figure 1.5
presents the comparison of energy consumption of the above experiments. It can be
noticed a significant reduction on energy consumption of 87,18% kWatt/hour, in
comparison between the experiment with “both agents” and the experiment “without
Considering the SLA without implementing policies for virtual machine
migration, we have 1171 lost requests. Migrating and relocating virtual machines lead to
1077 lost requests, reducing to 8,03% the SLA violations. Figure 1.6 shows the SLA
violations on a day.
There is a reduction in migration (45% on average over a day) an in the number
of SLA violations - a result of reducing the number of lost requests. Moreover, the
approach simplifies the management model, in which it is possible to manage resources
(connecting / disconnecting machines) of each element and reducing energy
In a second set of experiments we simulated three allocation and activation
strategies of virtual machines for green clouds. The goal was to obtain 90% of
maximum workload. The three strategies are: an on-demand strategy that enables
physical and virtual machines when the threshold is detected; an idle resources strategy
that keeps virtual and physical machines idle; and a hybrid strategy that works on
demand, but in the absence of allocating physical machines to virtual machines allocates
a new virtual machine in a public cloud and activates a physical machine.
Figure 1.6. SLA violations in a day [Werner et al. 2012]
Figure 1.7 shows the hybrid approach that uses a public and private cloud. The
private cloud is composed of eight physical machines. The public cloud is composed of
100 physical machines. Each physical machine supports up to five virtual machines.
The size of the requests was fixed at 5500 MIPS, and the maximum response time was
Figure 1.7 Hybrid strategy for VMs allocation [Werner et al. 2012]
This experiment aims to verify the advantage of outsourcing the processing,
using public clouds during periods of peak workload unexpected. Table 1.2 shows the
reduction of costs and power consumption of the hybrid strategy compared to the on-
demand and idle resources strategies.
Table 1.2. Reduction of cost and power consumption [Werner et al. 2012].
Our integrated management system also has a database that stores the power
constants associated with network devices (routers, switches, and line cards). Our
management model predicts the power consumed by network devices during operations
based on power measurements and management information, using entity MIBs
(Management Information Bases) over SNMP (Simple Network Management Protocol).
We have analyzed the ways in which the network can be made more efficient in order to
save energy, performing actions such as turning off unused switches, and disabling
unused line cards and ports (see section 1.4.3).
Our PCMONS (Private Cloud Monitoring System), open-source solutions for
cloud monitoring and management, also helps to manage green clouds, by automating
the instantiation of new or more powerful VMs, depending on the resource usage
[Chaves et al. 2011].
1.4.2. Proposal for Provisioning and Allocation Strategies
The concept of combining organization theory and complex distributed computing
environments is not new. [Foster at al. 2008] already proposed the idea of virtual
organizations (VOs) as a set of individuals and / or institutions defined by such sharing
rules in grid computing environments. This work concludes that VOs have the potential
to radically change the way we use computers to solve problems the same way as the
Web has changed the way we consume and create information.
Following this analogy, we have a similar view: Management Systems based on
the Organization Theory would provide means to describe why / how elements of the
Cloud should behave to achieve global system objectives, which are (among others):
optimum performance, reduced operating costs, appointment of dependence, service
level agreements, and energy efficiency.
These organizational structures, proposed in [Werner et al. 2012], allow network
managers to understand the interactions between the Cloud elements, how their
behavior is influenced in the organization, the impact of actions on macro and micro
structures, as the macro level processes allowing and restricting activities at the micro
level. This way, it provides computational models to classify, predict, and understand
the elements interactions and their influence on the whole environment.
Managing Cloud through the principles of the Organization Theory provides the
possibility for an automatic configuration management system, since adding a new
element (e.g., Virtual Machines, Physical Machines, Uninterrupted Power Supply, Air
Conditioning) is just a matter of adding a new service on the Management Group.
The proposed strategies are based on a pro-active management of Clouds, which
is based on the distribution of responsibilities in roles. The management responsibility
of the Cloud elements is distributed among several agents; each agent controls
individually a Cloud element that suits him, as seen in Figure 1.2.
[Werner et al. 2012] proposed a model based on the Organization Theory to
manage a Cloud environment using decentralized management services. They proposed
agents to manage the Cloud elements, each agent managing the elements that are in its
area. These agents would individually monitor and manage the elements they are
responsible for, orchestrating them to fulfill the norms that are imposed to the system.
Norms are the rules or agreements used as input into the system such as SLAs,
energy consumption, resource optimization, air conditioning (data center temperature),
etc. They are a primitive knowledge collected from experienced administrators and are
used at times when decisions need to be made. In complement to Norms, [Werner et al.
2012] defined believes that are empirical knowledge used to improve the decisions at
management. It is the junction of the practical knowledge from the norms and empirical
knowledge from historical data, derived by the system, analyzing historical data traces
and correlating them with the norms that have or have not been fulfilled.
[Werner et al. 2012] also defined roles that the agents would assume while
monitoring/managing the Cloud environments or services. The roles defined for agents
that act at Cloud environment level are: VM management, server management, network
management and environment management. The roles defined for agents that act at
service level are: monitor element, service scheduler and service analyzer.
Based on [Werner et al. 2012], we conclude that the Organization Theory model
would be applicable for managing the entities of a Cloud computing environment in a
decentralized way. So far, our models apply the Organization Theory ideas as describe
by [Werner et al. 2012], using decentralized agents to monitor and manage the Cloud
The DVFS (Dynamic Voltage and Frequency Scaling) was presented by
[Magklis et al. 2003]. It provides an alternative solution to decrease power consumption
by giving the possibility to the PMs to independently decrease their power dissipation,
by lowering the processor clock speed and supply voltage during the idle periods of
time of the processor as seen on the left side of Figure 1.8.
Figure 1.8. DVFS - Main idea [Kim 2013]
• Adaptive Consumption: lower energy consumption by adapting the processor
frequency to the workload.
• Out-of-the-box: There is no need to adapt applications or services to use it.
• Management: The user (or application) is allowed to determine when to use (or
not) the solution, giving the possibility to control the CPU temperature.
• Low Performance: decreasing the CPU frequency will reduce the system
performance, which is expected [Wang et al. 2013].
• Inertia of Changes: The frequency takes some time to adapt to the system’s
needs. So, in scenarios with high load variations, DVFS could become a problem.
• Over Changes: The rapid and constant act of “overvolting” and “undervolting”
the processor, trying to fulfill immediately the system needs, could decrease the
equipment lifetime [Basoglu et al. 2010].
DVFS enhancements, as seen on the right side of Figure 1.8, also shows a
deeper level of DVFS. The idea is to apply it at the core level, not at the processor level
as a global unit. Another work is trying to decrease the gap between voltage and
frequency changes. The idea is to optimize the processor and build a fast DVFS that
adapts quickly to system needs, as shown in Figure 1.9. [Kim et al. 2008] use both
strategies at the same time, achieving a mark of 21 % of energy saved.
Figure 1.9. Fast DVFS - Main idea [Kim et al. 2013]
For the conscious resource provisioning in Green Cloud environments, we
propose a hybrid strategy that uses public Cloud as an external resource used to mitigate
SLA breaches due to unexpected workload peaks. In parallel, for the optimal use of
local resources, we propose a strategy of dynamic reconfiguration of the VMs attributes,
allocated in the data center. Given the distributed model presented in the previous
section, we used the Cloud simulation tool CloudSim [Buyya 2009] to simulate the
university data center environment and workload.
184.108.40.206. Case Study for Provisioning and Allocation Strategies
In order to simulate a distribution faithful to reality and also stressful to the
infrastructure, we chose as a workload pattern the distribution of requests from the
university’s main websites (as shown in Fig. 1.10), and then multiplied the request load
by factors between 2 and 20, in order to apply stress to the system. We defined this
strategy with the goal of obtaining results that reflect the reality and, at the same time
pushing the request rate, striving for correlating the workload behavior trends with the
Figure 1.10. Week workload distribution (Reqs/s) [Geronimo et al. 2014]
The resource allocation strategy is a proposal that introduces a composition of
two other approaches: (1) the migration of VMs, which aims to consolidate VMs and
optimize resource utilization, and (2) the Dynamic Reconfiguration of VMs, which aims
to reconfigure dynamically the resources used by the VMs, increasing the consolidation
1) VMs Migration Strategy: This strategy aims to reduce power consumption by
disabling the idle PMs of the Cloud. To induce idleness in the PMs, the VMs are
migrated and concentrated in few PMs. This way, the Cloud manager can disable the
idle PMs, reducing the consumption of the data center. However, for optimal results,
this strategy must be used with a reconfiguration strategy that enables hosting more
VMs in less PMs, increasing the idle PMs.
2) VMs Dynamic Reconfiguration Strategy: Seeking the improvement of the
previous strategy, this strategy is an alternative optimization that dynamically shrinks
the VM. It adjusts the parameters of the VM [Wood et al. 2009], without migrating it or
turning it off. For example, we can increase or decrease the parameters of CPU and
memory allocated. Thus, the VMs would adapt its configurations according to the
3) Tests and Results: To simulate the strategies we used a Cloud simulator tool
developed in Melbourne, CloudSim [Buyya 2009]. But, in order to achieve the
simulations needed, we made some changes in the code [Werner et al. 2011], allowing
to simulate the distributions patterns and scenario defined before. Four scenarios were
simulated in order to seek the comparative analysis between ordinary Cloud (Scenario
1), the existing methods (Scenarios: 2 and 3), and the proposed approach (Scenario 4).
1) No strategies;
2) Migrating VMs Strategy;
3) Reconfiguring the VMs Strategy;
4) Reconfiguring and migrating VMs Strategy.
At the simulations, we gathered behavior, sustainability, and availability metrics,
such as the number of idle PMs, total energy consumption, and number of SLA
breaches. Table 1.3 presents the percentage of energy consumption in each scenario
with 100 PMs. Analyzing it we see that without any strategy implemented, the power
consumption is stable during the whole period, since all the VMs and PMs were
activated, which happens if we implement just the reconfiguration strategy by itself. The
migration scenario shows a significant reduction in power consumption, since it
consolidate VMs in PMs and latter disabling the idle ones. And last, the mix of
migration and reconfiguration approach shows a steady noticeable reduction in power
consumption since it consolidates more VMs in fewer PMs.
Table 1.3 shows the results of the simulations. It shows what strategies were
used in each scenario and what percentage (approximate) reduction was obtained,
compared to the scenario without strategies. Note that, the Reconfiguration strategy
(scenario 3) did not produce much effect, but the scenarios (2 and 4) that used the
Migration strategy obtained a greater reduction of energy consumption.
Table 1.3. Results of allocation’s scenarios [Geronimo et al. 2014].
The Green Cloud provisioning hybrid strategy is based on two other strategies,
which are the OD (On Demand) strategy and the SR (Spare Resources) strategy. It tries
to be the middle ground between the two, enjoying the strengths of both sides, and
aiming to present power consumption lower like the OD strategy while maintaining the
availability as the SR strategy. 1) On Demand Strategy: The principle of OD strategy is
to activate the resources when they are needed. In our case, when a service reaches a
saturation threshold, new VMs would be instantiated. And, when there is no more space
to instantiate new VMs, new PMs would be activated to host the new VMs. The
opposite also applies; when a threshold of idleness is reached, the idle VMs and PMs
are disabled. Figure 1.11 shows an OD Strategy scenario, where only the needed VMs
(green circles) and PMs (white slim rectangles) are turned on. The other units (red
crosses and lined rectangles) are off.
• Pros: energetically efficient since it maintains just a minimum amount of active
• Cons: ineffective in scenarios that have sudden spikes in demand because the
process to activate resource takes time, and some requests end up being lost.
Figure1.11. On Demand Strategy [Geronimo et al. 2014]
2) Spare Resource Strategy: To mitigate the problem of requests timeouts
originated by a long activation time of resources, we adopted the strategy SR, whose
principle is to reserve idle resources ready to be used. In our case, there was always one
idle VM ready to process the incoming requests and one idle PM ready to instantiate
new VMs. If these resources were used, they were no longer considered idle, and new
idle resources were activated. As soon as the resources were no longer being used they
were disabled. Figure 1.12 shows a SR Strategy scenario where the Cloud keep an idle
VM (golden circle) and an idle PM (vertical lined rectangle) ready to fulfill any
• Pros: The strategy has been shown effective to deal with unexpected peak
• Cons: It showed the same behavior as OD strategy in cases where demand
raised very rapidly; in other words, the idle feature was not enough to process the
demand. Another negative point was the energy consumption; since it always had an
idle resource, the consumption was greater than the OD strategy.
Figure 1.12. Spare Resource strategy [Geronimo et al. 2014]
3) Hybrid Strategy: Seeking the merger of the strengths of the previous
strategies and mitigating its shortcomings, we propose a hybrid strategy. This strategy
aims to reduce the energy consumption on private Cloud and reduce the violation of
As shown in Figure 1.13, the Cloud enables the VMs when the service in
question reaches its saturation threshold, just as the OD strategy. When a PM is unable
to allocate more VMs, it uses the public Cloud to host the new VMs while a disabled
PM is passing through the activation process. This way we fulfill requests that would be
lost during the activation process.
Fig. 1.13. Hybrid Strategy [Geronimo et al. 2014]
The deactivation process occurs just as the other strategies. However, it is
considered that the public Cloud is paid by time (usually by hour of processing); so, it
disables the VM hosted in the public Cloud only when:
• it is idle and;
• it is almost time to complete a full hour of hosting.
4) Tests: As previously mentioned, we performed some modifications to the
CloudSim code in order to enable the simulation of scenarios using our proposed model.
Before we started the simulations, we defined some variables, such as the saturation
threshold and idleness. The variables considered in our experiments are shown in Table
Table 1.4. Results of allocation’s scenarios [Geronimo et al. 2014].
Saturation Threshold (Load 1 minute)
Idleness Threshold (Load 1 minute)
Activation VM time (seconds)
Activation PM time (seconds)
Size of Request (MI)
1000 to 2000
On or Off
Number of PMs
Maximum number of VMs per PMs
SLA timeout threshold (seconds)
The amount of requests per second was calculated based on the previously
presented workload pattern (Figure 1.10), using the formula Rt*Mx where Rt is the
number of requests per second in time t, and Mx is the stress multiplier of the
To get an overview of how each strategy would behave in different scenarios,
we ran a series of tests which varied the:
• Amount of Requests: To maintain the defined request distribution (explained in
the beginning of Section 220.127.116.11), we used multipliers to increase the number of
requests. Those multipliers started from 2 to 20 in steps of 2 (2, 4, 6, etc.).
• Size of Requests: The size of requests ranged from 1000 to 2000 MI (Millions
Instructions) to be executed, in steps of 100 (1000, 1100, 1200, etc.).
• Utilization of DVFS: Based on the previously tests, we compare the proposed
hybrid strategy with and without DVFS.
This way, it was performed a total of 440 simulations being 330 simulations
without DVFS and 110 with DVFS (just the hybrid strategy). These tests evaluated the
power consumption of the private Cloud and the total number of timeouts (SLAs not
accomplished) for the period.
5) Results: Figures 1.14, 1.15 and 1.16 show the results obtained while running
the experiments described before. Each figure shows the timeout and the energy
consumption variation of each experiment for every combination of the settings
multiplier and request size variable (110 simulations each experiment).
Figure 1.14. Number of timeouts (top) and energy consumption (bottom) using OD
strategy [Geronimo et al. 2014]
Figure 1.15. Number of timeouts (top) and energy consumption (bottom) using SR
strategy [Geronimo et al. 2014]
Figure 1.16. Number of timeouts (top) and energy consumption (bottom) using hybrid
strategy [Geronimo et al. 2014]
Table 1.5 shows the results obtained in the ”worst case scenario”, by definition,
with the multiplier equal to 20 and the request size equal to 2000 MI. Regarding the
results in Table 1.5, it took the Hybrid Strategy as a basis of comparison. In this case,
the values listed are for hybrid strategy. For example, the hybrid strategy presented 3%
fewer request timeouts than the OD strategy.
Table 1.5. Hybrid Strategy Compared to the other Strategies [Geronimo et al.
Now, comparing the same Hybrid Strategy, with and without using DVFS, we
got 13 % less energy consumption. To get a better view of the differences between the
two simulations, the scale of the graph in Figure 1.17 was zoomed. There were no
significant difference on the timeout rate in this scenario.
Figure 1.17. Consumption with DVFS off (top) and with DVFS on (bottom) using hybrid
strategy [Geronimo et al. 2014]
1.4.3. Proposal for Management in Legacy Network Equipment
The proposal considers the network topology of a typical data center shown in Figure
1.18, where the switches are arranged in a hierarchy of three layers: core layer,
aggregation layer and access or edge layer. In this configuration, there is redundancy in
the connections between layers so that the failure of a device does not affect the
Consequently, we consider, in our model, that each rack accommodates forty
servers and two access layer switches. Each of these switches has 48 Gigabit Ethernet
ports and two 10 Gigabit Ethernet uplink ports, and each server has two Gigabit
Ethernet NICs (Network Interface Controllers) each one connected to a different access
Figure 1.18. Typical network topology of a data center [Mahadevan et al. 2011]
We also consider that if there is only one rack, aggregation layer switches are
not required, and up to 12 racks can be attended by 2 aggregation layer switches with
twenty four 10 Gigabit Ethernet and two 10 Gigabit Ethernet or 40 Gigabit Ethernet
uplinks, with no need for core switches.
Finally, the model assumes that, with more than 12 racks two core switches with
a 24 ports module for every 144 racks will be required. The module’s port speed may be
10 Gigabit Ethernet or 40 Gigabit Ethernet, according to the aggregation switches
In traditional facilities, the implementation and management of this redundancy
is done by the Spanning Tree Protocol and in most recent configurations by the
MultiChassis Links Aggregation Protocol (MC-LAG), which allows using redundant
links simultaneously expanding its capacity, as described in [Sher Decusatis et al.
Extensions to the Organization Theory Model:
To include the management of legacy network equipment in the model proposed
by [Werner et al. 2011], such that the network consumption becomes relatively
proportional to the traffic workload and the energy savings contribute to the overall
efficiency of the system, it is proposed to add the following elements to its architecture:
1) Management Roles
Add to the "System Operations" components the “Network Equipment
Management" role, which acts as an interface between the model and the network
equipment being responsible for actions taken on these devices such as: enabling and
disabling ports or equipment or change MC- LAG protocol settings.
The "Monitoring Management" role, responsible for collecting structure
information and its understanding, should be augmented with elements for interaction
with the network management system to provide data, from which decisions can be
made about the port speed configuration, or turning on or off components and ports.
These decisions will be guided by the rules and beliefs.
2) Planning Rules
These rules are used when decisions must be taken, and therefore, rules to
configure the network equipment in accordance with the activation, deactivation and
utilization of physical machines should be added.
To implement the settings pointed out in [Mahadevan et al. 2011], already
presented, the following rules are proposed:
· If a PM (Physical Machine) is switched off, the corresponding ports of access
layer switches must be turned off.
· If the occupation of a PM is smaller than a preset value, network interfaces and
corresponding access switches ports must be slowed down.
· If the aggregate bandwidth of the downlink ports of an access layer switch is
smaller than a preset value, their uplink ports must have their speed reduced.
· If an access layer switch has all its ports off, it must be turned off.
· If an access layer switch is turned off, the corresponding ports of the
aggregation layer switch must be turned off.
· If the aggregate bandwidth of the downlink ports of an aggregation layer switch
is smaller than a preset value, their uplink ports must have their speed reduced.
· If an aggregation layer switch has all its ports off, it must be turned off.
· If an aggregation layer switch is turned off, the corresponding port of the core
layer switch must be turned off.
· If a module of a core layer switch has all its ports off, it must be turned off.
· If a core layer switch has all its ports off, it must be turned off.
· All reversed rules must also be included.
The application of these rules does not affect the reliability of the network, since
port and devices are only turned off when servers are turned off. The system
performance will only be affected if the network equipment activation cost is bigger
than the server activation cost.
For more efficiency in traffic consolidation, the model should consider the racks
in virtual machines allocation and migration strategies, and rules that consolidate active
physical machines in as fewer racks as possible are necessary.
They are a set of empirical knowledge used to improve decisions, and are linked
to the used resources characteristics and to the type of services implemented in each
For each of the rules listed in the previous paragraph, a belief related to energy
consumption should be stated. If we consider [Christensen et al. 2010], examples
· Disconnecting a port on a switch access layer generates a saving of 500 mWh.
· Decreasing the speed of a port from 10 Gbps to 1 Gbps generates a saving of
It will also be necessary to include beliefs about the time required for a
deactivated port or device to become operational after the boot. These beliefs will be
used to make decisions that must consider performance requirements.
The typical data center network topology, rules and beliefs proposed form the
basis for building a simulation model to validate different strategies and rules in specific
settings and with different workloads. As already done in previous works by [Werner et
al. 2012], it is possible to expand the CloudSim [Calheiros et al. 2011] or work on some
of its extensions as TeachCloud [Jararweh et al. 2013].
The simulator must create the network topology and calculate their initial
consumption based on the amount of physical servers using the following rules:
· If the number of servers is smaller than 40, the topology will have only two
access layer switches interconnected by their uplink ports. Turn off unused ports.
· If the number of servers is greater than 40 and smaller than 480 (12 Racks), put
two access layer switches for every 40 servers or fraction and two aggregation layer
switches interconnected by their uplink ports. Turn off unused ports of both layers
· If the number of servers is greater than 480, apply the previous rule for each
group of 480 servers or fraction, add two core layer switches and put on each switch a
24 ports module for each 5,760 servers (144 racks) or fraction. Turn off unused port.
The equation to calculate the consumption of the switches and modules is:
Power (W) = BP + no. P 10Giga x 5 + no. P Giga x 0,5 + no. P Fast x 0,3 (1)
In this expression, the power in Watts is calculated by summing the BP (Base
Power), which is a fixed value specific to each device, and the consumption of every
active port at each speed, which is the variable component. The consumption of each
type of port is specific to each device, but the proposed values are the average values
according to the works already cited.
In equation (1), if the switch is modular, the base power of the chassis must be
During the simulation, when servers are connected or disconnected, the
simulator must apply the network management rules by turning on or off the
corresponding ports or configuring its speed, and update the calculation of the total
consumption of the network.
In order to analyze the system performance and SLA violations, the model must
know the time needed to put into operation each type of equipment, and at the moment
of the server’s activation, compare the uptime of the server with the uptime of the
network equipment and use the greatest.
18.104.22.168 Case Study for Management in Legacy Network Equipment
To validate the model and the potential of the proposal, it was applied to a hypothetical
case of a cloud with 200 physical servers, creating the topology, calculating its initial
consumption without network equipment management and illustrating two possible
situations in the operation of the system. It was considered for this scenario that the base
power is 60 W for access layer switches and 140 W for aggregation layer switches.
Applying the rule to calculate the topology, it is determined that it comprises 5
racks housing a cluster of 40 servers each and, therefore, there will be 10 access layer
switches with 40 Gigabit Ethernet ports and two 10 Gigabit Ethernet empowered ports,
and two aggregation layer switches with 12 connected ports each, 10 ports for access
layer switches and two ports for uplink interconnection between them.
Scenario 1: All network equipment with all its ports connected.
The consumption of the network will be:
Access layer switches = 10 x (60 + 2x5 + 48x0,5) = 940 W (2)
Aggregation layer switches = 2 x (140 + 24x5) = 520 W (3)
Total network consumption = 1.460 W (4)
Scenario 2: Initial configuration with unused ports off.
The consumption of the network will be:
Access layer switches = 10 x (60 + 2x5 + 40x0,5) = 900 W (5)
Aggregation layer switches = 2 x (140 + 12x5) = 400 W (6)
Total network consumption = 1.300 W (7)
In this scenario, it is observed that only by the proper initial configuration of the
network it is possible to get a power save of approximately 11%.
Scenario 3: 90 active servers, workload consolidated in the first three racks and
network configuration rules applied.
In this situation, according to the rules, there are 4 access layer switches working
in initial conditions (8), two access layer switches working with twelve 1 Gbps ports, 10
for servers and 2 uplink ports with its speed reduced (9), and 2 aggregation layer
switches with four 1 Gbps ports and two 10 Gbps ports (10), and the network
consumption will be:
Access layer switches 1 = 4 x (60 + 2x5 + 40x0,5) = 360 W (8)
Access layer switches 2 = 2 x (60 + 12x0,5) = 132 W (9)
Aggregation switches = 2 x (140 + 4x5 + 2x0,5) = 322 W (10)
Total network consumption = 814 W (11)
In this scenario, there is a power saving of approximately 45% in network
1.5.1. Conclusion for Integrated Management Model
We proposed an integrated model of environment, services and network management
for green clouds based on organization model of autonomous agent components.
Concepts related to cloud computing and green cloud computing were presented. We
demonstrated that the proposed solution delivers both reliability and sustainability,
contributing to our goal to optimize energy utilization.
Tests were realized to prove the validity of the system by utilizing the CloudSim
simulator from the University of Melbourne in Australia. We have implemented
improvements related to service-based interaction. We implemented migration policies
and relocation of virtual machines by monitoring and controlling the system. We
achieved the following results in the test environment:
• Dynamic physical orchestration and service orchestration led to 87,18% energy
savings, when compared to static approaches.
• Improvement in load balancing and high availability schemas provide up to
8,03% SLA error decrease.
We are building a unified power management strategy for green cloud
computing, minimizing the total power consumed by including network device power,
server power, and cooling power.
1.5.2. Conclusion for Provisioning and Allocation Strategies
Based on what was presented in previous sections, and considering the objectives
defined, we consider that the intended goal was achieved. Two strategies for allocation
and provisioning were proposed; both aimed at optimizing the energy consumption and
resource utilization without sacrificing service availability.
The allocation strategy in private Clouds, compared to a normal Cloud,
demonstrated an 87% reduction in energy consumption. Though, it was observed that
this strategy is not effective in scenarios that have huge oscillations in workload. That is
because it ends up generating too much reconfigurations and migrations which have a
significant computational cost. Despite this, it still shows a significant improvement in
energy savings when compared to a Cloud without any resource management strategy
deployed. Should be mentioned that, part of the 87% reduction rate is derived from the
fact that the energy consumption from the public Cloud is not considered. This part
represents approximately 3% of the final value.
Figure 1.19 shows a comparison between the green Cloud provisioning
strategies. The strategies are being compared with the SR (Spare Resource) strategy
which is the most expensive since it always keeps spare resources to maintain SLAs for
unpredicted increases in workloads. While the OD (On Demand) strategy achieves up to
47.05% of energy savings when compared to SR strategy, the proposed hybrid strategy
shows up to 3.13% of improvement, achieving 50.13% of energy savings with fewer
timeouts than the OD strategy. The energy saving rates were even bigger when we
simulated an environment where the servers deployed had the DVFS enabled. This
improved the energy savings to 59.87% while maintaining the timeouts rate for extreme
situations, such as when the request load was multiplied by a factor of 20 and each
request size was 2000 million of instructions to be processed.
We should mention that we found (and fixed) some ”bugs” in CloudSim DVFS
module. The simulator bases the energy consumption directly on the use of the CPU,
regardless of other components in the physical machine such as GPUs, NIC, memory,
HD, which leads the energy consumption to lower rates than it should be.
Figure 1.19. Average energy consumption gain over the strategies [Geronimo et al. 2014]
The strategy that achieved the lowest timeout rate was the SR strategy followed
directly by the hybrid strategy with a difference lower than 3%. It was expected that the
SR strategy achieved better timeout rates since it always has a spare VM and PM to
supply sudden spikes in workloads, though it comes with a high cost in energy
consumption and resource optimization. So, if we consider the energy savings and
resource optimization generated by the hybrid strategy and compare them with the
expenses of the SR strategy, the 3% extra timeouts generated by the hybrid strategy is
Thus, we conclude that the use of the hybrid strategy is recommended in
situations where the activation time of resources affects directly the SLA (in other
words, generates fines). This strategy is the most balanced strategy for resource
provisioning for green Cloud environments. However, this approach is not
recommended when access to public Cloud resources is poor or the Cloud provider
lacks in resource quality, security or other factors that can affect directly the SLAs.
As future work, we aim at adding the strategy of Dynamic Reconfiguration of
VMs in public Clouds. This way the public Cloud provider would be able to better
manage its resource. This procedure was not adopted because, during the development
of this work, this feature was not a market reality. We also intend to invest in new
simulations of the Cloud extending the variables (e.g., adding UPS variable), exploring
some artificial intelligence techniques [Koch and Westphall 2001] such as Bayesian
networks, adding the recalculation of beliefs, repeating the simulation with different
Cloud simulators such as GreenCloud [Kliazovich et al. 2010], ICanCloud [Núñez et al.
2012] or MDCSim [Lim et al. 2009]. This way we could compare the results and check
if our proposed models show the same benefits in different simulation tools engines.
We also want to implement our proposed solutions in a real data center, in order
to create an error factor between the results obtained with the use of simulation tools
and the results of a real Cloud. This way we could measure how accurate the Cloud
simulation tools are when compared with a real environment. Our PCMONS (Private
Cloud Monitoring System), open-source solutions for Cloud monitoring and
management, also will help to manage Green Clouds by automating the instantiation of
new resource [Chaves et al. 2011].
We foresee a way of working out unexpected workload peaks scenario. Prior
knowledge of the behavior of hosted services could allow the management services to
improve consolidation and energy consumption while maintaining the services’
expected behaviors. It is believed to be necessary to develop a description language that
represents the structure and behavior of a service, enabling and easing the exchange of
information between application developers and Cloud provider for planning,
provisioning, and managing the Cloud.
1.5.3. Conclusion for Management in Legacy Network Equipment
Basic concepts related to Green IT were first presented, i.e., Green Cloud and Green
Networking, demonstrating the need of considering the network equipment in strategies
designed to make data centers more efficient, since the network represents a significant
percentage of total consumption, and this participation will be more expressive when
the other components become more efficient.
A green cloud management model called OTM (Organization Theory Model)
was presented, as well as network equipment management principles that, when
properly applied, make the behavior of the total consumption of the network
approximately proportional to the traffic load, even when legacy energy-agnostic
equipment are used in. The proposal was to extend the OTM to manage the network
traffic consolidation according to these management principles.
Then, the elements that must be added to the architecture of the OTM were
described, including the rules and beliefs required for the correct network configuration
according to the load consolidation on servers.
It was also proposed a model to determine the data center network topology
based on the number of physical servers, the rules to manage and set the network
devices according to the servers’ state changes, and equations to calculate the switches
consumption and the total network consumption. This model is the basis to create a
simulator and perform simulations to test the viability and the impact of the proposal
application in different configurations, with different performance requirements and
with different rules and beliefs.
The model was validated by its application in a case study, which allowed
verifying that equations and rules are correct and enough to create the topology and to
calculate the consumption of the network in each step of the simulation, as well as
highlight the possible effects of the application of the proposal.
It was also demonstrated, that in the described scenario it is possible to get a
power saving of approximately 11% only by the proper initial configuration of the
network and without any compromise of the performance. In a hypothetical situation of
low utilization, a power saving of approximately 45% through proper workload
consolidation is possible. It was thus demonstrated the possibility and desirability of
extending the green cloud management model as proposed.
It is important to consider that the impact of applying the model is maximum in
legacy energy-agnostic equipment, and will be smaller as the equipment becomes more
energy-aware by applying the resources of the Green Networking as described in [D-
LINK 2011], but its application will be still convenient.
As future research, it is proposed to continue this work by developing the
necessary extensions to CloudSim to implement the model, and perform experiments to
determine the most effective rules and virtual machine allocation policies, and the actual
contribution of the model in scenarios with different configurations, real workloads and
taking into account possible violations to the SLA.
To evaluate the applicability of the model, it is also proposed to determine,
through simulation, how many times a day a port or a device is turned on and off in real
scenarios, and its possible impact in equipment failure rate.
Finally, since system performance may be affected if the network devices
activation cost is bigger than the server activation cost, it is also suggested to study the
proper network configuration and technologies to avoid this situation, with special
consideration to protocols that manage the links redundancy and aggregation, like the
Spanning Tree Protocol, MC-LAG, and other new networking standards for data
ACPI. (2010) “Advanced configuration and power interface specification”. Hewlett-
Packard, Intel, Microsoft, Phoenix, Toshiba. Accessed on September 2013. Available
Basoglu, M., Orshansky, M. and Erez, M. (2010) “Nbti-aware dvfs: A new approach to
saving energy and increasing processor lifetime,” in Low- Power Electronics and
Design (ISLPED), 2010 ACM/IEEE International Symposium on, 2010, pp. 253–
Beloglazov, A., Buyya, R., Lee, Y.C. and Zomaya, A. (2011) “A taxonomy and
Survey of Energy-efficient Datacenters and Cloud Computing”. Advances in
Computers, vol 82, pp. 47-111, Elsevier, November.
Bianzino, A., Chaudet, C., Rossi, D. and Rougier, J. (2012) “A survey of Green
Networking research”. IEEE Communications Surveys and Tutorials, vol 14, pp. 3-
Blanquicet, F., Christensen, K. (2008) “Managing energy use in a network with a new
SNMP Power State MIB”. In: IEEE Conference on Local Computer Networks LCN,
ed. 33, October.
Bolla, R., Bruschi, R., Davoli, F., Cucchietti, F. (2011) “Energy efficiency in the future
internet: a survey of existing approaches and trends in energy-aware fixed network
infrastructures”. IEEE Communications Surveys & Tutorials, May.
Buyya, R. (2009) “Modeling and simulation of scalable cloud computing 10
environments and the cloudsim toolkit: Challenges and opportunities,” in HPCS
2009. International Conference on. IEEE, pp. 1–11.
Buyya, R., Beloglazov, A. and Abawajy, J. (2010) “Energy-Efficient management of
data center resources for cloud computing: A vision, architectural elements, and open
challenges,” in Proceedings of the 2010 International Conference on Parallel and
Distributed Processing Techniques and Applications (PDPTA 2010), Las Vegas,
USA, July 12, vol. 15.
Buyya, R., Ranjan, R. and R. Calheiros, R. (2010) “Intercloud: Utility-oriented
federation of cloud computing environments for scaling of application services,” in
Proceedings of the 10th International Conference on Algorithms and Architectures
for Parallel Processing. LNCS, Springer.
Calheiros, R. N., Ranjan, R., Beloglazov, A., De Rose, C. A. F. and Buyya, R. (2011)
“Cloudsim: A toolkit for modeling and simulation of cloud computing environments
and evaluation of resource provisioning algorithms,” Software: Practice and
Experience, vol. 41, pp. 25–50.
Chaves, S. A. de, Uriarte, R. B. and Westphall, C. B. (2011) “Toward an architecture
for monitoring private clouds,” Communications Magazine, IEEE, vol. 49, no. 12,
pp. 130 –137, December 2011.
Christensen, K., Reviriego, P., Nordman, B., Mostowfi, M. and J. Maestro, J. (2010)
“IEEE 802.3az: The road to Energy Efficient Ethernet”, IEEE Comunication
Magazine, vol 48, pp. 50-56, November.
Dawoud, W., Takouna, I. and Meinel, C. (2012) “Dynamic scalability and contention
prediction in public infrastructure using internet application profiling,” in Cloud
Computing Technology and Science (CloudCom), 4th International Conference on.
IEEE, 2012, pp. 208–216.
Dignum, F., Dignum, V., Padget, J. and Vazquez-Salceda, J. (2009) “Organizing web
services to develop dynamic, flexible, distributed systems,” in iiWAS’09:
Proceedings of the 11th International Conference on Information Integration and
Web-based Applications & Services. New York, NY, USA: ACM, pp. 225–234.
D-LINK. Green Technologies. Taipei: D-LINK, 2011, available at:
http://www.dlinkgreen.com/energyefficiency.asp. Accessed on 13 June 2013.
Do, A. V., Chen, J., Wang, C., Lee, Y. C., Zomaya, A. Y. and Zhou, B. B. (2011)
“Profiling Applications for Virtual Machine Placement in Clouds,” 2011 IEEE 4th
International Conference on Cloud Computing, pp. 660–667, Jul. 2011.
FACEBOOK. Lulea goes live. Lulea: Facebook Inc., 2013. Accessed on January 2014.
Available at: https://www.facebook.com/notes/luleå-data-center/luleå-goes-
Foster, I., Zhao, Y., Raicu, I. and Lu, S. (2008) “Cloud computing and grid computing
360-degree compared,” in Grid Computing Environments Workshop. GCE 08, nov.
2008, pp. 1–10.
Geronimo, G. A., Werner, J., Weingartner, R., Westphall, C. B. and Westphall, C. M.
(2014) “Provisioning, Resource Allocation, and DVFS in Green Clouds”.
International Journal On Advances in Networks and Services.
Gong, Z., Gu, X. and Wilkes, J. (2010) “Press: Predictive elastic resource scal- ing for
cloud systems,” in Network and Service Management (CNSM), 2010 International
Conference on. IEEE, pp. 9–16.
GOOGLE. Data center efficiency. Mountain View: Google Inc., 2011. Available at:
Accessed on September 2013.
Garg, S. K., and Buyya, R. (2012) “Green cloud computing and environmental
sustainability. In: Murugesan San e Gangadharan, G. Harnessing Green IT:
Principles and Practices. Oxford: Wiley Press.
Gruber, C. G. (2009) “Capex and opex in aggregation and core networks,” in Optical
Fiber Communication Conference. Optical Society of America, pp. 1–3.
Gunaratne, C., Christensen, K. and Nordman, B. (2005) “Managing energy
consumption costs in desktop pcs and lan switches with proxying, split tcp
connections, and scaling of link speed,” International Journal of Network
Management, vol. 15, no. 5, pp. 297–310.
Gunaratne, C., Christensen, K., Nordman, B. and Suen, S. (2008) “Reducing the energy
consumption of ethernet with adaptive link rate (alr),” Computers, IEEE
Transactions on, vol. 57, no. 4, pp. 448–461.
Hulkury M. N. and Doomun, M. R. (2012) “Integrated green cloud computing
architecture,” in Proceedings of the 2012 International Conference on Advanced
Computer Science Applications and Technologies, ser. ACSAT ’12. Washington,
DC, USA: IEEE Computer Society, pp. 269–274.
IEEE p802.3az Energy Efficient Ethernet Task Force. New York: IEEE, 2010. available
at: http://www.ieee802.org/3/az/. Accessed on September 2013.
Jararweh, Y., Kharbutli, M. and Alsaleh, M. (2013) “TeachCloud: A Cloud Computing
Educational Toolkit”. International Journal of Cloud Computing (IJCC), Vol. 2, No.
2/3, February 2013, pp. 237-257.
Kim, W. (2013) “Fast, per-core dvfs using fully integrated voltage regulators,” [Online;
Last access: 2013-01-15]. [Online]. Available: http://www.eecs.harvard.edu/
Kim, W., Gupta, M., Wei, G.-Y. and Brooks, D. (2008) “System level analysis of fast,
per-core dvfs using on-chip switching regulators,” in High Performance Computer
Architecture, 2008. HPCA 2008. IEEE 14th International Symposium on, 2008, pp.
Kliazovich, D., Bouvry, P., Audzevich, Y. and Khan, S. U. (2010) “GreenCloud: A
Packet-Level Simulator of Energy-Aware Cloud Computing Data Centers,” 2010
IEEE Global Telecommunications Conference GLOBECOM 2010, pp. 1–5.
Koch, F. L. and Westphall, C. B. (2001) “Decentralized network management using
distributed artificial intelligence,” Journal of Network and Systems Management,
vol. 9, pp. 375–388.
Laszewski, G. von, Wang, L., Younge, A. and He, X. (2009) “Power-aware scheduling
of virtual machines in dvfs-enabled clusters,” in Cluster Computing and Workshops.
CLUSTER ’09. IEEE International Conference on, 31 2009-sept. 4 2009, pp. 1 –10.
Leandro, M. A. P., Nascimento, T. J., dos Santos, D. R., Westphall, C. M. and
Westphall, C. B. (2012) “Multi-tenancy authorization system with federated identity
for cloud-based environments using shibboleth,” in ICN 2012, The Eleventh
International Conference on Networks, 2012, pp. 88–93.
Lefevre L. and Orgerie, A.-C. (2010) “Designing and evaluating an energy efficient
cloud,” The Journal of Supercomputing, vol. 51, pp. 352–373.
Lim, S.-H., Sharma, B., Nam, G., Kim, E. K. and Das, C. R. (2009) “Mdcsim: A multi-
tier data center simulation, platform,” in Cluster Computing and Workshops, 2009.
CLUSTER’09. IEEE International Conference on. IEEE, 2009, pp. 1–9.
Liu, L., Wang, H., Liu, X., Jin, X., He, W. B., Wang, Q. B. and Chen, Y. (2009)
“Greencloud: a new architecture for green data center,” in ICAC-INDST ’09:
Proceedings of the 6th international conference industry session on Autonomic
computing and communications industry session. New York, NY, USA: ACM, 2009,
Magklis, G., Semeraro, G., Albonesi, D., Dropsho, S., Dwarkadas, S. and Scott, M.
(2003) “Dynamic frequency and voltage scaling for a multiple-clock- domain
microprocessor,” Micro, IEEE, vol. 23, no. 6, pp. 62–68.
Mahadevan, P., Sharma, P., Banerjee S. and Ranganathan, P. (2009) “Power
Benchmarking Framework for Network Devices”. Proc. 8th International IFIP-TC 6
Networking Conference, Springer Berlin Heidelberg, November, pp. 795-808.
Mahadevan, P., Banerjee, S., Sharma, P., Shah, A., Ranganathan, P. (2011) “On Energy
Efficiency for Enterprise and Data Center Networks,” in IEEE Communications
Minas, L. and Ellison, B. (2009) “Energy efficiency for information technology: how to
reduce power consumption in server and datacenter”. Intel Press.
Murugesan, S. (2008) “Harnessing green it: Principles and practices,” IT Professional,
vol. 10, no. 1, pp. 24–33.
Núñez, A., Vázquez-Poletti, J. L., Caminero, A. C., Castañé, G. G., Carretero, J. and
Llorente, I. M. (2012) “icancloud: A flexible and scalable cloud infrastructure
simulator,” Journal of Grid Computing, vol. 10, no. 1, pp. 185–209.
OpenCC, “Open cloud consortium,” 2012, ”[Online; Last access: 2013-01-15]”.
[Online]. Available: http://opencloudconsortium.org/
OCCI, “Open cloud computing interface,” 2012, ”[Online; Last access: 2013-01-15]”.
[Online]. Available: http://www.occi-wg.org
Pinheiro, E., Bianchini, R., Carrera, E. V., and Heath, T. (2001) “Load balancing and
unbalancing for power and performance in cluster-based systems,” in Proceedings of
the Workshop on Compilers and Operating Systems for Low Power (COLP’01), Sep
2001, pp. 182–195.
Shen, Z., Subbiah, S., Gu, X. and Wilkes, J. (2011) “Cloudscale: elastic resource scaling
for multi-tenant cloud systems,” in Proceedings of the 2nd ACM Symposium on
Cloud Computing. ACM, 2011, p. 5.
Sher Decusatis, C. J., Carranza, A. and Decusatis, C. M. (2012) “Comunication within
clouds: open standards and proprietary protocols for data center networking”,
IEEE Communication Magazine. Vol. 50, pp. 26-33, September.
Urgaonkar, B., Shenoy, P. and Roscoe, T. (2009) “Resource overbooking and
application profiling in a shared Internet hosting platform,” ACM Transactions on
Internet Technology, vol. 9, no. 1, pp. 1–45, Feb.
Valancius, V., Laoutaris, N., Massoulie, L., Diot, C., and Rodriguez, P. (2009)
“Greening the internet with nano data centers,” in CoNEXT ’09: Proceedings of the
5th international conference on Emerging networking experiments and technologies.
New York, NY, USA: ACM, pp. 37–48.
Villarreal, S. R., Westphall, C. B. and Westphall, C. M. (2014) “Optimizing Green
Clouds through Legacy Network Infrastructure Management”. Thirteenth
International Conference on Networks - ICN 2014, pp. 142-147.
Wang, Q., Kanemasa, Y., Li, J., Lai, C. A., Matsubara, M. and Pu, C. (2013) “Impact of
dvfs on n-tier application performance,” in Conference on Timely Results in
Operating Systems (TRIOS). ACM.
Werner, J., Geronimo, G. A., Westphall, C. B., Koch, F. L. and Freitas, R. R., (2011)
“Simulator improvements to validate the green cloud computing approach,”
LANOMS Latin American Network Operations and Management Symposium, vol.
1, pp. 1–8, 2011.
Werner, J., Geronimo, G. A., Westphall, C. W., F. L., Koch, R. R. Freitas, R. R. and
Westphall, C. M. (2012) “Environment, services and network management for green
clouds,” CLEI Electronic Journal, vol. 15, no. 2, p. 2.
Westphall, C. B. and Villarreal, S. R. (2013) “Princípios e Tendências em Green Cloud
Computing”, Revista Eletrônica de Sistemas de Informação, v. 12, n. 1, pp. 1-19,
Wood, T., Shenoy, P., Venkataramani, A. and Yousif, M. (2009) “Sandpiper: Black-box
and gray-box resource management for virtual machines,” Computer Networks, vol.
53, no. 17, pp. 2923–2938, Dec.