PreprintPDF Available

Carbon-Aware Computing in a Network of Data Centers: A Hierarchical Game-Theoretic Approach

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

Over the past decade, the continuous surge in cloud computing demand has intensified data center workloads, leading to significant carbon emissions and driving the need for improving their efficiency and sustainability. This paper focuses on the optimal allocation problem of batch compute loads with temporal and spatial flexibility across a global network of data centers. We propose a bilevel game-theoretic solution approach that captures the inherent hierarchical relationship between supervisory control objectives, such as carbon reduction and peak shaving, and operational objectives, such as priority-aware scheduling. Numerical simulations with real carbon intensity data demonstrate that the proposed approach successfully reduces carbon emissions while simultaneously ensuring operational reliability and priority-aware scheduling.
Carbon-Aware Computing in a Network of Data Centers:
A Hierarchical Game-Theoretic Approach
Enno Breukelman1, Sophie Hall2, Giuseppe Belgioioso2, and Florian D¨
orfler2
Abstract Over the past decade, the continuous surge in
cloud computing demand has intensified data center workloads,
leading to significant carbon emissions and driving the need for
improving their efficiency and sustainability. This paper focuses
on the optimal allocation problem of batch compute loads with
temporal and spatial flexibility across a global network of data
centers. We propose a bilevel game-theoretic solution approach
that captures the inherent hierarchical relationship between
supervisory control objectives, such as carbon reduction and
peak shaving, and operational objectives, such as priority-
aware scheduling. Numerical simulations with real carbon
intensity data demonstrate that the proposed approach success-
fully reduces carbon emissions while simultaneously ensuring
operational reliability and priority-aware scheduling.
I. INTRODUCTION
Between 2010 and 2020, the global compute load on data
centers (DCs) has increased more than 9 times [1]. As a
consequence, the number of hyper-scale DCs has doubled
between 2015 and 2021, reaching 700 installed facilities
worldwide [2]. For their operation, DCs require a significant
supply of electricity from the grid, about 1-1.5% of the
global electricity demand in 2022, corresponding to 230-
340 TWh [3]. Unfortunately, the electricity they run on is
still coming from predominantly carbon-intensive sources.
Incorporating more renewable energy into the electricity mix
is challenging due to supply variability, which depends on
both time of day and geographical location. Remarkably, a
large share of the global compute load is not time-sensitive,
allowing a delayed execution, nor bound to a specific DC,
allowing it to be executed at a different DC. Therefore,
various compute-demand management and real-time routing
mechanisms exploit this flexibility to mitigate the carbon
impact and provide ancillary services to the power grid.
Most of the existing work focuses on real-time routing
of temporally inflexible compute jobs [4], such as on-
demand services and search engine requests. Furthermore,
loads with either spatial or temporal flexibility [5], [6] are
considered, but rarely the combination of both. Compute jobs
are also mostly considered part of an aggregate compute
load, which does not differentiate between individual jobs
1Enno Breukelman is with the KTH Royal Institute of Technology,
School of Electrical Engineering and Computer Science, Division of Deci-
sion and Control Systems, Malvinas v¨
ag 10, SE-100 44 Stockholm, Sweden.
cebre@kth.se
2Sophie Hall, Giuseppe Belgioioso, and Florian D¨
orfler are with the Au-
tomatic Control Laboratory, Department of Electrical Engineering and Infor-
mation Technology, ETH Zurich, Physikstrasse 3, 8092 Z ¨
urich, Switzerland.
{gbelgioioso, shall, dorfler}@ethz.ch
This work is supported by the SNSF via NCCR Automation (Grant
Number 180545).
[7]. Typical objectives considered by these allocation mecha-
nisms include reducing monetary expenses, carbon taxes [8],
[4], carbon emissions [9], and inducing peak shaving [5],
[10]. Additional modeling features investigated in existing
research include: (i) the interaction with electricity utilities
and generators, [11], [5], (ii) more detailed power modeling
of the DC facilities [12], and (iii) the geographical location
and load migration among a network of DCs [13].
Within this large body of literature, the carbon-aware com-
puting platform proposed in [14] stands out as it is currently
implemented by Google to operate its inter-continental fleet
of DCs. Google uses carbon intensity forecasts, and predicts
future compute demand for their DCs. This information is
used to generate so-called Virtual Capacity Curves (VCCs),
which limit the hourly resource of their DCs in the fleet
such that temporally flexible compute load is pushed to less
carbon-intense hours. Thus, by harnessing the compute load
predictions, they can minimize carbon emissions.
In this paper, we design a novel day-ahead scheduling
mechanism to allocate batch compute jobs over an intercon-
nected fleet of DCs, inspired by Google’s carbon-intelligent
platform in [14]. The approach in [14] is scheduler-agnostic,
meaning VCCs are computed independently of the actual
compute job schedule. In contrast, we co-design the VCCs
and compute job allocation using a hierarchical game-
theoretic approach that distinguishes between separate indi-
vidual compute jobs. Additionally, our mechanism considers
not only temporal shifting but also spatial migration.
The contributions of this paper are threefold. Firstly, we
formalize the optimal allocation problem of compute jobs
with spatial and temporal flexibility, over a network of DCs
as a single-leader multiple-follower Stackelberg game. At
the lower level, the owners of compute jobs compete for
the computational resources in the DC network to process
their jobs as soon as possible. At the upper level, the
DC operator generates the virtual capacities to induce a
competitive allocation that reduces carbon emissions and
induces peak shaving. Thirdly, we derive an efficient ad-
hoc algorithm to solve the resulting large-scale Stackelberg
game by adapting the method proposed in [15], which begs
similarities to the approach of solving a Stackelberg game in
[16]. Finally, we numerically validate the proposed allocation
mechanism using real carbon intensity data for a selection
of Google’s DC locations.
Our numerical findings show that temporally shifting
and spatially migrating flexible compute jobs significantly
reduce carbon emissions. In contrast to scheduler-agnostic
approaches, co-designing the VCCs can improve the homo-
arXiv:2405.18070v1 [cs.GT] 28 May 2024
geneity of waiting times in the job schedule.
Notation: R,R0,R>0denote the set of real, nonnega-
tive real, and positive real numbers, respectively. Given N
scalars a1, . . . , aN, diag(a1, . . . , aN)denotes the diagonal
matrix with a1, . . . , aNon the main diagonal. Given N
column vectors x1, . . . , xNRn,x=col(x1, . . . , xN) =
[x
1, . . . , x
N]denotes their vertical concatenation. The Eu-
clidean projection onto the set Xis denoted as PX[·]. The
partial derivative of a function fwith respect to its i-th ar-
gument is denoted as: if(x1, . . . , xn) = ∂f (x1, . . . , xn)
∂xi
.
II. PROBLEM STATEMENT
We consider a set I:= {1, . . . , I }of batch compute jobs
to allocate across a fleet of DCs D:= {1, . . . , D}, over a
planning horizon T:= {1, . . . , T }. Each batch job i I is
uploaded by a customer, or team, at an initial location di
D, at which the data required for the computation is assumed
to be physically stored. Each batch job is characterized by
a predicted compute volume viand a priority parameter
τi, describing its time urgency. The DCs are physically
interconnected via a fiber network that allows compute jobs
to be transferred across the fleet. This network is modeled via
a weighted graph G:= G(D,E), where the vertices are the
DCs, and the edges are the fiber connections ek,ℓ E directly
connecting neighboring DCs k, D. The weights of the
edges encode the price of migrating the data, established by
the internet service providers for using their routers.
A. Team’s allocation game
Each team shall choose the allocation of their batch job
i I, represented by the vector yi=col(yi
1, . . . , yi
D), whose
entries yi
d,t R0describe the share of the compute volume
viallocated to DC d D at time t T . In other words,
yidetermines where and when to reserve DC capacities to
execute compute job i. The allocation chosen by each team
must satisfy various operational constraints. Intuitively, it
must cover the total compute volume vi, yielding
X
t∈T X
d∈D
yi
d,t =vi,i I.(1)
A job i I may be transferred from its initial DC di, to
any alternative DC j D via a predetermined shortest path
ωi,j over the fiber-optic network G. This path is uniquely
identified and computed in advance, to ensure minimal transit
time and resource usage, and consists of a sequence of fiber
connections e E. In Fig. 1, we illustrate an example of the
considered setup, where the path ω1,3= (e1,2, e2,3)uses the
fiber connections e1,2and e2,3.
Any migrated share of viappears in a migration variable
zi=col(zi
1, . . . , zi
D), whose entries zi
j,t R0describe how
much predicted load is migrated from dito DC j D at
time step t T , over the path ωi,j . The entry zi
i,t describes
the queue of unprocessed compute volume at this DC. Since
there is no migration before the first time step, there are T1
migration steps. The share of a predicted compute volume vi
Team 1
Compute facility
(Data Center)
Team 2
Team 3
Optical fiber connection
e4,1E
Virtual connection line
w1,3=(e1,2,e
2,3)
d1
d2
d3
d4
Fig. 1. DC network featuring 4 DC locations interconnected by physical
connection lines (solid line). A job migration between DC 1 and DC 3 is
realized via a path through the fiber network. Right: Teams are associated
with a DC location, where they initially submit their compute jobs.
executed at a DC other than the initial di, i.e., d D\ {di},
must be migrated in the time step before
yi
d,t+1 =zi
d,t,
i I,d D \ {di},t T \ {T}.(2)
Any share of a compute job vi, that has not yet been
computed at the initial DC dior migrated to another DC
zi
j=i,ℓ is stored in the queue for the next time step zi
i,t
vi
t
X
=1
yi
di,ℓ +
D
X
j=1,j=i
zi
j,ℓ
=zi
i,t,
i I,t T \ {T}.
(3)
Finally, the total amount of compute job volume allocable
at each DC d D is limited by
X
i∈I
yi
d,t xd,t,d D,t T ,(4)
where xd,t is a virtual capacity, set by the DC operator. These
virtual capacities, considered over the complete planning
horizon and all DCs in the network, constitute the VCCs.
Let yi=col(yi, zi)denote the stacked vector of the
local decision variables (namely, allocation and migration)
of team i. All local operational constraints (1)–(4) can be
compactly represented via the set-valued mapping
Yi(x, yi) := (yi, zi)|(1) (4) hold,(5)
which depends on the VCCs x=col(x1,...xD)viewed as
parameters, as well as the allocations of the other compute
jobs yi=col(y1,...,yi1,yi+1,...,yI).
The objective of each team is to choose an allocation that
minimizes execution time, migration cost, and deviation from
a pre-determined allocation profile ˆyi, formulated as
Ji(yi) = X
t∈T X
d∈D
τi
tyi
d,t +zi
d,t
X
eωi,d
σi
e
+1
2ϵyiˆ
yi2.
(6)
The first term in (6) penalizes allocating compute jobs
incrementally with the time steps and depends on the priority
parameter τi
t=τilt, where lt=t/T is a time-dependent
weight that penalizes delayed allocations. The second term
in (6) penalizes the migration of compute jobs, as transferring
data over the network comes with a time penalty for each
team. Therein, the total price of migration is the sum of all
the prices at each fiber connection σi
e=τiσe, multiplied by
the priority parameter to further penalize migrations of urgent
compute jobs. The quadratic terms penalize the deviation
from a predefined allocation and migration ˆyi=colyi,ˆzi).
Overall, each team solves the following optimization prob-
lem to find their optimal allocation
(i I) : min
yiJi(yi)s.t. yi Yi(x, yi).(7)
The collection of these inter-dependent optimization prob-
lems constitutes a generalized game [17], parametric in x,
the VCCs. Note that the other teams’ allocations yienter
the local constraints in (7) due to the resource constraint
(4), rendering the game a generalized game. Furthermore,
with the teams’ objectives in (6) being decoupled, this game
structure corresponds to an exact potential game [18].
A meaningful solution concept for (7) is the Generalized
Nash Equilibrium (GNE), which is a set of allocations
¯
y= ( ¯
y1,..., ¯
yI)that simultaneously solve the optimization
problems in (7). Here, we focus on the special subclass
of variational GNEs (v-GNEs) due to their computational
tractability and economic fairness [17]. This subclass of
equilibria corresponds to the solution set of the parameterised
variational inequality VI(F, Y(x)), namely, the problem of
finding a vector ¯
ysuch that
F(¯
y),(y¯
y) 0,y Y(x),(8)
where, F(y) := col({∇Ji(yi)}i∈I )is the pseudo-gradient
mapping of the game (7), and Y(x)collects the operational
constraints (5) and depends explicitly on the VCCs x. We
denote by y(·)the parameter-to-solution mapping that,
given the VCCs x, returns the set of solutions y(x)to the VI
in (8), namely, a set of strategically-stable allocations. It can
be shown that y(x)is single-valued for any xthat makes
(8) feasible, meaning that the allocation game has a unique
solution for given VCCs, as the pseudo-gradient mapping F
is strongly monotone.
B. Supervisory Objectives
The DC operator can influence the outcome of the al-
location game (7) by manipulating the virtual capacities
xd,t R0, limiting the total load at each DC and in each
time step. As for the allocations, also the VCCs must satisfy
a series of operational constraints. Firstly, the total available
capacity defined by the VCCs must accommodate the total
compute demand, yielding
X
d∈D X
t∈T
xd,t X
i∈I
vi.(9)
Additionally, the VCCs cannot exceed the physical compu-
tational capacity of the DCs, yielding
0xd,t xmax
d,t ,d D, t T ,(10)
12345
Time steps
Capacity
xmax
d
xd,t
inflexible load
xmax
d,t
job 1
job 2
Fig. 2. An example of sequential allocation of two compute jobs y1and
y2on a data center d D, where job 1 is of higher priority than job
2. The virtual capacity curve (red dashed line) limits the allocable load at
each time slot. The maximum capacity of the DC xmax
d,t (solid black line)
is obtained by subtracting the inflexible load from the maximum capacity
of that DC xmax
d(dashed black line).
where the maximum capacity is denoted by xmax
d,t . This value
is obtained by subtracting the inflexible demand, which we
cannot shift or migrate but assume to be forecast perfectly,
from the physical capacity xmax
d. Fig. 2 displays an exem-
plary job schedule for two jobs y1and y2, with the VCC
xd,t as their upper bound and physical constraint xmax
d. The
operational constraints on the DC operator’s decision are
compactly represented via
X:= x|(9),(10) hold.(11)
The DC operator utilizes its influence on the teams’
allocations to reduce the carbon impact and peak usage of the
DCs. This goal is modeled by the multi-objective function
ϕ(x, y) = ϕcarb(y) + ϕpeak (y) + ξϕmigr (y).(12)
The term ϕcarb penalizes carbon emissions and is defined as
ϕcarb(y) = X
d∈D X
t∈T
ρcarb
d,t X
i∈I
yi
d,t!,(13)
where ρcarb
d,t is the carbon intensity of the supplied power at
DC dand time step t, modeling the carbon impact of utilizing
computing power in a linear relationship, as it is done in [7],
[19]. The second term ϕpeak penalizes peak usage of DCs
computational resources per time step and is defined as
ϕpeak(y) = X
d∈D X
t∈T X
i∈I
yi
d,t!p!1/p
,(14)
for some pNlarge enough to approximate the infinity
norm. The third term penalizes the aggregate migration of
compute jobs and is defined as
ϕmigr(y) = X
i∈I X
t∈T X
d∈D
zi
d,t
X
eωi,d
σi
e
.(15)
This last term is necessary to reduce network traffic from the
DC operator’s perspective, as the DC operator is responsible
for paying network routing fees to internet service providers.
Finally, the parameter ξ0in (12) regulates the impact of
migration on the final allocation. In contrast to an exclusively
monetary objective, the multi-objective in (12) addresses
sustainability aspects directly.
Overall, the DC operator aims to solve the following
single-leader multiple-follower Stackelberg game
min
x,yϕ(x, y)(16a)
s.t. x X (16b)
F(y),(yy) 0,y Y(x).(16c)
III. BILEVEL GAME SOLUTION APPROACH
We solve (16) using a customized version of a first-order
algorithm named BIG Hype, recently proposed in [15].
A. A Hyper-gradient based algorithm
The first-order algorithm in [15] is based on the idea of
substituting the equilibrium constraints (16c) into the objec-
tive function (16a), by exploiting the solution mapping y(·)
and the fact that solutions to (8) are unique. The resulting
non-convex non-smooth optimization problem reads as
min
xϕ(x, y(x)) =: ϕe(x)
s.t. x X.(17)
Then, a local solution to (17) is obtained by relying on pro-
jected “gradient” descent. Whenever y(x)is differentiable
at x, one can obtain a gradient by applying the chain rule1
ϕe(x) = 1ϕ(x, y(x)) + Jy(x)2ϕ(x, y(x)),
(18)
where Jy(x)is the Jacobian of the solution mapping y(·)
at x, commonly known as the sensitivity.
The proposed algorithm is summarized in Algorithm 1
and consists of three main steps. In step 1, the DC operator
uses the current estimates of the equilibrium and sensitiv-
ity to compute a hypergradient (18), and runs a projected
hypergradient descent step. Based on the new VCCs xk+1,
in step 2, the teams compute the resulting optimal allocation
y(xk+1)and subsequently, in step 3, its sensitivity Jy(x).
Algorithm 1: Customized BIG Hype
Initialize x0 X,y0Rny
0,s0=0Rny×nx,
k= 1, and {αk}kN
repeat until convergence
1. DC operator’s projected hypergradient step:
ϕk
e 1ϕ(xk,yk)+(sk)2ϕ(xk,yk)
xk+1 PX[xkαkϕk
e]
2. Equilibrium seeking step:
yk+1 y(xk+1)
3. Sensitivity computation step:
sk+1 Jy(xk+1)
Under the considered problem setup, the convergence of
Algorithm 1 to critical points of (17) follows by Theorem 2
in [15]. We omit the details here for the sake of brevity. In
1If not differentiable, the standard Jacobians are replaced by elements of
the conservative Jacobians. For detailed explanations and proofs, see [15].
the following subsections, we describe how to perform steps
2and 3of Algorithm 1 for the specific game setup in (7) in
a computationally efficient manner.
B. Equilibrium Seeking Step
To efficiently find a solution of the allocation game (7),
we consider the surrogate optimization problem
min
yX
i∈I
Ji(yi)(19a)
s.t. y Y(x),(19b)
whose minimizer corresponds to the v-GNE of (7). The
individual objective functions (6) are decoupled, such that the
sum of all teams’ objectives constitutes the potential function
[18, p. 243], allowing us to rewrite (7) as (19). A formal
proof can also be obtained by comparing the Karush-Khun-
Tucker (KKT) conditions [20, Th. 4.8], but is omitted here
due to space limitations. The sum in (19a) can be written as
J(y) = X
i∈I
Ji(yi) = qy+1
2ϵ(yˆy)(yˆy),(20)
while the collection of constraints (19b) can be expressed as
Y(x) = {yRny|Ay=b, Gyh+Hx}.(21)
To improve the computational efficiency, we reduce the
problem size by eliminating variables, as outlined in [21,
Sec. 10.1.2]. We substitute y=FT˜y+yinto J(y), where
the transformation matrix FTconsists of basis vectors of
the nullspace of Aand yis any solution to Ay=b.
By choosing ˆy=y, the objective reduces to J(˜y) =
(FTq)˜y+1
2ϵ˜yF
TFT˜y. The new optimization variable
˜yis reduced in dimension by the number of rows in A, i.e.,
the number of equality constraints. Finally, the optimization
problem reads as follows
min
˜
yFT˜y+q2
2
s.t. ˜
G˜y˜
h+Hx,
(22)
where q,˜
G=GFT,˜
h=hGy,Hdepend on the problem
data. Problem (22) is an inequality-constrained quadratic
program (QP) with a convex objective, that can be efficiently
solved using off-the-shelf solvers.
C. Sensitivity Computation
The equilibrium sensitivity Jy(x)corresponds to the
sensitivity of the solutions to the QP (22) with respect to
changes in x, which we compute using the approach in [22].
First, we find the total differentials of the KKT conditions
of (22), more specifically of the stationarity condition and
complementarity slackness at the optimal point
0 = ˜
Gdλ+d˜y,(23)
0 = diag(dλ)( ˜
G˜y˜
hHx) + diag(λ)( ˜
Gd˜yHdx),
(24)
with the Lagrange multipliers λfor the inequality constraints
and the differentials dλ,d˜yand dx. The columns of the
sensitivity matrix Jy(x)are the solutions of d˜ywhen
replacing dx by the columns of the identity matrix. Math-
ematically, for each single constraint, the optimal solution
˜yand the differential dλdo not influence d˜yand dx,
because either ˜
G˜y˜
hHx = 0 (active constraint), or
λ= 0 (inactive constraint), or both are zero. Intuitively,
since the DC operator’s decisions only influence the lower
level through the constraints, the sensitivity does not depend
on the teams’ objectives. Therefore, we only consider the
rows of ˜
Gand Hwhich correspond to the active inequality
constraints and solve the following system of linear equations
˜
Gk,·d˜yHk,·dx = 0,kwith λ⋆,k >0.(25)
IV. SIMULATION RESULTS
A. Simulation Setup
We consider a network of 12 DCs distributed over 4
continents, whose locations we source from Google’s fleet
of DCs [23]. We use real carbon intensity data for those
locations, obtained from Electricity Maps [24], which covers
24 hours with a sampling rate of 5 hours, starting on February
22nd, 2023, at 10 o’clock. The maximum computational
capacities xmax
d,t of the DCs vary due to inflexible loads that
fluctuate in a sinusoidal shape, similar to [25]. We set the
parameter for the quadratic penalty in the team’s objective
to a small value ϵ= 2 ×108, which means that teams
virtually have no preferred allocation, but only care about the
processing time and migration cost. The p-norm parameter
for the peak shaving cost in (14) is set to p= 6, a value
approximating the infinity norm while retaining numerical
stability. To improve the conditioning of the DC operator’s
cost function (12), we include the uniform term 1
21
nxx, that
ensures that 1ϕ(x, y(x)) is nonzero. Moreover, to ensure
the feasibility of the parametrized allocation game (7), at
each step of Algorithm 1, we impose the following: In the
first time step, the VCCs at each DC shall not exceed the
cumulative predicted load of all teams, that have their data
uploaded at this DC, i.e., xd,1P{i∈I | d=di}vi, for all
d D.
B. The Impact of Temporal Shifting & Spatial Migration
To analyze the impact of temporal shifting, we compare
our allocation mechanism with a na¨
ıve algorithm that allo-
cates the compute jobs according to their priority, always
using the full available capacity xmax
d,t . We consider three
load scenarios, each characterized by different compute job
volumes: (a) large compute jobs that typically require more
than one time step at a DC location; (b) multiple small
compute jobs that can be processed during a single time step;
and (c) a mixture of large and small compute jobs. As shown
in Fig. 3, temporally shifting compute loads significantly
reduces carbon emissions, depending on the load scenario.
To investigate the impact of spatial migration on carbon
emissions, we modify the migration price ξin the DC
operator’s objective function. A very large ξcorresponds to
disabling the migration of compute jobs. As shown in Fig. 4,
increasing ξincreases carbon emissions. This demonstrates
the potential to save carbon emissions that comes with
enabling spatial migration.
5
10
15
% normalized carbon savings
large vismall vimixed vi
Fig. 3. Carbon emission savings
normalized by compute volume due
to time-shifting. Bilevel game vs,
na¨
ıve approach (full capacity utiliza-
tion) in three scenarios.
Fig. 4. Carbon emissions, nor-
malized by compute volume, vs mi-
gration price ξ, for three scenarios.
The growth demonstrates a positive
impact of spatial migration.
C. Co-design vs. Sequential Optimization
In this case study, we compare our bilevel approach for
co-designing VCCs and allocations with a sequential opti-
mization approach, similar to the scheduler-agnostic scheme
used by Google [14]. The latter consists of two sequential
steps: In the first step, optimal VCCs are computed by solely
using a forecast of the compute load and carbon intensity.
In the second step, the teams find the optimal allocation
given the previously computed VCCs. In this sequential
approach, the DC operator must optimize over the worst-
case allocation scenarios, namely, the VCCs, rather than the
actual allocations. The sequential approach does not take the
migration of compute loads into account. Thus, the resulting
cost function of the DC operator reads as
ˆ
ϕ(x) = X
d∈D X
t∈T
ρcarb
d,t xd,t +X
d∈D X
t∈T
(xd,t)p!1/p
,(26)
where the actual Pi∈I yi
d,t have been replaced by the VCCs
xd,t. Overall, this sequential optimization scheme reads as
Step 1: ¯x= arg min
x
ˆ
ϕ(x),(27)
Step 2: y=yx).(28)
Further, we introduce the following two metrics to evaluate
the performance of the allocations:
1) allocation fairness: ψτtime,i(yi)/vii∈I ,
2) total waiting time: Pi∈I τtime,i(yi),
where ψ(·)is the empirical standard deviation and τtime,i(yi)
denotes the time cost, i.e., the first linear term in (6). The
fairness criterion represents the heterogeneity of the waiting
times among the teams. In contrast, the second criterion is
the total wait time for all jobs, weighted by their time priority
parameter τi.
In Fig. 5, we show the outcome of simulations with
varying migration prices to showcase the differences between
the bilevel game (16) and the sequential approach (28).
Generally, fairness and waiting time are improved when
co-designing VCCs and the compute job schedule. The
difference becomes even more apparent with higher values
of ξ, as this directly addresses the shortcoming of not
featuring migration in the sequential approach. However,
as demonstrated by the results in Fig. 6, the sequential
optimization approach scores lower carbon emissions and
0 0.02 0.1 0.5 1 2
migration price ξ
0
20
40
% (SQ-BL)/BL
allocation fairness
large vi
small vi
mixed vi
0 0.02 0.1 0.5 1 2
migration price ξ
0
10
total waiting time
Fig. 5. Fairness and total waiting time in a direct comparison between our
approach (BL) and sequential optimization (SQ). Sequential optimization
scores higher (worse) values in both, and the difference grows with
increasing migration price.
0 0.02 0.1 0.5 1 2
migration price ξ
20
10
0
% (SQ-BL)/BL
carbon price
large vi
small vi
mixed vi
0 0.02 0.1 0.5 1 2
migration price ξ
20
10
0
peak price
Fig. 6. Carbon emissions and peak price in a direct comparison between our
approach (BL) and sequential optimization (SQ). SQ scores lower (better)
values in both for high migration prices in all three scenarios.
better peak-shaving performance for large migration prices.
In summary, if we enable spatial shifting, the co-design
of VCCs and allocations allows us to perform similarly to
the sequential approach regarding decarbonization and peak-
shaving. Yet, co-design can lead to reduced and fairer waiting
times for the teams. This may incentivize users to participate
in the proposed coordination mechanism.
V. C ONCLUSIONS
We modeled the problem of co-designing the allocation
of flexible batch compute jobs and virtual capacity curves
of DCs as a bilevel game. This formulation promotes the
supervisory control objectives of the DC operator and models
each team as a player competing for the computational
resources in the DCs. A local solution of the resulting
game is found by deploying the recently developed algorithm
[15], with some ad-hoc modifications that improve efficiency.
Simulation results show that allowing for spatial migration
and temporal shifting of compute jobs has a high potential to
reduce carbon emissions in the operation of DCs. Moreover,
compared with standard sequential optimization, our hierar-
chical approach can reduce the total waiting time of compute
jobs and improve the fairness of their allocation.
REFERENCES
[1] International Energy Agency, “Global trends in internet traffic, data
centres workloads and data centre energy use, 2010-2020, iea, paris,
2020. [Online]. Available: https://t.ly/R5Gy
[2] Synergy Research Group, “Number of hyperscale data centers world-
wide from 2015 to 2021,” 2021.
[3] International Energy Agency, “Data centres and data transmission
networks,” 2023. [Online]. Available: https://t.ly/bmyWg
[4] A. Khosravi, L. L. H. Andrew, and R. Buyya, “Dynamic vm placement
method for minimizing energy and carbon cost in geographically
distributed cloud data centers,” IEEE Transactions on Sustainable
Computing, vol. 2, no. 2, pp. 183–196, 2017.
[5] Z. Liu, A. Wierman, Y. Chen, B. Razon, and N. Chen, “Data center
demand response: Avoiding the coincident peak via workload shifting
and local generation,” Performance Evaluation, vol. 70, no. 10, pp.
770–791, 2013.
[6] R. Rahmani, I. Moser, and A. L. Cricenti, “Inter-continental data
centre power load balancing for renewable energy maximisation,
Electronics, vol. 11, no. 10, p. 1564, 2022.
[7] M. Xu and R. Buyya, “Managing renewable energy and carbon
footprint in multi-cloud computing environments, Journal of Parallel
and Distributed Computing, vol. 135, pp. 191–202, 2020.
[8] N. Buchbinder, N. Jain, and I. Menache, “Online job-migration for
reducing the electricity bill in the cloud,” in Networking, 2011,
Conference Proceedings.
[9] J. Shuja, A. Gani, S. Shamshirband, R. W. Ahmad, and K. Bilal,
“Sustainable cloud data centers: A survey of enabling techniques and
technologies,” Renewable and Sustainable Energy Reviews, vol. 62,
pp. 195–214, 2016.
[10] M. Dabbagh, B. Hamdaoui, A. Rayes, and M. Guizani, “Shaving data
center power demand peaks through energy storage and workload
shifting control,” IEEE Transactions on Cloud Computing, vol. 7,
no. 4, pp. 1095–1108, 2019.
[11] M. T. Takcı, T. G¨
ozel, and M. H. Hocao˘
glu, “Quantitative evaluation of
data centers’ participation in demand side management,” IEEE Access,
vol. 9, pp. 14 883–14 896, 2021.
[12] A. Radovanovic, B. Chen, S. Talukdar, B. Roy, A. Duarte, and
M. Shahbazi, “Power modeling for effective datacenter planning and
compute management,” IEEE Transactions on Smart Grid, vol. 13,
no. 2, pp. 1611–1621, 2022.
[13] J. Doyle, R. Shorten, and D. O. Mahony, “Stratus: Load balancing
the cloud for carbon emissions control,” IEEE Transactions on Cloud
Computing, vol. 1, no. 1, pp. 1–1, 2013.
[14] A. Radovanovic, R. Koningstein, I. Schneider, B. Chen, A. Duarte,
B. Roy, D. Xiao, M. Haridasan, P. Hung, N. Care, S. Talukdar,
E. Mullen, K. Smith, M. Cottman, and W. Cirne, “Carbon-
Aware Computing for Datacenters,” IEEE Transactions on Power
Systems, vol. 38, no. 2, pp. 1270–1280, 2023. [Online]. Available:
https://ieeexplore.ieee.org/document/9770383/
[15] P. D. Grontas, G. Belgioioso, C. Cenedese, M. Fochesato, J. Lygeros,
and F. D¨
orfler, “Big hype: Best intervention in games via distributed
hypergradient descent,” arXiv preprint arXiv:2303.01101, 2023.
[16] M. Maljkovic, G. Nilsson, and N. Geroliminis, “On finding the leader’s
strategy in quadratic aggregative stackelberg pricing games,” in 2023
European Control Conference (ECC), 2023, pp. 1–6.
[17] G. Belgioioso, P. Yi, S. Grammatico, and L. Pavel, “Distributed gen-
eralized Nash equilibrium seeking: An operator-theoretic perspective,
IEEE Control Systems, vol. 42, no. 4, pp. 87–102, 2022.
[18] F. Facchinei, V. Piccialli, and M. Sciandrone, “Decomposition algo-
rithms for generalized potential games,” Computational Optimization
and Applications, vol. 50, no. 2, pp. 237–262, 2011.
[19] N. H. Tran, D. H. Tran, S. Ren, Z. Han, E. N. Huh, and C. S. Hong,
“How geo-distributed data centers do demand response: A game-
theoretic approach,” IEEE Transactions on Smart Grid, vol. 7, no. 2,
pp. 937–947, 2016.
[20] F. Facchinei and C. Kanzow, “Generalized Nash equilibrium prob-
lems,” Annals of Operations Research, vol. 175, no. 1, pp. 177–211,
2010.
[21] S. P. Boyd and L. Vandenberghe, Convex Optimization. Cambridge
University Press, 2004.
[22] B. Amos and J. Z. Kolter, “Optnet: Differentiable optimization as a
layer in neural networks,” in International Conference on Machine
Learning. PMLR, 2021, Conference Proceedings, pp. 136–145.
[23] Google, “Data center locations,” 2023. [Online]. Available: https:
//www.google.com/about/datacenters/locations/
[24] Electricity Maps, “Organising the world’s electricity data, 2023.
[Online]. Available: https://www.electricitymaps.com
[25] N. Hogade, S. Pasricha, and H. J. Siegel, “Energy and network aware
workload management for geographically distributed data centers,
IEEE Transactions on Sustainable Computing, vol. 7, no. 2, pp. 400–
413, 2022.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The ever increasing popularity of Cloud and similar services pushes the demand for data centres, which have a high power consumption. In an attempt to increase the sustainability of the power generation, data centres have been fed by microgrids which include renewable generation—so-called ‘green data centres’. However, the peak load of data centres often does not coincide with solar generation, because demand mostly peaks in the evening. Shifting power to data centres incurs transmission losses; shifting the data transmission has no such drawback. We demonstrate the effectivity of computational load shifting between data centres located in different time zones using a case study that balances demands between three data centres on three continents. This study contributes a method that exploits the opportunities provided by the varied timing of peak solar generation across the globe, transferring computation load to data centres that have sufficient renewable energy whenever possible. Our study shows that balancing computation loads between three green data centres on three continents can improve the use of renewables by up to 22%. Assuming the grid energy does not include renewables, this amounts to a 13% reduction in CO2 emissions.
Article
Full-text available
The amount of CO2 emitted per kilowatt-hour on an electricity grid varies by time of day and substantially varies by location due to the types of generation. Networked collections of warehouse scale computers, sometimes called Hyperscale Computing, emit more carbon than needed if operated without regard to these variations in carbon intensity. This paper introduces Google's system for Carbon-Intelligent Compute Management, which actively minimizes electricity-based carbon footprint and power infrastructure costs by delaying temporally flexible workloads. The core component of the system is a suite of analytical pipelines used to gather the next day's carbon intensity forecasts, train day-ahead demand prediction models, and use risk-aware optimization to generate the next day's carbon-aware Virtual Capacity Curves (VCCs) for all datacenter clusters across Google's fleet. VCCs impose hourly limits on resources available to temporally flexible workloads while preserving overall daily capacity, enabling all such workloads to complete within a day. Data from operation shows that VCCs effectively limit hourly capacity when the grid's energy supply mix is carbon intensive and delay the execution of temporally flexible workloads to greener times.
Article
Full-text available
Over the past decade, there has been a global growth in datacenter capacity, power consumption and the associated costs. Accurate mapping of datacenter resource usage (CPU, RAM, etc.) and hardware configurations (servers, accelerators, etc.) to its power consumption is necessary for efficient long-term infrastructure planning and real-time compute load management. This paper presents two types of statistical power models that relate CPU usage of Google’s Power Distribution Units (PDUs, commonly referred to as power domains) to their power consumption. The models are deployed in production and are used for cost-and carbon-aware load management, power provisioning and infrastructure rightsizing. They are simple, interpretable and exhibit uniformly high prediction accuracy in modeling power domains with large diversity of hardware configurations and workload types across Google fleet. A multi-year validation of the deployed models demonstrate that they can predict power with less than 5% Mean Absolute Percent Error (MAPE) for more than 95% diverse PDUs across Google fleet. This performance matches the best reported accuracies coming from studies that focus on specific workload types, hardware platforms and, typically, more complex statistical models.
Article
Full-text available
In recent years, the rapid increase in the number of internet users and widespread usage of internet applications have obliged large servers and networking equipment to manage large data stack and optimize the instantaneous transmission of digital information. The COVID-19 Pandemic has also caused an increase in data exchanges and digital information generation. In order to manage large-scale data, there is a need for gigantic data centers (DCs) which are tremendous energy consumers and have relatively flexible loads that are easier to control by means of shifting in time and space. Therefore, DCs can be regarded as dispatchable loads and are considered good candidates for participating in demand side management (DSM) programs for power curve smoothing and compensation of power fluctuation in electrical power systems. In this paper, the question of why DCs should participate in DSM has been investigated rather than the technical methods used in DSM. The amount of DCs’ participation energy is used by peak shaving/shifting method for power curve smoothing using actual data. The possible environmental and financial effects of it for Turkey and all the world have been carried out. The study results show that DCs’ participation in DSM for Turkey decreases peak load by up to 2.18%, defers up to 34% of the installed power plants launched in 2019, and improves load and loss factors by up to 2.2% and 4.3% respectively. Additionally, global DC’s participation in DSM decreases the peak point by up to 0.77% and reduces CO2 emission by 0.03%.
Article
Hierarchical decision making problems, such as bilevel programs and Stackelberg games, are attracting increasing interest in both the engineering and machine learning communities. Yet, existing solution methods lack either convergence guarantees or computational efficiency, due to the absence of smoothness and convexity. In this work, we bridge this gap by designing a first-order hypergradient-based algorithm for Stackelberg games and mathematically establishing its convergence using tools from nonsmooth analysis. To evaluate the hypergradient , namely, the gradient of the upper-level objective, we develop an online scheme that simultaneously computes the lower-level equilibrium and its Jacobian. Crucially, this scheme exploits and preserves the original hierarchical and distributed structure of the problem, which renders it scalable and privacy-preserving. We numerically verify the computational efficiency and scalability of our algorithm on a large-scale hierarchical demand-response model.
Article
Generalized games model interactions between a set of selfish decision makers, called players or agents , where both the objective function and the feasible decision set of each player may depend on strategies of the competitors. Such games arise, for example, when agents compete for a share of some common but limited resources. For instance, consider a set of vehicles sharing the road, set of radio channels competing for bandwidth, or set of companies servicing an economic market. They can each be modeled as players/agents in a game, competing for a portion of available resources, that is, road capacity, total bandwidth, or market share. As resources are limited, player choices are bound together by a coupling-capacity constraint. Essentially, in a generalized game, or one with coupling constraints, the set of available choices an agent has depends on the choices of other agents. Thus, a player cannot simply optimize their own objective function without considering the decisions of the others, even though this objective might not depend on him or her. The relevant equilibrium (namely, the solution concept) for noncooperative decision making with coupling constraints is called a generalized Nash equilibrium (GNE).
Article
Cloud service providers are distributing data centers geographically to minimize energy costs through intelligent workload distribution. With increasing data volumes in emerging cloud workloads, it is critical to factor in the network costs for transferring workloads across data centers. For geo-distributed data centers, many researchers have been exploring strategies for energy cost minimization and intelligent inter-data-center workload distribution separately. However, prior work does not comprehensively and simultaneously consider data center energy costs, data transfer costs, and data center queueing delay. In this paper, we propose a novel game theory-based workload management framework that takes a holistic approach to the cloud operating cost minimization problem by making intelligent scheduling decisions aware of data transfer costs and the data center queueing delay. Our framework performs intelligent workload management that considers heterogeneity in data center compute capability, cooling power, interference effects from task co-location in servers, time-of-use electricity pricing, renewable energy, net metering, peak demand pricing distribution, and network pricing. Our simulations show that the proposed game-theoretic technique can minimize the cloud operating cost more effectively than existing approaches.
Article
Cloud computing offers attractive features for both service providers and customers. Users benefit from the pay-as-you-go model by saving expenditures and service providers are deploying their services to cloud data centers to reduce their maintenance efforts. However, due to the fast growth of cloud data centers, the energy consumed by the data centers can lead to a huge amount of carbon emission with environmental impacts, and the carbon intensity of different locations are varied among different power plants according to the sources of energy. Thus, in this paper, to address the carbon emission problem of data centers, we consider shifting the workloads among multi-cloud located in different time zones. We also formulate the energy usage and carbon emission of data centers and model the solar power corresponding to the locations. This helps to reduce the usage of brown energy and maximize the utilization of renewable energy at different locations. We propose an approach for managing carbon footprint and renewable energy for multiple data centers at California, Virginia, and Dublin, which are in different time zones. The results show that our proposed approaches that apply workload shifting can reduce around 40% carbon emission in comparison to the baseline while ensuring the average response time of user requests.
Article
This paper proposes efficient strategies that shave Data Centers (DCs)’ monthly peak power demand with the aim of reducing the DCs’ monthly expenses. Specifically, the proposed strategies allow to decide: i) when and how much of the DC's workload should be delayed given that the workload is made up of multiple classes where each class has a certain delay tolerance and delay cost, and ii) when and how much energy should be charged/discharge into DCs’ batteries. We first consider the case where the DC's power demands throughout the whole billing cycle are known and present an optimal peak shaving control strategy for it. We then relax this assumption and propose an efficient control strategy for the case when (accurate/noisy) predictions of the DC's power demands are only known for short durations in the future. Several comparative studies based on real traces from a Google DC are conducted in order to validate the proposed techniques.