Content uploaded by Alexis Pospelov
Author content
All content in this area was uploaded by Alexis Pospelov
Content may be subject to copyright.
Comparison of Scheduling Heuristics for Grid Resource Broker
Sergey Zhuk
MIPT, Moscow
zh sergey@mail.ru
Andrey Chernykh
CICESE Research Center,
Ensenada, Mexico
chernykh@cicese.mx
Arutyun Avetisyan, Sergey Gaissaryan,
Dmitry Grushin
ISP RAS, Moscow
{arut, ssg, grushin}@ispras.ru
Nikolai Kuzjurin, Alexey Pospelov,
Alexander Shokurov
ISP RAS, Moscow
{nnkuz, pospelov, shok}@ispras.ru
Abstract
In this paper,weconsiderparallel tasks schedulingprob-
lems for hierarchical decentralized systems that consist
of homogeneous computational resources such as clusters,
PCs and supercomputers, and geographically dispersed.
We concentrate on two-level hierarchy scheduling: at the
first level, broker allocates computational tasks to the re-
source. At the second level, each resource schedules the
tasks assigned to it using heuristics based, for instance, on
strip-packing algorithms. The allocation strategies and effi-
ciency of proposed hierarchical scheduling algorithms are
discussed.
1. Introduction
Computational Grids [9] are emerging as a new
paradigm for solving large-scale problems in science, en-
gineering, and commerce. They enable the sharing and
aggregation of millions of resources, geographically dis-
tributed across organizations and administrative do-
mains [10]. They comprise heterogeneous resources
(PCs, work-stations, clusters, and supercomputers), poli-
cies, and applications (scientific, engineering, and commer-
cial) with varied requirements. The resources are owned by
different organizations with their own management poli-
cies, usage and cost models. Moreover, the availability of
resources and their load dynamically vary in time. The ser-
vice producers (resource owners) and service consumers
(resource users) have different goals, objectives, strate-
gies, and supply-and-demand patterns. Therefore, a very
important problem is to have an efficient resource man-
agement system intended to optimize relations between
producers and consumers according to some adequate re-
source management strategy.
Most of the systems for Grid resource management and
scheduling (such as Legion, Condor, AppLeS PST, Net-
Solve,PUNCH,XtremWeb, etc.) adopt a conventionalstrat-
egy, where a scheduler selects jobs are to be executed on a
particular resource based on objective cost functions driven
by system-centric parameters. These schedulers aim to en-
hance the system throughput, utilization, and complete ex-
ecution at the earliest possible time rather than to improve
application processing. Two key players: resource providers
and resource consumers; have their own expectations and
strategies. Resource providers use strategies of good in-
vestment and try to maximize resource utilization, while
resource consumers adopt the strategies of solving their
problems within a required time-frame and budget. Also
it should be taken into account that a user is in competi-
tion with other users, and a resource owner is in competi-
tion with other resource owners.
The resource consumer interacts with brokers to express
requirements such as the budget to solve a given problem,
its deadline, and possible trade-off these two requirements.
He can has an option of what provider best meets his re-
quirements. The resource providers also need tools for ex-
pressing their pricing policies, and mechanisms that help
them to maximize the profit and resource utilization.
Rajkumar Buyya [6] proposed an economic-based ap-
proach, when scheduling decisions are made dynamically at
runtime and they are driven by the end-users requirements.
Such an approach is implemented in GRACE system. In
such an approach, based on a specific metric of the price,
management systems dynamically bargain about available
resources, and schedule computations on these resources to
meet user requirements.
Computations as objects with their own goals, resources,
and actions, can be viewed as an economic object. With the
proliferation of networks, when high-end computing sys-
tems has moved from using centralized models of control
and action toward decentralized ones, the projection of eco-
nomic driven market mechanisms to resource management
would be a natural.
The resource broker consists of the following compo-
nents:
• Job Control Agent: it is a persistent control engine re-
sponsible for shepherding a job through the system. It
coordinates with schedule adviser for schedule gener-
ation, handles actual creation of jobs, maintenance of
job status, interacting with clients/users, schedule ad-
viser, and dispatcher.
• ScheduleAdviser:itisresponsible for resource discov-
ery (using the Grid explorer), resource selection and
job assignment (schedule generation)to ensure that the
user requirements are met.
• Grid Explorer: it is responsible for resource discov-
ery by interacting with the Grid-information server and
identifying the list of authorized machines, and keep-
ing track of resource status information.
• Deployment Agent: it is responsible for activating task
execution on the selected resource as per the sched-
uler’s instruction and periodically updates the status of
task execution to Job Control Agent.
The Grid resource broker acts as a mediator between the
user and Grid resources using middleware services. It is re-
sponsible for resource discovery, resource selection, bind-
ing of software, data, and hardware resources, initiating
computations, adapting to the changes in Grid resources,
and presenting the Grid to the user as a single, unified re-
source.
In this paper, we address the problem of scheduling par-
allel jobs in such systems. We consider two-level hierarchy:
at the first level the broker allocates computational tasks to
clusters according some criteria. At the second level, each
cluster schedules the parallel tasks assigned to it by its own
local scheduler. Such local schedulers can use heuristics, in
particular based on strip-packing algorithms. Here we do
not consider implementation issue and issues related with
resource discovery and selection. The main objective is to
compare different scheduling strategies and estimate their
efficiency.
We restrict our analysis to the scheduling systems where
all the tasks are given at time 0 and are processed into the
same batch. That means that a set of available ready tasks
will be executed up to the completion of the last one. All
tasks which arrive in the system during thistimewill be pro-
cessed in the next batch. A relation between this scheme and
the scheme where tasks arrived over time, either at their re-
lease time, according to the precedence constraints, or re-
leased by different users is known and studied for differ-
ent scheduling strategies for general or restricted cases [21].
Using most basic results, the performance guarantee of our
strategies which allows release times is 2-competitive of the
batch style algorithms discussed in the paper.
2. Scheduling strategies
We consider the following simple model. Let we have
n tasks and m clusters with identical processors. We as-
sume that all tasks are independent, i.e. there is no com-
munications between tasks. Each task can be described by
a pair: the size p
j
(number of requested processors), and
execution time t
j
on a p
j
processors. Let w
i
be the num-
ber of processors of the ith cluster. Each task can be run
on some cluster, so that the maximum size of the task is
less than the maximum number of processors in a cluster
(max
j
p
j
≤ max
i
w
i
). Our objective is to minimize total
completion time.
Scheduling multiprocessor tasks with unit processing
times is a strongly NP-hard problem [1], and can be con-
sidered as a strip-packing problem where a set of rectan-
gles (size-time task rectangles) must be packed into a strip
of limited width and unbounded height. One of the known
heuristics is the Bottom-Left (BL): put a rectangle as low
as possible, and move it to the left. It is known that for
some problems BL can not find optimal packing [4, 19],
nor does it perform well in practice when applied for ran-
dom ordering (that is a case for heuristics H
1
described be-
low). However, a very successful approach is to apply BL
to the tasks ordered by decreasing width (the case of heuris-
tics H2-4 described below) that is referred as Bottom Left
Decreasing (BLD). The BLD has been shown to be a 3-
approximation [4]. Some results about asymptotic perfor-
mance ratio of different heuristics for this problem and re-
cent improvements are presented in [3, 4, 8, 14, 15, 16, 18].
In particular, in [8] it was shown that algorithms where the
rectangles are placed on ”shelves” using one-dimensional
bin-packing heuristics have asymptotic performance ratio
of 2.7 when the rectangles are sorted by decreasing height
(First-Fit shelf algorithms). The asymptotic performance ra-
tio of the best heuristic was further reduced to 2.5 in [18],
then to 4/3 (see [15]), and finally to 5/4 [3]. Recent improve-
ments are presented in [14, 15]. Scheduling tasks on a set of
clusters can be viewed as a packing problem of tasks into a
few strips of different width. We consider the following sce-
nario. On the first stage, Broker analyzes a task request, and
broadcasts task parameters to all clusters. Each cluster re-
turns an estimation of the corresponding completion time.
Broker assigns the task to the cluster that has minimum es-
timated completion time. On the second stage, clusters allo-
cate tasks on the processors using the BL heuristic. In this
paper, we consider different algorithms for these two stages,
and estimate the quality of such hierarchical scheduling.
Now, we present different scheduling strategies where al-
location of tasks on processors (by a cluster local sched-
uler) is performed by the BL or BLD heuristics.
Heuristic 1: MCT+BL
On the first stage, the broker sends the task description
to all clusters. Each cluster returns an estimation of the
completion time. Broker allocates the task on a cluster that
provides a minimum completion time (MCT). On the sec-
ond stage, the task is executed on the cluster using the BL
heuristic. Allocation and execution are performed in the or-
der tasks arrived.
Heuristic 2: MCT-SORT+BL
On the first stage, the broker sorts all available tasks by
decreasing their size (widths), and then it functions similar
to MCT+BL.
Heuristic 3: MCT+BLD
On the first stage, the broker functions similar to
MCT+BL, and then to execute tasks the BLD strat-
egy is used
3. Comparison of scheduling strategies
We use the following notations.
T = {T
1
, . . . , T
n
} is the set of tasks,
h(T
j
) execution time for task T
j
,
w(T
j
) is the task size (the number of required proces-
sors) by T
j
,
C = {C
1
, . . . , C
m
} the set of clusters,
w
i
is the size (width) of the ith cluster.
We will assume that the clusters are sorted by their
widths, i.e.
w
1
≤ w
2
≤ · · · ≤ w
m
.
Let H
O
be the minimum completion time of a given
list of tasks (that is optimum over all possible schedules).
Let H
i
be the completion time obtained by the ith heuris-
tics (i=1,2,3) described above. It can be shown that the ra-
tio H
1
/H
O
(the approximation ratio) can not be bounded
by a constant. To show this, it is sufficient to consider
only one cluster of width w and the following list of tasks
{T
1
, T
2
, T
1
, T
2
, . . .}, where w(T
1
) = w, h(T
1
) = ε,
w(T
2
) = 1, h(T
2
) = H, and H is sufficiently large.
For Heuristic 2 the constant approximation ratio for the
worst case has been proved.
Theorem 1 [19]. For any list of tasks and any set of clus-
ters the following inequality holds
H
2
/H
O
≤ 3.
This constant approximation ratio that Heuristic 2 can
guarantee in the worst case is a very desirable situation. A
drawback of the Heuristic 2 is the requirement for a broker
to do a preprocessing step (sorting tasks) that looks slightly
unrealistic (a broker can manage a lot of clusters). It is more
reasonable to put this problem on clusters’ side and to use
Heuristic 3, so that the broker will manage tasks on-line.
It is interesting whether the analogue of Theorem 1 could
be proved for the ratio H
3
/H
O
? In some cases (all jobs’
widths are small enough) it can be shown, that the ratio is
bounded by a constant. For example, the following theorem
can be proved
Theorem 2. For any list of tasks and any set of clus-
ters such that for any j w(T
j
) ≤ w
1
the following inequal-
ity holds
H
3
/H
O
≤ 4.
For the general case, the analogue of Theorem 1 can not
be proved that can be shown by the following example.
Example. Let the clusters and the tasks are divided into
groups according to their widths. Let there are n + 1 groups
of width w
i
, i = 0, 1, . . . , n . We use the following notations
• n + 1 is the number of groups of machines (tasks),
• M
i
, i = 0, n — the number of machines in group i,
• N
i
, i = 0, n — the number of tasks in group i,
• W
i
, i = 0, n — width of machines (tasks) in group i,
• h
i
, i = 0, n — height of tasks in group i,
• H
O
— height of optimal packing,
• H
3
— height of tasks obtained by Heuristic 3.
The instance below shows that the ratio H
3
/H
O
may be
arbitrary large. Let
M
i
= 2
i
, i = 0, n,
N
i
= (i + 1) · 2
i
, i = 0, n,
W
i
= 2
n−i
, i = 0, n,
h
i
=
1
i + 1
, i = 0, n.
Let tasks come in increasing order. Since
h
i
N
i
W
i
M
i
W
i
=
1
i+1
(i + 1)2
i
2
i
= 1, i = 0, n.
obviously
H
O
= 1.
But
N
i
W
i
= 2
n−i
(i + 1) · 2
i
= (i + 1) · 2
n
=
i
X
k=0
M
k
W
k
=
i
X
k=0
2
k
2
n−k
.
Hence, any group of tasks may completely fill one layer
on all clusters, and
H
3
=
n
X
k=0
1
k + 1
∼ ln n,
that means the ratio H
3
/H
O
may be arbitrary large.
Heuristic 3 can be modified to achieve constant approxi-
mation ratio. Let us consider the following modification.
first(T
j
) be the minimum i: w
i
≥ w(T
j
),
last(T
j
) be the minimum r such that
r
X
i=first(T
j
)
w
i
≥
1
2
m
X
i=first(T
j
)
w
i
H
4
the completion time by Heuristic 4.
BL
i
(W ) packing of the set of tasks W on the ith cluster
obtained by the BLD-algorithm.
S(W ) area of all tasks from the set W .
Clusters fir st(T
j
), fir st(T
j
) + 1, . . . , last(T
j
) we will
call admissible for the task T
j
.
Heuristic 4. On the first stage broker functions as in
Heuristic 3 with only one difference: It sends a task de-
scription to admissible (for this task) clusters only. Every
of these clusters uses BLD-algorithm and broker chooses
the cluster which gives minimum completion time.
Theorem 3. For any list of tasks and any set of clusters
H
4
H
O
≤ 10.
Proof. Let maximum completion time is given by kth
cluster, and let T
a
be the task that has been received last
from the broker by this cluster, and
f = first(T
a
), l = last(T
a
).
Let W
f
, . . . , W
l
be the sets of tasks that were allocated
on the clusters f, . . . , l (admissible for T
a
), just before get-
ting the task T
a
. Because the task was sent to the kth clus-
ter, then
BL
i
(W
i
∪ {T
a
}) ≥ H
4
∀i = f, . . . , l
Let in BL
i
(W
i
∪ {T
a
}) the task with maximum comple-
tion time be T
c
i
. Let h
i
= h(T
c
i
), t
i
be the time when this
task has started the execution, and r
i
= t
i
−h(T
a
). We have
t
i
+ h
i
≥ H
4
∀i = f, . . . , l
r
i
+ h(T
a
) + h
i
≥ H
4
∀i = f, . . . , l
l
X
i=f
w
i
r
i
+ h(T
a
) ·
l
X
i=f
w
i
+
l
X
i=f
w
i
h
i
≥ H
4
·
l
X
i=f
w
i
. (1)
Before the time t
i
the ith cluster is filled at least half
(Property of the BL-algorithm). Hence,
S(W
i
) ≥
1
2
· w
i
· r
i
.
Let T
b
be the task which requires minimal number of
processors among the tasks allocated on clusters f, . . . , l,
and let f
0
= f irst(T
b
). Then all tasks allocated on clus-
ters f, . . . , l cannot be allocated on clusters C
i
with i < f
0
.
Since T
b
is allocated on one of the clusters f, . . . , l then
last(T
b
) ≥ f =⇒
f−1
P
i=f
0
w
i
≤
1
2
m
P
i=f
0
w
i
=⇒
m
P
i=f
w
i
≥
1
2
m
P
i=f
0
w
i
Since l = last(T
a
), then
l
X
i=f
w
i
≥
1
2
m
X
i=f
w
i
,
and
m
X
i=f
0
w
i
≤ 2
m
X
i=f
w
i
≤ 4
l
X
i=f
w
i
. (2)
Thus,
H
O
·
m
X
i=f
0
w
i
≥ S(
l
[
i=f
W
i
) ≥
1
2
l
X
i=f
w
i
r
i
,
and by (2)
l
X
i=f
w
i
r
i
≤ 2 · H
O
·
m
X
i=f
0
w
i
≤ 8 · H
O
·
l
X
i=f
w
i
. (3)
The inequalities H
O
≥ h(T
j
), ∀j, (1) and (3) imply
8 · H
O
·
l
X
i=f
w
i
+ h(T
a
) ·
l
X
i=f
w
i
+
l
X
i=f
w
i
h
i
≥ H
4
·
l
X
i=f
w
i
.
8·H
O
·
l
X
i=f
w
i
+h(T
a
)·
l
X
i=f
w
i
+H
O
·
l
X
i=f
w
i
≥ H
4
·
l
X
i=f
w
i
.
8 · H
O
+ h(T
a
) + H
O
≥ H
4
.
8 · H
O
+ H
O
+ H
O
≥ H
4
,
and, finally
H
4
H
O
≤ 10.
4. Concluding remarks
In this paper, we discuss approaches for multiprocessor
task scheduling on computational Grid. We present solu-
tions for hierarchical systems that include broker and clus-
ters. The worst case analysis shows that our strategies can
provide efficient task management with constant guarantee
despite they are based on relatively simple (in fact, on-line)
heuristic algorithms.
There are many problems left for future research. From
the practical point of view, it is interesting to study the
scheduling algorithms on different sets of real instances.
These results motivate finding average case approximation
bounds that can better predict the algorithms behavior.
It is also interesting to study different algorithms for the
local cluster scheduling, other than BL and BLD, say, shelf
or split algorithms, that can improve the total approxima-
tion quality. Another interesting questions is: how fuzzy ex-
ecution time affects the efficiency? It seems important also
to study malleable tasks hierarchical scheduling when the
number of processors for a task is not given explicitly by a
user but can be chosen by a broker.
Acknowledgment
The authors would like to thank the Russian Foundation
for Basic Research for partial support of this work (grant
02-01-00713).
References
[1] Abraham, R. Buyya, B. Nath, Nature’s heuristics for
scheduling jobs on Computational Grids, International Con-
ference on Advanced Computing and Communications
(2000).
[2] P. Brucker, Scheduling Algorithms, Springer Ferlag (1998),
217-218.
[3] B.S. Baker, D.J. Brown and H.P. Katseff, A 5/4 algorithm for
two-dimensional packing, J. of Algorithms, 1981, v. 2, pp.
348-368.
[4] B.S. Baker, E.J. Coffman and R.L. Rivest, Orthogonal pack-
ings in two dimensions, SIAM J. Computing, 1980, v. 9, pp.
846-855.
[5] D.J.Brawn. An improved BL lower bound, Information pro-
cessing Letter. 11, pp. 37-38, 1980.
[6] R. Buyya, D. Abramson, J. Giddy, An Economy Driven Re-
source Management Architecture For Global Computational
Power Grids, International Conference on Parallel and Dis-
tributed Processing Techniques and Applications, 2000.
[7] L. Hluchy, V.D. Tran, D. Froehlich, and W. Castaings, Meth-
ods and Experiences of Parallelizing Flood Models, The 10th
EuroPVM/MPI conference. LNCS 2840. Sept. 2003, Venice.
pp. 677-681.
[8] E.J. Coffman, M.R. Garey, D.S. Johnson and R.E. Tar-
jan, Performance bounds for level-oriented two-dimensional
packing algorithms, SIAM J.Computing, 1980, v. 9, pp. 808-
826.
[9] The Global Grid Forum. http://www.gridforum.org
[10] Foster, C. Kesselman, editors. The Grid: Blueprint for a fu-
ture computing infrastructure, Morgan Kaufmann, San Fran-
sisco, 1999.
[11] K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman, Grid
Information Services for Distributed Resource Sharing, 10th
IEEE International Symposium on High-Performance Dis-
tributed Computing, 2001.
[12] M. Drozdowski, Scheduling multiprocessor tasks - an
overview, European J. of Oper. Research, 1996, v. 94, pp.
215-230.
[13] S. Orlando, P. Palmerini, R. Perego, F. Silvestri, Scheduling
high performance data mining tasks on a data Grid environ-
ment, Euro-Par 2002, LNCS 2400, Springer-Ferlag Berlin
Heidelberg 2002, pp. 375-384.
[14] K. Jansen, Scheduling malleable parallel tasks: an asymp-
totic fully polynomial-time approximationscheme, Proc. Eu-
ropean Symposium on Algorithms, ESA, 2002.
[15] C. Kenyon and E. Remila, A near optimal solution to a two-
dimensional cutting stock problem, Mathematics of Opera-
tions Research, 25 (2000), 645-656.
[16] W. Ludwig and P. Tiwari, Scheduling malleable and nonmal-
leable parallel tasks, Proc. 5th ACM-SIAM Symposium on
Discrete Algorithms, SODA (1994), 167-176.
[17] S. Orlando, P. Palmerini, R. Perego, F. Silvestri, Scheduling
high performance data mining tasks on a data Grid environ-
ment, Euro-Par 2002, LNCS 2400, Springer-Ferlag Berlin
Heidelberg 2002, pp. 375-384.
[18] D.D.Sleator, A 2.5-times optimal algorithm for bin packing
in two dimensions, Inf. Processing Letters, 1980, v. 10, pp.
37-40.
[19] A.I. Pospelov, On a problem of packing rectangles into a set
of strips, Proc. of the Institute for System Programming, v.
6, 2004 (to appear).
[20] N. Sample, P. Keyani, G. Wiederhold, Scheduling under
uncertainty: Planning for the Ubiquitois Grid, International
Conference on Coordination models and languages (2002).
[21] D. Shmoys, J. Wein, D. Williamson. Scheduling parallel ma-
chines on-line. SIAM J. Comput., 24:1313-1331, 1995.