Conference PaperPDF Available

Comparison of scheduling heuristics for grid resource broker

Authors:

Abstract

We consider parallel task scheduling problems for hierarchical decentralized systems that consist of homogeneous computational resources such as clusters, PCs and supercomputers, and are geographically dispersed. We concentrate on two-level hierarchy scheduling: at the first level, the broker allocates computational tasks to the resource. At the second level, each resource schedules the tasks assigned to it using heuristics based, for instance, on strip-packing algorithms. The allocation strategies and efficiency of proposed hierarchical scheduling algorithms are discussed.
Comparison of Scheduling Heuristics for Grid Resource Broker
Sergey Zhuk
MIPT, Moscow
zh sergey@mail.ru
Andrey Chernykh
CICESE Research Center,
Ensenada, Mexico
chernykh@cicese.mx
Arutyun Avetisyan, Sergey Gaissaryan,
Dmitry Grushin
ISP RAS, Moscow
{arut, ssg, grushin}@ispras.ru
Nikolai Kuzjurin, Alexey Pospelov,
Alexander Shokurov
ISP RAS, Moscow
{nnkuz, pospelov, shok}@ispras.ru
Abstract
In this paper,weconsiderparallel tasks schedulingprob-
lems for hierarchical decentralized systems that consist
of homogeneous computational resources such as clusters,
PCs and supercomputers, and geographically dispersed.
We concentrate on two-level hierarchy scheduling: at the
first level, broker allocates computational tasks to the re-
source. At the second level, each resource schedules the
tasks assigned to it using heuristics based, for instance, on
strip-packing algorithms. The allocation strategies and effi-
ciency of proposed hierarchical scheduling algorithms are
discussed.
1. Introduction
Computational Grids [9] are emerging as a new
paradigm for solving large-scale problems in science, en-
gineering, and commerce. They enable the sharing and
aggregation of millions of resources, geographically dis-
tributed across organizations and administrative do-
mains [10]. They comprise heterogeneous resources
(PCs, work-stations, clusters, and supercomputers), poli-
cies, and applications (scientific, engineering, and commer-
cial) with varied requirements. The resources are owned by
different organizations with their own management poli-
cies, usage and cost models. Moreover, the availability of
resources and their load dynamically vary in time. The ser-
vice producers (resource owners) and service consumers
(resource users) have different goals, objectives, strate-
gies, and supply-and-demand patterns. Therefore, a very
important problem is to have an efficient resource man-
agement system intended to optimize relations between
producers and consumers according to some adequate re-
source management strategy.
Most of the systems for Grid resource management and
scheduling (such as Legion, Condor, AppLeS PST, Net-
Solve,PUNCH,XtremWeb, etc.) adopt a conventionalstrat-
egy, where a scheduler selects jobs are to be executed on a
particular resource based on objective cost functions driven
by system-centric parameters. These schedulers aim to en-
hance the system throughput, utilization, and complete ex-
ecution at the earliest possible time rather than to improve
application processing. Two key players: resource providers
and resource consumers; have their own expectations and
strategies. Resource providers use strategies of good in-
vestment and try to maximize resource utilization, while
resource consumers adopt the strategies of solving their
problems within a required time-frame and budget. Also
it should be taken into account that a user is in competi-
tion with other users, and a resource owner is in competi-
tion with other resource owners.
The resource consumer interacts with brokers to express
requirements such as the budget to solve a given problem,
its deadline, and possible trade-off these two requirements.
He can has an option of what provider best meets his re-
quirements. The resource providers also need tools for ex-
pressing their pricing policies, and mechanisms that help
them to maximize the profit and resource utilization.
Rajkumar Buyya [6] proposed an economic-based ap-
proach, when scheduling decisions are made dynamically at
runtime and they are driven by the end-users requirements.
Such an approach is implemented in GRACE system. In
such an approach, based on a specific metric of the price,
management systems dynamically bargain about available
resources, and schedule computations on these resources to
meet user requirements.
Computations as objects with their own goals, resources,
and actions, can be viewed as an economic object. With the
proliferation of networks, when high-end computing sys-
tems has moved from using centralized models of control
and action toward decentralized ones, the projection of eco-
nomic driven market mechanisms to resource management
would be a natural.
The resource broker consists of the following compo-
nents:
Job Control Agent: it is a persistent control engine re-
sponsible for shepherding a job through the system. It
coordinates with schedule adviser for schedule gener-
ation, handles actual creation of jobs, maintenance of
job status, interacting with clients/users, schedule ad-
viser, and dispatcher.
ScheduleAdviser:itisresponsible for resource discov-
ery (using the Grid explorer), resource selection and
job assignment (schedule generation)to ensure that the
user requirements are met.
Grid Explorer: it is responsible for resource discov-
ery by interacting with the Grid-information server and
identifying the list of authorized machines, and keep-
ing track of resource status information.
Deployment Agent: it is responsible for activating task
execution on the selected resource as per the sched-
uler’s instruction and periodically updates the status of
task execution to Job Control Agent.
The Grid resource broker acts as a mediator between the
user and Grid resources using middleware services. It is re-
sponsible for resource discovery, resource selection, bind-
ing of software, data, and hardware resources, initiating
computations, adapting to the changes in Grid resources,
and presenting the Grid to the user as a single, unified re-
source.
In this paper, we address the problem of scheduling par-
allel jobs in such systems. We consider two-level hierarchy:
at the first level the broker allocates computational tasks to
clusters according some criteria. At the second level, each
cluster schedules the parallel tasks assigned to it by its own
local scheduler. Such local schedulers can use heuristics, in
particular based on strip-packing algorithms. Here we do
not consider implementation issue and issues related with
resource discovery and selection. The main objective is to
compare different scheduling strategies and estimate their
efficiency.
We restrict our analysis to the scheduling systems where
all the tasks are given at time 0 and are processed into the
same batch. That means that a set of available ready tasks
will be executed up to the completion of the last one. All
tasks which arrive in the system during thistimewill be pro-
cessed in the next batch. A relation between this scheme and
the scheme where tasks arrived over time, either at their re-
lease time, according to the precedence constraints, or re-
leased by different users is known and studied for differ-
ent scheduling strategies for general or restricted cases [21].
Using most basic results, the performance guarantee of our
strategies which allows release times is 2-competitive of the
batch style algorithms discussed in the paper.
2. Scheduling strategies
We consider the following simple model. Let we have
n tasks and m clusters with identical processors. We as-
sume that all tasks are independent, i.e. there is no com-
munications between tasks. Each task can be described by
a pair: the size p
j
(number of requested processors), and
execution time t
j
on a p
j
processors. Let w
i
be the num-
ber of processors of the ith cluster. Each task can be run
on some cluster, so that the maximum size of the task is
less than the maximum number of processors in a cluster
(max
j
p
j
max
i
w
i
). Our objective is to minimize total
completion time.
Scheduling multiprocessor tasks with unit processing
times is a strongly NP-hard problem [1], and can be con-
sidered as a strip-packing problem where a set of rectan-
gles (size-time task rectangles) must be packed into a strip
of limited width and unbounded height. One of the known
heuristics is the Bottom-Left (BL): put a rectangle as low
as possible, and move it to the left. It is known that for
some problems BL can not find optimal packing [4, 19],
nor does it perform well in practice when applied for ran-
dom ordering (that is a case for heuristics H
1
described be-
low). However, a very successful approach is to apply BL
to the tasks ordered by decreasing width (the case of heuris-
tics H2-4 described below) that is referred as Bottom Left
Decreasing (BLD). The BLD has been shown to be a 3-
approximation [4]. Some results about asymptotic perfor-
mance ratio of different heuristics for this problem and re-
cent improvements are presented in [3, 4, 8, 14, 15, 16, 18].
In particular, in [8] it was shown that algorithms where the
rectangles are placed on ”shelves” using one-dimensional
bin-packing heuristics have asymptotic performance ratio
of 2.7 when the rectangles are sorted by decreasing height
(First-Fit shelf algorithms). The asymptotic performance ra-
tio of the best heuristic was further reduced to 2.5 in [18],
then to 4/3 (see [15]), and finally to 5/4 [3]. Recent improve-
ments are presented in [14, 15]. Scheduling tasks on a set of
clusters can be viewed as a packing problem of tasks into a
few strips of different width. We consider the following sce-
nario. On the first stage, Broker analyzes a task request, and
broadcasts task parameters to all clusters. Each cluster re-
turns an estimation of the corresponding completion time.
Broker assigns the task to the cluster that has minimum es-
timated completion time. On the second stage, clusters allo-
cate tasks on the processors using the BL heuristic. In this
paper, we consider different algorithms for these two stages,
and estimate the quality of such hierarchical scheduling.
Now, we present different scheduling strategies where al-
location of tasks on processors (by a cluster local sched-
uler) is performed by the BL or BLD heuristics.
Heuristic 1: MCT+BL
On the first stage, the broker sends the task description
to all clusters. Each cluster returns an estimation of the
completion time. Broker allocates the task on a cluster that
provides a minimum completion time (MCT). On the sec-
ond stage, the task is executed on the cluster using the BL
heuristic. Allocation and execution are performed in the or-
der tasks arrived.
Heuristic 2: MCT-SORT+BL
On the first stage, the broker sorts all available tasks by
decreasing their size (widths), and then it functions similar
to MCT+BL.
Heuristic 3: MCT+BLD
On the first stage, the broker functions similar to
MCT+BL, and then to execute tasks the BLD strat-
egy is used
3. Comparison of scheduling strategies
We use the following notations.
T = {T
1
, . . . , T
n
} is the set of tasks,
h(T
j
) execution time for task T
j
,
w(T
j
) is the task size (the number of required proces-
sors) by T
j
,
C = {C
1
, . . . , C
m
} the set of clusters,
w
i
is the size (width) of the ith cluster.
We will assume that the clusters are sorted by their
widths, i.e.
w
1
w
2
· · · w
m
.
Let H
O
be the minimum completion time of a given
list of tasks (that is optimum over all possible schedules).
Let H
i
be the completion time obtained by the ith heuris-
tics (i=1,2,3) described above. It can be shown that the ra-
tio H
1
/H
O
(the approximation ratio) can not be bounded
by a constant. To show this, it is sufficient to consider
only one cluster of width w and the following list of tasks
{T
1
, T
2
, T
1
, T
2
, . . .}, where w(T
1
) = w, h(T
1
) = ε,
w(T
2
) = 1, h(T
2
) = H, and H is sufficiently large.
For Heuristic 2 the constant approximation ratio for the
worst case has been proved.
Theorem 1 [19]. For any list of tasks and any set of clus-
ters the following inequality holds
H
2
/H
O
3.
This constant approximation ratio that Heuristic 2 can
guarantee in the worst case is a very desirable situation. A
drawback of the Heuristic 2 is the requirement for a broker
to do a preprocessing step (sorting tasks) that looks slightly
unrealistic (a broker can manage a lot of clusters). It is more
reasonable to put this problem on clusters’ side and to use
Heuristic 3, so that the broker will manage tasks on-line.
It is interesting whether the analogue of Theorem 1 could
be proved for the ratio H
3
/H
O
? In some cases (all jobs’
widths are small enough) it can be shown, that the ratio is
bounded by a constant. For example, the following theorem
can be proved
Theorem 2. For any list of tasks and any set of clus-
ters such that for any j w(T
j
) w
1
the following inequal-
ity holds
H
3
/H
O
4.
For the general case, the analogue of Theorem 1 can not
be proved that can be shown by the following example.
Example. Let the clusters and the tasks are divided into
groups according to their widths. Let there are n + 1 groups
of width w
i
, i = 0, 1, . . . , n . We use the following notations
n + 1 is the number of groups of machines (tasks),
M
i
, i = 0, n — the number of machines in group i,
N
i
, i = 0, n — the number of tasks in group i,
W
i
, i = 0, n — width of machines (tasks) in group i,
h
i
, i = 0, n — height of tasks in group i,
H
O
— height of optimal packing,
H
3
— height of tasks obtained by Heuristic 3.
The instance below shows that the ratio H
3
/H
O
may be
arbitrary large. Let
M
i
= 2
i
, i = 0, n,
N
i
= (i + 1) · 2
i
, i = 0, n,
W
i
= 2
ni
, i = 0, n,
h
i
=
1
i + 1
, i = 0, n.
Let tasks come in increasing order. Since
h
i
N
i
W
i
M
i
W
i
=
1
i+1
(i + 1)2
i
2
i
= 1, i = 0, n.
obviously
H
O
= 1.
But
N
i
W
i
= 2
ni
(i + 1) · 2
i
= (i + 1) · 2
n
=
i
X
k=0
M
k
W
k
=
i
X
k=0
2
k
2
nk
.
Hence, any group of tasks may completely fill one layer
on all clusters, and
H
3
=
n
X
k=0
1
k + 1
ln n,
that means the ratio H
3
/H
O
may be arbitrary large.
Heuristic 3 can be modified to achieve constant approxi-
mation ratio. Let us consider the following modification.
first(T
j
) be the minimum i: w
i
w(T
j
),
last(T
j
) be the minimum r such that
r
X
i=first(T
j
)
w
i
1
2
m
X
i=first(T
j
)
w
i
H
4
the completion time by Heuristic 4.
BL
i
(W ) packing of the set of tasks W on the ith cluster
obtained by the BLD-algorithm.
S(W ) area of all tasks from the set W .
Clusters fir st(T
j
), fir st(T
j
) + 1, . . . , last(T
j
) we will
call admissible for the task T
j
.
Heuristic 4. On the first stage broker functions as in
Heuristic 3 with only one difference: It sends a task de-
scription to admissible (for this task) clusters only. Every
of these clusters uses BLD-algorithm and broker chooses
the cluster which gives minimum completion time.
Theorem 3. For any list of tasks and any set of clusters
H
4
H
O
10.
Proof. Let maximum completion time is given by kth
cluster, and let T
a
be the task that has been received last
from the broker by this cluster, and
f = first(T
a
), l = last(T
a
).
Let W
f
, . . . , W
l
be the sets of tasks that were allocated
on the clusters f, . . . , l (admissible for T
a
), just before get-
ting the task T
a
. Because the task was sent to the kth clus-
ter, then
BL
i
(W
i
{T
a
}) H
4
i = f, . . . , l
Let in BL
i
(W
i
{T
a
}) the task with maximum comple-
tion time be T
c
i
. Let h
i
= h(T
c
i
), t
i
be the time when this
task has started the execution, and r
i
= t
i
h(T
a
). We have
t
i
+ h
i
H
4
i = f, . . . , l
r
i
+ h(T
a
) + h
i
H
4
i = f, . . . , l
l
X
i=f
w
i
r
i
+ h(T
a
) ·
l
X
i=f
w
i
+
l
X
i=f
w
i
h
i
H
4
·
l
X
i=f
w
i
. (1)
Before the time t
i
the ith cluster is filled at least half
(Property of the BL-algorithm). Hence,
S(W
i
)
1
2
· w
i
· r
i
.
Let T
b
be the task which requires minimal number of
processors among the tasks allocated on clusters f, . . . , l,
and let f
0
= f irst(T
b
). Then all tasks allocated on clus-
ters f, . . . , l cannot be allocated on clusters C
i
with i < f
0
.
Since T
b
is allocated on one of the clusters f, . . . , l then
last(T
b
) f =
f1
P
i=f
0
w
i
1
2
m
P
i=f
0
w
i
=
m
P
i=f
w
i
1
2
m
P
i=f
0
w
i
Since l = last(T
a
), then
l
X
i=f
w
i
1
2
m
X
i=f
w
i
,
and
m
X
i=f
0
w
i
2
m
X
i=f
w
i
4
l
X
i=f
w
i
. (2)
Thus,
H
O
·
m
X
i=f
0
w
i
S(
l
[
i=f
W
i
)
1
2
l
X
i=f
w
i
r
i
,
and by (2)
l
X
i=f
w
i
r
i
2 · H
O
·
m
X
i=f
0
w
i
8 · H
O
·
l
X
i=f
w
i
. (3)
The inequalities H
O
h(T
j
), j, (1) and (3) imply
8 · H
O
·
l
X
i=f
w
i
+ h(T
a
) ·
l
X
i=f
w
i
+
l
X
i=f
w
i
h
i
H
4
·
l
X
i=f
w
i
.
8·H
O
·
l
X
i=f
w
i
+h(T
a
)·
l
X
i=f
w
i
+H
O
·
l
X
i=f
w
i
H
4
·
l
X
i=f
w
i
.
8 · H
O
+ h(T
a
) + H
O
H
4
.
8 · H
O
+ H
O
+ H
O
H
4
,
and, finally
H
4
H
O
10.
4. Concluding remarks
In this paper, we discuss approaches for multiprocessor
task scheduling on computational Grid. We present solu-
tions for hierarchical systems that include broker and clus-
ters. The worst case analysis shows that our strategies can
provide efficient task management with constant guarantee
despite they are based on relatively simple (in fact, on-line)
heuristic algorithms.
There are many problems left for future research. From
the practical point of view, it is interesting to study the
scheduling algorithms on different sets of real instances.
These results motivate finding average case approximation
bounds that can better predict the algorithms behavior.
It is also interesting to study different algorithms for the
local cluster scheduling, other than BL and BLD, say, shelf
or split algorithms, that can improve the total approxima-
tion quality. Another interesting questions is: how fuzzy ex-
ecution time affects the efficiency? It seems important also
to study malleable tasks hierarchical scheduling when the
number of processors for a task is not given explicitly by a
user but can be chosen by a broker.
Acknowledgment
The authors would like to thank the Russian Foundation
for Basic Research for partial support of this work (grant
02-01-00713).
References
[1] Abraham, R. Buyya, B. Nath, Nature’s heuristics for
scheduling jobs on Computational Grids, International Con-
ference on Advanced Computing and Communications
(2000).
[2] P. Brucker, Scheduling Algorithms, Springer Ferlag (1998),
217-218.
[3] B.S. Baker, D.J. Brown and H.P. Katseff, A 5/4 algorithm for
two-dimensional packing, J. of Algorithms, 1981, v. 2, pp.
348-368.
[4] B.S. Baker, E.J. Coffman and R.L. Rivest, Orthogonal pack-
ings in two dimensions, SIAM J. Computing, 1980, v. 9, pp.
846-855.
[5] D.J.Brawn. An improved BL lower bound, Information pro-
cessing Letter. 11, pp. 37-38, 1980.
[6] R. Buyya, D. Abramson, J. Giddy, An Economy Driven Re-
source Management Architecture For Global Computational
Power Grids, International Conference on Parallel and Dis-
tributed Processing Techniques and Applications, 2000.
[7] L. Hluchy, V.D. Tran, D. Froehlich, and W. Castaings, Meth-
ods and Experiences of Parallelizing Flood Models, The 10th
EuroPVM/MPI conference. LNCS 2840. Sept. 2003, Venice.
pp. 677-681.
[8] E.J. Coffman, M.R. Garey, D.S. Johnson and R.E. Tar-
jan, Performance bounds for level-oriented two-dimensional
packing algorithms, SIAM J.Computing, 1980, v. 9, pp. 808-
826.
[9] The Global Grid Forum. http://www.gridforum.org
[10] Foster, C. Kesselman, editors. The Grid: Blueprint for a fu-
ture computing infrastructure, Morgan Kaufmann, San Fran-
sisco, 1999.
[11] K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman, Grid
Information Services for Distributed Resource Sharing, 10th
IEEE International Symposium on High-Performance Dis-
tributed Computing, 2001.
[12] M. Drozdowski, Scheduling multiprocessor tasks - an
overview, European J. of Oper. Research, 1996, v. 94, pp.
215-230.
[13] S. Orlando, P. Palmerini, R. Perego, F. Silvestri, Scheduling
high performance data mining tasks on a data Grid environ-
ment, Euro-Par 2002, LNCS 2400, Springer-Ferlag Berlin
Heidelberg 2002, pp. 375-384.
[14] K. Jansen, Scheduling malleable parallel tasks: an asymp-
totic fully polynomial-time approximationscheme, Proc. Eu-
ropean Symposium on Algorithms, ESA, 2002.
[15] C. Kenyon and E. Remila, A near optimal solution to a two-
dimensional cutting stock problem, Mathematics of Opera-
tions Research, 25 (2000), 645-656.
[16] W. Ludwig and P. Tiwari, Scheduling malleable and nonmal-
leable parallel tasks, Proc. 5th ACM-SIAM Symposium on
Discrete Algorithms, SODA (1994), 167-176.
[17] S. Orlando, P. Palmerini, R. Perego, F. Silvestri, Scheduling
high performance data mining tasks on a data Grid environ-
ment, Euro-Par 2002, LNCS 2400, Springer-Ferlag Berlin
Heidelberg 2002, pp. 375-384.
[18] D.D.Sleator, A 2.5-times optimal algorithm for bin packing
in two dimensions, Inf. Processing Letters, 1980, v. 10, pp.
37-40.
[19] A.I. Pospelov, On a problem of packing rectangles into a set
of strips, Proc. of the Institute for System Programming, v.
6, 2004 (to appear).
[20] N. Sample, P. Keyani, G. Wiederhold, Scheduling under
uncertainty: Planning for the Ubiquitois Grid, International
Conference on Coordination models and languages (2002).
[21] D. Shmoys, J. Wein, D. Williamson. Scheduling parallel ma-
chines on-line. SIAM J. Comput., 24:1313-1331, 1995.
... The feature of such schemes is that each local domain can use its own resource broker policy. On the other hand, the job cannot be moved or reallocated to another resource at different domains, even when a better resource is found [37, 146]. ...
... In [8] the scheduling algorithm proposed tries to minimize the total average task delay and maximize resource utilization. The scheduling algorithms proposed in [2], [13] and [14] try to minimize the total completion time by dropping over-demanding tasks (e.g., tasks of high workload and short deadlines). Other performance metrics used are the average task slowdown [7], defined as the ratio of the task's total delay to its actual run time, the deadline miss ratio [6], and several other metrics. ...
Article
Full-text available
Grids offer a transparent interface to geographically scattered computation, communication, storage and other resources. In this chapter we propose and evaluate QoS-aware and fair scheduling algorithms for Grid Networks, which are capable of optimally or near-optimally assigning tasks to resources, while taking into consideration the task characteristics and QoS requirements. We categorize Grid tasks according to whether or not they demand hard performance guarantees. Tasks with one or more hard requirements are referred to as Guaranteed Service (GS) tasks, while tasks with no hard requirements are referred to as Best Effort (BE) tasks. For GS tasks, we propose scheduling algorithms that provide deadline or computational power guarantees, or offer fair degradation in the QoS such tasks receive in case of congestion. Regarding BE tasks our objective is to allocate resources in a fair way, where fairness is interpreted in the max-min fair share sense. Though, we mainly address scheduling problems on computation resources, we also look at the joint scheduling of communication and computation resources and propose routing and scheduling algorithms aiming at co-allocating both resource type so as to satisfy their respective QoS requirements.
... A number of scheduling algorithms have been proposed so far, both for single-and for multi-processor systems, some of which have also been adapted for use in Grids. Furthermore lately, a number of scheduling schemes that are specific to Grids have also been proposed [4], [5] and [3]. QoS in Data Networks has been extensively studied. ...
Article
Full-text available
Future Grid Networks should be able to provide Quality of Service (QoS) guar-antees to their users. In this work we examine the way Grid resources should be configured so as to provide deterministic delay guarantees to Guaranteed Service (GS) users and fairness to Best Effort (BE) users. The resources are partitioned in groups that serve GS users only, or BE users only, or both types of users with different priorities. Furthermore, the GS users are registered to the resources either statically or dynamically, while both single and multi-Cpu resources are examined. Finally the proposed resource configurations for providing QoS are implemented in the GridSim environment and a number simulations are executed. Our results indicate that the allocation of resources to both types of users, with different priorities, results in fewer deadlines missed and better resources utiliza-tion. Finally benefits can be derived from the dynamic registration of GS users to the resources.
Article
Job scheduling is an old topic in High-Performance Computing (HPC), and it is more and more studied in data centers. Large data centers are often split into separate partitions for cloud computing and HPC; each partition normally has its specific scheduler. The possibility of migrating jobs from the HPC partition to the cloud one is a topic widely discussed in the literature. However, job migration from cloud to HPC is a much less explored topic. Nevertheless, such migration may be useful in many situations, in particular when the HPC platform has a low resource usage level, and the cloud usage level is high. A large number of jobs that could migrate from the cloud to the HPC partition may be observed in Google data center workloads. Job scheduling using overbooking strategy is seen as the main reason for the high resource usage level in clouds. However, overbooking can lead to a high rate of rescheduling and job dumping, which potentially causes response time violations. This work shows that HPC platforms can host and execute some cloud jobs with low interference in HPC jobs and a low number of response time violations. We introduce the definition of a cloud-HPC convergence area and propose a job scheduling strategy for it, aiming at reducing the number of response time violations of cloud jobs without interfering with HPC jobs execution. Our proposal is formally defined and then evaluated in different execution scenarios, using the SimGrid simulation framework, with workload data from production HPC grid. The experimental results show that often, there is a large number of empty areas in the scheduling plan of HPC platforms, which makes it possible to allocate cloud jobs by backfilling. This is due to the sparse HPC job submission pattern and the low resource usage level in some HPC platforms. One performed simulation scenario considered a set of 11K parallel HPC jobs running on a 2560-processor platform having an average resource usage level of 38.0%. The proposed convergence scheduler succeeded to inject around 267K cloud jobs in the HPC platform, with a response time violation rate under 0.00094% for such jobs, considering 80 processors in the convergence area and no effects on the HPC workload. This experiment considered cloud jobs based on job features of Google public cloud workloads, with a processing time slack factor of 1.25 (which is considered as high priority in the Google cloud SLA—Service Level Agreement). Usually, most cloud jobs show a slack factor higher than 1.25 (most cloud jobs are medium or low priority). The same simulation, repeated with a higher slack factor (4), showed no response time violations.
Article
Full-text available
In this paper, we address the problem of dynamic resource allocation in presence of job runtime uncertainty. We develop an execution delay model for runtime prediction, and design an adaptive stochastic allocation strategy, named Pareto Fractal Flow Predictor (PFFP). We conduct a comprehensive performance evaluation study of the PFFP strategy on real production traces, and compare it with other well-known non-clairvoyant strategies over two metrics. In order to choose the best strategy, we perform bi-objective analysis according to a degradation methodology. To analyze possible biasing results and negative effects of allowing a small portion of the problem instances with large deviation to dominate the conclusions, we present performance profiles of the strategies. We show that PFFP performs well in different scenarios with a variety of workloads and distributed resources.
Article
Full-text available
In this paper the problem of scheduling parallel tasks on a group of clusters and formalization of this process as an optimization of packing rectangles in a set of strips of different widths is considered (Multiple Strip Packing). Some modern results concerning this problem and some open problems are presented. Practical aspects of optimization of scheduling parallel tasks process with different criteria of its quality are considered. The description of modeling system developed in the ISP RAS for experimental investigation of scheduling algorithms is presented and its properties are described.
Article
The model and approach to the scheduling of computing resources in a two-level Grid-system are proposed. The dynamic procedure of scheduling resources in a heterogeneous environment on the basis of solving a minimum covering problem, as well as the software that implements a discrete-event scheduling model, are developed. Numerical experiments on the basis of software implementation of the model are given, proving the effectiveness of the proposed model of resource scheduling in heterogeneous systems in the chosen metrics of the system performance. It is shown that the proposed procedure of scheduling enables one to maximize the loading of heterogeneous system resources, to reduce the run-time of all queue of tasks in a Grid-system compared with the common FCFS method. Realization of the proposed method in the MAUI scheduler is considered.
Article
Full-text available
This paper describes the experimental analysis of scheduling strategies for a computational Grid using two level hierarchy model. At the first level, a broker allocates computational jobs to the resource from a set of available ones. At the second level, each resource generates a schedule of the parallel jobs assigned to it by its own local scheduler. Based on the fact that the actual job execution time is different from the estimation provided by users, we made a performance evaluation of Grid scheduling strategies, by considering such a variation called time fluctuation. Scheduling strategies are evaluated considering workloads of real supercomputers.
Article
Grid resource management, Dynamic adaptation, resource provisioning, SLA management A grid computing environment enables sharing of loosely coupled resources and services required by various applications in a large-scale. In such an environment, one of the key challenges is to develop a flexible, scalable, and self-adaptive resource management system which would allow users to carry out their jobs by transparently accessing autonomous, distributed, and heterogeneous resources. In this paper, we discuss the research issues and conceptual architectural design of such a dynamic resource management framework, which leverages the open-source Globus Toolkit and commercially available HP OpenView Configuration Management Solutions software (Radia). Our approach provides adaptive and scalable middleware for static and dynamic resource provisioning, resource monitoring, virtual organization-wide authorization, and business policy management. The framework is based on automated, policy-driven change and configuration management functionality that can dynamically adjust the size, configuration and allocation of various resources that will be consumed in the environment. Abstract A grid computing environment enables sharing of loosely coupled resources and services required by var-ious applications in a large-scale. In such an envi-ronment, one of the key challenges is to develop a flexible, scalable, and self-adaptive resource manage-ment system which would allow users to carry out their jobs by transparently accessing autonomous, dis-tributed, and heterogeneous resources. In this pa-per, we discuss the research issues and conceptual ar-chitectural design of such a dynamic resource man-agement framework, which leverages the open-source Globus Toolkit and commercially available HP Open-View Configuration Management Solutions software (Radia). Our approach provides adaptive and scal-able middleware for static and dynamic resource provi-sioning, resource monitoring, virtual organization-wide authorization, and business policy management. The framework is based on automated, policy-driven change and configuration management functionality that can dynamically adjust the size, configuration and alloca-tion of various resources that will be consumed in the environment.
Conference Paper
Full-text available
Increasingly the datasets used for data mining are becoming huge and physically distributed. Since the distributed knowledge discovery process is both data and computational intensive, the Grid is a natural platform for deploying a high performance data mining service. The focus of this paper is on the core services of such a Grid infrastructure. In particular we concentrate our attention on the design and implementation of specialized broker aware of data source locations and resource needs of data mining tasks. Allocation and scheduling decisions are taken on the basis of performance cost metrics and models that exploit knowledge about previous executions, and use sampling to acquire estimate about execution behavior.
Conference Paper
Full-text available
This paper focuses on parallelization process of DaveF, a new two-dimensional depth-averaged flow and sediment transport model that allows breach development and the resulting flood wave to be simulated simultaneously. Problems encountered during parallelization and techniques used to solve them are described. The experimental results with different input data on different machines are also included.
Article
We analyze several “level-oriented” algorithms for packing rectangles into a unit-width, infinite-height bin so as to minimize the total height of the packing. For the three algorithms we discuss, we show that the ratio of the height obtained by the algorithm to the optimal height is asymptotically bounded, respectively, by 2, 1.7, and 1.5. The latter two improve substantially over the performance bounds for previously proposed algorithms. In addition, we give more refined bounds for special cases in which the widths of the given rectangles are restricted and in which only squares are to be packed.
Conference Paper
A malleable parallel task is one whose execution time is a function of the number of (identical) processors allotted to it. We study the problem of scheduling a set of n independent malleable tasks on an arbitrary number m of parallel processors and propose an asymptotic fully polynomial time approximation scheme. For any fixed ∈ > 0, the algorithm computes a non-preemptive schedule of length at most (1 + ∈) times the optimum (plus an additive term) and has running time polynomial in n,m and 1/∈.
Article
Multiprocessor tasks require more than one processor at the same moment of time. This relatively new concept in scheduling theory emerged with the advent of parallel computing systems. In this work we present the state of the art for multiprocessor task scheduling. We show the rationale behind the concept of multiprocessor tasks. The standard three-field notation is extended to accommodate multiprocessor tasks. The main part of the work is presentation of the results in multiprocessor tasks scheduling both for parallel and for dedicated processors.
Conference Paper
Computational Grid projects are ushering in an environment where clients make use of resources and services that are far too expensive for single clients to manage or maintain. Clients compose a megaprogram with services offered by outside organizations. However, the benefits of this paradigm come with a loss of control over job execution with added uncertainty about job completion. Current techniques for scheduling distributed services do not simultaneously account for autonomous service providers whose performance, reliability, and cost are not controlled by the service user. We propose an approach to scheduling that compensates for this uncertainty. Our approach builds initial schedules based on cost estimates from service providers and during program execution monitors job progress to determine if future deadlines will be met. This approach enables early hazard detection and facilitates schedule repairs to compensate for delays.