Conference PaperPDF Available

An Iterative Local Search Algorithm for Scheduling Precedence-Constrained Applications on Heterogeneous Machines

Authors:

Abstract and Figures

The paper deals with the problem of scheduling precedence-constrained applications on a distributed heterogeneous computing system with the aim of minimizing the response time or total execution time. The main contribution is a scheduling algorithm that promotes an iterative local search process. Due to a lack of generally accepted standard benchmarks for the evaluation of scheduling algorithms in the heterogeneous computing systems we also generate a benchmark of synthetic instances. The benchmark is composed of small size synthetic deterministic non-preemptive program graphs known in the literature. We compute the optimal solution and the global optimal value with an exact enumerative search method that explores all possible solutions. We compare the performance of the proposed local search algorithm with the optimal values. We have simulated the proposed algorithm using graphs obtained from real-world applications emphasizing the interest of the approach.
Content may be subject to copyright.
MISTA 2013
An Iterative Local Search Algorithm for Scheduling
Precedence-Constrained Applications on Heterogeneous
Machines
Aurelio A. Santiago Pineda ·Johnatan E.
Pecero ·ector J. Fraire Huacuja ·Juan J.
Gonzalez Barbosa ·Pascal Bouvry
Abstract The paper deals with the problem of scheduling precedence-constrained ap-
plications on a distributed heterogeneous computing system with the aim of minimizing
the response time or total execution time. The main contribution is a scheduling al-
gorithm that promotes an iterative local search process. Due to a lack of generally
accepted standard benchmarks for the evaluation of scheduling algorithms in the het-
erogeneous computing systems we also generate a benchmark of synthetic instances.
The benchmark is composed of small size synthetic deterministic non-preemptive pro-
gram graphs known in the literature. We compute the optimal solution and the global
optimal value with an exact enumerative search method that explores all possible so-
lutions. We compare the performance of the proposed local search algorithm with the
optimal values. We have simulated the proposed algorithm using graphs obtained from
real-world applications emphasizing the interest of the approach.
1 Introduction
Heterogeneous computing systems are a commonplace infrastructure that provides re-
sources in a distributed way interconnected via networks for executing parallel appli-
cations with a large amount of data and computing power. In such a system, a parallel
application can be partitioned into a number of cooperative tasks that are distributed
to the resources for parallel execution. However, the performance of a parallel appli-
cation executed on a parallel and distributed computing system heavily depends on
the scheduling of the tasks from the application onto the available processors in the
system, which if not properly solved, can nullify the benefits of parallelization and the
power of the distributed computing resources. Moreover, not only the performance of
the parallel application is deteriorated, but also issues related to energy consumption
are affected if the problem of scheduling is not properly handled [17,20,28].
Johnatan E. Pecero ·Pascal Bouvry
University of Luxembourg
E-mail: {johnatan.pecero, pascal.bouvry}@uni.lu
Aurelio A. Santiago Pineda ·ector J. Fraire Huacuja ·Juan J. Gonzalez Barbosa
Instituto Tecnol´ogico de Ciudad Madero
E-mail: alx.santiago@gmail.com, automatas2002@yahoo.com.mx
E-mail: jjgonzalezbarbosa@hotmail.com
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 472 -
In its general form, the precedence-constrained scheduling problem is NP-complete [7].
Therefore, many heuristics based scheduling algorithms have been proposed that find
a sub-optimal solution and attempt to balance running time, complexity and sched-
ule quality [5,14–16, 19, 25]. However, the performance of these heuristics is still an
open research problem [4,10,11,27,31]. Consequently, there is an increasing interest
in investigating and designing heuristics for scheduling precedence-constrained paral-
lel programs on heterogeneous computing systems. Main motivations are not only the
availability of heterogeneous computing platforms, such as Grid and P2P systems, but
also the increasing interest in industry and science by executing a number of parallel
applications that can be modelled by precedence task graphs or workflows [11].
In this paper we propose a local search algorithm based on an iterative search
process to solve the precedence-constrained scheduling problem. The iterative local
search (ILS) algorithm is a straight-forward, yet powerful techique for extending sim-
ple local search algorithms. We generate a benchmark composed of small size synthetic
deterministic parallel program graphs proposed in literature. We compute the optimal
solution by an enumerative search process that exhaustively explores the search space.
We compare the performance of the proposed ILS algorithm with the optimal values
considering an approximation factor. To evaluate and investigate scalability issues we
also simulated the ILS algorithm using parallel graphs that model real-world applica-
tions. Results of the experimental study show that the algorithm is efficient in solving
the problem providing results close to the optimal value.
The paper is organized as follows. In Section 2 we describe the precedence-constrained
scheduling problem. Section 3 discusses related work. The proposed iterated local search
algorithm is described in Section 4. Next, in Section 5 we present the benchmark and
experimental results. Section 6 concludes the paper.
2 Problem description
The target system used in this work is represented by an undirected unweighted graph
Gs= (Vs, Es), called a system graph (see, e.g. [31]). Vsis the set of Nsnodes of
the system graph representing the mprocessors. Esis the set of edges representing
bidirectional channels between processors and defines a topology of the distributed
system. The processors have different processing speed or provide different processing
performance in term of MIPS (Million Instruction Per Second) and communication via
links does not consume any processor time.
As usually, a parallel program is represented by a weighted directed acyclic graph
(DAG). The DAG, called a precedence task graph or a program graph, is defined as
G= (T, E), where Tis a finite set of nodes (vertices) and Eis a finite set of edges.
The node tiTis associated with one task tiof the modeled parallel program. To
every task ti, there is an associated value pij representing the computation cost of the
task tion a processor mj, and its average computation cost is denoted as pi. Each
edge (ti1, ti2)E(with ti1, ti2T) is a precedence constraint between tasks and
represents inter-task data communications if the output produced by task ti1has to be
communicated to the task ti2. We consider the same communication model as in [4].
That is, the data parameter is a t×tmatrix of communication data, where data(ti, tj)
is the amount of data required to be transmitted from task tito task tj. The rate
parameter is a m×mmatrix and represent the data transfer rate between proces-
sors. The communication cost of edge (ti, tj)E, which is for data transfer from task
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 473 -
ti(scheduled on processor mk) to task tj(scheduled on processor ml), is defined by
cti,tj=data(ti, tj)/rate(mk, ml). When both tiand tjare scheduled on the same pro-
cessor (pk=pl), then cti,tjbecomes zero. The average communication cost of an edge
is defined by cti,tj=data(ti, tj)/rate, where rate is the average transfer rate between
the processors in the domain. For a given DAG the communication to computation ra-
tio (CCR) is a measure that indicates wether a task graph is communication intensive,
computation intensive or moderate. It is computed by the average communication cost
divided by the average computation cost on a target system.
A simple task graph with its details are shown in Figure 1. The values presented in
the last column of the table are computed based on a frequently used task prioritization
method, the bottom level (blevel). The blevel of a node is the length of the longest path
from the node to an exit node. Note that, both the computation and communication
costs are averaged over all nodes and links. The blevel(ti) is computed recursively by
traversing the DAG upward starting from the exit task texit as follows (Eq. 1):
blevel(ti) = pi+maxtjsucc(ti){blevel(tj) + cij },(1)
where succ(ti) is the set of immediate successors of tiand blevel(texit ) = (ptexit ).
!
" # $ %
& '
(
"" "( "% ""
"$ "! ") "$ #(
#" "$
task r0r1r2piblevel
0 11 13 9 11 101.3
1 10 15 11 12 66.7
2 9 12 14 12 63.3
3 12 16 10 12 73.0
4 15 11 19 15 79.3
5 13 9 5 9 41.7
6 11 15 13 12 37.3
7 11 15 10 12 12.0
Fig. 1 In the left a sample DAG with the task indexes iinside nodes and values of cti1ti2
function next to the corresponding edges. In the right computation cost (piat level L0) and
task priorities (blevel).
The aim of scheduling is to distribute the tasks among the processors in such a
way that the precedence constraints are preserved, and the response time Cmax (the
total execution time or makespan) is minimized. The response time Cmax for a given
precedence task graph depends on the allocation of tasks in the distributed computing
topology and scheduling policy applied in individual processors [31]:
Cmax =f(allocation, scheduling policy) (2)
The scheduling policy defines and order of processing tasks and assigns a starting
time for each task, ready to run in a given processor. We will assume that it is the
same for any run of the scheduling algorithm and we will focus on looking for such
an allocation of tasks of a parallel application in a distributed computing system to
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 474 -
minimize the makespan. Although the minimization of makespan is crucial, tasks of a
DAG in this work are not associated with deadlines as in real-time systems.
An approximation factor is used to evaluate the proposed algorithms. The factor
is defined as ρ=Cmax
C
max , where C
max is the optimal response time or makespan [11].
3 Related Work
The scheduling problem is NP-hard in its simplest version (homogeneous case with-
out considering communications). Therefore, many heuristics have been proposed to
schedule DAG applications on heterogeneous distributed computing systems. A well
known scheduling algorithm is the Heterogeneous Earliest Finish Time (HEFT) algo-
rithm [33]. The HEFT algorithm maintains a list of all tasks of a given graph according
to their priorities, usually based on the blevel method. It consists in two phases. In the
first phase, a ready task is selected from the priority list. The task with the highest
priority for which all dependent tasks have finished is chosen. This process corresponds
to the task prioritizing or task selection phase. Thereafter, a suitable processor that
minimizes a predefined cost function is selected (i.e., the processor selection phase), in
this case the processor which will result in the earliest finish time of that task. The
HEFT algorithm is one of the most used algorithm as a basis for comparison to eval-
uate the performance of new proposed scheduling algorithms [2–5, 8–10, 12, 13,15,16,
25–27,29,30,35]. Therefore, we also use HEFT in Section 5 to validate the proposed
approach.
A number of local search algorithms for scheduling have been investigated in the
literature. Kowk et al. [24] present a first improvement random local search algorithm,
named FAST. In this algorithm, a task is randomly picked and then moved to a ran-
domly selected processor. If the schedule length is reduced, the move is accepted.
Otherwise, the task is moved back to its original processor. Kwok and Ahmad in [23]
modified the FAST algorithm. The major improvement is that it uses a nested loop
for a probabilistic jump. A parallel version of FAST is named FASTEST. Wu et. al.
in [34] proposed a local search algorithm based on topological ordering. The algorithm
is a deterministic guided search that uses the level of a task, defined as the sum of the
top-level and blevel, to schedule the tasks. The algorithm first selects a task with the
largest level and then assigns it to the processor that generates the smallest level for
that task. The level for each task is dynamically calculated and is used to determine
the search direction. However, the considered computing system is based on homoge-
neous processors. Kim et al. [18] report a deterministic local search-based scheduling
algorithm. The algorithm starts with a schedule found by a deterministic scheduling
algorithm and then iteratively attempts to improve the current best solution using a
deterministic guided search method based on prior knowledge about the task schedul-
ing problem and the target computing environment. The main idea is to move tasks to
fill the idle periods of processors. One of the major limitations of this approach is that
it assumes complete knowledge of the problem and that the information about tasks
and communications is always accurate. However, many external events can modify
the parameters of the scheduling problem. Kang et al. [14] propose an iterated greedy
algorithm. The main idea of this algorithm is to improve the quality of the assign-
ment in an iterative manner using results from previous iterations. The algorithm first
uses a constructive heuristic to find an initial assignment and iteratively improves it in
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 475 -
a greedy way. The authors consider additional resource constraints in the scheduling
problem. However, in this work we do not consider such constraints.
4 Iterated Local Search Based Scheduling
The solution we propose in this work is a scheduling algorithm based on a local search.
Local search was one of the early techniques for combinatorial optimization. The princi-
ple is to refine a given initial solution point in the solution space by searching through a
neighborhood of the solution point. If an improvement can be achieved in this manner,
then a new solution is obtained. This process is continued until no further improvement
can be obtained. However, in many cases, local minima are common and local search
algorithms can converge quickly to these local minima and get stuck in a local optimum
solution far away from the global optimal. We propose a local search algorithm based
on an iterated local search (ILS) to alleviate this problem.
Algorithm 1 shows the pseudocode of ILS we have used in this paper. The ILS
algorithm is a trajectory-based metaheuristic that can be seen as a straight-forward, yet
powerful technique for extending simple local search algorithms. The algorithm starts
off by generating and evaluating an initial solution. The search process (i.e., the initial
solution) can be initialized in various ways, for example, by starting from a randomly
generated initial solution or from a heuristically constructed solution to the given
problem. Then, following an iteration based approach, it seeks to improve the solutions
from one iteration to the next. At each iteration, a perturbation of the obtained local
optimum is carried out. The perturbation mechanism introduces a modification to a
given candidate solution to allow the search process to escape from local optimum. A
local search is applied to the perturbed solution. The new solution is then evaluated
and accepted as the new current solution under some conditions. The algorithm finishes
when the termination condition is met. We will detail in the following subsections the
core components of the ILS customized to the task scheduling problem.
Algorithm 1 Algorithm outline of ILS based scheduling algorithm
1: sol = GenerateInitialSolution();
2: EvaluateSolution(sol);
3: bestSol = sol;
4: repeat
5: Perturbation(sol);
6: LocalSearch(sol);
7: EvaluateSolution(sol);
8: if sol <bestSol then
9: bestSol = sol;
10: end if
11: until termination condition met
4.1 Initial solution
The ILS based scheduling algorithm starts of by generating an initial feasible solution
as the starting point of search by the local search procedure. We have constructed the
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 476 -
initial solutions as follows. First, we sort tasks by priority, then we select the task with
highest priority and schedule the task to the processor that optimizes a predetermined
objective function. To assign priorities, two different methods have been evaluated, the
first one based on the blevel method and the second one is based on list scheduling
principle using random feasible orders. The main idea of random feasible orders is
to randomly select a task among the ready tasks and place that task in the top of
a priority list. A task is ready when it is an entry task or when all its predecessors
have already been selected and are in the priority list. The tasks are selected from the
priority list and scheduled on the basis of the HEFT heuristic, that is assigning the
task to the processor that minimizes the earliest finish time. In the case of blevel +
HEFT, the quality of the initial solution is equal to that of the HEFT algorithm. In
case of random feasible orders the quality of the solution depends on the priority of the
tasks. In this case, we generate 50 initial random feasible solutions. We evaluate each
of the generated solutions and we keep the best among them. Hence, we have evaluated
two different methods of constructing the initial solution.
4.2 Perturbation process
The perturbation process is an essential aspect determining the good behavior of an
ILS algorithm, as well as the local search process.
We have tested two different configurations for the perturbation method. As that,
the search space is differently explored. In the first perturbation process, called prob-
ability based movement, every task tiof the current solution has a probability to be
changed from its current processor. If the probability occurs the task tiwill be moved
from its current location and assigned to a new random processor. The second per-
turbation process, called random movement moves only one task at a time from its
current location to a random processor. Although the new assignment in the perturba-
tion methods can be selected by a sophisticated criterion, we have decided to randomly
select the new processor to keep time complexity of the algorithm as low as possible.
4.3 Local search
We propose a local search based on the first improvement type pivoting rule. The first
improvement strategy tries to avoid the time complexity of evaluating all neighborhoods
by performing the first improvement step encountered during the inspection of the
neighborhood. The proposed algorithm evaluates the neighboring candidate solutions
in a particular fixed order. We use the order of the priority list computed in the initial
solution construction (i.e., blevel or random feasible order) as classical list scheduling
algorithms. For each task in the list the neighborhood is defined to be the set of
assignments that can be obtained by removing the task from its current processor and
reallocating it to another. If a makespan improvement is found, then the search process
is restarted from the new solution.
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 477 -
4.4 Acceptance criterion and termination condition
An important step in the ILS algorithm is to consider whether the new schedule is
accepted or not as the incumbent solution for the next iteration. The proposed ILS
accepts solutions if and only if the makespan is improved, and rejects them otherwise.
Different conditions can be used to stop the algorithm. The ILS based scheduler
stops the process when the algorithm reaches a maximum number of iterations.
5 Synthetic Benchmark and Performance Comparison
Many heuristics have been developed to solve the heterogeneous DAG scheduling prob-
lem. Most of them use HEFT as a basis for comparison because of the lack of generally
accepted standard benchmarks for the evaluation of scheduling heuristics in the het-
erogeneous distributed computing systems. In this work, we construct a benchmark as
follows. We have collected a set of small size synthetic deterministic non-preemptive
program graphs from literature. We compute the optimal solution and the optimal
value by an enumerative algorithm that performs an exhaustive search. The algorithm
explores all the possible solutions in the search space by keeping the best found so-
lution. The problem is reduced to generate possible permutations with and without
repetitions and evaluate the resulting solutions.
Since generating optimal solutions for arbitrarily structured task graphs takes ex-
ponential time, it is not feasible to obtain optimal solutions for larger graphs [22].
However, to investigate the scalability of the proposed ILS algorithm we have also
used a set of structured real-world parallel applications. The applications used for
our experiments are the robot control application and sparse matrix solver from the
Standard Task Graph set (STG) [32], and a subroutine of the Laser Interferometer
Gravitational-wave Observatory (LIGO) application [6]. Table 1 summarizes the main
characteristics for the used applications: instances size, edges amount and the ratio
between tasks and edges (ETR). ETR gives information of the degree of parallelism.
Table 1 Instance types: tasks and edges numbers, and Edge Task Ratio.
Type Tasks Edges ET R
LIGO 76 132 1.73
Robot 88 131 1.48
Sparse 96 67 0.69
The STG set is composed of homogenous instances with constant execution time
among machines. Since we are interested in heterogenous instances, we have only con-
sidered the structure of these applications and we have implemented the procedure
described in [33] to consider heterogeneity. We fixed the parameter βto 1. Parameter
βis basically the heterogeneity factor for processor speeds. A high percentage value
(i.e., a percentage of 1) causes a significant difference in a task’s computation cost
among the processors. For each graph we have varied the CCR ratio. A randomization
procedure which changes weight of edges was executed to assure the needed CCR. We
have generated five CCRs (0.1, 0.5, 1, 5, 10) for each graph. Tested system sizes were
8, 16 and 32 processors. We have generated 15 different instances for each application
type.
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 478 -
5.1 ILS Configurations
We have four different ILS configurations depending on the method used to generate
the initial solution and the perturbation process. The four ILS algorithms studied are
listed below.
– ILS-Alg1: It uses the blevel method to generate the order of tasks’ execution
and generate the initial solution. The perturbation process is the probability based
movement.
– ILS-Alg2: The initial solution is based on the blevel method and the used pertur-
bation process is the random movement.
– ILS-Alg3: The algorithm generates the initial solutions based on the random fea-
sible orders. The algorithm generates 50 solutions and the best one is used as the
initial solution. The perturbation process is the probability based movement.
– ILS-Alg4: To construct the initial solution the algorithm generates 50 solutions
using the random feasible orders method and keep the best among all the generated
solutions. The algorithm applies the random movement to perturb the solution
during the iteration process.
We study in Figure 2 the effects of different probabilities of the perturbation process
based on the probability movement in order to identify the probability to set in the
experiments. For this experiment we used a LIGO instance in an eight processors
system, and CCR equal to 5. Figure 2 shows a typical run of the algorithms for different
probabilities from 0.05 to 0.2 with 0.05 increment. We only show the results obtained
by ILS-Alg1, however the results for the rest of algorithms show the same behavior.
The main result that can be drawn from this study is that probabilities equal to 0.05
and 0.15 perform better than the other two probabilities (equal to 0.1 and 0.2). When
comparing these two probabilities we noticed that using a low probability (0.05) leads
to a faster convergence. Therefore the results we report for the algorithms ILS-Alg1
and ILS-Alg3 were computed using that probability movement. The maximum number
of iterations in the local search process without improving the current best solution
was set to 50. The maximum number of iterations per algorithm (i.e., the termination
condition) was set to 15. The algorithms are executed 15 times on each instance and
each run is independent.
5.2 Results for Synthetic Deterministic Program Graphs
We report in this section some preliminary results obtained with the ILS algorithms. We
compare the results against the global optimal value computed with the enumerative
search method. We also compare the results obtained by HEFT. Table 2 shows the
results for the results reported in literature and the optimal value. The first column
corresponds to the name of the instance. The name of the instance is composed as
follows: id m n bestres, where id represents the surname of the main author of the
reference where the instances are proposed, mis the number of processors used in the
reference to schedule the instance, nis the size of the instances, and bestres corresponds
to the best value reported in literature. For example, instance Ahmad 3 9 28 is read as:
Ahmad the id of the instance simulated on three processors system, with 9 tasks in size
and the best result reported is equal to 28 units of time. The second column in Table 2
provides the reference where the instance is proposed. The third column presents the
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 479 -
720
740
760
780
800
820
840
860
880
900
0 20 40 60 80 100 120 140
Response Time
Number of Iterations
Prob 0.05
Prob 0.1
Prob 0.15
Prob 0.2
Fig. 2 Results for various probability values used by the perturbation process based on prob-
ability movement
Table 2 Results for synthetic deterministic program graphs known in the literature and the
optimal value
Instance Reference Best Result optimal
Ahmad 3 9 28 [1] 28 22
Hsu 3 10 84 [12] 84 80
Eswari 2 11 61 [10] 61 56
Hamid 3 10 [2] ? 100
Heteropar 4 12 124 [3] 124 124
Ilav arasan 3 10 77 [13] 77 73
Kang 3 10 76 [16] 76 73
Kang 3 10 84 [15] 84 79
Kuan 3 10 28 [25] 28 26
Liang 3 10 80 [26] 80 73
Daoud 2 11 64 [8] 64 56
Y CLee 3 8 80 [27] 80 66
Sample 3 8 100 100 81
SampleF ig 1 3 8 89 89 66
best result reported in literature. The fourth column shows the optimal value computed
with the enumerative search algorithm. Table 3 depicts the results computed by HEFT
in the third column. The last four columns present the best response time computed by
each one of the four ILS configurations and the number of times over 15 independent
runs that the best value is equal to the optimal value. Rows in gray color highlight
the algorithm that is able to find the optimal value for a given instance in all the
executions.
As we can notice in Table 2 the optimal solutions can not be always found by the
algorithms despite having small size instances with maximum four processor system
and 12 tasks. Moreover, there are some instances for which the ILS configurations are
not able to find the optimal solutions in each run. Recall that the scheduling problem
is NP-hard even for two processors system.
From Table 3 it can be observed that the ILS configurations with initial solutions
generated using random feasible orders (i.e., ILS Alg3 and ILS Alg4) find the optimal
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 480 -
Table 3 Comparison results for synthetic deterministic program graphs known in the litera-
ture
Instance optimal HEFT ILS Alg1 ILS Alg2 ILS Alg3 ILS Alg4
Ahmad 3 9 28 22 24 23 (0) 23 (0) 22 (15) 22 (15)
Hsu 3 10 84 80 92 80 (12) 80 (13) 80 (14) 80 (15)
Eswari 2 11 61 61 56 76 58 (0) 58 (0) 56 (4) 56 (2)
Hamid 3 10 100 110 100 (10) 100 (10) 100 (15) 100 (15)
Heteropar 4 12 124 124 150 124 (12) 124 (15) 124 (15) 124 (15)
Ilav arasan 3 10 77 73 80 73 (13) 73 (13) 73 (10) 73 (10)
Kang 3 10 76 73 80 75 (0) 75 (0) 73 (5) 73 (4)
Kang 3 10 84 79 109 82 (0) 82 (0) 83 (0) 83 (0)
Kuan 3 10 28 26 30 26 (15) 26 (15) 26 (8) 26 (11)
Liang 3 10 80 73 80 73 (13) 73 (13) 73 (15) 73 (10)
Daoud 2 11 64 56 76 58 (0) 58 (0) 56 (3) 56 (1)
Y CLee 3 8 80 66 88 68 (0) 68 (0) 66 (15) 66 (15)
Sample 3 8 100 81 84 81 (1) 81 (7) 81 (4) 81 (4)
SampleF ig 1 3 8 89 66 89 69 (0) 69 (0) 66 (15) 66 (15)
value for a bigger number of instances than the ILS algorithms using blevel as initial
solution (i.e., ILS Alg1 and ILS Alg2). The main reason is that by generating random
initial solutions the algorithm is able to explore different regions of the search space.
However, using the deterministic blevel method the algorithms are restricted to a spe-
cific region of the search space. In fact, the ILS Alg3 and ILS Alg4 algorithms find
the optimal value in around 93% of the instances and the algorithms ILS Alg1 and
ILS Alg2 find the optimal value for only 56% of the instances. The ILS configurations
with random feasible orders not only find the optimal value for a bigger number of
instances than ILS configurations with blevel, but also are able to find optimal values
in all the executions for a bigger number of instances than ILS Alg1 and ILS Alg2. It
can be verified in Table 3 with rows highlighted in gray color.
Table 4 Approximation Factor
Instance HEFT ILS Alg1 ILS Alg2 ILS Alg3 ILS Alg4
Ahmad 3 9 28 1.090 1.045 1.045 1 1
Hsu 3 10 84 1.150 1.012 1.011 1.001 1
Eswari 2 11 61 1.366 1.061 1.038 1.054 1.064
Hamid 3 10 1.100 1.019 1.019 1 1
Heteropar 4 12 124 1.209 1.036 1 1 1
Ilav arasan 3 10 77 1.095 1.005 1.005 1.012 1.013
Kang 3 10 76 1.095 1.030 1.030 1.025 1.026
Kang 3 10 84 1.379 1.043 1.046 1.058 1.049
Kuan 3 10 28 1.153 1 1 1.028 1.015
Liang 3 10 80 1.095 1.005 1.005 1 1.010
Daoud 2 11 64 1.366 1.057 1.042 1.058 1.056
Y CLee 3 8 80 1.333 1.030 1.030 1 1
Sample 3 8 100 1.037 1.018 1.008 1.012 1.009
SampleF ig 1 3 8 89 1.348 1.045 1.045 1 1
Average 1.191 1.029 1.022 1.017 1.018
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 481 -
Table 4 provides more details about the performance comparison by presenting
the approximation factor (ρ) for each algorithm. First column provides the name of
instances. Second column shows the approximation factor for the HEFT algorithm.
Recall that HEFT is a well-known scheduling algorithm and is often used as a basis
for comparison and validation of new proposed heuristics. However, to the best of our
knowledge no comparison has been provided between the results computed by HEFT
and the optimal value. The remainder columns present the approximation factor for the
ILS-configurations. We provide average approximation factor for the ILS algorithms.
From Table 4 we observe that the ILS algorithm with the different configurations
outperform HEFT. The ILS algorithms are around 3% on average (for ILS Alg1) from
the optimal values, however HEFT is around 19% on average from the optimal values.
We see that ILS configurations based on random feasible orders compute best results
than ILS algorithms using blevel to generate the initial solution. However, the results
are comparable among the four configurations.
5.3 Results for Real-world Parallel Applications
In this section we present results using the set of real-world parallel applications. The
aim is to evaluate the scalability of the ILS configurations. We present statistical results.
In our test, the algorithms were run for 15 independent runs. We have evaluated the
ILS algorithms using a Friedman test [21]. The Friedman test is a non-parametric
statistical tool that allows to compare a set of a non normalized population to verify if
does exist significant statistical differences in the sample. Moreover, the ranking shows
which algorithm has the higher performance.
Table 5 provides the results obtained after applying the Friedman test. The rank-
ing shows the performance of the algorithms. As we are dealing with a minimization
problem, the higher the value of the ranking, the better the algorithm performs in
finding a solution for the set of instances. We can observe that the behavior of the
algorithms is different when the problem scale regarding small size instances. The rank
shows that ILS-Alg4 is still computing best results than the other three configurations.
However, ILS-Alg2 shows better performance than ILS-Alg1 and ILS-ALg3. The com-
mon characteristics of both algorithms is that ILS-Alg4 and ILS-Alg2 use the same
perturbation process, which slightly change the current solution by moving only one
task at a time then exploring the neighbors of the perturbed solution. The main reason
is that the changes made by the algorithms (moving only one task) do not violate a
structure of the population in a huge range. At once the algorithm can continuously
search better and better solution [31]. However, it can be interesting to investigate a
more sophisticated heuristic to decide which task to move based on some knowledge of
the current structure of the scheduling problem under consideration without increasing
complexity. We can assume that the ILS algorithm provides results better or compara-
ble than HEFT. We can verify the results by considering that ILS-Alg1 and ILS-Alg2
use as initial solution the output of HEFT, hence if ILS does not improve the initial
solution we have the same results than HEFT.
We performed a statistical hypothesis testing to verify if the difference between the
algorithm that provides the best results are significant if the null hypothesis is rejected,
or otherwise, if due to the random nature of the solutions, the obtained results may be
considered statistically equivalent.
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 482 -
Table 5 Average Rankings of the algorithms on all the set of instances
Algorithm LIGO Robot Sparse
ILS-Alg1 2.2888 2.3733 2.1155
ILS-Alg2 3.0466 2.9333 3.2355
ILS-Alg3 1.3444 1.3000 1.6155
ILS-Alg4 3.3200 3.3933 3.0333
Table 6 presents the statistical tests that take into account the p-value obtained
between all the algorithms with the best performing one (ILS-Alg4), with a level of
significance of α= 0.05, if the observed value is larger than 1 αquantile of this
distribution, the null hypothesis is rejected. As mentioned, the results indicates that
at least the candidate algorithm gives better performance than at least one of the
compared algorithms.
In Table 6 we can observe that ILS-Alg4 outperforms ILS-Alg1 and ILS-Alg3 config-
urations, but is statistically equivalent to ILS-Alg2. Meanwhile, ILS-Alg4 finds better
solutions than ILS-Alg2 when considering Robot application.
Table 6 Statistical hypothesis testing for the best algorithm by Makespan with α= 0.05/4
(Bonferroni correction)
Set ILS-Alg4 vs. p-value H0Rejected
Algorithm
ILS-Alg1 2.4237E-17 Yes
LIGO ILS-Alg2 0.0247 No
ILS-Alg3 3.0522E-59 Yes
ILS-Alg1 5.2857E-17 Yes
Robot ILS-Alg2 1.5727E-4 Yes
ILS-Alg3 2.7254E-66 Yes
ILS-Alg1 4.6884E-14 Yes
Sparse ILS-Alg2 0.0966 No
ILS-Alg3 2.3426E-31 Yes
6 Conclusions and Future Work
In this paper we have proposed an iterative local search based scheduling algorithm
to solve the precedence-constrained scheduling problem in heterogeneous computing
systems. We have generated a synthetic benchmark composed of small size determin-
istic parallel applications. We computed the optimal solutions by using an exhaustive
enumerative search algorithm. The generated benchmark can be used as a basis for
comparison during the design of new scheduling heuristics.
We have investigated four configurations for the proposed ILS algorithm. We com-
pared the results against optimal solutions. From the set of experimental results we
observed that ILS benefited not only from the generation of several feasible orders
(diversifying solutions) and considering the best one instead of generating only one
solution, but also from the small perturbations in the current solution intensifying the
search in its neighborhood, two basic considerations when designing local search meth-
ods. One of the main advantages of the propose ILS is that it can be coupled with a
more sophisticated heuristic without increasing complexity.
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 483 -
We plan to extend the proposed work. First, we consider to extend the benchmark.
For that, we plan to parallelize the enumerative search method and design a branch
and bound algorithm. We consider to design more sophisticated heuristics to decide
which task to move during the perturbation process. One possibility to explore is to
move the task with highest CCR, such that the communication delay can be reduced.
Nowadays, one important aspect to consider is the size of the workflows which usually
are composed of thousands of tasks, we plan to evaluate the scalability of proposed
local search in bigger workflows, for that, we consider to apply a partitioning technique
to first decompose the DAG graph in subgraphs, and then apply our local search in
a cooperative way to locally optimizes these subgraphs, while considering the global
optimization.
Acknowledgements This work is partially supported by the National Research Fund, Lux-
embourg in the framework of the AFR Green Energy-Efficient Computing project (PDR-09-
067). The mexican researchers was supported by the Consejo Nacional de Ciencia y Tecnologia
(CONACyT), Mexico.
References
1. Ahmad, I., Dhodhi, M., UI-Mustafa, R.: Dps: dynamic priority scheduling heuristic for
heterogeneous computing systems. IEE Proceedings: Computers and Digital Techniques
145(6), 411–418 (1998)
2. Arabnejad, H.: List based task scheduling algorithms on heterogeneous systems - an
overview. http://paginas.fe.up.pt/˜prodei/dsie12/papers/paper 30.pdf (2011). Available
Online. Consulted January, 2013
3. Arabnejad, H., Barbosa, J.: Performance evaluation of list based scheduling on heteroge-
neous systems. http://icl.eecs.utk.edu/heteropar2011/slides/heteropar
JorgeBarbosa.pdf. HeteroPar’11. Consulted online 2012
4. Arabnejad, H., Barbosa, J.G.: Performance evaluation of list based scheduling on heteroge-
neous systems. In: Proceedings of the 2011 international conference on Parallel Processing,
Euro-Par’11, pp. 440–449. Springer-Verlag, Berlin, Heidelberg (2012)
5. Bittencourt, L.F., Sakellariou, R., Madeira, E.R.M.: Dag scheduling using a lookahead
variant of the heterogeneous earliest finish time algorithm. In: Proceedings of the 2010
18th Euromicro Conference on Parallel, Distributed and Network-based Processing, PDP
’10, pp. 27–34. IEEE Computer Society (2010)
6. Brown, D., Brady, P., Dietz, A., Cao, J., Johnson, B., McNabb, J.: A case study on the
use of workflow technologies for scientific analysis: Gravitational wave data analysis. In:
I. Taylor, E. Deelman, D. Gannon, M. Shields (eds.) Workflows for e-Science, pp. 39–59.
Springer London (2007)
7. Coffman, E.G.: Computer and jobshop scheduling theory. John Wiley & Sons Inc (1976)
8. Daoud, M.I., Kharma, N.: A hybrid heuristic-genetic algorithm for task scheduling in
heterogeneous processor networks. J. Parallel Distrib. Comput. 71(11), 1518–1531 (2011)
9. Demir¨oz, B., Topcuoglu, H.R.: Static task scheduling with a unified objective on time and
resource domains. Comput. J. 49(6), 731–743 (2006)
10. Eswari, R., Nickolas, S.: Path-based heuristic task scheduling algorithm for heterogeneous
distributed computing systems. In: Proceedings of the 2010 International Conference on
Advances in Recent Technologies in Communication and Computing, ARTCOM ’10, pp.
30–34. IEEE Computer Society, Washington, DC, USA (2010)
11. Hirales-Carbajal, A., Tchernykh, A., Yahyapour, R., Gonz´alez-Garc´ıa, J., R¨oblitz, T.,
Ram´ırez-Alcaraz, J.: Multiple workflow scheduling strategies with user run time estimates
on a grid. J. Grid Comput. 10(2), 325–346 (2012)
12. Hsu, C.H., Hsieh, C.W., Yang, C.T.: A generalized critical task anticipation technique for
dag scheduling. In: Proceedings of the 7th international conference on Algorithms and
architectures for parallel processing, ICA3PP’07, pp. 493–505. Springer-Verlag, Berlin,
Heidelberg (2007)
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 484 -
13. Ilavarasan, E., Thambidurai, P., Mahilmannan, R.: Performance effective task scheduling
algorithm for heterogeneous computing system. In: Proceedings of the The 4th Interna-
tional Symposium on Parallel and Distributed Computing, ISPDC ’05, pp. 28–38. IEEE
Computer Society, Washington, DC, USA (2005)
14. Kang, Q., He, H., Song, H.: Task assignment in heterogeneous computing systems using
an effective iterated greedy algorithm. J. Syst. Softw. 84(6), 985–992 (2011)
15. Kang, Y., Lin, Y.: A recursive algorithm for scheduling of tasks in a heterogeneous dis-
tributed environment. In: Y. Ding, Y. Peng, R. Shi, K. Hao, L. Wang (eds.) BMEI, pp.
2099–2103. IEEE (2011)
16. Kang, Y., Zhang, Z., Chen, P.: An activity-based genetic algorithm approach to multipro-
cessor scheduling. In: Y. Ding, H. Wang, N. Xiong, K. Hao, L. Wang (eds.) ICNC, pp.
1048–1052. IEEE (2011)
17. Khan, S.U., Ahmad, I.: A cooperative game theoretical technique for joint optimization
of energy consumption and response time in computational grids. IEEE Trans. Parallel
Distrib. Syst. 20(3), 346–360 (2009)
18. Kim, S.C., Lee, S., Hahm, J.: Push-pull: Deterministic search-based dag scheduling for
heterogeneous cluster systems. IEEE Trans. Parallel Distrib. Syst. 18(11), 1489–1502
(2007)
19. Kolodziej, J., Khan, S.U.: Multi-level hierarchic genetic-based scheduling of independent
jobs in dynamic heterogeneous grid environment. Inf. Sci. 214, 1–19 (2012)
20. Kolodziej, J., Khan, S.U., Xhafa, F.: Genetic algorithms for energy-aware scheduling in
computational grids. In: F. Xhafa, L. Barolli, J. Kolodziej, S.U. Khan (eds.) 3PGCIC, pp.
17–24. IEEE (2011)
21. Kvam, P.H., Vidakovic, B.: Nonparametric Statistics with Applications to Science and
Engineering (Wiley Series in Probability and Statistics). Wiley-Interscience (2007)
22. Kwok, Y.K., Ahmad, I.: Efficient scheduling of arbitrary task graphs to multiprocessors
using a parallel genetic algorithm. J. Parallel Distrib. Comput. 47(1), 58–77 (1997). DOI
10.1006/jpdc.1997.1395. URL http://dx.doi.org/10.1006/jpdc.1997.1395
23. Kwok, Y.K., Ahmad, I.: Fastest: A practical low-complexity algorithm for compile-time
assignment of parallel programs to multiprocessors. IEEE Trans. Parallel Distrib. Syst.
10(2), 147–159 (1999)
24. Kwok, Y.K., Ahmad, I., Gu, J.: Fast: A low-complexity algorithm for efficient scheduling
of dags on parallel processors. In: ICPP, Vol. 2, pp. 150–157 (1996)
25. Lai, K.C., Yang, C.T.: A dominant predecessor duplication scheduling algorithm for het-
erogeneous systems. J. Supercomput. 44(2), 126–145 (2008)
26. Lee, L.T., Chen, C.W., Chang, H.Y., Tang, C.C., Pan, K.C.: A non-critical path earliest-
finish algorithm for inter-dependent tasks in heterogeneous computing environments. In:
HPCC, pp. 603–608. IEEE (2009)
27. Lee, Y.C., Zomaya, A.: A novel state transition method for metaheuristic-based scheduling
in heterogeneous computing systems. IEEE Trans. Parallel Distrib. Syst. 19(9), 1215–1223
(2008)
28. Lee, Y.C., Zomaya, A.Y.: Energy conscious scheduling for distributed computing systems
under different operating conditions. IEEE Trans. Parallel Distrib. Syst. 22(8), 1374–1381
(2011)
29. Liu, G.Q., Poh, K.L., Xie, M.: Iterative list scheduling for heterogeneous computing. J.
Parallel Distrib. Comput. 65(5), 654–665 (2005)
30. Shen, L., Choe, T.Y.: Posterior task scheduling algorithms for heterogeneous computing
systems. In: Proceedings of the 7th international conference on High performance com-
puting for computational science, VECPAR’06, pp. 172–183. Springer-Verlag (2007)
31. Switalski, P., Seredynski, F.: Multiprocessor scheduling by generalized extremal optimiza-
tion. J. of Scheduling 13(5), 531–543 (2010)
32. Tobita, T., Kasahara, H.: A standard task graph set for fair evaluation of multiprocessor
scheduling algorithms. Journal of Scheduling 5(5), 379–394 (2002)
33. Topcuoglu, H., Hariri, S., Wu, M.Y.: Performance-effective and low-complexity task
scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–
274 (2002)
34. Wu, M.Y., Shu, W., Gu, J.: Efficient local search for dag scheduling. IEEE Trans. Parallel
Distrib. Syst. 12(6), 617–627 (2001)
35. Zhao, H., Sakellariou, R.: An experimental investigation into the rank function of the
heterogeneous earliest finish time scheduling algorithm. In: H. Kosch, H. B¨osz¨orm´enyi
aszl´oand Hellwagner (eds.) Euro-Par 2003 Parallel Processing, Lecture Notes in Com-
puter Science, vol. 2790, pp. 189–194. Springer Berlin / Heidelberg (2003)
6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013)
27-29 August 2010, Gent, Belgium
- 485 -
... proposed, the best ILS nearly reaches all the optimal values, obtaining an approximation factor of 1.018 (desirable values are near to 1.0). Thus, we assess our proposal with the best-proposed algorithm in [11]. ...
... Scheduling is a well-known NP-hard optimization problem [2], as well as the task scheduling for parallel systems [3]. Therefore, different scheduling approaches were developed for this problem: Heuristics [4][5][6][7], Local Searches [8][9][10], and Metaheuristics [11][12][13][14][15]. Unfortunately, the wide variety of works in the state-of-the-art differs in their objective definitions and uses different sets of instances. ...
... Therefore, different scheduling approaches were developed for this problem: Heuristics [4][5][6][7], Local Searches [8][9][10], and Metaheuristics [11][12][13][14][15]. Unfortunately, the wide variety of works in the state-of-the-art differs in their objective definitions and uses different sets of instances. From the relevant works in the state-of-the-art, we highlight [11]. To our knowledge, it is the only work that compares the obtained results against the optimal values for their synthetic instances, which are fourteen scheduling problem instances ...
Article
Full-text available
High-Performance Computing systems rely on the software's capability to be highly parallelized in individual computing tasks. However, even with a high parallelization level, poor scheduling can lead to long runtimes; this scheduling is in itself an NP-hard problem. Therefore, it is our interest to use a heuristic approach, particularly Cellular Processing Algorithms (CPA), which is a novel metaheuristic framework for optimization. This framework has its foundation in exploring the search space by multiple Processing Cells that communicate to exploit the search and in the individual stagnation detection mechanism in the Processing Cells. In this paper, we proposed using a Greedy Randomized Adaptive Search Procedure (GRASP) to look for promising task execution orders; later, a CPA formed with Iterated Local Search (ILS) Processing Cells is used for the optimization. We assess our approach with a high-performance ILS state-of-the-art approach. Experimental results show that the CPA outperforms the previous ILS in real applications and synthetic instances.
... Also, multi-objective problems are rarely tackled by exact methods [12]. A benchmark with unknown optimal Pareto front (PF) was chosen to evaluate the proposed algorithm [13,14]. The main contributions in this work are: a multi-objective B&B algorithm based on Pareto dominance and the the first available benchmark for HCSMOP that include the optimal front of the instances. ...
... The experiment was done sequentially using one core from the node. The instances solved are a subset of benchmark datasets reported in [13,14] combined with instances from the literature, given a total of 40 instances (See Table 5), for those instances we ignored the precedencies section. The instances have between 2 and 5 machines. ...
Article
Full-text available
In this paper the Pareto optimization of the Heterogeneous Computing Scheduling Multi-Objective Problem (HCSMOP) is approached. The goal is to minimize two objectives which are in conflict: the overall completion time (makespan) and the energy consumed. In the revised literature, there are no reported exact algorithms which solve the HCSMOP. In this work, we propose a Branch and Bound algorithm to solve the problem and it is used to find the optimal Pareto front of a set of instances of the literature. This set is the first available benchmark to assess the performance of multiobjective algorithms with quality metrics that requires known the optimal front of the instances.
... To the best of our knowledge there is not a set of instances with optimal values for the multi-objective problem treated. The compilation of instances used for scheduling with precedence constraints in [10], is used as a benchmark. The optimal values of makespan are obtained with the linear programming model in [11], based on it we developed a linear programming model to compute energy. ...
... The proposed MILP model of two phases performs an exhaustive search, because of that a set of small instances is required, we use the benchmark set from [10] removing the precedence constraint. The voltage configurations are taken from [13] and are configured using the round robin principle; the first machine gets the first configuration, when no more configurations left the next machine gets the first configuration and so on. ...
Chapter
Full-text available
This chapter is focused on the problem of scheduling independent tasks on heterogeneous machines. The main contributions of our work are the following: a linear programming model to compute energy consumption for the execution of independent tasks on heterogeneous clusters, a constructive heuristic based on local search, and a new benchmark set. To assess our approach we compare the performance of two solution methods: a memetic algorithm, based on population search and local search, and a seeded genetic algorithm, based on NSGA-II. A Wilcoxon rank-sum test shows significant differences in the diversity of solutions found but not in hypervolume. The memetic algorithm gets the best diversity for a bigger instance set from the state of the art.
... But in order to achieve the benefits of exploitation, any of the local search algorithms should be used. The proposed work applies the first improvement type pivoting rule (Pineda et al. 2013) based on flight length, in order to fully explore a prominent region. This method is chosen due to its simplicity and efficiency. ...
Article
Full-text available
Fog computing is an emerging computing paradigm that extends traditional cloud computing by leveraging the resources at the user premises for providing better services. It is preferred for many real-time applications because of its advantages such as reduced network latency, improved security, and reduced operational costs. Due to the inherent heterogeneity among the fog devices, resource allocation and scheduling is a challenging task. This paper utilizes a multi-objective population-based metaheuristic optimizer called the crow search algorithm for resource allocation and scheduling in the fog computing environment. The two different objectives considered by the proposed work are namely: success ratio and the security hit ratio. Both of these objectives need to be maximized. To enhance the performance of the crow search algorithm, a local search method is utilized. The proposed work applies the metaheuristic technique for solving resource allocation and scheduling in the fog environment. The performance of the proposed algorithm is compared with the other existing algorithms, and the comparison results demonstrate the efficiency of the proposed algorithms in achieving the stated objectives.
... Inside the machine tasks scheduling literature, local search is generally a part of a metaheuristic method; for example, we found: iterated local search [39][40][41][42][43], particle swarm optimization [44,45], ant colony optimization [46][47][48], memetic algorithm [49][50][51][52][53], GRASP [18], and variable neighborhood search [54], among others. In the previous examples, local search plays a crucial role in their performances, so we can infer a straightforward improvement is through new local search designs and studies to enlighten ways of improving the final performance of these methods. ...
Article
Full-text available
The use of parallel applications in High-Performance Computing (HPC) demands high computing times and energy resources. Inadequate scheduling produces longer computing times which, in turn, increases energy consumption and monetary cost. Task scheduling is an NP-Hard problem; thus, several heuristics methods appear in the literature. The main approaches can be grouped into the following categories: fast heuristics, metaheuristics, and local search. Fast heuristics and metaheuristics are used when pre-scheduling times are short and long, respectively. The third is commonly used when pre-scheduling time is limited by CPU seconds or by objective function evaluations. This paper focuses on optimizing the scheduling of parallel applications, considering the energy consumption during the idle time while no tasks are executing. Additionally, we detail a comparative literature study of the performance of lexicographic variants with local searches adapted to be stochastic and aware of idle energy consumption.
... The experiments consists of 30 independent runs with an instance set from [20]. AMOSA uses the following parameters: Initial population = 100, T i = 4000, T f = 10, α = 0.95, L = 30, H L = 100 and SL = 150. ...
Chapter
In this chapter, an analytical parameter tuning for the Archive Multi-Objective Simulated Annealing (AMOSA) with a fuzzy logic controller is proposed. The analytical tuning is used to compute the initial and final temperature, as well as the maximum metropolis length. The fuzzy logic controller is used to adjust the metropolis length for each temperature. These algorithms are used to solve the Heterogeneous Computing Scheduling Problem. The tuned AMOSA with a fuzzy logic controller is compared against an AMOSA without tuning. Three quality indicators are used to compare the performance of the algorithms, these quality indicators are hypervolume, generational distance, and generalized spread. The experimental results show that the tuned AMOSA with fuzzy logic controller achieves the best performance.
... Meta-heuristics can be • Trajectory-based methods are methods where a solution space is searched and where the probability of choosing a better solution above a worse one is dependent on the moment of the trajectory in the timespan of the algorithm. Examples are Simulated Annealing [39,36,40], Iterative Local Search [41], Variable Neighborhood Search [42], and Tabu Search [36]. ...
Article
Automated guided vehicles (AGVs) form a large and important part of the logistic transport systems in today's industry. They are used on a large scale, especially in Europe, for over a decade. Current employed AGV systems and current systems offered by global manufacturers almost all operate under a form of centralized control: one central controller controls the whole fleet of AGVs. The authors do see a trend towards decentralized systems where AGVs make individual decisions favoring flexibility, robustness, and scalability of transportation. Promoted by the paradigm shift of Industry 4.0 and future requirements, more research is conducted towards the decentralization of AGV-systems in academia while global leading manufacturers start to take an active interest. That said, this implementation seems still in infancy. Currently, literature is dominated by central as well as by decentral control techniques and algorithms. For researchers in the field and for AGV developers, it is hard to find structure in the growing amount of algorithms for various types of applications. This paper is, to this purpose, meant to provide a good overview of all AGV-related control algorithms and techniques. Not only those that were used in the early stages of AGVs, but also the algorithms and techniques used in the most recent AGV-systems, as well as the algorithms and techniques with high potential.
Conference Paper
Full-text available
This paper addresses the problem of evaluating the schedules produced by list based scheduling algorithms, with metaheuristic algorithms. Task scheduling in heterogeneous systems is a NP-problem, therefore several heuristic approaches were proposed to solve it. These heuristics are categorized into several classes, such as list based, clustering and task duplication scheduling. Here we consider the list scheduling approach. The objective of this study is to assess the solutions obtained by list based algorithms to verify the space of improvement that new heuristics can have considering the solutions obtained with metaheuritcs that are higher time complexity approaches. We concluded that for a low Communication to Computation Ratio (CCR) of 0.1, the schedules given by the list scheduling approach is in average close to metaheuristic solutions. And for CCRs up to 1 the solutions are below 11% worse than the metaheuristic solutions, showing that it may not be worth to use higher complexity approaches and that the space to improve is narrow.
Article
Full-text available
In this paper, we present an experimental study of deterministic non-preemptive multiple workflow scheduling strategies on a Grid. We distinguish twenty five strategies depending on the type and amount of information they require. We analyze scheduling strategies that consist of two and four stages: labeling, adaptive allocation, prioritization, and parallel machine scheduling. We apply these strategies in the context of executing the Cybershake, Epigenomics, Genome, Inspiral, LIGO, Montage, and SIPHT workflows applications. In order to provide performance comparison, we performed a joint analysis considering three metrics. A case study is given and corresponding results indicate that well known DAG scheduling algorithms designed for single DAG and single machine settings are not well suited for Grid scheduling scenarios, where user run time estimates are available. We show that the proposed new strategies outperform other strategies in terms of approximation factor, mean critical path waiting time, and critical path slowdown. The robustness of these strategies is also discussed.
Chapter
Full-text available
Modern scientific experiments acquire large amounts of data that must be analyzed in subtle and complicated ways to extract the best results. The Laser Interferometer Gravitational Wave Observatory (LIGO) is an ambitious effort to detect gravitational waves produced by violent events in the universe, such as the collision of two black holes or the explosion of supernovae [37,258]. The experiment records approximately 1 TB of data per day, which is analyzed by scientists in a collaboration that spans four continents. LIGO and distributed computing have grown up side by side over the past decade, and the analysis strategies adopted by LIGO scientists have been strongly influenced by the increasing power of tools to manage distributed computing resources and the workflows to run on them. In this chapter, we use LIGO as an application case study in workflow design and implementation. The software architecture outlined here has been used with great efficacy to analyze LIGO data [2–5] using dedicated computing facilities operated by the LIGO Scientific Collaboration, the LIGO Data Grid. It is just the first step, however. Workflow design and implementation lies at the interface between computing and traditional scientific activities. In the conclusion, we outline a few directions for future development and provide some long-term vision for applications related to gravitational wave data analysis.
Article
A thorough and definitive book that fully addresses traditional and modern-day topics of nonparametric statistics. This book presents a practical approach to nonparametric statistical analysis and provides comprehensive coverage of both established and newly developed methods. With the use of MATLAB, the authors present information on theorems and rank tests in an applied fashion, with an emphasis on modern methods in regression and curve fitting, bootstrap confidence intervals, splines, wavelets, empirical likelihood, and goodness-of-fit testing. Nonparametric Statistics with Applications to Science and Engineering begins with succinct coverage of basic results for order statistics, methods of categorical data analysis, nonparametric regression, and curve fitting methods. The authors then focus on nonparametric procedures that are becoming more relevant to engineering researchers and practitioners. The important fundamental materials needed to effectively learn and apply the discussed methods are also provided throughout the book. Complete with exercise sets, chapter reviews, and a related Web site that features downloadable MATLAB applications, this book is an essential textbook for graduate courses in engineering and the physical sciences and also serves as a valuable reference for researchers who seek a more comprehensive understanding of modern nonparametric statistical methods.
Article
Given a parallel program represented by a task graph, the objective of a scheduling algorithm is to minimize the overall execution time of the program by properly assigning the nodes of the graph to the processors. In this paper we propose a novel GA-based algorithm with an objective to simultaneously meet the goals of high performance, scalability, and fast running time. The proposed PGS (Parallel Genetic Scheduling) algorithm itself is a parallel algorithm and generates high quality solutions in a short time. By encoding a scheduling list as a chromosome, the PGS algorithm can potentially generate an optimal scheduling list which in turn leads to an optimal schedule. The major strength of the PGS algorithm lies in its two efficient genetic operators: the order crossover and mutation which effectively combine the building-blocks of good scheduling lists to construct better lists. When evaluated with random task graphs for which optimal solutions are known, the algorithm generates optimal solutions for more than half of the test cases and close-to-optimal for the remaining cases.
Article
A ‘standard task graph set’ is proposed for fair evaluation of multiprocessor scheduling algorithms. Developers of multiprocessor scheduling algorithms usually evaluate them using randomly generated task graphs. This makes it difficult to compare the performance of algorithms developed in different research groups. To make it possible to evaluate algorithms under the same conditions so that their performances can be compared fairly, this paper proposes a standard task graph set covering many of the proposed task graph generation methods. This paper also evaluates as examples two heuristic algorithms (CP and CP/MISF), a practical sequential optimization algorithm (DF/IHS), and a practical parallel optimization algorithm (PDF/IHS) using the proposed standard task graph set. This set is available at http://www.kasahara.elec.waseda.ac.jp/schedule/. Copyright © 2002 John Wiley & Sons, Ltd.
Article
We propose a solution of the multiprocessor scheduling problem based on applying a relatively new metaheuristic technique, called Generalized Extremal Optimization (GEO). GEO is inspired by a simple coevolutionary model known as the Bak–Sneppen model. The model describes an ecosystem consisting of N species. Evolution in this model is driven by a process in which the weakest species in the ecosystem, together with its nearest neighbors, is always forced to mutate. This process shows the characteristics of a phenomenon called punctuated equilibrium, which is observed in evolutionary biology. We interpret the multiprocessor scheduling problem in terms of the Bak–Sneppen model and apply the GEO algorithm to solve the problem. We show that the proposed optimization technique is simple and yet outperforms genetic algorithm-based and swarm algorithm-based approaches to the multiprocessor scheduling problem. Multiprocessor scheduling problem-Generalized extremal optimization-GEO-Genetic algorithm-GA-PSO-Particle swarm optimization
Article
With the explosive growth in computers and the growing scarcity in electric supply, reduction of energy consumption in large-scale computing systems has become a research issue of paramount importance. In this paper, we study the problem of allocation of tasks onto a computational grid, with the aim to simultaneously minimize the energy consumption and the makespan subject to the constraints of deadlines and tasks' architectural requirements. We propose a solution from cooperative game theory based on the concept of Nash bargaining solution. In this cooperative game, machines collectively arrive at a decision that describes the task allocation that is collectively best for the system, ensuring that the allocations are both energy and makespan optimized. Through rigorous mathematical proofs we show that the proposed cooperative game in mere O(n mlog(m)) time (where n is the number of tasks and m is the number of machines in the system) produces a Nash bargaining solution that guarantees Pareto-optimally. The simulation results show that the proposed technique achieves superior performance compared to the greedy and linear relaxation (LR) heuristics, and with competitive performance relative to the optimal solution implemented in LINDO for small-scale problems.