ArticlePDF Available

Abstract and Figures

We deal with the problem of preference-based matchmaking of computational resources belonging to a Grid. We introduce CP–Nets, a recent development in the field of Artificial Intelligence, as a means to deal with user’s preferences in the context of Grid scheduling. We discuss CP–Nets from a theoretical perspective and then analyze, qualitatively and quantitatively, their impact on the matchmaking process, with the help of a Grid simulator we developed for this purpose. Many different experiments have been setup and carried out, and we report here our main findings and the lessons learnt.
Content may be subject to copyright.
Noname manuscript No.
(will be inserted by the editor)
Preference–based Matchmaking of Grid Resources
With CP–Nets
Massimo Cafaro ·Maria Mirto ·
Giovanni Aloisio
Received: date / Accepted: date
Abstract We deal with the problem of preference-based matchmaking of com-
putational resources belonging to a grid. We introduce CP–Nets, a recent de-
velopment in the field of Artificial Intelligence, as a means to deal with user’s
preferences in the context of grid scheduling. We discuss CP–Nets from a the-
oretical perspective and then analyze, qualitatively and quantitatively, their
impact on the matchmaking process, with the help of a grid simulator we
developed for this purpose. Many different experiments have been setup and
carried out, and we report here our main findings and the lessons learnt.
Keywords Grids ·Matchmaking ·CP–Nets
1 INTRODUCTION
Grid computing [16] emerged as a new paradigm distinguished from traditional
distributed computing because of its focus on large-scale resource sharing and
innovative high-performance applications. The grid infrastructure ties together
a number of Virtual Organizations (VOs) [17], that reflect dynamic collections
of individuals, institutions and computational resources.
A Grid Information Service (GIS) [12] aims at providing an information
rich environment to support service/resource discovery and decision making
M. Cafaro
University of Salento, Lecce, Italy
CMCC - Euro-Mediterranean Centre for Climate Change, Lecce, Italy
E-mail: massimo.cafaro@unisalento.it
M. Mirto
CMCC - Euro-Mediterranean Centre for Climate Change, Lecce, Italy
E-mail: maria.mirto@cmcc.it
G. Aloisio
University of Salento, Lecce, Italy
CMCC - Euro-Mediterranean Centre for Climate Change, Lecce, Italy
E-mail: giovanni.aloisio@unisalento.it
2 Massimo Cafaro et al.
processes. The main goal of grid environments is indeed the provision of flexi-
ble, secure and coordinated resource sharing among VOs to tackle large-scale
scientific problems, which in turn require addressing, besides other challeng-
ing issues like authentication/authorization, access to remote data etc., ser-
vice/resource discovery and management for scheduling and/or co-scheduling
of resources.
Information thus plays a key role allowing, if exploited, high performance
execution in grid environments: the use of manual or default/static configura-
tions hinders application performance, whereas the availability of information
regarding the execution environment fosters design and implementation of so-
called grid-aware applications.
Obviously, applications can react to changes in their execution environment
only if these changes are somehow advertised. Therefore, self-adjusting, adap-
tive applications are natural consumers of information produced in grid envi-
ronments where distributed computational resources and services are sources
and/or potential sinks of information, and the data produced can be static,
semi-dynamic or fully dynamic [33]. Resource brokering services and grid
schedulers also need to access this information for matchmaking available grid
resources against a user’s request [15] [47].
The problem of matchmaking available resources in a grid environment
against a user’s request entails finding one or more (a pooled set of) resources
that best match the user’s request. A matchmaking service is in charge of
finding the best match given the current status of the grid environments:
indeed, the same request may result in different matchings under different
resource load, etc.
The input to the matchmaking service is a job description expressed in
a specified formalism (e.g., the Job Submission Description Language [1], the
Job Description Language [34], the Condor’s classified advertisements [39] etc.)
containing constraints to be satisfied by resources to execute a batch, param-
eter sweep or workflow job.
Our contribution is two-fold. First, we propose to extend the matchmak-
ing process to take into account the user’s preferences (besides the usual con-
straints) and, in order to deal with preferences, we suggest the use of Condi-
tional Preference Networks (CP–Nets) [7], a powerful concept borrowed from
the field of Artificial Intelligence that can be used to describe, structure and
reason about user’s preferences. Second, we thoroughly analyze, both qualita-
tively and quantitatively, the impact of CP–Nets on the matchmaking process.
Our analysis takes into account both the resource broker (or grid scheduler)
and the users’ perspectives, in order to assess the validity of our approach.
It is worth remarking here that our focus is not on the scheduling process:
we limit ourself in this analysis to matchmaking only, i.e., to the problem
of finding a matching set of resources taking into account user’s constraints
and preferences. Therefore, in what follows, we will not deal with the problem
of scheduling grid resources using algorithms such as FCFS, SJF, backfilling
etc to achieve, for instance, minimization of the makespan metric and we do
not discuss grid scheduling systems such as GlideinWMS [37] etc. Instead, we
Preference–based Matchmaking of Grid Resources With CP–Nets 3
will utilize the simplest possible strategy: since our matchmaking algorithm
returns a set of resources ranked according to their matching degrees, we will
simply schedule the corresponding job on the first available resource. If this
resource is not available, we will try to schedule the job on the second resource
if available and so on, round-robin.
The rest of the paper is organized as follows. Section 2 introduces CP–Nets.
Our matchmaking approach based on CP–Nets is presented in Section 3. This
approach is implemented in the grid simulator we used for our tests, which is
described in Section 4. We discuss the impact of CP–Nets on the scheduling
process in Section 5, analyzing the results of several computer simulations. We
discuss related work in Section 6, and draw our conclusions in Section 7.
2 CP–Nets
Conditional Preference Networks [6] address the problem of representing and
reasoning with preferences over a multivariate domain; their broad applicabil-
ity to many fields such as, for instance, design, planning and decision making,
is related to the ability to succinctly specify and represent preference order-
ings graphically. This is an extremely important feature, owing to the fact that
correspondingly explicit representations of preference orderings of multivariate
domains are exponential in the number of variables and thus unfeasible. We
begin by formally defining preference relations.
DEFINITION 1. Preference relation.
Given a set of variables V={v1, ..., vn}and the outcome space O=
Dom(v1)×... ×Dom(vn), a preference relation or ranking is a total preorder
over O; if o1o2then the outcome o1is equally or more preferred than o2.
Each variable vi(also known as attribute or feature) may assume a value
belonging to Dom(vi) = {vi
1, ..., vi
ni}. The size of the set of outcomes Ois thus
exponential.
CP–Nets capture ceteris paribus (all else being equal) conditional prefer-
ence statements, whose semantics is based on the notion of preferential inde-
pendence.
DEFINITION 2. Preferential independence.
Let xdenote an assignment of values to a set XVand xy the concate-
nation of two assignments to Xand Ywith XY=. A set of variables X
is preferentially independent of its complement Y=VXiff
x1y1x2y1iff x1y2x2y2x1x2y1y2
When the preferential independence relation holds, x1is preferred to x2
ceteris paribus: fixing the values of all of the other variables, the preference
4 Massimo Cafaro et al.
relation (over assignments to the set X) holds independently of the values
taken by the other variables.
We are now ready to define Conditional preferential independence.
DEFINITION 3. Conditional preferential independence.
Given a partition of Vsuch that V=XYZ,Xand Yare conditionally
preferentially independent given ziff
x1y1zx2y1ziff x1y2zx2y2zx1x2y1y2
In practice, Xand Yare preferentially independent iff Zis assigned z. If
the relation holds for all possible assignments zthen Xand Yare conditionally
preferentially independent given Z.
The preference elicitation process requires that users specify for each vari-
able xVthe parent variable P arent(x) that can affect their preferences
over the values of x. The CP–Net graph is then constructed so that for each
node x,P arent(x) is the immediate predecessor. A more general approach is
to allow for P arent(x) to represent a set of vertices instead of a single vertex.
On the basis of the particular assignment to the vertex P arent(x) the user is
able to determine a specific preference order over Dom(x), the domain of the
variable x, all other things being equal. Thus, a CP–Net associates to each
assignment to P arent(x) a Conditional Preference Table (CPT).
DEFINITION 4. CP–Net.
A CP–Net is a directed graph G= (V , E). The set of vertices V=
{v1, ..., vn}represents the CP–Net variables and E={(vi, vj) : vi, vjV}is
the set of edges between variables. For each vV, the function P arent(v)
returns the vertex ¯vVsuch that (v, ¯v)E. The CPT specifies a strict
partial order i
uover Dom(xi) representing the conditional preference of the
instantiations of xifor a given instantiation uof P arent(xi).
Given the ceteris paribus preference statement “I prefer wine to beer with
my meal”, its interpretation is: given two identical meals, one with wine and
one with beer, I prefer the former. The statement “I prefer red wine to white
wine with my meal, ceteris paribus, given that meat is served” is interpreted as:
given two identical meals in which meat is served, I prefer red wine to white
wine. This tells us nothing about two identical meals in which meat is not
served. Ceteris Paribus preference statements induce independence relations,
for instance, if my preference for wine depends on (and only on) the main
course, then wine choice is conditionally preferentially independent of all other
variables given the main course value.
We now give an example describing how a CP–Net can be used to represent
the following preference statements:
I strictly prefer aix to linux as operating system;
Preference–based Matchmaking of Grid Resources With CP–Nets 5
I prefer power processors if the operating system is aix, and xeon processors
if the operating system is linux;
I prefer the EESL math library when using power processors and the NAG
library when using xeon processors.
Let the variables a,l,p,x,eand nrepresent respectively a preference for
aix, linux, power processor, xeon processor, EESL and NAG math library.
The first preference statement is unconditional; the other ones are condi-
tional. Fig. 1a shows the CP–Net related to the previous example. Each node
represents a domain variable, and the immediate parents P arent(v) of a vari-
able vin the network are those variables that affect the user’s preference over
the values of v. Associated to each node, there is a Conditional Preference
Table (CPT) which provides an ordering over the values of the node for each
possible parent’s context. Fig. 1b shows the corresponding Conditional Pref-
erence Graph, in which lpnrepresents the worst outcome and ape
the best one.
As discussed in [7], any acyclic CP–Net defines a consistent partial or-
der over the outcome space; given a CP–Net Nand two possible outcomes
xand y, a dominance query asks whether or not xyis a consequence of
the preferences of N. When the Conditional Preference Graph is a DAG (Di-
rected Acyclic Graph), it can be shown that, in order to answer the query,
a simple polynomial time sweep algorithm only needs to search for a flip-
ping sequence (path) from the less preferred outcome ythrough a series of
more preferred outcomes, to the more preferred outcome x, where each value
flip in the sequence is sanctioned by the network N. The time complexity of
the flipping-sequence search over binary–valued, DAG–structured CP–Nets is
O(n2), where nis the number of variables in the CP–Net [6]. For instance, the
dominance query axnlpecan be shown to be true in the example
of Fig. 1b owing to the fact that there exists a path (sequence of improving
flips) from one assignment to another (flipping sequence). This is a proof that
the latter assignment is preferred to the former.
Given a CP–Net N, generating an optimal outcome is even easier: this
requires sweeping through the network from top (ancestor vertices) to bottom
(descendent vertices) setting each variable to its most preferred value given the
instantiation of its parents. Even if the network does not, in general, determine
a unique ranking, however it determines a unique best outcome (assuming no
indifference). Therefore, outcome optimization queries can be answered using
the outlined forward sweep algorithm, whose complexity is O(n), so that it is
linear in the number nof variables [6].
Finally, CP–Nets also allow expressing relative importance relations. These
express the fact that one variable’s value is more important than another’s;
moreover, CP–Nets induce implicit importance relations between nodes and
their descendants. As an example, one could say that Processor type is more
important to me than operating system (all else being equal). If it is more
important to me that the value of xbe high than the value of ybe high, then
6 Massimo Cafaro et al.
xis more important than y. The notation to express this relative importance
relation is x ⊲ y.
A variable may be conditionally more important than another. For in-
stance, one could say the operating system is more important than processor
type (all else being equal), if the workstation is used primarily for graphical
applications. Given zDom(Z), if it is more important to me that the value
of xbe high than the value of ybe high, then xis conditionally more important
than y. The corresponding notation is x ⊲zy.
al
a:"#
$:#"
!:en
x:ne
Operating
System
CPU
Math
Library
(a) CP–Net
"n
"e
lxn
xn
"n
xe
"e
(b) Corresponding Conditional Pref-
erence Graph
Fig. 1: CP–Net and corresponding Conditional Preference Graph
3 CONDITIONAL PREFERENCE MATCHMAKING
In this Section we briefly review our approach to matchmaking available re-
sources in a grid environment against a user’s request. Matchmaking available
resources in a grid environment against a user’s request entails finding one or
more (a pooled set of) resources that best match the user’s request. Our match-
Preference–based Matchmaking of Grid Resources With CP–Nets 7
making service is in charge of finding the best match given the current status
of the grid environments, the user’s constraints and preferences: indeed, the
same request may result in different matchings under different resource load,
etc. The first input to the matchmaking service is a job description expressed
in a specified formalism (e.g., the Job Submission Description Language [1],
the Job Description Language [34], the Condor’s classified advertisements [39]
etc.) containing constraints to be satisfied by resources to execute a batch,
parameter sweep or workflow job (e.g. the machine’s memory must be at least
16 GB, the processor must be AMD etc). The second input are the user’s
preferences w.r.t. the resources to be used for the execution of his/her job. In
our simulator, both the constraint and the preferences are expressed using a
simple XML dialect, as shown in Section 4.4.
Algorithm 3.1: Conditional Preference Matchmaking algorithm
Input:J, a job to be scheduled; C, set of user’s constraints; P, set of user’s
preferences
Output: Job Jis scheduled on the resource that best matches the user’s request if
it exists
/* Query a GIS using the constraints in C*/
1RM C Query-GIS(C) ;
/* RM C is the set of resources matching the constraints */
2if card(RM C )== 0 then
/* no resource matches the constraints in C*/
3return No match ;
4else
5build the CP–Net graph using the preferences in P;
/* run the linear time outcome optimization query on the CP-Net */
6RM P C CP-Net(RMC, P) ;
/* RM P C is the set of resources matching both the preferences and the
constraints */
7if card(RM P C )== 0 then
/* no resource matches the preferences in P*/
/* so scheduling may use any heuristic */
8schedule job Jon one of the available resources in RM C ;
9else
10 FFilter(RMPC) ;
/* Fis a subset of resources in RMPC matching the preferences in Pand
the constraints in C, totally ordered from best to worst w.r.t.
preferences */
11 schedule job Jon the first available resource in F;
Algorithm 3.1 describes the steps to achieve Conditional Preference Match-
making. We start querying a Grid Information Service (GIS) using the con-
straints in Cto determine RM C , a set of resources matching the constraints
on which the user’s job may run (step 1). If RM C is empty, no machine on the
grid actually satisfies the user’s constraint, so that the scheduler discards the
job and informs the user (steps 2–3). Otherwise, in steps 4–11 we deal with
the resources in RM C.
8 Massimo Cafaro et al.
We begin by building the CP–Net graph related to the preferences Pin
step 5. Then, we run the linear time outcome optimization query on the CP–
Net in step 6, determining R MP C RM C, a subset of resources matching
both the user’s preferences and costraints. If RM P C is empty, the CP–Net
algorithm did not find any resource in RM C satisfying the user’s preferences.
Hence, we schedule the job on one of the machines belonging to RM C (steps
7–8). This can be done using, for instance, an heuristic.
When RM P C is not empty, steps 9–11 return a set of resources suitable
for job execution. Since RM P C may contain many resources, we select F
RM P C, a fraction of these resources in step 10. For instance, we select and
return the initial 30% of the machines returned in RM P C , ordered from best
to worst w.r.t. preferences. The job is scheduled on the first machine available
in F. If a machine is not available, the scheduler will try scheduling the job
on the next one round robin until it succeeds or a timeout elapses.
We now analyze the computational complexity of algorithm 3.1. We denote
by T(R, C, P ) the time required to determine a feasible set of resources on
a grid consisting of Rresources taking into account the constraints Cand
the preferences P, by H(RM C ) the time to execute an heuristic on the set
of resources RM C and by CP–Net(RM C, P ) the time to execute the CP–
Net outcome optimization query on RM C and P. We have T(R, C, P ) =
Max(H(RM C ),CP–Net(R MC, P )).
Indeed, the time to query a GIS (we have an explicit query in step 1, and
implicit queries in steps 8 and 11) is usually negligible w.r.t. the time used for
scheduling the job. The CP–Net outcome optimization query requires in the
worst case time linear in the input size n(number of preferences), i.e. O(n).
Therefore, T(R, C, P ) = (|RM C |).
Since analyzing all of the resources in step 8 using an heuristic requires at
least time linear in the number of resources in RM C, we conclude that the
overall time complexity depends on the actual number of steps executed for
each resource. As an example, if no more than O(1) steps are executed on each
resource in RM C, then the overall complexity of the algorithm is linear in the
number of resources in RM C .
4 A Simulator for Grid Matchmaking and Scheduling
Efficient and effective scheduling is very important in grid computing environ-
ments as shown in [25], [26], [13]. The problem can be addressed considering
experimental or simulation approaches. Validating the performance of grid
scheduling strategies in a real production environment should be the ideal
scenario but cannot be feasibly carried out. The complexity of production sys-
tems, dynamism of grid execution environments and the difficulty to reproduce
experiments, make scheduling in production systems a complex research envi-
ronment. So, given the difficulties tied to the experimental approach, simula-
tion is the most flexible and viable way of evaluating different Grid scheduling
Preference–based Matchmaking of Grid Resources With CP–Nets 9
algorithms as well as other design issues, although some simplifications and
assumptions are made.
In order to develop and evaluate new grid scheduling algorithms it is fun-
damental the use of simulators in order to address performance evaluation
studies, considering possible constraints and preferences given by the users.
On the other hand, it is worth nothing here that the evaluation of the perfor-
mances obtained with a simulator is just a first step, that must be followed by
the setup of a good testing environment with representative workload traces
to produce dependable results. Computer simulation is our approach for eval-
uation of CP–Nets in the matchmaking of involved resources; therefore, a sim-
ulator has been designed and implemented. Even though several simulators
have already been developed e.g. Alea [20], Briks [44], ChicSim [36], Grench-
Mark [19], GridNet [24], GridSim [8], MicroGrid [40], NSGrid, [46], G3S [32],
OptorSim [9], SimGrid [10] and GSSIM [23], we decided to implement our own
simulator for the following reasons:
we did not need the ability to plug in several algorithms;
the majority of simulators do not provide C bindings (with the exception
of SimGrid);
in order to reduce software engineering costs and to maximize reuse we had
to exploit an already implemented code base;
the time required to install a simulator, read and understand its docu-
mentation and implement our algorithm according to the simulator’s ar-
chitecture was much higher, with regard to the integration of our software
modules for our purposes.
Our simulations takes as parameters constraints and preferences, related to
a set of information on resources (CPU, memory and storage usage), network
links and applications, provided by the users (our scheduler’s clients) in an
XML format. We begin describing first the workload data, then we present
the architecture of the simulator and hence several implementation details.
4.1 Workload Data
The workload plays an important role in experimental performance evaluation
of computer systems. Many studies have been conducted to design better and
more effective resource allocation schemes [13].
Using the simulation approach, a data set representative of the job inter–
arrival times can be retrieved by using various statistical distributions such as
Uniform and Exponential distribution ([29], [31], [14]).
For describing job arrivals, we used statistical distributions such as Ex-
ponential, Gaussian and Weibull. The first one, Exponential distribution, is
motivated by the need to simulate an incremental rate of task arrivals, the
second one, Gaussian distribution, performs well even under the stress of mil-
lions of tightly packed data points and finally the Weibull distribution’s failure
rate is a power function of time, so that instantaneous failure rate at time tis
10 Massimo Cafaro et al.
defined as the probability of failures between time tand t+dt given that no
failure has occurred in the system until time t.
4.2 Simulator architecture
The simulator has a client-server architecture. The server starts initializing
a number of child processes specified as a command line parameter. When a
connection request is issued by a client (service consumer), the server spawns
a new thread to serve the incoming request in a thread.
A submitted job contains constraints and preferences described in an XML
file, described in Section 4.4. As shown in the Fig. 2, when the request is
submitted by a client, the server queues the request in the Arrivals Queue
(managed by the process request thread). Hence the job is validated against a
Document Type Definition (DTD) document (validator thread).
If the validation is successful, then the job is queued in the Job Queue,
handled with a FIFO (First In First Out) policy. The job status becomes
JOB SUBMITTED and it is stored in an internal database with the cur-
rent timestamp. Otherwise, if the XML file is not well formed or the client
has submitted again a job that was submitted previously, then the valida-
tor thread kills the job and the server deals with the following requests. Jobs
queued in the Job Queue are not scheduled immediately, owing to the fact that
there is another queue, the Pending Queue, with higher priority. The latter
queue handles jobs that were submitted previously; the scheduler was not able
to schedule them before, taking into account the constraints and preferences
characterizing them on the one hand, and the grid status on the other.
When a job is dequeued from the Job Queue, the scheduler queries the
internal database (which fullfills the role of a Grid Information Service) to
retrieve RM C, the set of computational resources matching the constraints
and then processes through the CP–Net outcome optimization query the pref-
erences as specified in the corresponding XML file, returning R MP C , the set
of resources matching both the user’s preferences and constraints. If none of
the resources satisfies the constraint query (|RMC|== 0), the client request
cannot be scheduled and hence the job is suppressed and its status is up-
dated to JOB SUPPRESSED. If a set of resources matches the constraints
but no resource satisfies the preferences, then one of the resources matching
the constraints is chosen (|RM C |>0∧ |RM P C |== 0). Finally, when a set
of resources matches the constraints and the CP–Net algorithm returns a set
RM P C including at least a resource, which is the best outcome with regard
to the preferences (|RM C|>0∧ |RM P C|>0), the scheduler selects a subset
of resources in R MP C totally ordered from best to worst w.r.t. preferences,
and tries to schedule the job on the first available resource (round robin).
Therefore, before executing the job, the scheduler queries again the database
because, owing to the the dynamic nature of the grid environment, it needs to
verify if the selected resource still provides the required number of CPU cores
Preference–based Matchmaking of Grid Resources With CP–Nets 11
Start
XML
Data
Generation
Workload data
Arrivals
Queue
(An)
Validation
Duplicated
job_id
Yes
Job not
scheduled
No
Yes
Job
Queue
(Jn)
No
Job
Evaluation
Use values of
previous queries
Constraint query
Preferences query
Query
Evaluation
Jn=An
Jn=Pn
Job is suppressed
con == 0
con > 0 Check grid
status
free storage >
requested
storage
free CPUs >
requested
CPUs
Yes
No
Job
Execution
Yes
No
Resources
availability
Yes
Job
Pending
(Pn)
No
End
Fig. 2: The simulator flow chart
etc; indeed, another scheduler’s thread may have submitted concurrently an-
other job or rescheduled a job dequeued from the Pending Queue.
If this is the case, the job is queued in the Pending Queue (also handled
with FIFO policy). Since this queue has a higher priority with regard to the
Job Queue, the scheduler will attempt to serve job requests belonging to this
queue before the ones related to Job Queue. In turn, this provides a certain
degree of fault–tolerance. However, a job cannot be queued into the Pending
Queue over and over again: a field of the database (maxPendingTimes (MPT))
specifies a Time To Live (TTL), so that when this time limit is exceeded the
job is suppressed and its status updated to JOB TIMEOUT.
Fig. 3 depicts job states and transitions between states; available states
include:
JOB PENDING: the job is waiting in the Arrivals or Pending Queue;
JOB SUBMITTED: the job is processed by the server;
JOB TIMEOUT: the time specified in the maxPendingTimes field has
elapsed without the scheduler being able to submit the job;
JOB SUPPRESSED: no resource was available for job scheduling;
JOB DONE: the job completed successfully;
JOB FAILED: job execution failed.
12 Massimo Cafaro et al.
JOB_SUBMITTED
JOB_TIMEOUT
JOB_FAILED
JOB_PENDING
JOB_DONE
end
start
MPT < Max
JOB_SUPPRESSED
MPT >Max
Fig. 3: Job states
4.3 The information database
One of the aspects considered in the design of the simulator has been the
definition of the database schema in order to provide a system for storing and
accessing the data with the usual CRUD operations. Therefore, we modeled
the information to be managed taking into account information about the
available CPUs and cores, resources, nodes, applications, jobs and queues. A
snapshot of the relational schema is shown in Fig. 4). The “grid” database
contains the following information:
CPU entity:
serial number: CPU serial number (primary key of this entity);
node id: node identifier, represents a resource node containing the CPU;
hourly cost: hourly cost of the CPU;
frequency: clock frequency;
brand: CPU manufacturer;
cores: numbers of cores;
cint: integer performance value acquired through the SPEC bench-
mark;
cfloat: floating point performance value acquired through the SPEC
benchmark;
cache L1: cache size of level 1 of CPU;
cache L2: cache size of level 2 of CPU.
Node entity:
id node: node identifier (primary key of this entity);
ram: RAM size;
os name: installed operating system;
os version: operating system version;
os load average: load average;
network interface: available network interface;
hostname resource: node’s hostname.
Resource entity:
hostname: hostname identifying the resource (primary key of this en-
tity);
Preference–based Matchmaking of Grid Resources With CP–Nets 13
CPU
serial_number
hourly_cost
frequency
brand
cores
spec_int
spec_float
cache_L1
cache_L2
node_id
node
id_node
ram
os_name
os_version
os_load_average
network_interface
hostname_resource
PK
FK1
applicatio n
name
type
expected_workload
additional_prefs
application _installed _resource
date_time
resource_hostname
name_application
resource
PK hostname
place
storage_type
storage_name
storage_brand
storage_space
storage_free_space
bandwith_in
bandwith_out
PK
FK1
queue
name
hostname
type
priority
cpus
free_cpus
policy
job
PK
FK1
FK2
FK3
id
type
SLA
name
parent_id
name_queue
name_application
job status
PK
FK1
date_time
value
id_job
PK
PK
FK2
FK1
PK1
PK2
Fig. 4: Relational schema of the “grid” database
place: resource’s location;
storage type: storage type of the resource;
storage name: storage name of the resource;
storage brand: storage brand of the resource;
storage space: size of space currently used;
storage free space: size of free space available;
bandwidth in: input network bandwidth;
bandwidth out: output network bandwidth.
Application entity:
name application: application name (primary key of this entity);
type: application type (sequential, parallel);
expected workload: expected workload of the application;
additional prefs: possible preferences.
Job entity:
id: job identifier (primary key of this entity);
type: job type (sequential, parallel);
SLA: Service Level Agreement associated to the job;
name: job name;
14 Massimo Cafaro et al.
parent id: identifier of parent job (required to support workflow appli-
cations; NULL for a job without parent);
name queue: name of resource queue in which the job has been queued
for execution;
name application: name of application to which the job refers.
Job Status entity:
id job: job identifier (primary key of this entity);
date time: timestamp (date and hour) associated to the job status;
value: job status;
Queue entity:
name: queue name (primary key of this entity, along with hostname resource);
hostname resource: hostname of the resource to which the queue be-
longs;
type: queue type (sequential, parallel)
priority: priority level (high, medium, low);
cpus: total number of CPUs handled;
free cpus: number of current CPUs available for job execution;
policy: management policy of the queue.
4.4 Job Description
The jobs executed during the simulation are represented by XML files con-
taining both the constraints and preferences. These files are validated using a
suitable schema. In particular, the Job tag, that represents the root element,
must have three child tags: Parameters,Requirements and Preferences. The
Parameters tag specifies the job executable, the command line arguments, the
type of jobs (serial or parallel) and the number of required CPUs. Require-
ments contains constraints on the resources such as the type of CPUs, the
amount of RAM, the operating system etc; Preferences contains the user’s
desiderata. A Preferences node may have a maxT node (maxterm) contain-
ing a minT (minterm) node which must have a child node named depends.
Indeed, preferences’ modeling is based on CP–Nets: a CP–Table associate to
a CP–Net node may be expressed using standard forms of expressions such as
sum of products (minterms) or product of sums (maxterms) commonly used
in boolean algebra and Karnaugh maps. The following is the Document Type
Definition for job description.
<!ELEMENT Job (Parameters,Requirements,Preferences)>
<!ELEMENT Parameters (Executable, Arguments, Type)>
<!ELEMENT Executable (#PCDATA)>
<!ELEMENT Arguments (#PCDATA)>
<!ELEMENT Type (#PCDATA)>
<!ELEMENT Requirements (CPU?, Node?, Resource?)>
<!ELEMENT CPU (brand?, cores?, frequency?, cache_L1?, cache_L2?, CINT?, CFP?, hourly_cost?)>
<!ELEMENT Node (ram?, os_name?, os_version?)>
<!ELEMENT Resource (hostname?, place?, bandwidth_in?, bandwidth_out?, storage_free_space?)>
<!ELEMENT Preferences (CPU?, Node?, Resource?)>
<!ELEMENT brand (maxT?)>
<!ELEMENT cores (maxT?)>
<!ELEMENT frequency (maxT?)>
<!ELEMENT cache_L1 (maxT?)>
<!ELEMENT cache_L2 (maxT?)>
Preference–based Matchmaking of Grid Resources With CP–Nets 15
<!ELEMENT CINT (maxT?)>
<!ELEMENT CFP (maxT?)>
<!ELEMENT hourly_cost (maxT?)>
<!ELEMENT ram (maxT?)>
<!ELEMENT os_name (maxT?)>
<!ELEMENT os_version (maxT?)>
<!ELEMENT hostname (maxT?)>
<!ELEMENT place (maxT?)>
<!ELEMENT bandwidth_in (maxT?)>
<!ELEMENT bandwidth_out (maxT?)>
<!ELEMENT storage_free_space (maxT?)>
<!ELEMENT maxT (minT+)>
<!ELEMENT minT (depends+)>
<!ELEMENT depends EMPTY>
<!ATTLIST Executable applicationName CDATA #REQUIRED expectedWorkload CDATA #REQUIRED>
<!ATTLIST Type nCpu CDATA #REQUIRED>
<!ATTLIST brand value CDATA #REQUIRED operator (equal) "equal">
<!ATTLIST cores value CDATA #REQUIRED operator (max|min) "min">
<!ATTLIST frequency value CDATA #REQUIRED operator (max|min) "min">
<!ATTLIST cache_L1 value CDATA #REQUIRED operator (max|min) "min">
<!ATTLIST cache_L2 value CDATA #REQUIRED operator (max|min) "min">
<!ATTLIST CINT value CDATA #REQUIRED operator (max|min) "min">
<!ATTLIST CFP value CDATA #REQUIRED operator (max|min) "min">
<!ATTLIST hourly_cost value CDATA #REQUIRED operator (max|min) "min">
<!ATTLIST ram value CDATA #REQUIRED operator (max|min) "min">
<!ATTLIST os_name value CDATA #REQUIRED operator (equal) "equal">
<!ATTLIST os_version value CDATA #REQUIRED operator (equal) "equal">
<!ATTLIST hostname value CDATA #REQUIRED operator (equal) "equal">
<!ATTLIST place value CDATA #REQUIRED operator (equal) "equal">
<!ATTLIST bandwidth_in value CDATA #REQUIRED operator (min) "min">
<!ATTLIST bandwidth_out value CDATA #REQUIRED operator (min) "min">
<!ATTLIST storage_free_space value CDATA #REQUIRED operator (min) "min">
<!ATTLIST maxT operation (and|or) "and">
<!ATTLIST minT operation (and|or) "and">
<!ATTLIST depends node (brand|cores|frequency|cache_L1|cache_L2|CINT|CFP|hourly_cost|ram|os_name|os_version|hostname|place|bandwidth_in|bandwidth_out|storage_free_space) "brand" denied (y|n) "n">
Here is an example of a job description file:
<Job>
<Parameters>
<Executable applicationName="app_8" expectedWorkload="53.675">48</Executable>
<Arguments>arg4</Arguments>
<Type nCpu="8">PARALLEL</Type>
</Parameters>
<Requirements>
<CPU>
<frequency value="1.8" operator="min"/>
<hourly_cost value="2" operator="max"/>
</CPU>
<Node>
<ram value="4" operator="max"/>
<os_name value="Aix" operator="equal"/>
</Node>
</Requirements>
<Preferences>
<CPU>
<cores value="16" operator="min">
<maxT operation="or">
<minT operation="and">
<depends node="bandwidth_in" denied="n"/>
</minT>
</maxT>
</cores>
<hourly_cost value="1.5" operator="max"/>
</CPU>
<Resource>
<bandwidth_in value="12" operator="min">
<maxT operation="or">
<minT operation="and">
<depends node="hourly_cost" denied="y"/>
</minT>
</maxT>
</bandwidth_in>
</Resource>
</Preferences>
</Job>
This file describes the following requests of a user:
Execution of a parallel job named ’48’, related to the ’app 8’ application
with at least 8 CPUs (constraint);
RAM size must be at least 4 GB on each node (constraint);
The operating system must be Aix (constraint);
The CPU frequency must be greater than or equal to 1.8 Ghz (constraint);
16 Massimo Cafaro et al.
The hourly cost of the CPUs must be less than or equal to 2 dollars (con-
traint);
If possible, the hourly cost of the CPUs should be less than or equal to 1.5
dollars (preference);
If it is possible submit a job on a CPU with hourly cost under 1.5 dollars,
preferably the resource should have a minimum input bandwidth of 12
Mb/s (preference);
If the previous requests can be satisfied, the job should be run on a CPU
with 16 cores (preference).
The following example shows how to express three preferences, each one
depending on the previous one; we omit details related to parameters and
requirements. In particular, the user prefers a CINT value (related to the
CPU performance) that must be at least 42, and, if this preference holds, the
user prefers an AMD CPU; finally, if the CPU is an AMD one, the user prefers
a level 2 cache size of at least 1 MBytes.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Job SYSTEM "gridsim.dtd">
<Job>
<Parameters> ...</Parameters>
<Requirements>...</Requirements>
<Preferences>
<CPU>
<brand value="AMD" operator="equal">
<maxT operation="and">
<minT operation="or">
<depends node="CINT" denied="n"/>
</minT>
</maxT>
</brand>
<cache_L2 value="1" operator="min">
<maxT operation="and">
<minT operation="or">
<depends node="brand" denied="n"/>
</minT>
</maxT>
</cache_L2>
<CINT value="42" operator="min"/>
</CPU>
</Preferences>
</Job>
5 IMPACT OF CP–Nets ON SCHEDULING
In this Section we present the experimental results we obtained. We begin by
describing the experiments that have been carried out, which are characterized
by the following parameters:
r, number of computational resources managed by the grid scheduler;
j, number of jobs submitted to the grid scheduler;
w, workload expressed as number of jobs already running on the grid;
e, boolean value indicating if the CP–Net algorithm is enabled or disabled;
a, number of applications to be simulated;
n, total number of nodes;
c, total number of CPUs.
We designed and carried out 38 different experiments which are character-
ized by r∈ {500,1000,2000},j∈ {500,1000,2000,100000},w∈ {0,500,1000,2000},
Preference–based Matchmaking of Grid Resources With CP–Nets 17
e∈ {T RU E, F ALS E},a= 24. The first 36 experiments have been run sub-
mitting up to 2,000 jobs on small, medium and large grids: for r= 500,
n= 13,567 and c= 365,204; for r= 1,000, n= 27,766 and c= 729,084;
for r= 2,000, n= 52,642 and c= 1,330,148. The last two experiments have
been run submitting 100,000 jobs on a large grid: r= 2,000, n= 52,737 and
c= 1,404,214.
The hardware used in the first 36 experiments consists of three SMP (Sym-
metric Multi-Processor) nodes configured with two Intel Itanium 2 single core
processors 1.4 Ghz with 1.5 MB level 3 cache and 4 GB of main memory. One
of the nodes was dedicated to the execution of our grid simulator, another one
to the back-end PostgreSQL database and the last one was used to issue the
user’s requests to the grid simulator. In order to stress the simulator, all of the
requests were issued concurrently. For the last two experiments, the hardware
used consists of three SMP nodes configured with two Intel Xeon E5520 dual
core processors 2.27 Ghz with 8 MB level 3 cache and 8 GB of main memory.
Tables 1–7 and Figures 5–14 summarize the results obtained. In these ta-
bles, the total time for scheduling the jobs, the average time to schedule one
job and standard deviation are expressed in seconds. It’s worth clarifying here
that, since our focus is on matchmaking and not on scheduling, the impact of
CP–Nets on scheduling is measured by carrying out several couples of exper-
iments, respectively enabling or disabling the CP–Net algorithm (parameter
e) in order to verify quantitatively the net effect of enabling it and to demon-
strate that it is negligible; we schedule the jobs using the simplest round-robin
strategy applied to the set of resources returned by the CP–Nets algorithm,
which are ranked according to their matching degrees; i.e., each job is sched-
uled on the first available resource returned by the algorithm. Given that this
is the simplest possible scheduling strategy, our aim therefore is not to compare
different scheduling algorithms, but, rather, to determine experimentally the
time required to schedule all of the jobs, the average time and the standard de-
viation to schedule a job and resource utilization when CP–Nets matchmaking
is taken into account.
We begin discussing the results related to Table 1 and Figures 5–7. The
figures are histograms in which we plot the distribution of data as frequencies.
On the xand yaxes we plot respectively the time in seconds to schedule a job
(rounded to the nearest integer) and the number of jobs that required that
time to be scheduled.
For a grid consisting of 500 resources with no initial workload (Table 1
and Figures 5– 7), enabling the CP–Net algorithm we observe as expected an
increase of the total time required to schedule the jobs. However, the rate of
increase is not directly proportional to the number of submitted jobs. When
submitting 500 jobs, the rate of increase is 90.02%, on average it takes 5.05 sec-
onds to schedule a job (versus 2.66 with CP–Nets disabled) and the standard
deviation is 8.47 seconds (versus 5.29). The rate of increase is only 28.59%
when submitting 1000 jobs. Correspondingly, on average it takes 4.16 seconds
to schedule a job (versus 3.24 with CP–Nets disabled) and the standard de-
viation is 5.61 seconds (versus 5.76). Therefore, in this case the scheduling
18 Massimo Cafaro et al.
process appears to be more uniform, with less dispersion around the mean
value w.r.t. the same experiment in which CP–Nets are not enabled. Finally,
when submitting 2000 jobs the rate of increase becomes 75.76%, the average
time to schedule a jobs is 5.6 seconds (versus 3.19 with CP–Nets disabled)
and the standard deviation is 6.57 seconds (versus 3.28). From these exper-
imental results we conclude that for this grid increasing the submitted jobs
leads to a reduction of the small overhead associated to the CP–Net algo-
rithm up to (probably) a minimum value, and then the overhead increases
again. The behavior is thus the one associated to a monotonically decreasing
and then increasing function. As can be seen in Figures 5–7, the majority of
the submitted jobs requires a few seconds to be scheduled, with or without
the CP–Net algorithm. The figures also show the presence of outliers, a few
job requiring more time to be scheduled. We now discuss resource utilization.
Since the application in our simulations are not installed on each resource, a
complete utilization of the full set of available resources is not possible. In the
experiments, resource utilization falls from 383 to 293 resources used when
submitting 500 jobs, from 484 to 420 resources for 1000 jobs and finally from
486 to 482 for 2000 jobs. The trend is therefore the one of a monotonically
increasing function; when enabling the CP–Net algorithm the overall differ-
ence in resource utilization becomes negligible as the number of submitted
jobs increases.
Regarding the rate of increase of the scheduling time associated to the CP–
Net algorithm, the same pattern can be observed in the results related to Table
2 and Figures 8–10. These results are related to the same grid consisting of 500
resources, but in the corresponding experiments there is an initial workload of
500 jobs already running on the grid before submitting new jobs. While the
rate of increase is higher with regard to the unloaded grid, we observe that on
average the time required to schedule a job is practically almost always lower.
Resource utilization is also better, with almost no difference when increasing
the number of submitted jobs, and very close to the maximum possible (given
that, as already stated, full resource utilization is not possible).
Jobs 500 1000 2000
Time (CP–Net disabled) 1329.29 1618.02 1592.54
Time (CP–Net enabled) 2525.93 2080.67 2798.98
Difference 1196.64 462.64 1206.44
% Difference 90.02 28.59 75.76
Average (CP–Net disabled) 2.66 3.24 3.19
Std deviation (CP–Net disabled) 5.29 5.76 3.28
Average (CP–Net enabled) 5.05 4.16 5.60
Std deviation (CP–Net enabled) 8.47 5.61 6.57
Resources (CP–Net disabled) 383 484 486
Resources (CP–Net enabled) 293 420 482
Difference -90 -64 -4
%Difference -23.49 -13.22 -0.82
Table 1: Unloaded grid consisting of 500 resources
Preference–based Matchmaking of Grid Resources With CP–Nets 19
Jobs 500 1000 2000
Time (CP–Net disabled) 945.63 1293.75 1399.59
Time (CP–Net enabled) 1946.44 2355.04 2687.59
Difference 1000.81 1061.29 1287.99
% Difference 105.83 82.03 92.03
Average (CP–Net disabled) 1.89 2.59 2.80
Std deviation (CP–Net disabled) 4.13 6.19 6.68
Average (CP–Net enabled) 3.89 4.71 5.38
Std deviation (CP–Net enabled) 7.51 9.40 10.83
Resources (CP–Net disabled) 484 484 489
Resources (CP–Net enabled) 418 483 488
Difference -66 -1 -1
%Difference -13.63 -0.2 -0.2
Table 2: Grid consisting of 500 resources, initial workload of 500 jobs
We now analyze the results obtained for a grid consisting of 1000 resources,
with no initial workload. These results are provided in Table 3 and Figures
8–10. As shown, when increasing the number of computational resources be-
longing to the grid, the rate of increase is monotonically decreasing when
increasing the number of submitted jobs from 500 to 2000. The average time
to schedule a job using the CP–Net algorithm is, respectively, 13.75, 17.07
and 17.29 seconds (versus 8.96, 11.67 and 11.9 without CP–Net) for 500, 1000
and 2000 submitted jobs. The overall resource utilization factor is very good
for 500 and 2000 jobs (respectively a difference of 31 and 85 resources not
utilized) and worse for 1000 submitted jobs (difference of 162 resources). The
number of resources utilized increases steadily with the number of submitted
jobs, reaching 887 resources (out of 1000) for 2000 jobs.
When considering the same grid consisting of 1000 resources (Table 4 and
Figures 8–10), this time with an initial workload of 1000 jobs already running
on the grid before submitting new jobs, we obtained the following results. As
in the previous case, the rate of increase is monotonically decreasing when
increasing the number of submitted jobs from 500 to 2000. Moreover, the
average time required to schedule a job is lower than the corresponding time
in the previous case, and resource utilization is again steadily increasing with
the number of submitted jobs, reaching 972 resources (out of 1000) for 2000
jobs.
Regarding the results obtained for a grid consisting of 2000 resources, with
no initial workload and with an initial workload of 2000 jobs (Tables 5–6 and
Figures 11–13), we note a dramatic decrease of the rate of increase of the total
time required to schedule the jobs with respect to the small (500 resources)
and the medium (1000 resources) sized grids in all of the experiments with
500, 1000 and 2000 submitted jobs. The average time to schedule a job and
the standard deviation when using the CP–Net are only slightly larger than
the corresponding times without the CP–Net algorithm, and the increase is
negligible. Resource utilization is also quite good. The utilization factor in-
creases with the number of submitted jobs, and, for the grid with no initial
workload, in the worst case (2000 jobs submitted) there is only a -15.39%
20 Massimo Cafaro et al.
20
40
60
80
Seconds
50
100
150
200
250
Number of jobs
CP-Net enabled
CP-Net disabled
(a) grid unloaded
10
20
30
40
50
60
70
Seconds
100
200
300
400
Number of jobs
CP-Net enabled
CP-Net disabled
(b) grid workload: 500 jobs
Fig. 5: 500 jobs on a grid consisting of 500 resources
difference between the corresponding experiments with and without CP–Net.
For the grid with an initial workload of 2000 jobs, the worst case happens
when submitting 1000 jobs, with a similar -15.79% difference. Interestingly,
the percentage difference is the lowest (-4.05%) when submitting 2000 jobs.
Preference–based Matchmaking of Grid Resources With CP–Nets 21
Jobs 500 1000 2000
Time (CP–Net disabled) 4478.43 5832.93 5948.85
Time (CP–Net enabled) 6874.56 8533.51 8644.94
Difference 2396.12 2700.58 2696.09
% Difference 53.50 46.30 45.32
Average (CP–Net disabled) 8.96 11.67 11.90
Std deviation (CP–Net disabled) 27.96 36.77 37.33
Average (CP–Net enabled) 13.75 17.07 17.29
Std deviation (CP–Net enabled) 36.73 47.50 46.35
Resources (CP–Net disabled) 416 791 972
Resources (CP–Net enabled) 385 629 887
Difference -31 -162 -85
%Difference -7.45 -20.48 -8.74
Table 3: Unloaded grid consisting of 1000 resources
Jobs 500 1000 2000
Time (CP–Net disabled) 3639.75 5067.47 5525.62
Time (CP–Net enabled) 5719.73 7587.18 7612.77
Difference 2079.98 2519.71 2087.15
% Difference 57.15 49.72 37.77
Average (CP–Net disabled) 7.28 10.13 11.05
Std deviation (CP–Net disabled) 23.01 32.11 35.12
Average (CP–Net enabled) 11.44 15.17 15.23
Std deviation (CP–Net enabled) 29.85 40.26 40.90
Resources (CP–Net disabled) 776 971 974
Resources (CP–Net enabled) 620 887 972
Difference -156 -84 -2
%Difference -20.1 -8.65 -0.2
Table 4: Grid consisting of 1000 resources, initial workload of 1000 jobs
Jobs 500 1000 2000
Time (CP–Net disabled) 34256.4 63764.1 132312
Time (CP–Net enabled) 38972.8 76280.9 153356
Difference 4716.47 12516.8 21044.2
% Difference 13.76 19.62 15.9
Average (CP–Net disabled) 68.51 63.76 66.15
Std deviation (CP–Net disabled) 179.16 177.42 184.17
Average (CP–Net enabled) 77.94 76.28 76.67
Std deviation (CP–Net enabled) 186.84 192.03 193.25
Resources (CP–Net disabled) 445 852 1559
Resources (CP–Net enabled) 435 795 1319
Difference -10 -57 -240
%Difference -2.24 -6.69 -15.39
Table 5: Unloaded grid consisting of 2000 resources
22 Massimo Cafaro et al.
10
20
30
40
50
60
70
Seconds
100
200
300
400
500
600
700
Number of jobs
CP-Net enabled
CP-Net disabled
(a) grid unloaded
20
40
60
80
100
Seconds
200
400
600
800
Number of jobs
CP-Net enabled
CP-Net disabled
(b) grid workload: 500 jobs
Fig. 6: 1000 jobs on a grid consisting of 500 resources
Finally, Table 7 and Figure 14 refers to the last two experiment carried out
on an unloaded grid consisting of 2000 resources. To assess the scalability of the
CP–Net algorithm, we submitted 100000 jobs. As shown, the time required to
submit all of the jobs using the CP–Net algorithm is only 18.41% more than
the corresponding time without the algorithm. Average time and standard
Preference–based Matchmaking of Grid Resources With CP–Nets 23
20
40
60
80
100
Seconds
200
400
600
800
1000
1200
1400
Number of jobs
CP-Net enabled
CP-Net disabled
(a) grid unloaded
20
40
60
80
100
Seconds
500
1000
1500
Number of jobs
CP-Net enabled
CP-Net disabled
(b) grid workload: 500 jobs
Fig. 7: 2000 jobs on a grid consisting of 500 resources
deviation increase slightly, and resource utilization is in this case even slightly
better, with more resources utilized when running the experiment with the
CP–Net algorithm enabled.
24 Massimo Cafaro et al.
50
100
150
200
250
Seconds
100
200
300
400
Number of jobs
CP-Net enabled
CP-Net disabled
(a) grid unloaded
50
100
150
200
Seconds
100
200
300
400
500
Number of jobs
CP-Net enabled
CP-Net disabled
(b) grid workload: 1000 jobs
Fig. 8: 500 jobs on a grid consisting of 1000 resources
6 Related Work
In this paper we strictly deal with the matchmaking process in the context
of grid scheduling, not with scheduling algorithms in general; therefore, we
discuss in this Section relevant work in the field of matchmaking only.
Preference–based Matchmaking of Grid Resources With CP–Nets 25
50
100
150
200
250
300
350
Seconds
200
400
600
800
Number of jobs
CP-Net enabled
CP-Net disabled
(a) grid unloaded
50
100
150
200
250
300
Seconds
200
400
600
800
Number of jobs
CP-Net enabled
CP-Net disabled
(b) grid workload: 1000 jobs
Fig. 9: 1000 jobs on a grid consisting of 1000 resources
In the field of Artificial and Computational Intelligence, earliest results
in matchmaking include [38], [22], [5], [41], [43] and [35]. Agent-Based Soft-
ware Interoperability (ABSI) [38] takes advantage of the KQML (Knowledge
Query and Manipulation Language) specification and uses KIF (Knowledge In-
26 Massimo Cafaro et al.
50
100
150
200
250
300
Seconds
500
1000
1500
Number of jobs
CP-Net enabled
CP-Net disabled
(a) grid unloaded
50
100
150
200
250
300
350
Seconds
500
1000
1500
Number of jobs
CP-Net enabled
CP-Net disabled
(b) grid workload: 1000 jobs
Fig. 10: 2000 jobs on a grid consisting of 1000 resources
terchange Format) as content language. Matchmaking of advertisements and
users’ requests happens through the unification of equality predicates.
COIN [22], is a system in which matchmaking is based on a unification pro-
cess quite similar to the one carried out by the Prolog programming language.
InfoSleuth [5] uses KIF as the content language; the matchmaking process is
Preference–based Matchmaking of Grid Resources With CP–Nets 27
Jobs 500 1000 2000
Time (CP–Net disabled) 32767.8 61576.5 125502
Time (CP–Net enabled) 38217.1 75871.9 145591
Difference 5449.3 14295.4 20089
% Difference 16.63 23.21 16.007
Average (CP–Net disabled) 65.53 61.57 62.75
Std deviation (CP–Net disabled) 173.91 171.65 174.7
Average (CP–Net enabled) 76.43 75.87 72.79
Std deviation (CP–Net enabled) 185.39 191.97 184.4
Resources (CP–Net disabled) 855 1564 1948
Resources (CP–Net enabled) 794 1317 1869
Difference -61 -247 -79
%Difference -7.13 -15.79 -4.05
Table 6: Grid consisting of 2000 resources, initial workload of 2000 jobs
Jobs 100000
Time (CP–Net disabled) 9949690
Time (CP–Net enabled) 11782200
Difference 1832480
% Difference 18.41
Average (CP–Net disabled) 99.49
Std deviation (CP–Net disabled) 324.07
Average (CP–Net enabled) 117.82
Std deviation (CP–Net enabled) 370.76
Resources (CP–Net disabled) 1950
Resources (CP–Net enabled) 1954
Difference 4
%Difference 0.2
Table 7: Unloaded grid consisting of 2000 resources, 100000 submitted jobs
based on solving a constraint satisfaction problem, so that an advertisement
and a request match if the user’s constraints are satisfied.
The Service Description Language has been proposed in [41] to describe
available services. Here, matchmaking requires determining k-nearest services
for a request according to the distance between the service names (pairs of
verb and noun terms) and the request. Capability Description Language was
proposed in [48]. It supports reasoning through the notions of subsumption
and instantiation.
Language for Advertisement and Request for Knowledge Sharing (LARKS)
appeared in [43] and is able to describe both service capabilities and service
requests. It is based on the ITL (Information Terminological Language) con-
cept language [42]. LARKS exploits the relations among concepts in order to
compute semantic similarities.
Traditionally, service and resource discovery have been carried out using
methods based on name and keyword matchmaking. A semantic matchmaking
framework based on DAML-S, a DAML (DARPA Agent Markup Language)-
based language for service description, has been proposed in [35]. In this
ontology-based matchmaking framework an advertisement matches a request
28 Massimo Cafaro et al.
200
400
600
800
1000
Seconds
100
200
300
400
Number of jobs
CP-Net enabled
CP-Net disabled
(a) grid unloaded
200
400
600
800
1000
Seconds
100
200
300
400
Number of jobs
CP-Net enabled
CP-Net disabled
(b) grid workload: 2000 jobs
Fig. 11: 500 jobs on a grid consisting of 2000 resources
when the service or resource provided by the advertisement can provide a
certain degree of usefulness to the requester. When performing matchmaking,
the system uses the outputs and inputs of the advertisement and the request
based on the ontologies available, and, through the subsumption relationship
of one concept of the input/output of the advertisement and one concept of
Preference–based Matchmaking of Grid Resources With CP–Nets 29
200
400
600
800
1000
Seconds
200
400
600
800
Number of jobs
CP-Net enabled
CP-Net disabled
(a) grid unloaded
200
400
600
800
1000
Seconds
200
400
600
800
Number of jobs
CP-Net enabled
CP-Net disabled
(b) grid workload: 2000 jobs
Fig. 12: 1000 jobs on a grid consisting of 2000 resources
the input/output of the request, is able to determine four different levels of
matching: exact, plug-in, subsume, and fail.
Another ontology-based matchmaking service is presented in [18]. It uses
separate ontologies to declaratively describe resources and job requests. In-
stead of exact syntax matching, their ontology-based matchmaker performs se-
30 Massimo Cafaro et al.
200
400
600
800
1000
Seconds
500
1000
1500
Number of jobs
CP-Net enabled
CP-Net disabled
(a) grid unloaded
200
400
600
800
1000
Seconds
500
1000
1500
Number of jobs
CP-Net enabled
CP-Net disabled
(b) grid workload: 2000 jobs
Fig. 13: 2000 jobs on a grid consisting of 2000 resources
mantic matching using terms defined in ontologies. The loose coupling between
resource and request descriptions remove the tight coordination requirement
between resource providers and consumers. The authors designed and proto-
typed their matchmaking service using TRIPLE to use ontologies encoded in
W3Cs Resource Description Format (RDF) and rules (based on Horn logic
Preference–based Matchmaking of Grid Resources With CP–Nets 31
500
1000
1500
2000
Seconds
20000
40000
60000
80000
Number of jobs
CP-Net enabled
CP-Net disabled
Fig. 14: Unloaded grid consisting of 2000 resources, 100000 submitted jobs
and F-logic) for resource matching. Resource descriptions, request descrip-
tions, and usage policies are all independently modeled and syntactically and
semantically described using RDF schema. Inference rules are utilized for rea-
soning about the characteristics of a request, available resources, and usage
policies to find a resource that satisfies the request requirements.
In [3] the authors implemented a matchmaking service in an intelligent
grid environment, the BondGrid [4]. Their matchmaking framework is based
on a resource specification component, a request specification component, and
matchmaking algorithms. The request specification includes a matchmaking
function and possibly two additional constraints, namely a cardinality thresh-
old and a matching degree threshold. The cardinality threshold specifies how
many resources the requestor expects from the matchmaking service. The
matching degree threshold purpose is to specify the least matching degree
of one of the resources returned. The input of the matchmaking algorithm is
the request and the grid resource instances stored in a knowledge base; the
algorithm evaluates the request function in the context of each resource in-
stance. The output is a number of grid resources, which are ranked according
to their matching degrees. The matchmaking service returns the grid resources
32 Massimo Cafaro et al.
that have the nlargest matching degrees to the requester, where nis the car-
dinality threshold specified by the request.
In [30], the authors discuss the problem of matchmaking for mathematical
services, where the semantics play a critical role in determining the applica-
bility or otherwise of a service and for which they use OpenMath descriptions
of pre- and post-conditions. A matchmaking architecture supporting the use
of match plug-ins is described, along with five kinds of plug-in that have been
developed for this pourpose: (i) a basic structural match, (ii) a syntax and
ontology match, (iii) a value substitution match, (iv) an algebraic equivalence
match and (v) a decomposition match. Their matchmaker uses the individual
match scores from the plug-ins to compute a ranking by applicability of the
services. The authors consider the effect of pre- and post-conditions of mathe-
matical service descriptions on matching, and how and why to reduce queries
into Disjunctive Normal Form (DNF) before matching. Finally, a case study
demonstrates in detail how the matching process works.
Trust-aware matchmaking is the subject of [2]. The authors presents a peer-
to-peer trust brokering system, in which the network of trust brokers operate
by providing peer reviews in the form of recommendations regarding potential
resource targets. One of the distinguishing features of this work is that it sepa-
rately models the accuracy and honesty concepts, so that their model is able to
significantly improve the performance. The trust brokering system is applied
to a resource manager in order to illustrate its utility in a public-resource Grid
environment. The simulations performed to evaluate the trust-aware resource
matchmaking strategies indicate that high levels of robustness can be attained
by considering trust while matchmaking and allocating resources.
The Condor high-throughput resource management system for compute-
intensive jobs [45] requires task submissions to be specified in description files
containing basic information and task requirements. The latter are translated
to classified advertisements (ClassAds) that are sets of named expressions.
ClassAds, which maps attribute names to expressions, are also used to ex-
press characteristics of resources, and a matchmaking service matches task
and resource-related ClassAds to determine the proper resources where tasks
can be executed. A Constraint attribute in a classad is evaluated against the
classad being matched with this classad, and when the values of attribute Con-
straint of both classads are evaluated to true, these two classads are matched.
Another attribute, Rank, measures the quality aof a match. The value of Rank
therefore provides an indication of how much the two classads match, so that
the larger the value, the better is the matching. Condor requires a provider
and a requester to know each other’s classad structure. The evaluation result
of the Rank attribute is, in general, not normalized and can not tell explicitly
how well two classads match.
Matchmaking in Condor supports selecting only one resource. In [28], an
extension, called set-extended classad syntax, was proposed in order to sup-
port multiple resource selection. Matchmaking works evaluating a set-extended
classad with a set of classads and returning a classad set with the highest rank.
However, when the size of the classad set is large, evaluating all of the possible
Preference–based Matchmaking of Grid Resources With CP–Nets 33
combinations is infeasible; in this case, a simple greedy heuristic is used to find
the classad set providing the highest rank.
In [27] the authors present a new approach to symmetric matching that
achieves significant advances in expressiveness relative to ClassAds. It allows
multi-way matches, expression and location of resource with negotiable capa-
bility. The key to their approach is reinterpreting matching as a constraint
problem and exploit constraint-solving technologies to implement matching
operations. A prototype matchmaking mechanism, named Redline, has been
implemented and used to model and solve several challenging matching prob-
lems.
GREEN [11] matches a job demand with a grid resource supply on the
basis of a characterization of resources by means of their performance, evalu-
ated through benchmarks relevant to the application. The matchmaking ser-
vice is based on a two-level benchmarking methodology; a requestorspecifies
both syntactic and performance requirements, independently of the underlying
middleware. GREEN fosters Grid interoperability through the use of JSDL to
express job submission requirements, and an internal translation to the job
submission languages used by the targets middleware. Middleware indepen-
dence is pursued through an extension of JSDL based on the Glue2.0 schema.
Moreover, some extensions to JSDL related to concurrency aspects were bor-
rowed from JSDL SPMD Application Extension, in oder to support execution
of parallel applications.
A resource selection system for exploiting graphics processing units (GPUs)
as general-purpose computational resources in desktop Grid environments is
presented in [21]. The system allows Grid users to share remote GPUs, which
are traditionally dedicated to local users who directly see the display output.
The key contribution of this paper is a novel system for non-dedicated en-
vironments. The authors first show criteria for defining idle GPUs from the
Grid users’ perspective. Based on these criteria, the system uses a screen-
saver approach with some sensors that detect idle resources at a low overhead.
The idea for this lower overhead is to avoid GPU intervention during resource
monitoring. Detected idle GPUs are then selected according to a matchmaking
service, making the system adaptive to the rapid advance of GPU architecture.
Though the system itself is not yet interoperable with current desktop Grid
systems, the idea can be applied to screensaver-based systems such as BOINC.
The system has been evaluated using Windows PCs with three generations of
nVIDIA GPUs. The experimental results show that it achieves a low overhead
of at most 267 ms, minimizing interference to local users while maximizing the
performance delivered to Grid users. Some case studies are also performed in
an office environment to demonstrate the effectiveness of the system in terms
of the amount of detected idle time.
34 Massimo Cafaro et al.
7 CONCLUSIONS
In this paper we dealt with the problem of conditional preference matchmak-
ing of computational resources belonging to a grid. We introduced CP–Nets,
a recent development in the field of Artificial Intelligence, as a means to deal
with user’s preferences in the context of grid scheduling. We discussed CP–
Nets from a theoretical perspective and then analyzed, qualitatively and quan-
titatively, their impact on the matchmaking process, with the help of a grid
simulator we developed for this purpose. Many different experiments have been
setup and carried out, and we report here our main findings and the lessons
learnt.
1. Introducing CP–Nets in the matchmaking process is feasible. The
overhead associated to CP–Nets is minimal when considering that the av-
erage time to schedule a job in all of our experiments ranges from 3.89 (best
case) to 117.82 seconds (worst case, less than two minutes). We also note
here that, besides requiring a few seconds, the average time to schedule a
job when using the CP–Nets is almost always close to the average time to
schedule a job without CP–Nets, and is never more than two times this
value. Moreover, the outcome optimization query is polynomial (linear) in
its input for the particular case related to our experiments, and the aver-
age time to schedule a job is not directly proportional to the number of
submitted jobs.
2. Bigger grids are well suited to the use of CP–Nets in the match-
making process. Compared to smaller grids, bigger ones exhibit a reduced
rate of increase of the scheduling time associated to the CP–Net algorithm.
3. Resource utilization does not decreases excessively using CP–
Nets. Overall, resource utilization is extremely good, ranging from no
difference at all (best case) to a maximum difference of 23.49%. For bigger
grids and workloads, resource utilization is close to the maximum possible.
4. Grids with an initial workload provide better performances w.r.t.
unloaded ones. Interestingly, grids which are already busy executing a
previous workload react better to conditional preference matchmaking,
leading in almost all of the experiments to a reduced average time to sched-
ule a job.
Therefore, we conclude that CP–Nets can be a useful tool to ensure that
user’s preferences are met in the matchmaking process, so that scheduling may
provide results more appealing to the end users, with minimal overhead.
Acknowledgment
The authors would like to thank the anonymous reviewers for their useful, con-
structive comments, that greatly helped improving the quality of this paper.
Preference–based Matchmaking of Grid Resources With CP–Nets 35
References
1. A. Anjomshoaa, F. Brisard, M. Drescher, D. Fellows, A. Ly, S. McGough, D. Pulsipher,
and A. Savva. Job submission description language (jsdl), specification, version 1.0.
Global Grid Forum Working Draft, 2005.
2. F. Azzedin, M. Maheswaran, and A. Mitra. Trust brokering and its use for resource
matchmaking in public-resource grids. Journal of Grid Computing, 4:247–263, 2006.
3. X. Bai, H. Yu, Y. Ji, and D. C. Marinescu. Resource matching and a matchmaking ser-
vice for an intelligent grid. In International Conference on Computational Intelligence,
pages 262–265. International Computational Intelligence Society, 2004.
4. X. Bai, H. Yu, G. Wang, Y. Ji, D. Marinescu, and L. B¨ol¨oni. Intelligent grids. In Grid
Computing: Software Environments and Tools, pages 45–74. Springer, 2005.
5. R. J. Bayardo, Jr., W. Bohrer, R. Brice, A. Cichocki, J. Fowler, A. Helal, V. Kashyap,
T. Ksiezyk, G. Martin, M. Nodine, M. Rashid, M. Rusinkiewicz, R. Shea, C. Unnikr-
ishnan, A. Unruh, and D. Woelk. Infosleuth: agent-based semantic integration of in-
formation in open and dynamic environments. SIGMOD Rec., 26(2):195–206, June
1997.
6. C. Boutilier, R. I. Brafman, C. Domshlak, H. H. Hoos, and D. Poole. Cp-nets: a tool
for representing and reasoning with conditional ceteris paribus preference statements.
J. Artif. Int. Res., 21:135–191, February 2004.
7. C. Boutilier, R. I. Brafman, H. H. Hoos, and D. Poole. Reasoning with conditional
ceteris paribus preference statements. In K. B. Laskey and H. Prade, editors, UAI,
pages 71–80. Morgan Kaufmann, 1999.
8. R. Buyya and M. Murshed. Gridsim: a toolkit for the modeling and simulation of
distributed resource management and scheduling for grid computing. Concurrency and
Computation: Practice and Experience, 14(13-15):1175–1220, 2002.
9. D. G. Cameron, A. P. Millar, C. Nicholson, R. Carvajal-Schiaffino, K. Stockinger, and
F. Zini. Analysis of scheduling and replica optimisation strategies for data grids using
optorsim. Journal of Grid Computing., 2(1):57–69, 2004.
10. H. Casanova, A. Legrand, and M. Quinson. Simgrid: A generic framework for large-
scale distributed experiments. In Proceedings of the Tenth International Conference
on Computer Modeling and Simulation, UKSIM ’08, pages 126–131. IEEE Computer
Society, 2008.
11. A. Clematis, A. Corana, D. D’Agostino, A. Galizia, and A. Quarati. Job-resource
matchmaking on grid through two-level benchmarking. Future Gener. Comput. Syst.,
26(8):1165–1179, Oct. 2010.
12. K. Czajkowski, S. Fitzgerald, I. Foster, and C. Kesselman. Grid information services
for distributed resource sharing. In High Performance Distributed Computing, 2001.
Proceedings. 10th IEEE International Symposium on, pages 181 –194, 2001.
13. H. Dail, O. Sievert, F. Berman, H. Casanova, A. YarKhan, S. Vadhiyar, J. Dongarra,
C. Liu, L. Yang, D. Angulo, and I. Foster. Scheduling in the grid application devel-
opment software project. In J. Nabrzyski, J. M. Schopf, and J. Weglarz, editors, Grid
resource management, pages 73–98. Kluwer Academic Publishers, Norwell, MA, USA,
2004.
14. C. Dumitrescu and I. Foster. Usage policy-based cpu sharing in virtual organizations. In
Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, GRID
’04, pages 53–60, Washington, DC, USA, 2004. IEEE Computer Society.
15. E. Elmroth and J. Tordsson. Grid resource brokering algorithms enabling advance
reservations and resource selection based on performance predictions. Future Generation
Computer Systems, 24:585–593, 2008.
16. I. Foster and C. Kesselman. The Grid. Blueprint for a New Computing Infrastructure.:
Blueprint for a New Computing Infrastructure (Elsevier Series in Grid Computing).
Morgan Kaufmann, 2. a. edition, 2003.
17. I. Foster, C. Kesselman, and S. Tuecke. The Anatomy of the Grid: Enabling Scalable
Virtual Organizations. International Journal of High Performance Computing Appli-
cations, 15(3):200–222, 2001.
18. A. Harth, S. Decker, Y. He, H. Tangmunarunkit, and C. Kesselman. A semantic match-
maker service on the grid. In Proceedings of the 13th international World Wide Web
36 Massimo Cafaro et al.
conference on Alternate track papers & posters, WWW Alt. ’04, pages 326–327. ACM,
2004.
19. A. Iosup and D. Epema. Grenchmark: A framework for analyzing, testing, and com-
paring grids. In Cluster Computing and the Grid, IEEE International Symposium on,
pages 313–320. IEEE Computer Society, 2006.
20. D. Klus´cek and H. Rudov´a. Alea 2: job scheduling simulator. In Proceedings of the
3rd International ICST Conference on Simulation Tools and Techniques, SIMUTools
’10, pages 61:1–61:10, ICST, Brussels, Belgium, Belgium, 2010. ICST (Institute for
Computer Sciences, Social-Informatics and Telecommunications Engineering).
21. Y. Kotani, F. Ino, and K. Hagihara. A resource selection system for cycle stealing in
gpu grids. Journal of Grid Computing, 6:399–416, 2008.
22. D. Kuokka and L. Harada. Matchmaking for information agents. In Proceedings of
the 14th international joint conference on Artificial intelligence - Volume 1, IJCAI’95,
pages 672–678. Morgan Kaufmann Publishers Inc., 1995.
23. K. Kurowski, J. Nabrzyski, A. Oleksiak, and J. Weglarz. Grid scheduling simulations
with gssim. In Proceedings of the 13th International Conference on Parallel and Dis-
tributed Systems - Volume 02, ICPADS ’07, pages 1–8. IEEE Computer Society, 2007.
24. H. Lamehamedi, Z. Shentu, B. Szymanski, and E. Deelman. Simulation of dynamic data
replication strategies in data grids. In Parallel and Distributed Processing Symposium,
2003. Proceedings. International, page 10 pp., april 2003.
25. H. Li and R. Buyya. Model-driven simulation of grid scheduling strategies. In Proceed-
ings of the Third IEEE International Conference on e-Science and Grid Computing,
pages 287–294, Washington, DC, USA, 2007. IEEE Computer Society.
26. H. Li and R. Buyya. Model-based simulation and performance evaluation of grid
scheduling strategies. Future Gener. Comput. Syst., 25:460–465, April 2009.
27. C. Liu and I. Foster. A constraint language approach to matchmaking. In Proceedings
of the 14th International Workshop on Research Issues on Data Engineering: Web
Services for E-Commerce and E-Government Applications (RIDE’04), RIDE ’04, pages
7–14. IEEE Computer Society, 2004.
28. C. Liu, L. Yang, I. Foster, and D. Angulo. Design and evaluation of a resource selec-
tion framework for grid applications. In Proceedings of the 11th IEEE International
Symposium on High Performance Distributed Computing, HPDC ’02, pages 63–. IEEE
Computer Society, 2002.
29. U. Lublin and D. G. Feitelson. The workload on parallel supercomputers: modeling
the characteristics of rigid jobs. J. Parallel Distrib. Comput., 63:1105–1122, November
2003.
30. S. Ludwig, O. Rana, J. Padget, and W. Naylor. Matchmaking framework for mathe-
matical web services. Journal of Grid Computing, 4:33–48, 2006.
31. E. Medernach. Workload analysis of a cluster in a grid environment. In D. Feitelson,
E. Frachtenberg, L. Rudolph, and U. Schwiegelshohn, editors, Job Scheduling Strategies
for Parallel Processing, volume 3834 of Lecture Notes in Computer Science, pages 36–
61. Springer Berlin / Heidelberg, 2005.
32. S. Naqvi and M. Riguidel. Grid security services simulator (g3s) &#8212; a simulation
tool for the design and analysis of grid security solutions. In Proceedings of the First
International Conference on e-Science and Grid Computing, E-SCIENCE ’05, pages
421–428. IEEE Computer Society, 2005.
33. L. N. Nassif, J. M. Nogueira, and F. v. V. de Andrade. Resource selection in grid:
a taxonomy and a new system based on decision theory, case-based reasoning, and
fine-grain policies. Concurr. Comput. : Pract. Exper., 21:337–355, March 2009.
34. F. Pacini. Job submission description language attributes, glite specification (submission
through wmproxy service). egee-jra1-tec-590869-jdlattributes-v0-8. EGEE, 2006.
35. M. Paolucci, N. Srinivasan, K. P. Sycara, and T. Nishimura. Towards a semantic chore-
ography of web services: From wsdl to daml-s. In Proceedings of the International
Conference on Web Services, ICWS ’03, June 23 - 26, 2003, Las Vegas, Nevada, USA,
pages 22–26. CSREA Press, 2003.
36. K. Ranganathan and I. Foster. Computation scheduling and data replication algorithms
for data grids. In J. Nabrzyski, J. M. Schopf, and J. Weglarz, editors, Grid resource
management, pages 359–373. Kluwer Academic Publishers, 2004.
Preference–based Matchmaking of Grid Resources With CP–Nets 37
37. I. Sfiligoi, D. C. Bradley, B. Holzman, P. Mhashilkar, S. Padhi, and F. Wurthwein. The
pilot way to grid resources using glideinwms. In Proceedings of the 2009 WRI World
Congress on Computer Science and Information Engineering - Volume 02, CSIE ’09,
pages 428–432. IEEE Computer Society, 2009.
38. N. Singh. A common lisp api and facilitator for absi: version 2.0.3. Technical Report
Logic-93-4, Logic Group, Computer Science Department, Stanford University, 1993.
39. M. Solomon. The classad language reference manual v2.1. Computer Sciences Depart-
ment, University of Wisconsin, Madison, USA, 2003.
40. H. J. Song, X. Liu, D. Jakobsen, R. Bhagwan, X. Zhang, K. Taura, and A. Chien. The
microgrid: A scientific tool for modeling computational grids. Sci. Program., 8(3):127–
141, Aug. 2000.
41. V. S. Subrahmanian, P. Bonatti, U. J. Dix, T. Eiter, S. Kraus, and R. Ross. Heteroge-
neous Agent Systems. MIT Press, 2000.
42. K. Sycara, J. Lu, and M. Klusch. Interoperability among heterogeneous software agents
on the internet. Technical Report CMU-RI-TR-98-22, Robotics Institute, Pittsburgh,
PA, October 1998.
43. K. Sycara, S. Widoff, M. Klusch, and J. Lu. Larks: Dynamic matchmaking among
heterogeneous software agents in cyberspace. Autonomous Agents and Multi-Agent
Systems, 5(2):173–203, June 2002.
44. A. Takefusa, S. Matsuoka, H. Nakada, K. Aida, and U. Nagashima. Overview of a per-
formance evaluation system for global computing scheduling algorithms. In High Perfor-
mance Distributed Computing, 1999. Proceedings. The Eighth International Symposium
on, pages 97 –104, 1999.
45. D. Thain, T. Tannenbaum, and M. Livny. Condor and the grid. In F. Berman, G. Fox,
and T. Hey, editors, Grid Computing: Making the Global Infrastructure a Reality. John
Wiley & Sons Inc., December 2002.
46. P. Thysebaert, B. Volckaert, F. de Turck, B. Dhoedt, and P. Demeester. Evaluation
of grid scheduling strategies through nsgrid: a network-aware grid simulator. Neural,
Parallel Sci. Comput., 12(3):353–378, Sept. 2004.
47. C.-M. Wang, H.-M. Chen, C.-C. Hsu, and J. Lee. Dynamic resource selection heuris-
tics for a non-reserved bidding-based grid environment. Future Gener. Comput. Syst.,
26:183–197, February 2010.
48. G. J. Wickler. Using expressive and flexible action representations to reason about
capabilities for intelligent agent cooperation. PhD thesis, University of Edinburgh,
1999.
... Often, when distributing the tasks, a first suitable resource without any optimization is selected [1][2][3][4]. Other algorithms do not take into account the peculiarities of the environment with inalienable resources, i.e. a monitoring of the computing resource utilization dynamics is not proceeded, the task supplier's competition is not taken into account [5,6]. ...
... Planning methods described in [4,[9][10][11][12][13]] use a number of criteria (the value of the use of resources, the level of load of computing resources, the use of nearby resources for related objectives in the task). In the studies [14][15][16][17] the methods of planning, which focus on the preferences of task suppliers, administrators of virtual organizations or suppliers of computing resources are suggested to be used. ...
Article
Full-text available
An information distribution task technology for GRIDsystems based on the use of simulation modeling GRASS environment was proposed. GRASS reproduces the process of functioning over time of elementary events that occur in the GRID-system with maintaining their interaction logic. This solution enables conducting of computational experiments that implement different methods of distribution, with a following selecting of the most effective solution on the basis of the collection, analysis and interpretation of simulation results. The proposed task of distribution technology using simulation modeling GRASS environment, enables implementing multiple distribution methods and selecting the best distribution environment that increases the efficiency of GRIDsystems by reducing the time of the task performance and reducing the downtime of resources in highly related tasks. GRASS modeling environment has a modular structure, which consists of a core and dynamically loaded modules (plug-ins). Each module performs a highly specialized task, referring if necessary to the other modules of the system. The core provides means of inter-module interaction and provides boot and system configuration.
... Though applications in MARA do make use of the preference models presented in Section 2.5 (Chevaleyre et al., 2008;Cafaro et al., 2013;Bouveret et al., 2016), the characteristics of certain allocated items (also referred to as tasks or resources) in many applications have led to the development of other preference models. For example, kadditive utility functions (Chevaleyre et al., 2004) provide evaluations for bundles of allocated items and are a generalisation of many models addressed in this chapter, while weighted propositional formulas represent preferences using logical operations (Lang, 2004). ...
Preprint
Full-text available
This thesis presents a theory for learning and inference of user preferences with a novel hierarchical representation that captures preferential indifference. Such models of 'Coarse Preferences' represent the space of solutions with a uni-dimensional, discrete latent space of 'categories'. This results in a partitioning of the space of solutions into preferential equivalence classes. This hierarchical model significantly reduces the computational burden of learning and inference, with improvements both in computation time and convergence behaviour with respect to number of samples. We argue that this Coarse Preferences model facilitates the efficient solution of previously computationally prohibitive recommendation procedures. The new problem of 'coordination through set recommendation' is one such procedure where we formulate an optimisation problem by leveraging the factored nature of our representation. Furthermore, we show how an on-line learning algorithm can be used for the efficient solution of this problem. Other benefits of our proposed model include increased quality of recommendations in Recommender Systems applications, in domains where users' behaviour is consistent with such a hierarchical preference structure. We evaluate the usefulness of our proposed model and algorithms through experiments with two recommendation domains - a clothing retailer's online interface, and a popular movie database. Our experimental results demonstrate computational gains over state of the art methods that use an additive decomposition of preferences in on-line active learning for recommendation.
... При работе большинства алгоритмов распределения запуск задания осуществляется на первый подходящий вычислительный ресурс без анализа остальных [1][2][3]. В работе [4] в качестве параметра отбора предложена функция полезности, однако оптимизация выбора ресурса проводится только на основании доступных на данный момент вычислительных ресурсов. Большинство алгоритмов распределения при построении плана не ориентируются на предварительное резервирование, которое может сократить время нахождения задания в системе за счет уменьшения простоя вычислительных ресурсов [1; 2; 5]. ...
Article
Full-text available
Объектом исследования выступает процесс распределения пула входных заданий на вычислительные ресурсы в гибридных кластерных системах. Предмет исследования – информационная технология распределения заданий на вычислительные ресурсы гибридных кластерных систем. Цель – разработка и внедрение этапа имитационного моделирования в модифицированную информационную технологию распределения входящего пула заданий на вычислительные мощности гибридных кластерных систем. Задачи: на основе математических моделей заданий, вычислительных ресурсов и методов распределения модифицировать существующую информационную технологию распределения заданий; разработать информационную систему, которая будет выполнять автоматизированный процесс сбора и обработки полученных данных; сформировать ряд экспериментов по распределению входного пула заданий, на основе реализованных в среде имитационного моделирования методов распределения. Методы исследования базируются на использовании теории множеств, общей теории систем и теории имитационного моделирования. Получены следующие результаты. Предложена модифицированная информационная технология распределения программных заданий большой размерности на вычислительные ресурсы для систем облачных вычислений с использованием имитационной среды моделирования с последующим выбором наилучшего плана распределения по каждому пулу входных заданий. Предложенная информационная технология внедрена в имитационную среду моделирования, которая позволяет воспроизводить процесс функционирования элементарных событий, происходящих в реальных гибридных кластерных системах с сохранением логики их взаимодействия в реальном времени. Выводы: предложенная информационная технология объединяет процессы сбора, хранения, обработки и передачи данных с использованием предложенных в работе методов распределения, средства для дальнейшего анализа результатов моделирования и принятия решения о выполнении определенного действия (выбора наилучшего плана распределения). Использование в среде моделирования множества методов распределения позволяет провести серию экспериментов и на основании полученных результатов, осуществить выбор наилучшего плана распределения для конкретного входного пула заданий (на основании выбранной стратегии распределения).
Conference Paper
In this work, a job-flow scheduling approach for grid virtual organizations (VOs) is proposed and studied. Users’ and resource providers’ preferences, VOs internal policies, resources geographical distribution along with local private utilization impose specific requirements for efficient scheduling according to different, usually contradictive, criteria. With increasing level of resources utilization, the set of available resources and corresponding decision space are reduced. This further complicates the problem of efficient scheduling. In order to improve overall scheduling efficiency, we propose an anticipation scheduling approach based on a cyclic scheduling scheme. It generates a near optimal but infeasible scheduling solution and includes a special replication procedure for efficient and feasible resources allocation. Anticipation scheduling is compared with the general cycle scheduling scheme and conservative backfilling using such criteria as average jobs’ start and finish times as well as users’ and VO economic criteria: total execution time and cost.
Conference Paper
A heuristic user job-flow scheduling approach to grid virtual organizations with non-dedicated resources is discussed in this article. Users’ and resource providers’ preferences, virtual organization’s internal policies, resources geographical distribution along with local private utilization impose specific requirements for efficient scheduling according to different, usually contradictive, criteria. The available resources set and the corresponding decision space decrease as resources utilization increases. This introduces further complications into the task of efficient scheduling. We propose a heuristic anticipation scheduling approach to improve the overall scheduling efficiency. Initially, it generates a near optimal but infeasible scheduling solution which is then used as a reference for efficient allocation of resources.
Conference Paper
In this work, a job-flow scheduling approach for Grid virtual organizations (VOs) is proposed and studied. Users’ and resource providers’ preferences, VOs internal policies, resources geographical distribution along with local private utilization impose specific requirements for efficient scheduling according to different, usually contradictive, criteria. With increasing resources utilization level the available resources set and corresponding decision space are reduced. This further complicates the problem of efficient scheduling. In order to improve overall scheduling efficiency, we propose an anticipation scheduling approach based on a cyclic scheduling scheme. It generates a near optimal but infeasible scheduling solution and includes a special replication procedure for efficient and feasible resources allocation. Anticipation scheduling is compared with the general cycle scheduling scheme and conservative backfilling using such criteria as average jobs’ response time (start and finish times) as well as users’ and VO economic criteria (execution time and cost).
Chapter
In this work, a job-flow scheduling approach for Grid virtual organizations is proposed and studied. Users’ and resource providers’ preferences, virtual organization’s internal policies, resources geographical distribution along with local private utilization impose specific requirements for efficient scheduling according to different, usually contradictive, criteria. With increasing resources utilization level the available resources set and corresponding decision space are reduced. This further complicates the task of efficient scheduling. In order to improve overall scheduling efficiency we propose a heuristic anticipation scheduling approach. It generates a near optimal but infeasible scheduling solution and includes special replication procedure for efficient and feasible resources allocation.
Article
Full-text available
This document specifies the semantics and structure of the Job Submission Description Language (JSDL). JSDL is used to describe the requirements of computational jobs for submission to resources, particularly in Grid environments, though not restricted to the latter. The document includes the normative XML Schema for the JSDL, along with examples of JSDL documents based on this schema.
Article
The complexity and dynamic nature of the Internet (and the emerging Computational Grid) demand that middleware and applications adapt to the changes in configuration and availability of resources. However, to the best of our knowledge there are no simulation tools which support systematic exploration of dynamic Grid software (or Grid resource) behavior. We describe our vision and initial efforts to build tools to meet these needs. Our MicroGrid simulation tools enable Globus applications to be run in arbitrary virtual grid resource environments, enabling broad experimentation. We describe the design of these tools, and their validation on micro-benchmarks, the NAS parallel benchmarks, and an entire Grid application. These validation experiments show that the MicroGrid can match actual experiments within a few percent (2% to 4%).
Article
"Grid" computing has emerged as an important new field, distinguished from conventional distributed computing by its focus on large-scale resource sharing, innovative applications, and, in some cases, high-performance orientation. In this article, we define this new field. First, we review the "Grid problem," which we define as flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources-what we refer to as virtual organizations. In such settings, we encounter unique authentication, authorization, resource access, resource discovery, and other challenges. It is this class of problem that is addressed by Grid technologies. Next, we present an extensible and open Grid architecture, in which protocols, services, application programming interfaces, and software development kits are categorized according to their roles in enabling resource sharing. We describe requirements that we believe any such mechanisms must satisfy, and we discuss the central role played by the intergrid protocols that enable interoperability among different Grid systems. Finally, we discuss how Grid technologies relate to other contemporary technologies, including enterprise integration, application service provider, storage service provider, and peer-to-peer computing. We maintain that Grid concepts and technologies complement and have much to contribute to these other approaches.
Conference Paper
Grid technologies enable large-scale sharing of resources within formal or informal consortia of individuals and/or institutions: what are sometimes called virtual organizations. In these settings, the discovery, characterization, and monitoring of resources, services, and computations are challenging problems due to the considerable diversity, large numbers, dynamic behavior, and geographical distribution of the entities in which a user might be interested. Consequently, information services are a vital part of any Grid software infrastructure, providing fundamental mechanisms for discovery and monitoring, and hence for planning and adapting application behavior. We present here an information services architecture that addresses performance, security, scalability, and robustness requirements. Our architecture defines simple low-level enquiry and registration protocols that make it easy to incorporate individual entities into various information structures, such as aggregate directories that support a variety of different query languages and discovery strategies. These protocols can also be combined with other Grid protocols to construct additional higher-level services and capabilities such as brokering, monitoring, fault detection, and troubleshooting. Our architecture has been implemented as MDS-2, which forms part of the Globus Grid toolkit and has been widely deployed and applied.
Conference Paper
Data Grids seek to harness geographically distributed resources for large-scale data-intensive problems such as those encountered in high energy physics, bioinformatics, and other disciplines. These problems typically involve numerous, loosely coupled jobs that both access and generate large data sets. Effective scheduling in such environments is challenging, because of a need to address a variety of metrics and constraints (e.g., resource utilization, response time, global and local allocation policies) while dealing with multiple, potentially independent sources of jobs and a large number of storage, compute, and network resources. We describe a scheduling framework that addresses these problems. Within this framework, data movement operations may be either tightly bound to job scheduling decisions or performed by a decoupled, asynchronous process on the basis of observed data access patterns and load. We develop a family of job scheduling and data movement (replication) algorithms and use simulation studies to evaluate various combinations. Our results suggest that while it is necessary to consider the impact of replication on the scheduling strategy, it is not always necessary to couple data movement and computation scheduling. Instead, these two activities can be addressed separately, thus significantly simplifying the design and implementation of the overall Data Grid system.
Article
The aim of this thesis is to adress the problem of capability brokering. A capability-brokering agent recieves capability advertisements from problem-solving agents and problem descriptions from problem-holding agents. The amin task for the broker is to find problem-solving agents that have the capabilities to address problems described to the broker by a problem-holding agent. Capability brokering poses two problems: for advertisements, and matching problems and capabilities, to find capable problem-solvers. For the representation part of the problem, there have been a number of representations in AI that address similar issues. We review various logical representations, action representations, and representations for models of problem solving and conclude that, while all of these areas have some positive features for the representation of capabilities, they also all have serious drawbacks. We describe a new capability description language, CDL, which shares the positive features of previous languages while avoiding their drawbacks. CDL is a decoupled action representation into which arbitrary state representations can be plugged, resulting in the expressiveness and flexibility needed for capability brokering. Reasoning over capability descriptions takes place on two levels. The outer level deals with agent communication and we have devloped the Knowledge Query and Manipulation Language (KQML) here. At the inner level the main task is to decide whether a capability description subsumes a problem description. In CDL thee subsumtion relation for achievable objectives is defined in terms of the logical entailment relation betwenn sentences in the state language used within CDL. The definition of subsumption for performable tasks in turn is based on this definition for achievable objectives. We describe algoritms in this thesis which have all been implemented and incorporated into he Java Agent Template where they proved sufficient to operationalise anumber of example scenarios. The two most important featues of CDL are its expressiveness and its flexibility. By expressiveness we mean the ability to express more than is possible in other representations. By flexibility we mean the possibility to delay decisions regarding the compromises that have to be made to knowledge representation time. The scenarions we ahve implemted illustrate the importance of the features and we have shown in this thesis that CDL indeed possess thease features. Thus, CDL is an expressive and flexible capability description language that can be used to address the problem of capability brokering.
Article
This work describes the Grid and cluster scheduling simulator Alea 2 designed for study, testing and evaluation of various job scheduling techniques. This event-based simulator is able to deal with common problems related to the job scheduling like the heterogeneity of jobs, resources, and the dynamic runtime changes such as the arrivals of new jobs or the resource failures and restarts. The Alea 2 is based on the popular GridSim toolkit [31] and represents a major extension of the Alea simulator, developed in 2007 [16]. The extension covers both improved design, extended functionality as well as the improved scalability and the higher simulation speed. Finally, new visualization interface was introduced into the simulator. The main part of the simulator is a complex scheduler which incorporates several common scheduling algorithms working either on the queue or the schedule (plan) based principle. Additional data structures are used to maintain information about the resource status, the objective functions and for collection and visualization of the simulation results. Many typical objectives such as the machine usage, the average slowdown or the average response time are included. The paper concludes with an example of the Alea 2 execution using a real-life workload, discussing also the scalability of the simulator.