ArticlePDF Available

Abstract and Figures

We survey the areas of Desktop Grid task scheduling that seem to be insufficiently studied so far and are promising for efficiency, reliability, and quality of Desktop Grid computing. These topics include optimal task grouping, “needle in a haystack” paradigm, game-theoretical scheduling, domain-imposed approaches, special optimization of the final stage of the batch computation, and Enterprise Desktop Grids.
Content may be subject to copyright.
Open Access. ©2017 Ilya Chernov et al., published by De Gruyter Open. This work is licensed under the Creative Commons
Attribution-NonCommercial-NoDerivs 4.0 License.
Open Eng. 2017; 7:343–351
Survey article
Ilya Chernov, Natalia Nikitina*, and Evgeny Ivashko
Task Scheduling in Desktop Grids:
Open Problems
https://doi.org/10.1515/eng-2017-0038
Received September 7, 2017; accepted November 8, 2017
Abstract: We survey the areas of Desktop Grid task
scheduling that seem to be insuciently studied so far
and are promising for eciency, reliability, and quality
of Desktop Grid computing. These topics include optimal
task grouping, “needle in a haystack” paradigm, game-
theoretical scheduling, domain-imposed approaches, spe-
cial optimization of the nal stage of the batch computa-
tion, and Enterprise Desktop Grids.
Keywords: desktop grid, task scheduling, high-
performance computing, high-throughput computing,
enterprise desktop grid
1Introduction
Desktop Grid is a computational paradigm representing
a distributed computing system which uses idle time
of non-dedicated geographically distributed computing
nodes connected over general-purpose network. Fast de-
velopment of the Internet, quick growth of the number
of personal computers, as well as huge increase in their
performance, have made Desktop Grid a promising high-
throughput computing tool for solving numerous scien-
tic problems. Volunteer computing is one way to organize
computation: the work is done by volunteers that join the
project via the Internet. Another approach is Enterprise
Desktop Grid which joins computers within an organiza-
tion.
Ilya Chernov: Institute of Applied Mathematical Research of the
Karelian Research Center RAS; Petrozavodsk State University,
Petrozavodsk, Russia, E-mail: IAChernov@yandex.ru
*Corresponding Author: Natalia Nikitina: Institute of Applied
Mathematical Research of the Karelian Research Center RAS,
Petrozavodsk, Russia, E-mail: nikitina@krc.karelia.ru
Evgeny Ivashko: Institute of Applied Mathematical Research of
the Karelian Research Center RAS; Petrozavodsk State University,
Petrozavodsk, Russia, E-mail: ivashko@krc.karelia.ru
The rst large volunteer computing project
SETI@home was launched in 1999, providing the basis for
development of the BOINC (Berkeley Open Infrastructure
for Network Computing) platform. Today, there are more
than 60 active BOINC projects utilizing more than 15
million computers worldwide [2]. Desktop Grid diers
from other high-performance systems, such as Compu-
tational Grids, computing clusters, and supercomputers;
scheduling policies should be aware of huge hardware
and software heterogeneity, lack of trust, uncertainty on
availability and reliability of computing nodes, etc.
Note that, besides the conventional Desktop Grids,
special types of them are considered in literature, includ-
ing hybrid, hierarchical, peer-to-peer Desktop Grids, etc.
Here, we address all variations as Desktop Grids. However,
we focus on the classical BOINC architecture: client-server
network with independent nodes, free to join or leave.
Hierarchical, peer-to-peer and decentralized Desktop
Grids open new problems and challenges from the point of
view of scheduling. The authors of [9], for example, pro-
pose a decentralized distributed scheduling method with
each node serving as a computing node, scheduler, and
router; in [20, 21, 27] (to mention a few) “Desktop Grids of
Desktop Grids” are considered, where the nodes of a Desk-
top Grid are Desktop Grids with their own schedulers; in
[43] scheduling in peer-to-peer Desktop Grids is studied.
Also, new technologies and approaches are used to cre-
ate new types of Desktop Grids. For example, the quickly
developing blockchain technology is used to construct a
market of Desktop Grid resources based on XtremWeb-HEP
software platform (http://iex.ec/). These technologies and
new approaches, together with scheduling policies and
open problems that appear in this area, deserve a whole
separate paper.
Scheduling policy used in Desktop Grid signicantly
aects its characteristics, such as performance (measured
as the makespan, overall throughput, throughput of vali-
dated results, turnaround, or otherwise) and/or reliability
(in the sense of producing correct results). So, much atten-
tion has been paid to development of eective (from dif-
ferent points of view) scheduling policies. However, there
Unauthenticated
Download Date | 12/4/17 1:14 AM
344 |I. Chernov, N. Nikitina, E. Ivashko
remain open problems that deserve more attention of re-
searchers and are still promising.
Besides performance or reliability, one can consider
the problem of reducing the cost of an answer. In par-
ticular, this can be restricted to energy eciency (e.g.,
[34, 38, 40, 51]).
The aim of our work is to highlight several open prob-
lems in the domain of scheduling in Desktop Grid. Imple-
menting the solutions of these problems promises signi-
cant performance improvement both in the common com-
putational process and in specic domain-related prob-
lems. Our analysis is based on study of more than a hun-
dred research papers on Desktop Grid scheduling and re-
lated problems, dated from 1999 to 2016. The comprehen-
sive review of these papers is presented in [11]: we summa-
rize methods of solving various challenges of task schedul-
ing, but do not discuss open problems.
There is a number of articles devoted to overview and
study of scheduling in Desktop Grids. The BOINC website
has a list of open problems [3] (however, mostly technical
ones). The paper [32] studies BOINC from the user and de-
veloper perspectives to develop high-level suggestions to
future development of BOINC.
There is also a number of review papers devoted to
Desktop Grid scheduling and related problems. The report
[14] presents classications of Desktop Grids from several
points of view. Although relatively old, this work is still a
valuable attempt to classify Desktop Grids and a collection
of real systems.
The authors of [28] consider and compare the most
popular contemporary middleware systems for Desktop
Grid: BOINC, XtremWeb, OurGrid and HTCondor. They re-
view scientic papers devoted to research of the factors
that inuence the performance of Desktop Grid (from the
point of view of both server and the client).
The review [57] considers Computational Grids,
though oers much knowledge about scheduling types
and methods. Computational Grids, e-science Grids,
Service Grids are all quite dierent compared to Desktop
Grids.
The chapter [19] in a book is a review of scheduling
algorithms for Desktop Grids and the challenges they face.
Naive algorithms that ignore the work history, knowledge-
based ones that use accumulated and apriori information,
and adaptive methods are overviewed.
The paper [17] addresses availability of resources: the
problem of great importance for volunteer computing.
Measurement of resource availability, revealing availabil-
ity patterns, and using them for better control of the com-
puting process are the main considered questions. Sev-
eral naive and knowledge-based algorithms are reviewed
together with node grouping methods, security threats in
volunteer computing, and safety measures.
In this paper, we focus on the open problems on
task scheduling in BOINC. Using the BOINC terminol-
ogy [4], by task we mean an instance of a computational
workunit. The open problems we discuss in this paper in-
clude: task grouping and scheduling parcels of tasks; task
scheduling when searching for rare answers; evaluating
schedules from the point of view of the mean cost; non-
common, domain-imposed scheduling objectives; game-
theoretic methods of task scheduling; Enterprise Desktop
Grids; “tail” phase scheduling; and generating policies by
a complex “black box” algorithm, the inner logic of which
is unavailable (including genetic search and neural net-
works).
2Task grouping
By its essence, BOINC is designed to support applications
that have low data transmission/compute ratio [1]. Thus,
an important concern is providing computational tasks
of reasonable runtime. On the one hand, too long tasks
lead to waste of resources due to computational errors and
deadline violations. It is also unfavourable from the point
of view of volunteer computing [7]. On the other hand, too
short tasks lead to high server load because of large num-
ber of requests from the clients.
Some scientic problems, like search for extrema in
high-dimensional spaces, allow to vary task runtimes
quite exibly. But in some cases, selection of the proper
task size is more complicated. One of such cases is vir-
tual drug screening, where a single estimation of ligand-
protein interaction energy typically takes from a couple of
minutes to about an hour using molecular docking soft-
ware such as AutoDock Vina [23, 42]. Task grouping tech-
nique could be used to solve this problem.
We will refer to a group of tasks wrapped together
as a parcel. A parcel is created and processed as a sin-
gle BOINC workunit. This distinguishes the parcel concept
from a batch of tasks (which is a series of workunits) and
from a group of tasks which is an arbitrarily selected por-
tion of workunits.
Task parceling can serve at least three purposes.
Firstly, parcels of small tasks take more time to solve,
and communications between nodes and the server are
fewer; so the server load is less. We witnessed a drastic
slow-down of Desktop Grid computation when many quick
tasks (less than 1 minute each) were scheduled; grouping
tasks into parcels of 1000 tasks each improved the perfor-
Unauthenticated
Download Date | 12/4/17 1:14 AM
Task Scheduling in Desktop Grids: Open Problems |345
mance. In [37] the server load is minimized by choosing
optimal parcel size by game-theoretic methods.
Next, parceling is able to improve scheduling e-
ciency by using variable size of parcels depending on a
node’s performance, reliability, trust level, network con-
nection, etc. This question has been paid relatively low
attention, up to our knowledge. In [35] dierent heuristic
scheduling algorithms, including a few “batch” ones, are
compared. Beside selecting the best ones by simulation,
they also show that the batch mode allows more eective
scheduling.
Finally, parcels can contain special tasks for reveal-
ing cheaters, checking purposes, etc. Adding some tasks
of a parcel to another parcel for check purposes can be
called “fractional replication”: only part of the complex
compound task is re-solved. This can be used to ght
saboteurs, including intellectual and organized ones, to
improve reliability, or for better eciency. For example,
[50] studies defense against intellectual cheaters (adver-
saries) in volunteer computing networks by putting check
tasks into task parcels. In [58] also test tasks are added
to parcels, not only to reveal the cheaters, but to hide
valuable answers. The Earth added to a parcel of planet
descriptions for search of life, drastically increases the
amount of positive results so that stealing a positive result
would be harder. In [16] the same aim is considered from
another point of view: they re-solve randomly chosentasks
of a parcel to reveal lying nodes. Dierent retrieval meth-
ods for better performance are compared in [52]: this is not
exactly task parceling, but a related approach.
3Needle in a haystack
Many problems suitable for Desktop Grid consist of tasks
that give interesting results only seldom. For example, t-
ting experimental data of hydride decomposition [24] by
model curves by search in a space of parameters rejects
most of the tested sets. Then errors of the rst and sec-
ond kinds, i.e., accepting a wrong answer or rejecting the
correct one, have signicantly dierent value. Indeed, a
false interesting result would be quickly revealed; also,
additional checks, even expensive, are too rare to signi-
cantly aect the metrics. However, rejecting the unique in-
teresting result makes all the project completely useless.
This can be taken into account for optimizing replication,
scheduling, task validation, etc.
Besides, the existing paradigm “one task — one an-
swer” obviously causes needless communications be-
tween the server and the clients. It seems eective to
schedule huge parcels of tasks and consider the rst good
answer as the result of the whole parcel. In other words,
a node continues to look for a good answer until either
the parcel is complete or the deadline is missed. Up to our
knowledge, nobody has studied this possibility which can
be called “looking for the needle in a haystack”. In vol-
unteer computing, some reliability-improving measures
must be taken. Heartbeat and replication techniques [49]
can be used for this purpose. For Enterprise Desktop Grids,
employment of the PUSH model of interaction with com-
puting nodes may improve eciency and signicantly re-
duce the server load.
4Cost of an answer
In [10, 12] we consider the special optimal replication prob-
lem for Desktop Grid: the minimized quantity is the av-
erage total cost of solving a task. Each task can produce
an answer from some given set of values (a two-element
set in case of recognition problems). Wrong answers can
be obtained with some known probabilities. To reduce the
risk of error, each task is solved independently on dier-
ent computing nodes. An answer is accepted when it is
produced the given number of times called the quorum (it
can be answer-dependent). In case a wrong answer is ac-
cepted, some penalty is added to the computational cost.
Penalties can also depend on the accepted answer and
the correct answer and are usually large compared to the
cost of a single computation. So, the cost of a task, con-
sists of solving it several times (replication) and a possible
penalty. The problem is to choose quorums for the possible
answers to minimize the expected cost. Too high quorums
almost eliminate the risk of paying a penalty but instead
increase linearly the computational cost.
Obviously, the optimal quorums need not to be the
same for dierent possible answers: some answers need
to be checked more carefully than others, depending on
their penalty and error probability. For example, if penal-
ties are signicantly dierent (so some answers are more
valuable than other), less valuable answers need to be
checked more in order to reduce the risk of missing a valu-
able answer. Another result is the stability of the optimal
quorum with respect to the penalty values: critical penal-
ties in case of reliable (i.e., with low error risk p) comput-
ing nodes grow as pνwith respect to the quorum ν. There-
fore, exact values of penalties need not to be known pre-
cisely: rough estimations up to a few orders of magnitude
are sucient. Also the quorum for rare results often can be
chosen higher than the optimal value at a very low cost.
Unauthenticated
Download Date | 12/4/17 1:14 AM
346 |I. Chernov, N. Nikitina, E. Ivashko
Indeed, higher quorum further reduces the risk, but in-
creases computational costs linearly; this addition occurs
only seldom: when the true answer is a rare one. So the
total increase of the average total cost is low.
5Domain-imposed criteria for task
scheduling
Desktop Grids allow solving problems from various scien-
tic domains. Beside the common optimization criteria,
such as the minimal makespan, the maximal throughput,
etc., the scientic domain of the problem being solved on a
Desktop Grid can impose additional requirements and op-
timization objectives to the process of problem solution,
which partly might be addressed by proper task schedul-
ing.
One of the examples of such objectives arises from
the bioinformatics eld. During virtual drug screening,
the challenge is not only to process large libraries of
molecules, but also to select a topologically diverse set of
hit compounds. Moreover, such set is highly desirable to
obtain at early stages of the research, because the virtual
screening process might take months before the results
proceed to a laboratory [45].
The work [39] proposes a task scheduling algorithm
to address this challenge. The space of molecules is di-
vided into blocks according to the chemical properties of
molecules. The client nodes of a Desktop Grid act as in-
dependent entities and select a block where they prefer
to perform molecular docking. The purpose of each node
is to obtain as many hit molecules as possible, but topo-
logically dierent from the ones found by other nodes.
The prototype implementation and computational experi-
ments show that the proposed algorithm allows to quickly
obtain topologically diverse hits.
The example above shows that domain-imposed crite-
ria for task scheduling can signicantly increase eective-
ness of a Desktop Grid.
6Games of scheduling
Mathematical game theory is an eective approach to con-
struct optimal rules of interaction between independent
agents. In Desktop Grids, computing nodes can be consid-
ered as independent agents with dierent characteristics
(i.e. computing power, network speed, availability and re-
liability metrics, and so on).
Game theory considers optimization problems where
utility functions of two or more agents called players de-
pend on decisions made by all players [54]. One of the aims
of the theory is to design the game rules to provide the de-
sired behaviour as the optimal one.
Applied to scheduling in Desktop Grids, game-
theoretic methods can serve several purposes. The
literature considers the following two major examples.
First, well-designed rules are able to provide optimal be-
haviour in a decentralized way; this can reduce the server
load, eliminate necessity to gather information about the
nodes, improve robustness of the schedule [15, 29, 36, 37].
Second, game-theoretic methods eciently serve to
counteract sabotage in volunteer computing Desktop
Grid [6, 22, 56]. Besides the possibility to force saboteurs
to stop their malicious activity, it is vital to understand
interaction of intelligent and cooperating adversaries.
However, the potential of game-theoretic methods seems
to be much higher.
7Enterprise Desktop Grids
Desktop Grids can be classied to volunteer computing
projects, where volunteers grant their resources and are
connected via the Internet, and Enterprise Desktop Grid
systems that join resources of an institution and exploit
local area networks. Much attention has been paid to task
scheduling for volunteer computing: it is enough to men-
tion the thesis [48] and a review [17]. Enterprise Desk-
top Grids are studied much less [26, 30, 31, 47]. However,
these two types of Desktop Grids are quite dierent, also
from the point of view of scheduling. Firstly, trust is much
higher in Enterprise Desktop Grids: at least there are no
saboteurs. Reliability is also higher. Secondly, nodes’ be-
haviour is much more predictable [33, 46] and all informa-
tion about them (including their performance, amount of
memory, type of OS, installed software, etc) is available.
In particular, availability patterns (distribution of ON and
OFF periods) is more predictable. Thirdly, the structure of
the Enterprise Desktop Grid is usually known. So, quite ef-
cient task scheduling is possible with lower level of repli-
cation.
Next, Enterprise Desktop Grids are often relatively
small in size, so that methods inapplicable in the general
case can be used (e.g., in [5] linear optimization problem
maximizes the throughput of a heterogeneous computing
system).
Also, Enterprise Desktop Grid can be of dierent archi-
tecture, not necessarily adhere to just the classical client-
Unauthenticated
Download Date | 12/4/17 1:14 AM
Task Scheduling in Desktop Grids: Open Problems |347
server model. Peer-to-peer, torrent-like, multi-level com-
puting networks can be used for solving various types
of scientic problems. For example, in [13] a peer-to-peer
torrent-like Desktop Grid where each node needs to get
all the results is considered. One of the nodes serves as a
tracker.
Finally, Enterprise Desktop Grid can exploit the PUSH
model of interaction with nodes instead of the regular
PULL model. Using the PUSH model, the server assigns a
task to a node and is able to check the status of the task
(completion progress), the status of the node (ON/OFF,
available resources, availability periods, etc.); to cancel
the task when the needed number of results is achieved.
All this makes Enterprise Desktop Grids much closer to
Computational Grid than to volunteer computing. But En-
terprise Desktop Grids also keep many of the intrinsic char-
acteristics of regular Desktop Grids.
Enterprise Desktop Grids allow to solve local scien-
tic problems (e.g., data analysis) within an institution.
However, their ecient usage needs new mathematical
models and algorithms of task scheduling. The open prob-
lems of task scheduling in Enterprise Desktop Grids in-
clude both optimal replication, availability/reliability rat-
ings, etc. (more important for volunteer computing) and
special problems like “tail” computation or task grouping.
8“Tail” computation
Another interesting problem is optimization of the “tail”
computation. A distributed computational experiment in-
volving a batch of tasks inherently consists of two stages
([8], see Fig. 1). At the rst one, the throughput stage, the
number of tasks is greater than the number of comput-
ing nodes (in the very beginning it can be much more). At
this stage, the computational power is limiting the perfor-
mance, so it is reasonable to supply each node with a task.
With time, the number of unprocessed tasks decreases un-
til it is equal to the number of computing nodes: then the
second stage called the tail starts. At this stage, there is ex-
cess of computational power. In particular, during the rst
stage, the replication can be useless (from the point of view
of the makespan) as it was shown in [25]. The same is true
for parallel multicore implementations of the used algo-
rithms. Tail computation is optimized in [8, 31].
A number of research problems related to specics of
Desktop Grids are connected to the two-staged batch com-
pletion.
The rst problem is time estimation of the rst stage
completion. It is important for planning of the computa-
Figure 1: Two stages of batch completion in a Desktop Grid.
tional experiment, Desktop Grid maintenance, and keep-
ing interest to the project among volunteers. This prob-
lem is complicated due to the dynamic nature of Desktop
Grids: computational nodes can leave or join the compu-
tational network; high heterogeneity of computing nodes:
computing performance of new or leaving nodes can sig-
nicantly dier from the mean node performance; a com-
puting network performance can have a trend (because of
growth or lose an interest to the project among volunteers)
or even ashes (as a result of “competitions” among volun-
teer teams).
Another problem is the fastest experiment or batch
completion. In practice, “tail” computation takes a long
time (usually, about two or three times greater than dead-
line) because of missing deadlines. A computational net-
work does not have information of the current status of
tasks computation. So, the “tail” accumulates a lot of
nodes that have abandoned the computing network. As a
certain task assigned to such a node violates the deadline,
it is assigned again to a dierent node, possibly also going
to leave the network soon. So, this prolongs the “tail”.
A solution to the problem is in the redundant com-
puting: currently processing tasks are assigned to vacant
computing nodes. This strategy signicantly increases the
chances that at least one copy is solved in time. Employ-
ing this strategy, one should to take into account char-
acteristics of computational nodes (availability, reliabil-
ity, computing power), processing the same task; accumu-
lated task processing time; expected task completion time;
and so on.
An approach for reducing the total batch completion
time by about 14% and the “tail” phase, in particular, by
about 40% is proposed in [53]: the nodes and the tasks are
grouped by their dynamically evaluated performance and
complexity.
Unauthenticated
Download Date | 12/4/17 1:14 AM
348 |I. Chernov, N. Nikitina, E. Ivashko
Three strategies of task replication with performance
of the nodes taken into account are experimentally tested
in [31]: idle resources need to be used to decrease the
makespan.
The fastest batch completion problem is even more
complicated if a new tasks batch should be started after the
current batch completion. In this case redundant comput-
ing reduces accessible computing power. The more com-
plicated case is connected to inter-dependency between
the tasks of the new batch and completion of certain tasks
of the current batch.
9Black box schedule generators
Attention to articial intelligence (machine learning, neu-
ral networks) methods has been growing recently. They
can be used for producing enhanced schedules. For exam-
ple, in [44] neural networks forecast the load in order to
schedule jobs to volunteer resources. The authors of [59]
also apply neural networks to evaluate the load. Such sys-
tems have proved to be able to solve complex tasks of esti-
mating fuzzy concepts like reliability, availability patterns,
task complexity, etc.
Another approach is the genetic algorithms. Applied
to scheduling (e.g., [18, 19, 41, 55]), they simulate the Dar-
winian evolution in a population of schedules to produce
the best ones. Some authors use genetic algorithms to de-
velop optimal scheduling policies.
10 Discussion
Scheduling in Desktop Grids is generally a multi-criteria
optimization problem. The demands can be classied to:
Performance or eciency, measured as makespan,
throughput, turnaround time, total time for the bath
of tasks, overhead time, etc; early production of valu-
able results also belongs here.
Reliability or trust, meaning low risk of getting a
wrong result or high chances to get any result, includ-
ing counter-sabotage measures.
Justice, important mostly in volunteer computing,
meaning taking into account both the desires of
volunteers and the project goals (possibly, multiple
projects).
Obviously, the criteria often contradict each other. Even
criteria from the same class can be mutually exclusive;
e.g., task replication can be used to get an answer sooner
(performance) but on the cost of redundant use of re-
sources. Also the criteria are quite hard to formalize.
Even makespan (the time to complete a set of tasks) and
throughput (tasks per hour) are not exactly reciprocals.
Black box algorithms, including machine learning, ge-
netic search, and other, are able to solve such “fuzzy”
problems. They are potentially able to improve eciency,
reliability, and justice, in any combination. The same is
true for multi-agent models (game-theoretic models): rst,
decentralization denitely improves scalability, reduces
the server load, forces the nodes to work better and thus
is able to optimize eciency. Also, well-chosen game rules
can stop nodes from lying or poorly solve tasks, so that re-
liability is better (including an important question of stop-
ping sabotage). Finally, game-style interaction of indepen-
dent agents helps justice; e.g., nodes can distribute tasks
of a few projects in a fair way.
As for the task grouping, it is generally promising for
better eciency, though, as we have noted, can be used
for cross-checking. Creating parcels of variable size can be
useful for fair use of computing power of the nodes.
Other problems that we discuss in the paper are more
specic. Tail-stage optimization is mostly from the point of
view of eciency. Also, when serving multiple projects in
a queue, resource management at the nal stage becomes
non-trivial.
Enterprise Desktop Grids are free of some volun-
teer computing problems, as has been discussed already.
Therefore, the main criteria are those of the rst class,
i.e., eciency. However, relatively low computing power
in case of a few projects can be a challenge from the point
of view of fair scheduling of the available resources. Also
interests of computer users, project owners, and other con-
cerned parties may not coincide. So, the fair distribution of
the available resources without violating anybody’s rights
can be a challenging problem.
Domain-specic optimization or optimal choice of the
order of the search is mostly for eciency in the special
sense of getting results sooner. However, it is quite a new
subject, so new applications can appear. Also (at the mo-
ment) optimizing the cost of the computing seems to serve
only reliability. Though, we are sure that this approach can
be useful also for the eciency, including fewer deadline
violations (if one is punished by a virtual penalty increas-
ing the cost). Possibility to join the cost metrics with game-
theoretic approach is (up to our knowledge) completely
unstudied.
As for the “needle in a haystack” search, it is proposed
for less server load and thus better eciency; however,
rare valuable answers are expensive, so this becomes a
question of reliability.
Unauthenticated
Download Date | 12/4/17 1:14 AM
Task Scheduling in Desktop Grids: Open Problems |349
Table 1: Eect of the discussed subjects.
A problem Performance Reliability Justice
Task grouping 3 3 3
Needle in a haystack 3 3
Task cost ? 3?
Domain-specics 3? ?
Game theory 3 3 3
EDG 3 3
Tail stage 3 3
Black box 3 3 3
So, most open problems selected by us serve at least
two optimization purposes (see Table 1), which agrees
well with a nature of heterogeneous computing systems
where multiple conicting interests meet and need to be
resolved.
Conclusion
Task scheduling plays crucial role in Desktop Grids. Much
eorts were taken to develop new mathematical models
and algorithms. With some prerequisites they promise sig-
nicant increase of computing performance. But there are
also open problems in this domain.
We have considered several promising directions to
improve task scheduling in Desktop Grids. They in-
clude task grouping (parceling), “needle in a haystack”,
mean cost evaluation, domain-imposed scheduling objec-
tives, game-theoretical methods of robust or decentralized
scheduling, Enterprise Desktop Grids, optimizing the nal
part of the project, and sophisticated algorithms.
Solution of each of the described problems promises
higher eciency of Desktop Grids. Optimal task group-
ing reduces the server load, increasing the number of si-
multaneously served computing nodes and thus the Desk-
top Grid performance. The same is true for “needle in
a haystack” problems, using some game-theoretical so-
lutions, and the mean-cost approach. Domain-imposed
criteria, mathematical models, and algorithms promise
faster solutions of higher quality. Games of scheduling po-
tentially give more eective protocols of interaction with
computing nodes and more eective tasks distribution.
Optimizing the “tail” computation reduces makespan. AI
schedule generators are promising in producing ecient
schedules that suit best for the given environment and take
into account all available informtion. Finally, Enterprise
Desktop Grids are the special type of computing systems
not widely propagated. We believe that the described prob-
lems are hot topics of task scheduling in Desktop Grids.
Acknowledgement: The authors are grateful to the anony-
mous reviewers for their useful comments that helped to
improve the paper.
The work is supported by the Russian Foundation for
Basic Research (grant numbers 16-07-00622, 15-07-02354,
and 15-29-07974).
References
[1] BoincIntro. URL: http://boinc.berkeley.edu/trac/wiki/BoincIntro,
accessed Jun 2017.
[2] BOINCstats. URL: https://boincstats.com, accessed Jun 2017.
[3] DevProjects. URL: http://boinc.berkeley.edu/trac/wiki/DevProjects,
accessed Jun 2017.
[4] JobIn. URL: http://boinc.berkeley.edu/trac/wiki/JobIn, ac-
cessed Jun 2017.
[5] I. Al-Azzoni and D.G. Down. Dynamic scheduling for heteroge-
neous desktop grids. Journal of Parallel and Distributed Com-
puting, 70(12):1231–1240, dec 2010.
[6] A.F. Anta, Ch. Georgiou, M.A. Mosteiro, and D. Pareja. Multi-
round master-worker computing: a repeated game approach. In
Reliable Distributed Systems (SRDS), 2016 IEEE 35th Symposium
on, pages 31–40. IEEE, 2016.
[7] A.L. Bazinet and M.P. Cummings. Subdividing long-running,
variable-length analyses into short, xed-length BOINC worku-
nits. Journal of Grid Computing, sep 2015.
[8] Orna Agmon Ben-Yehuda, Assaf Schuster, Artyom Sharov, Mark
Silberstein, and Alexandru Iosup. Expert: Pareto-ecient task
replication on grids and a cloud. In Parallel & Distributed Pro-
cessing Symposium (IPDPS),2012 IEEE 26th International, pages
167–178. IEEE, 2012.
[9] J Celaya and U Arronategui. A task routing approach to
large-scale scheduling. Future Generation Computer Systems,
29(5):1097–1111, 2013.
[10] I. Chernov. Theoretical study of replication in desktop grid com-
puting: Minimizing the mean cost. In Proceedings of the 2nd Ap-
plications in Information Technology (ICAIT-2016), International
Conference on, pages 125–129, Aizu-Wakamatsu, Japan, 2016.
[11] I.A. Chernov, E.E. Ivashko, and N.N. Nikitina. Survey of task
scheduling methods in Desktop Grids. Program Systems: The-
ory and Applications, 8(3):3–29, 2017. In Russian.
[12] I.A. Chernov and N.N. Nikitina. Virtual screening in a desktop
grid: Replication and the optimal quorum. In V. Malyshkin, edi-
tor, Parallel Computing Technologies, International Conference
on, volume 9251, pages 258–267. Springer, 2015.
[13] G. Chmaj, K. Walkowiak, M. Tarnawski, and M. Kucharzak.
Heuristic algorithms for optimization of task allocation and re-
sult distribution in peer-to-peer computing systems. Interna-
tional Journal of Applied Mathematics and Computer Science,
22(3):733–748, 2012.
[14] S.J. Choi, H.S. Kim, E.J. Byun, and C.S. Hwan. A taxonomy of
desktop grid systems focusing on scheduling. Technical re-
port KU-CSE-2006-1120-02, Department of Computer Science
Unauthenticated
Download Date | 12/4/17 1:14 AM
350 |I. Chernov, N. Nikitina, E. Ivashko
and Engeering, Korea University, 2006.
[15] B. Donassolo, A. Legrand, and C. Geyer. Non-cooperative
scheduling considered harmful in collaborative volunteer com-
puting environments. In Cluster, Cloud and Grid Computing,11th
IEEE/ACM International Symposium on, pages 144–153, 2011.
[16] W. Du, J. Jia, M. Mangal, and M. Murugesan. Uncheatable grid
computing. Electrical Engineering and Computer Science, Paper
26:1–8, 2004.
[17] N.M. Durrani and J.A. Shamsi. Volunteer computing: require-
ments, challenges, and solutions. Journal of Network and Com-
puter Applications, 39:369–380, mar 2014.
[18] T. Estrada, O. Fuentes, and M. Taufer. A distributed evolutionary
method to design scheduling policies for volunteer computing.
ACM SIGMETRICS Performance Evaluation Review, 36(3):40–49,
2008.
[19] T. Estrada and M. Taufer. Challenges in designing scheduling
policies in volunteer computing. In C. Cérin and G. Fedak, edi-
tors, Desktop Grid Computing, pages 167–190. CRC Press, 2012.
[20] Z. Farkas and P. Kacsuk. Evaluation of hierarchical desktop grid
scheduling algorithms. Future Generation Computer Systems,
28(6):871–880, jun 2012.
[21] Z. Farkas, A. Marosi, and P. Kacsuk. Job scheduling in hierar-
chical desktop grids. In F. Davoli, N. Meyer, R. Pugliese, and
S. Zappatore, editors, Remote Instrumentation and Virtual Lab-
oratories, pages 79–97. Springer, Boston, MA, 2010.
[22] A.A. Fernández, C. Georgiou, M.A. Mosteiro, and D. Pareja. Al-
gorithmic mechanisms for reliable crowdsourcing computation
under collusion. PLoS ONE, 10(3):1–22, 2015.
[23] Stefano Forli, Ruth Huey, Michael E Pique, Michel F Sanner,
David S Goodsell, and Arthur J Olson. Computational protein-
ligand docking and virtual drug screening with the autodock
suite. Nature protocols, 11(5):905–919, 2016.
[24] I.E. Gabis and I.A. Chernov. The Kinetics of Binary Metal Hy-
dride Decomposition. Chemistry Research and Applications.
Nova Publisher, 2017.
[25] Gaurav D Ghare and Scott T Leutenegger. Improving speedup
and response times by replicating parallelprograms on a SNOW.
In Workshop on Job Scheduling Strategies for Parallel Process-
ing, pages 264–287. Springer, 2004.
[26] D.L. González, G.G. Gil, F.F. de Vega, and B. Segal. Centralized
BOINC resources manager for institutional networks. In Paral-
lel and Distributed Processing, 2008. IPDPS 2008. IEEE Interna-
tional Symposium on, pages 1–8. IEEE, 2008.
[27] P.Kacsuk, J. Kovacs, Z. Farkas, et al. SZTAKI desktop grid (SZDG):
A flexible and scalable desktop grid system. Journal of Grid
Computing, 7(4):439, 2009.
[28] M.Kh. Khan, T. Mahmood, and S.I. Hyder. Scheduling in desk-
top grid systems: Theoretical evaluation of policies and frame-
works. International Journal of Advanced Computer Science and
Applications, 8(1):119–127, 2017.
[29] J. Kołodziej and F. Xhafa. Meeting security and user behavior
requirements in grid scheduling. Simulation Modelling Practice
and Theory, 19(1):213–226, jan 2011.
[30] D. Kondo and H. Casanova. Computing the optimal makespan
for jobs with identical and independent tasks scheduled on
volatile hosts. Technical report CS2004-0796, Dept. of Com-
puter Science and Engineering, University of California, San
Diego, 2004.
[31] D.Kondo, A.A . Chien, and H. Casanova. Scheduling task parallel
applications for rapid turnaround on enterprise desktop grids.
Journal of Grid Computing, 5(4):379–405, oct 2007.
[32] Ilya Kurochkin and Anatoliy Saevskiy. BOINC forks, issues
and directions of development. Procedia Computer Science,
101:369–378, 2016. 5th International Young Scientist Confer-
ence on Computational Science, YSC 2016, 26-28 October 2016,
Krakow, Poland.
[33] J.L. Lerida, F. Solsona, P. Hernandez, F. Gine, M. Hanzich, and
J. Conde. State-based predictions with self-correction on Enter-
prise Desktop Grid environments. Journal of Parallel and Dis-
tributed Computing, 73(6):777–789, 2013.
[34] Chunlin Li and Layuan Li. Utility-based scheduling for grid com-
puting under constraints of energy budget and deadline. Com-
puter Standards & Interfaces, 31:1131–1142, 2009.
[35] M. Maheswaran, Sh. Ali, H.J. Siegel, D. Hensgen, and R.F. Fre-
und. Dynamic mapping of a class of independent tasks onto
heterogeneous computing systems. Journal of Parallel and Dis-
tributed Computing, 59(2):107–131, 1999.
[36] V.V. Mazalov, N.N. Nikitina, and E.E. Ivashko. Hierarchical two-
level game model for tasks scheduling in a desktop grid. In Ul-
tra Modern Telecommunications and Control Systems and Work-
shops (ICUMT), 2014 6th International Congress on, pages 541–
545. IEEE, 2014.
[37] V.V. Mazalov, N.N. Nikitina, and E.E Ivashko. Task scheduling in
a desktop grid to minimize the server load. In V. Malyshkin, ed-
itor, Parallel Computing Technologies, International Conference
on, volume 9251, pages 273–278. Springer, 2015.
[38] S. Nesmachnow, B. Dorronsoro, J.E. Pecero, and P. Bouvry.
Energy-aware scheduling on multicore heterogeneous grid com-
puting systems. Journal of Grid Computing, 11:653–680, 2013.
[39] Natalia Nikitina, Evgeny Ivashko, and Andrei Tchernykh. Con-
gestion game scheduling implementation for high-throughput
virtual drug screening using boinc-based desktop grid. In Inter-
national Conference on Parallel ComputingTechnologies, pages
480–491. Springer, 2017.
[40] A.-C. Orgerie, L. Lefévre, and J.-P. Gelas. Save watts in your grid:
Green strategies for energy-aware framework in large scale dis-
tributed systems. In Parallel and Distributed Systems, 14th IEEE
International Conference on, pages 171–178. IEEE, 2008.
[41] B. Qu, Y. Lei, and Y. Zhao. A new genetic algorithm based
scheduling for volunteer computing. In Computer and Commu-
nication Technologies in Agriculture Engineering (CCTAE), 2010
International Conference On, volume 3, pages 228–231. IEEE,
2010.
[42] Rob E Quick, Samy Meroueh, Soichi Hayashi, Rynge Mats, Scott
Teige, David Xu, and Bo Wang. Building a chemical-protein in-
teractome on the open science grid. In International Symposium
on Grids and Clouds (ISGC) 2015, 15–20 March 2015, Academia
Sinica, Taipei, Taiwan, pages 1–5, 2015.
[43] Josep Rius, Fernando Cores, and Francesc Solsona. Cooper-
ative scheduling mechanism for large-scale peer-to-peer com-
puting systems. Journal of Network and Computer Applications,
36(6):1620 – 1631, 2013.
[44] Saddaf Rubab, Mohd Fadzil Hassan, Ahmad Kamil Mahmood,
and Syed Nasir Mehmood Shah. Proactive job scheduling and
migration using articial neural networks for volunteer grid. In
Computer Science and Engineering, First EAI International Con-
ference on. EAI, 3 2017.
[45] Chetan Rupakheti, Aaron Virshup, Weitao Yang, and David N Be-
ratan. Strategy to discover diverse optimal molecules in the
small molecule universe. Journal of chemical information and
Unauthenticated
Download Date | 12/4/17 1:14 AM
Task Scheduling in Desktop Grids: Open Problems |351
modeling, 55(3):529–537, 2015.
[46] S.A. Salinas. PFS: A productivity forecasting system for desktop
computers to improve grid applications performance in Enter-
prise Desktop Grid. Computing and Informatics, 33:783–809,
2014.
[47] S.A. Salinas, C.G. Garino, and A. Zunino. An architecture
for resource behavior prediction to improve scheduling sys-
tems performance on enterprise desktop grids. In Advances in
New Technologies, Interactive Interfaces and Communicability,
pages 186–196. Springer, 2012.
[48] L.F. Sarmenta. Volunteer computing. PhD thesis, Mas-
sachusetts Institute of Technology, 2001.
[49] S.S. Sathya and K.S. Babu. Survey of fault tolerant techniques
for grid. Computer Science Review, 4:101–120, 2010.
[50] D. Szajda, B. Lawson, and J. Owen. Hardening functions for
large-scale distributed computations. In Security and Privacy,
2003. Proceedings. 2003 Symposium on, page 7946298. IEEE,
2003.
[51] A. Tchernykh, J.E. Pecero, A. Barrondo, and E. Schaeer. Adap-
tive energy ecient scheduling in peer-to-peer desktop grids.
Future Generation Computer Systems, 36:209–220, 2014.
[52] D. Toth and D. Finkel. Improving the productivity of volunteer
computing by using the most eective task retrieval policies.
Journal of Grid Computing, 7(4):519–535, dec 2009.
[53] M. Ujhelyi, P. Lacko, and A. Paulovic. Task scheduling in dis-
tributed volunteer computing systems. In Intelligent Systems
and Informatics (SISY), 2014 IEEE 12th International Symposium
on, pages 111–114. IEEE, 2014.
[54] John Von Neumann and Oskar Morgenstern. Theory of games
and economic behavior. Princeton University Press, 2007.
[55] X. Wang, Ch.Sh. Yeo, R. Buyya, and J. Su. Optimizing the
makespan and reliability for workflow applications with repu-
tation and a look-ahead genetic algorithm. Future Generation
Computer Systems, 27(8):1124–1134, oct 2011.
[56] Y. Wang, J. Wei, Sh. Ren, and Yu. Shen. Toward integrity assur-
ance of outsourced computing — a game theoretic perspective.
Future Generation Computer Systems, 55:87–100, 2016.
[57] F. Xhafa and A. Abraham. Computational models and heuristic
methods for grid scheduling problems. Future Generation Com-
puter Systems, 26(4):608–621, apr 2010.
[58] Jianhua Yu, Yue Luo, and Xueli Wang. Deceptive detection and
security reinforcement in grid computing. In Intelligent Network-
ing and Collaborative Systems (INCoS), 2013 5th International
Conference on, pages 146–152. IEEE, 2013.
[59] K.-M. Yu, Z.-J. Luo, C.-H. Chou, C.-K. Chen, and J. Zhou. A
fuzzy neural network based scheduling algorithm for job as-
signment on computational grids. In T. Enokido, L. Barolli, and
M. Takizawa, editors, Network-Based Information Systems, vol-
ume 4658, pages 533–542, Berlin, Heidelberg, 2007. Springer.
Unauthenticated
Download Date | 12/4/17 1:14 AM
... Some notable ones can be found in quantum chemistry, molecular biology, hydrodynamics and other branches of science. With explosive use of the Internet coupled with the growth in the number, power, accessibility, performance and decrease in cost of personal computers, the Desktop Grid (DG) computing, a particular and popular case of parallel computing is in high demand [5]. In the DG system, when heterogeneous computing resources like personal computers, laptops, web servers, cluster nodes, as well as wearable devices, are idle used as computing resources. ...
... Fast application turnaround, low delays and low response time are among the key optimization goals for a DG system [5]. In DG systems the response time can be reduced by sending a adequate number of replicas (identical copies) of workunits (hereafter we use the word task as a synonym of workunit) to compute nodes and waiting for a quorum (fixed number of valid results). ...
... Moreover, modeling heavy-tailed distribution for service times of tasks in multiserver model may lead to heterogeneous moment properties of servers resulting in many servers being busy with unusually long service times of tasks [18]. (This effect may explain uneven behavior of the time required to complete the so-called tail computation of a finite number of workunits in a DG system, the problem pointed out in [5].) In the present paper we study these effects adopting a Split-Merge model of an EDG presented in [15,4]. ...
... A large number of configuration parameters and project implementation peculiarities, ranging from task scheduling [2] up to psychological aspects of working with volunteers, affect the overall system performance [3,4]. Below we summarize some recent advances in Desktop Grid performance studies. ...
... 1 (t) -the number of incomplete tasks assigned to the host, and X It remains to note that the BOINC server in the simplest case can be modeled by a one-dimensional process X (0) (t) being the number of incomplete tasks in the project. The server's clock component, though, is associated with the clients as the time to accept the result, T (i) 3 . As such, the processes {X (0) (t), . . . ...
Chapter
The paper describes a discrete event simulation model of a Desktop Grid system. Firstly, we present a stochastic model of a volunteer computing project. We then employ the event simulation approach based on the generalized semi-Markov processes to develop a discrete event simulation model. Finally, using the simulation model, we describe a performance optimization problem aiming to optimize the project runtime as a function of the task size under performance constraints.
... The application examples in cloud storage (see, e.g., Saxena et al. (2018)), distributed computing (see, e.g., Ilya et al. (2017)) and blood screening (Lohse et al. 2020) motivate our interest to study a single-server batch arrival queue with a group clearance in which Content courtesy of Springer Nature, terms of use apply. Rights reserved. ...
Article
Full-text available
In this paper we consider a single server queueing model with under general bulk service rule with infinite upper bound on the batch size which we call group clearance. The arrivals occur according to a batch Markovian point process and the services are generally distributed. The customers arriving after the service initiation cannot enter the ongoing service. The service time is independent on the batch size. First, we employ the classical embedded Markov renewal process approach to study the model. Secondly, under the assumption that the services are of phase type, we study the model as a continuous-time Markov chain whose generator has a very special structure. Using matrix-analytic methods we study the model in steady-state and discuss some special cases of the model as well as representative numerical examples covering a wide range of service time distributions such as constant, uniform, Weibull, and phase type.
... There are a number of possible extensions and improvements related to scheduling; some of these are explored by Chernov et al. [66]. Dinis et al. proposed facilitating experimentation by making the server scheduler "pluggable" [67]. ...
Preprint
"Volunteer computing" is the use of consumer digital devices for high-throughput scientific computing. It can provide large computing capacity at low cost, but presents challenges due to device heterogeneity, unreliability, and churn. BOINC, a widely-used open-source middleware system for volunteer computing, addresses these challenges. We describe its features, architecture, and implementation.
Article
Full-text available
“Volunteer computing” is the use of consumer digital devices for high-throughput scientific computing. It can provide large computing capacity at low cost, but presents challenges due to device heterogeneity, unreliability, and churn. BOINC, a widely-used open-source middleware system for volunteer computing, addresses these challenges. We describe BOINC’s features, architecture, implementation, and algorithms.
Conference Paper
Full-text available
Virtual drug screening is one of the most common applications of high-throughput computing. As virtual screening is time consuming, a problem of obtaining a diverse set of hits in a short time is very important. We propose a mathematical model based on game theory. Task scheduling for virtual drug screening in high-performance computational systems is considered as a congestion game between computing nodes to find the equilibrium solutions for best balancing between the number of interim hits and their chemical diversity. We present the developed scheduling algorithm implementation for Desktop Grid and Enterprise Desktop Grid, and perform comprehensive computational experiments to evaluate its performance. We compare the algorithm with two known heuristics used in practice and observe that game-based scheduling outperforms them by the hits discovery rate and chemical diversity at earlier steps.
Article
Full-text available
Desktop grid systems have already established their identity in the area of distributed systems. They are well suited for High Throughput Computing especially for Bag-of-Tasks applications. In desktop grid systems, idle processing cycles and memory of millions of users (connected through internet or through any other communication mechanism) can be utilized but the workers / hosts machines not under any centralized administrative control that result in high volatility. This issue is countered by applying various types of scheduling policies that not only ensure task assignments to better workers but also takes care of fault tolerance through replication and other mechanism. In this paper, we discussed leading desktop grid systems framework and performed a comparative analysis of these frameworks. We also presented a theoretical evaluation of server and client based scheduling policies and identified key performance indicators to evaluate these policies.
Article
Full-text available
The article based on the experience of running BOINC projects. We interviewed developers of projects on the platform BOINC in order to adopt their experience with the platform: issues with which they are confronted, how they have solved them, what changes have they done in BOINC and their opinion about BOINC platform, what should be improved in BOINC platform to make it better. Next we were study materials about experience of using the BOINC platform and BOINC issues. Finally we made conclusions about the actions to be taken for the development of BOINC: increase number of crunchers; rewrite the platform using modern architectural solutions and the latest technologies; initiate creation of services providing access to computing resources of crunchers.
Conference Paper
Full-text available
We propose a mathematical model of a desktop grid computing system that solves tasks with two possible answers. Replication is used in order to reduce the error risk: wrong answers are returned with some known probabilities and penalty is added to the calculation cost in case of an error. We solve the optimization problems to determine the optimal quorum for tasks of varying duration. Beside the general case, we consider reliable answers of one kind. We apply the model to the problem of virtual screening and show how replication reduces the average cost. Also we demonstrate that when penalties are close to but lower than the critical values, taking different duration of tasks into account significantly reduces the penalty threat at very low additional cost.
Chapter
Volunteer Computing (VC) is a paradigm that allows the use of heterogeneous computing resources (e.g., desktops, notebooks) connected through the Internet and owned by volunteers to provide computing power needed by computationally expensive, loosely coupled applications. For such applications, VC systems represent an effective alternative to traditional High Performance Computing (HPC) systems because they can provide higher throughput at a lower cost.
Book
From the point of view of hydrogen, desorption metal hydrides are extremely complex and interesting. Various metal hydrides differ so much, range in terms of decomposition temperatures and pressures and look like materials from separate classes. Many research articles are devoted to metal hydrides; many of them study the kinetics of dehydriding. However, there is still no complete understanding of the processes that govern kinetics. The reason is for the high number of factors that influence the rate of hydride decomposition and hydrogen desorption. First of all, properties of metal hydrides are diverse. The number of hydride phases, decomposition rates and ranges of absorbed energy are very different. The rates of hydrogen desorption, diffusion and phase transition also can influence the kinetics of dehydriding. The aim of this book is to summarise the authors’ research of dehydriding kinetics with electronic structures of the materials taken into account, and to determine which elementary processes influence the decomposition rates. The book is not a reference guide on the dehydriding rates of constants. Even more so, in order to keep the subject as simple as possible, the authors restrict the discussion only to binary hydrides when talking about basic dehydriding laws. It is possible, however, that ternary and more complex hydrides have their own unique properties. The authors describe experimental results through physically clear and well studied processes, such as desorption, diffusion, reaction of hydride decomposition, etc. as well as by using conservation laws. On the other hand, describing the entire process in the most general way, taking into account all of the possible reactions and excluding some on the grounds of experimental results, also seems unsound. The authors believe that complexity of models must be comparable to that of experimental data; additional factors should be taken into account only in order to improve poor fitting. The following factors seem the most important: Electronic properties of the hydride (type of bonding); phase morphology; incubation and nucleation; physically reasonable elementary processes mentioned above (diffusion, adsorption and desorption, decomposition and formation of the hydride phase, etc.); and the variety of powder particles. Over recent years, much effort has been made to activate the decomposition of magnesium and aluminium hydrides with the hope of finding appropriate materials for keeping hydrogen in solids aboard a vehicle. Most of the research dealt with mechanochemical activation (i.e., ball milling in the presence of catalysts, often in hydrogen). The choice of catalytic materials is not always logical; this was one more reason for systemizing the dehydriding kinetics. Naturally, activated materials must follow the general pattern of decomposing metal hydrides. A separate section is devoted to the activation of magnesium and aluminium hydrides; aside from mechanochemical, we also discuss the thermal and photoactivation of these materials. (Imprint: Novinka).
Article
Computational docking can be used to predict bound conformations and free energies of binding for small-molecule ligands to macromolecular targets. Docking is widely used for the study of biomolecular interactions and mechanisms, and it is applied to structure-based drug design. The methods are fast enough to allow virtual screening of ligand libraries containing tens of thousands of compounds. This protocol covers the docking and virtual screening methods provided by the AutoDock suite of programs, including a basic docking of a drug molecule with an anticancer target, a virtual screen of this target with a small ligand library, docking with selective receptor flexibility, active site prediction and docking with explicit hydration. The entire protocol will require ∼5 h.