Evaluating Strategies for Running from the Cops.
ABSTRACT Moving target search (MTS) or the game of cops and robbers has a broad field of application reach ing from law enforcement to computer games. Within the recent years research has focused on computing move policies for one or multiple pur suers (cops). The present work motivates to ex tend this perspective to both sides, thus developing algorithms for the target (robber). We investigate the game with perfect information for both play ers and propose two new methods, named TrailMax and Dynamic Abstract Trailmax, to compute move policies for the target. Experiments are conducted by simulating games on 20 maps of the commercial computer game Baldur's Gate and measuring sur vival time and computational complexity. We test seven algorithms: Cover, Dynamic Abstract Mini max, minimax, hill climbing with distance heuris tic, a random beacon algorithm, TrailMax and DA TrailMax. Analysis shows that our methods outper form all the other algorithms in quality, achieving up to 98% optimality, while meeting modern com puter game computation time constraints.

Conference Paper: A New Approach for the TwoPlayer PursuitEvasion Game
[Show abstract] [Hide abstract]
ABSTRACT: We study the problem of maintaining an unobstructed view of an agent moving amongst obstacles by a moving camera fixed to a pursuing robot. This is known as a twoplayer pursuit evasion game. Using a mesh discretization of the environment, we develop an algorithm that determines, given initial positions of both pursuer and evader, if the evader can take any moving strategy to go out of sight of the pursuer, and thus win the game. If it is decided that there is no winning strategy for the evader, we also compute a pursuer's trajectory that keeps the evader within sight, for every trajectory that the evader can take.URAI 2011: The 8th International Conference on Ubiquitous Robots and Ambient Intelligence, Incheon, Korea; 11/2011  SourceAvailable from: export.arxiv.org
Article: Cops and Robbers is EXPTIMEcomplete
[Show abstract] [Hide abstract]
ABSTRACT: We investigate the computational complexity of deciding whether k cops can capture a robber on a graph G. In 1995, Goldstein and Reingold conjectured that the problem is EXPTIMEcomplete when both G and k are part of the input; we prove this conjecture.Journal of Combinatorial Theory Series B 09/2013; · 0.94 Impact Factor  SourceAvailable from: math.ryerson.ca[Show abstract] [Hide abstract]
ABSTRACT: Vertex pursuit games are widely studied by both graph theorists and computer scientists. Cops and Robbers is a vertex pursuit game played on a graph, where some set of agents (or Cops) attempts to capture a robber. The cop number is the minimum number of cops needed to win. While cop number of a graph has been studied for over 25 years, it is not well understood, and has few connections with existing graph parameters. In this survey, we highlight some of the main results on bounding the cop number, and discuss Meyniel's conjecture on an upper bound for the cop number of connected graphs. We include a new proof of the fact that outerplanar graphs have cop number at most two.
Page 1
Evaluating Strategies for Running from the Cops
Carsten Moldenhauer and Nathan R. Sturtevant
Department of Computer Science
University of Alberta
Edmonton, AB, Canada T6G 2E8
moldenha, nathanst@cs.ualberta.ca
Abstract
Moving target search (MTS) or the game of cops
and robbers has a broad field of application reach
ing from law enforcement to computer games.
Within the recent years research has focused on
computing move policies for one or multiple pur
suers (cops). The present work motivates to ex
tend this perspective to both sides, thus developing
algorithms for the target (robber). We investigate
the game with perfect information for both play
ers and propose two new methods, named TrailMax
and Dynamic Abstract Trailmax, to compute move
policies for the target. Experiments are conducted
by simulating games on 20 maps of the commercial
computer game Baldur’s Gate and measuring sur
vival time and computational complexity. We test
seven algorithms: Cover, Dynamic Abstract Mini
max, minimax, hill climbing with distance heuris
tic, a random beacon algorithm, TrailMax and DA
TrailMax. Analysis shows that our methods outper
form all the other algorithms in quality, achieving
up to 98% optimality, while meeting modern com
puter game computation time constraints.
1
Moving target search (MTS), or the game of cops and rob
bers, has many applications reaching from law enforcement
to video games. The game was introduced into the arti
ficial intelligence literature by [Ishida and Korf, 1991] as
a new variant of the classical search problem. Following
this study, the question how to catch a moving prey effec
tively has been studied extensively [Ishida and Korf, 1995;
Koenig et al., 2007; Isaza et al., 2008].
In today’s computer games, the players control robbers be
ing chased by computer generated police agents. The same
game turned around, i.e. the player controlling a cop and hav
ingtochasedownacomputergeneratedrobber, isfarfromre
alizable. This is due to the fact that the focus in MTS research
has always been in developing strong pursuit strategies. Very
little is known on how to compute strategies for the target.
This paper presents a systematic study of move policies for
the robber, thus enabling better target modelling and widen
ing the current focus in MTS.
Introduction
The game of cops and robber has also been studied in
the mathematical literature (see [Hahn, 2007] for a survey).
Here, cops and robber alternatingly choose their initial posi
tions at the beginning of the game and then play as in MTS.
The search time of a graph, i.e. the time needed by optimal
playing cops to catch the robber, is therefore a constant. Be
sides bounds for the one cop and one robber problem, little is
known about this graph property. However, a first algorithm
that runs in polynomial time and which computes the search
time has been developed in [Hahn and MacGillivray, 2006].
Given this algorithm it is possible to determine optimal poli
cies for both players.
Computer games require tight bounds on resource usage,
especially computation time. Therefore, computing optimal
policies, even though possible in polynomial time with the
above algorithm, is not practical. This gives rise to the ques
tion of how to quickly compute approximations that yield
nearoptimal move policies. In the following, we will in
troduce a new algorithm called TrailMax and its variant Dy
namic Abstract TrailMax to respond to this question.
As optimality has only been studied recently, previous
work in MTS has been concerned with approximative solu
tions and has not, whether for the pursuer or the target, com
pared methods against optimal policies. Therefore, this paper
is the first to conduct a study of various target algorithms with
respect to their achieved suboptimality.
A precise definition of the cops and robber game consid
ered in this work will be given in Section 2. We review exist
ing algorithms, including Cover and Dynamic Abstract Min
imax, and outline their strengths and weaknesses in Section
3. The new methods, TrailMax and Dynamic Abstract Trail
Max, are introduced in Section 4. Evaluations of experiments
and extensive comparisons of various target algorithms when
playing against an optimal cop can be found in Section 5. We
wrap up with conclusions in Section 6.
2
The game of cops and robber is played with n cops and one
robber. Cops and robber occupy vertices in a finite undirected
connected graph G and are allowed to move to an adjacent
vertex or remain on their current location in each turn. Turns
are taken alternatingly beginning with the first to last cop fol
lowed by the robber. The game is played with perfect in
formation, i.e. the graph and all locations of all agents are
Game Definition
584
Page 2
C
R
CR
Figure 1: Map abstraction for DAM.
known. G is called ncopwin if n cops have a winning strat
egy on G.
Since our focus is on the target and the cop is potentially
played by a human player, we concentrate on the one cop one
robber problem here. However, all the following methods can
easily be extended to multiple cops. Furthermore, we are in
terested in playing on typical video game maps that include
obstacles. Hence, one cop cannot catch a robber that plays
optimal when both agents play with same speed. To enable
execution of experiments, i.e. many simulations of the game,
we have to decide between one of the three ways to guarantee
termination: the target moves suboptimally from time to time,
the game is ended after a certain number of steps, or the cop
is faster than the target. The first possibility contradicts our
wish to compute nearoptimal policies for the robber. The
second choice is problematic due to the choice of timeout
conditions. Furthermore, it does not measure the full amount
of suboptimality generated by a given strategy because the
game is truncated after the timer runs out. Moreover, it is
easy to construct an algorithm that achieves optimal results in
this game: detect all cycles around obstacles of length greater
or equal to four in the map, run to a cycle where the cop can
not capture the robber before reaching the cycle, and exploit
the cycle. Therefore, we allow the cop to be faster than the
robber. For simplicity we allow the cop to make d subsequent
moves when the robber only gets one, i.e. to move to any
location within a radius of d of his current position.
3
There are only two advanced methods in the literature that
try to compute move policies for the robber quickly. [Bulitko
and Sturtevant, 2006] suggest using Dynamic Abstract Mini
max (DAM). This algorithm assumes that various resolutions
of abstract maps are available, where an abstract map is cre
ated by taking sets of states in an original map and merging
them together to form a more abstract map. DAM chooses a
level of abstraction to begin with, and then computes a min
imax solution to a fixed depth. If the robber cannot avoid
capture at that level of abstraction, computation proceeds to
the next lower level. We illustrate this in Figure 1. In the ab
stract map two sets of 9 states have been abstracted together
to form a 2node graph. The cop can catch the robber in one
move in the abstract graph, so DAM will search again on the
lower level of abstraction. Assume there are ? levels of ab
Related Work
cr
crcr
uvwuvw
Figure 2: Example of where the original tie breaking of the
cover heuristic computation can cause the robber to remain in
v instead of going to w.
straction and the cop and the robber occupy distinct nodes up
until level m. The original algorithm begins planning at level
m. Running the experiments in Section 5 for multiple frac
tions of m showed that starting at level m/2 is superior. We
report the experiments for the later case.
If the robber can escape, an abstract goal destination is se
lected and projected onto the actual map. PRA* [Sturtevant
and Buro, 2005]is used to compute a path to that node which
is subsequently followed for one step. Since only the goal
destinationisprojectedontothegroundlevel, DAMcanmake
mistakes when cycles exist in the strategy. For example, con
sider a cycle with five nodes and an adjacent cop and robber.
The solution is to run around this cycle, but after seven steps
the robber will reach the initial position of the cop. Hence,
when computing with depth seven, the robber will run to
wards the cop. The solution is to make DAM only refine one
abstract step. However, running the experiments in Section 5
for such a variant showed that the original algorithm, despite
its flaws, achieves slightly better results.
Within the present work, we use the same idea of using
abstractions for speedup. Our algorithm uses the same policy
(m/2) for selecting the first level of abstraction, solves the
problem on this level and proceeds to the next lower level if
the robber cannot survive long enough due to the computed
solution. Otherwise, the abstract solution path is refined into
a ground level path.
The Cover heuristic, as a stateoftheart algorithm for
moving target search, has been used for both the cop and the
robber[Isaza et al., 2008]. This algorithm computes the num
ber of nodes in the graph that the respective agent can get to
before any other agent. It then tries to maximize this area
with each move, minimizing the area the opponent can reach.
The original algorithm breaks ties by assigning the nodes
on the border between two covered areas to the cop. This
causes the heuristic to be inaccurate even for simple prob
lems. They used a notion of risk to increase the pursuer’s
aggressiveness and circumvent this inaccuracy for the cop.
As an example for the robber, consider the graph in Figure
2. There are three vertices, u, v, and w. The cop starts on
u, the robber on v, and it is the robber’s turn. When the rob
ber remains on v, v and w are considered robber cover. If he
moves to w, u and v are cop cover (due to the tiebreaking
rule) and only w is robber cover. Thus, when maximizing,
the robber prefers to stay in v, which is suboptimal. In this
work, we modify Cover to eliminate this problem. Vertices
are only declared robber cover if he is guaranteed to reach
them no matter what the cop does.
But, we found that no matter how the Cover heuristic is
585
Page 3
defined, it is easy to construct a simple example where hill
climbing would fail for either of the two players. Using no
tions of ties and untouchable nodes can solve some of the
issues but subsequently turns the heuristic into a search algo
rithm instead of a static heuristic. Thus, we seek to develop
a more principled search method instead of trying to patch
cover.
When being used for the pursuer, Cover with Risk and Ab
straction (CRA)[Isaza et al., 2008]makes use of abstractions
to decrease computation time and to scale to large maps. This
has not been used for robber. Since the heuristic is most ac
curate with full information, using abstractions only trades
accuracy against speed. Within our experiments, the Cover
heuristic without abstractions already performed poorly in
terms of survival time against an optimal cop. Therefore, we
did not extend the algorithm to incorporate abstractions.
Optimal move policies for both cops and robbers are stud
ied by [Moldenhauer and Sturtevant, 2009].
oped algorithms that solve one problem instance, i.e. com
pute optimal policies for a given initial position. Unfortu
nately, although well optimized, optimal algorithms do not
scale to very large maps and cannot meet tight computation
time constraints of modern computer games. An algorithm
that solves a map, i.e. computes a strategy for cop and robber
for every possible initial position was first proposed by[Hahn
and MacGillivray, 2006]. We use an improved version that
has been used as a baseline in [Moldenhauer and Sturtevant,
2009] to compute optimal solutions offline and to generate a
cop that moves optimally within our experiments.
They devel
4
We now outline our approach to computing nearoptimal
move policies for the robber. We will first motivate the al
gorithm and then provide more details. For ease of under
standing the following ideas will be developed for the game
where cop and robber move with same speed. However, all
the definitions and theorems are extendible to different speed
games.
The robber makes the assumption that the cop knows
where he is going to move, i.e. that the cop will play a best
response against him. Under this assumption, the robber tries
to maximize the time to capture. This can also be interpreted
as “running away”, i.e. taking the path that the cop takes
longest to intersect. We will now formalize this idea. Let
N[v] = {w(v,w) ∈ E(G)} ∪ {v} denote the closed neigh
borhood of v ∈ G. Let
P(v) = {p : N → V p(0) = v,∀i ≥ 0 : p(i+1) ∈ N[p(i)]}
be the set of paths starting in v. Given a path prand pcfor
the robber and cop, respectively, that they will follow disre
garding the opponent’s actions, we can compute the sum of
the numbers of turns both agents take until capture occurs:
T(pr,pc)= min(
{2tt ≥ 0,pc(t) = pr(t)}
∪{2t − 1t ≥ 1,pc(t) = pr(t − 1)}).
Definition 1 (TrailMax) Let vr∈ G and vc∈ G be the po
sitions of robber and cop in G. We define
TrailMax(vr,vc) =max
pr∈P(vr)
TrailMax
min
pc∈P(vc)T(pr,pc).
(1)
r
c
Figure 3: Smallest 1copwin graph where the set of moves
according to TrailMax (solid) diverges from the set of optimal
moves (dashed).
We say G is an octile map if its vertices are positions in a
two dimensional grid and each vertex is connected to its up
to eight neighbors via the two horizontals, two verticals and
four diagonals. Within our experiments we use octile maps to
model the environment.
Recall that a graph G is called ncopwin if n cops have a
winning strategy on G for any initial position of the cops and
the robber and when all agents move with same speed.
Theorem 1 Let G be a 1copwin octile map.
and vc be the initial positions of robber and cop.
TrailMax(vr,vc) returns the optimal value of the game where
the cop and robber move at same speed.
Let vr
Then
This theorem also holds when the cop is faster as described
in Section 2. However, this requires obvious adjustments of
the above definitions and is therefore omitted for readability.
Unfortunately, the theorem does not hold for general 1cop
win or ncopwin graphs (n ≥ 2).
TrailMaxcanbeusedtogeneratemovepoliciesfortherob
ber. For simplicity, the resulting algorithm will be refered to
by the same name. Furthermore, a pair (pr,pc) for which
(1) is maximal will be called a TrailMax pair. The algorithm
computes a TrailMax pair (pr,pc) and then follows the rob
ber’s path prfor k steps (k ≥ 1) disregarding the cop’s ac
tions. Afterwards, TrailMax is called again and a new path pr
is computed, hence our notation TrailMax(k). Unfortunately,
the immediate assumption, that TrailMax(1) might yield an
optimal strategy for general ncopwin graphs is not true. De
picted in Figure 3 is an example of a 1copwin graph where
the robber is to move and the optimal move is to remain on
his current position, marked with a r. This causes the cop
to commit to a direction, after which the robber can run away
more effectively. However, according to TrailMax, the robber
has to move to either of the indicated adjacent positions.
A TrailMax pair is efficiently computed by simultaneously
expanding vertices around the robber’s and cop’s position in
a Dijkstra like fashion. Two priority queues are maintained,
one for the cop and one for the robber. All nodes of a given
cost for the robber are expanded first, because the robber
moves immediately after computing a policy. Node expan
sions for the robber are checked against the cop’s expanded
nodes to test whether the cop could have already reached that
point and captured the robber. If this is the case, the node is
discarded. Otherwise, the vertex is declared as robber cover
and expanded normally. When taking a node from the queue
for the cop, it is always expanded normally.
A visualization is depicted in Figure 4. The grey area in
dicates the vertices that are declared robber cover but are not
586
Page 4
robber
cop
Figure 4: Visualization of TrailMax’s computation. The grey
area is the nodes that have been reached by the robber first,
declared as robber cover but will not be expanded anymore
since they were captured by the cop in a previous turn.
expanded anymore since the expansion around the cop’s po
sition captured them in a previous turn. Computation ends
when all nodes declared as robber cover have been expanded
by the cop as well. The last node that is explored by the cop
is the goal node the robber will run to. Path generation can be
easily done by maintaining pointers to parents when expand
ing nodes.
The above computation finds one goal vertex and a shortest
path to it. The path then has to be extended by moves that
make the robber remain on the goal vertex until capture. It is
not hard to show that this extended shortest path is indeed a
solution to (1). Note that there might be many possible goal
vertices the robber could run to and many different paths to
gettothemthatfulfill(1). Findingallsuchverticesispossible
by remembering all robber nodes that have not been caught
before the last cop’s turn expansion. This could potentially
be used to take advantage of a suboptimal cop, although we
do not study this issue here.
Within computer game maps, edge costs are often approx
imated to enable faster computation. Under the assumption
that path costs can only differ by a fixed number of values, i.e.
buckets can be used within the priority queue and queue ac
cess takes constant time, the above algorithm runs in time lin
ear in the size of the graph. Although TrailMax already scales
well to large maps (cf. Section 5) our goal is to make com
putation time as independent of the size of the input graph as
possible. Inspired by DAM we use abstraction to achieve this
goal. Starting at an intermediate level of abstraction of the
hierarchy relative to the cop and robber positions, TrailMax
is computed. If the solution length does not exceed a certain
value q (computed by (1)), then computation proceeds to the
next lower level. If it does, the computed abstract path is re
finedtoagroundlevelpathusingPRA*’srefinement, i.e. pro
gressively computing a path on the next lower level that only
goes through nodes whose parents are either on or adjacent
to the abstract path. In the following, this algorithm is called
Dynamic Abstract TrailMax with threshold q and number of
steps the solution is followed k, hence DATrailMax(q,k).
5
To evaluate our algorithms we compare to the algorithms de
scribed in Section 3 and measure the quality and required
computation time in terms of node expansions. We set d = 2,
Experiments
Figure 5: One of the maps used in Baldur’s Gate that the ex
periments were conducted on. The black parts are obstacles,
white is traversable.
i.e. the cop can take two turns before the robber gets one and
can thus move to any location within a radius of 2 around
his current location. First experiments show that greater cop
speeds yield the same trends. In contrast, since capture oc
curs faster, the game becomes easier and less interesting for
the robber.
To generate meaningful statistics we use 20 maps from the
commercial game Baldur’s Gate as a testbed. The smallest of
these maps has 2638, the largest 22,216 vertices. A plot of
a sample map can be found in Figure 5. Furthermore, 1000
initial positions for each map are generated randomly. We
choose the selection at random because we want to explore
the performances of the algorithms for all scenarios since in
a video game, both agents could potentially be spawned any
where in the map.
We choose octile connections for the map representation
and subsequent levels of abstraction are generated using
Clique Abstraction [Sturtevant and Buro, 2005]. To enable
effective transposition table lookups in minimax and DAM
we set all edge costs to one in all levels of abstraction. Thus,
the distance heuristic between two positions (on an abstrac
tion or ground level) becomes the maximum norm of these
positions. Furthermore, equidistant edge costs mean we are
optimizing the number of turns both players take rather than
the distance they travel. All the tested algorithms can be used
for nonequidistant edge costs, only minimax’s and DAM’s
performance is expected to be lower.
Using an improved version of the algorithm in [Hahn and
MacGillivray, 2006]the entire joint state space is solved first,
i.e. we compute the values of an optimal game for each tuple
of positions of the robber and cop. This is done in an offline
computation and is used to generate optimal move policies
for the cop as well as to know the optimal value of the game.
Generation of these offline solutions took up to 2.5 hours per
map.
We study the following target algorithms:
Cover. The target performs hill climbing due to the Cover
heuristic (cf. Section 3). The heuristic has to be computed in
every step and for every possible move.
Greedy. The target performs hill climbing using the distance
heuristic. This is extremly fast since distance evaluation is
very simple.
Minimax. The target runs minimax with αβ pruning, trans
587
Page 5
algorithm
Cover
RBeacons(1)
RBeacons(5)
RBeacons(10)
RBeacons(15)
RBeacons(20)
Greedy
Minimax(5)
Minimax(7)
Minimax(9)
Minimax(11)
DAM(5)
DAM(7)
DAM(9)
DAM(11)
TrailMax(1)
TrailMax(5)
TrailMax(10)
TrailMax(15)
TrailMax(20)
DATrailMax(1)
DATrailMax(5)
DATrailMax(10)
DATrailMax(15)
DATrailMax(20)
optim.
61.9%
64.3%
65.9%
67.4%
68.6%
69.5%
76.0%
78.7%
79.2%
79.8%
80.3%
88.8%
88.4%
87.8%
87.1%
98.3%
98.0%
97.7%
97.5%
97.3%
97.0%
97.1%
97.0%
96.8%
96.7%
nE/c
4.687
0.158
0.159
0.160
0.161
0.162
0.0002
0.031
0.146
0.499
1.354
0.039
0.123
0.323
0.729
0.502
0.520
0.543
0.565
0.585
0.101
0.104
0.110
0.107
0.106
nT/cnE/tnT/t
156.831
1.065
1.070
1.075
1.083
1.091
0.001
0.216
1.027
3.546
9.709
0.238
0.752
1.985
4.476
16.682
17.301
18.060
18.769
19.436
2.283
2.342
2.487
2.395
2.359
0.037
0.022
0.017
0.015
0.248
0.148
0.117
0.102
0.108
0.059
0.043
0.035
3.598
1.970
1.433
1.169
0.023
0.014
0.011
0.010
0.515
0.311
0.251
0.225
Table 1: Experimental results.
position tables and distance heuristic as evaluation function.
We experimented with depths from 1 to 11.
DAM. The target runs dynamic abstract minimax with αβ
pruning, transposition tables and distance heuristic as evalu
ation function (cf. Section 3). We experimented with depths
from 1 to 11. The depth is used for computation on every
level of abstraction.
RandomBeacons(120). The target randomly distributes 40
beacons on the map. It then selects the beacon that is heuris
tically furthest away from the cop’s position and computes
a path to this location. The path is followed k steps before
computing a new path, hence RandomBeacons(k). We tested
RandomBeacons(k) for k = 1,...,20.
TrailMax(120). We tested TrailMax(k) for k = 1,...,20.
DATrailMax(10,120). We tested DATrailMax(10,k) for k =
1,...,20. q = 10 was chosen by hand. The question whether
there is a better setting remains for future investigation.
To evaluate performance the game is simulated for each
initial position on each map. Within these simulations, the
target algorithm is called whenever a new move has to be gen
erated. TrailMax, DATrailMax and RandomBeacons are only
called when a new path has to be computed, thus the num
ber of turns and algorithm calls differ in this case. All other
algorithms are called once per turn and therefore these two
numbers are equal. In fact, it is not possible for TrailMax,
DATrailMax and RandomBeacons to spread their computa
tion among the turns where the previous computed path is
followed because the future position of the cop is unknown.
Nonetheless, when used in computer games, these algorithms
will only require computation once every k steps and there
fore make the frames during path execution available to other
tasks. Therefore, we can also analyze the computation time
Minimax(11)
TM(1)
TM(20)
DAM(0.5, 11)
Beacons(1)
Beacons(20)
Greedy/Minimax(1)
Cover
DATM(10, 20)
DAM(0.5, 1)
DATM(10, 1)
Nodes Expanded per Turn (nE/T)
104
103
102
101
100
101
Optimality
0.6 0.70.80.91.0
Figure 6: Optimality versus node expansions per turn in one
game simulation. Averaged over the number of games played
in the experiments. Left bottom corner is best, right upper
corner is worst.
per turn for these three methods.
We are interested in the following performance measures:
• the expected survival time of the target measured in per
centage of the optimal survival time (suboptimality),
• the number of node expansions per call to the algorithm
within one game simulation (nE/c) and
• for TrailMax, DATrailMax and RandomBeacons the
amortized number of nodes expanded per turn within
one game simulation (nE/t).
Similar measures are presented for nodes touched per call
(nT/c)andperturn(nT/t). Toaccountforvariablesizedmaps,
the numbers of nodes expanded and touched are further nor
malized and measured as a percentage of the map size. Nodes
expanded counts how many times the neighbors of a node
were generated, while nodes touched measures how many
times a node is visited in memory.
The results are in Table 1 and are plotted in Figure 6. The
xaxis is reversed so the best algorithms are near the origin,
with high optimality and few expansions per move. Notice
further the logarithmic scale on the number of node expan
sions. A paretooptimal boundary is formed by Greedy and
the TrailMax algorithms, meaning that all other algorithms
have both worse optimality and more node expansions per
move, on average.
Cover clearly performs the worst. Having to compute the
heuristic in every step and for every possible move, its com
putation time is beyond any computer game requirement. Al
though solutions on abstractions can be computed in less
time, Cover is also the worst algorithm with respect to op
timality and optimality decreases when using abstract solu
tions.
Considering quality, RandomBeacons is the second worst
algorithm. This is due to the fact that it does not play very
well in the endgame, i.e. when the target is cornered and is
about to be captured. When distributing the beacons, many
of them lie in parts of the map that are heuristically far away
from the cop. Thus, the robber runs towards these positions.
Since he is cornered, this results in running into the cop.
588