Content uploaded by Giovanni Squillero
Author content
All content in this area was uploaded by Giovanni Squillero on Nov 04, 2015
Content may be subject to copyright.
Evolving Warriors for the Nano
Core
Ernesto Sanchez
Massimiliano Schillaci
Giovanni Squillero
Abstract—The paper describes the attempt to cultivate
programs to climb the nano hill, a contest with exceptionally
tight parameters of the game called corewa. An existing tool,
called µGP, has been exploited. Two genetic operators were
added to tackle the peculiarities of the objective. The generated
programs compared favorably with others, either manually
written or evolved. µGP autonomously reproduced the same
structure of the current champion of the competition, and
devised a sharp self-modifying program exploiting a completely
new strategy.
I. INTRODUCTION
In August 2004, a program called White Noise challenged
SAL Tiny Hill, the hardest corewar contests available on the
internet. White Noise defeated all opponents becoming the
new King Of The Hill (i.e., the champion). The program
recoiled from the top on summer 2005 when Larger Than
Infinity was submitted, but became king again after 10 days
when on-a-tiny-amount-of-speed challenged the hill
(ranks in a corewar competition are updated continuously,
and scores may both increase and decrease). Today, White
Noise is still in the upper half of the hill, fighting to get to
the top again.
The point of interest in this story is that White Noise was
not written by a corewar expert as other programs on the
hill, but it was cultivated by an evolutionary algorithm called
µGP [1]. The corewar community traditionally pay fairly
attention to evolvers (as they call all evolutionary
algorithms) [8] [9], nevertheless White Noise was the first
evolved beast able to top the difficult SAL Tiny Hill. The
description of the enhanced techniques exploited to evolve it
can be found in [2].
This paper further extends the research targeting a
different type of contest, the so-called nano hills. The rules
used in nano hills makes this competition quite interesting
from a mere evolutionary point of view, while it would not
be appropriate if the final goal is to evolve a test program
for a microprocessor.
This paper is organized as follows: Section II describes
the game called carewar and the different competition.
Section III focuses on the enhancements required to the
evolutionary core, while section IV details the fitness
functions used. Section V describes the experiments. Section
VI concludes the paper.
II. COREWARS
Corewar is a very peculiar game where two or more
programs fight in a virtual-computer memory. Programs are
written in an assembly-like language called redcode and run
on a virtual machine named memory array redcode
simulator (MARS).
A program wins if it causes all processes of the opposing
programs to terminate, remaining in sole possession of the
machine. This is eventually accomplished by overwriting the
opponents’ code and making them execute some illegal
instructions. Since the redcode allows to spawn multiple
threads, this is definitely not an easy task. In the past years,
researchers developed impressive warriors and subtle
strategies, most labeled with evocative names such as
scanners, vampires, dwarves, stoners. Such programs are
commonly called warriors, stressing the aggressive nature of
the game.
The true origin of corewar dates back to Darwin, a game
devised by Vyssotsky, Morris and Ritchie in the early 1960s
at Bell Labs. However, the popularization of the game is due
to Dewdney column in Scientific American [3] in 1984. In
the same year, Dewdney and Jones rigorously characterize
corewar and redcode in a document titled “Corewar
Guidelines”. The International Corewar Society (ICWS)
updated the redcode in 1986 and 1988, and proposed a new
update in 1994 that, although widely accepted, was never
formally set as the new standard.
Corewar contests are called hills. When a new program is
submitted to a hill, it plays G one-on-one games against each
of the N other programs currently on the hill. Each warrior
gets sw points for each win and st point for each tie (warriors
already present on the hill do not rematch one against each
others, but their old scores are recalled). Finally, all
programs are ranked from high to low and the least one is
pushed off the hill. Thus, while a program is present on a hill
it can get to the top as the result of a new challenge.
After two decades, the corewar community is still rather
active on the internet and organizes several contests called
hills. Different hills accept different redcode style (e.g.,
DRAFT: THIS IS NOT THE FINAL COPY
THE PUBLISHED PAPER HAS BEEN EDITED AND REFORMATTED
DOI: 10.1109/CIG.2006.311712
instruction set or program length) and run games with
different parameters (e.g., number of matches, maximum
number of concurrent warriors or scoring systems). The
dimension of the MARS memory (the core size) profoundly
influences all strategies, and is probably the key parameter.
The most common core size is c = 8,000, followed by
c = 8,192, c = 55,400, and c = 800 (all of them divisible
by 4).
The oldest and most famous server is simply named
KOTH [4] and still hosts seven hills with different settings.
However, the hardest hills are on a server called SAL [5], ran
by the Department of Mathematical and Statistical Science
of University of Alberta, Canada. Differently from other
hills, the source code of warriors posted to SAL is not visible
to all users, and authors who are not willing to expose their
strategies send their latest warriors to this server only,
contributing to make the challenge very hard.
A. Tiny Hills
All hills running core of size c = 800 are called tiny, and
usually do not accept warriors containing more than 20
instructions. Tiny hills are commonly targeted by evolvers
and other automatic optimizers, since the program length
allows a certain flexibility while the search space is not
huge.
Interestingly, before the tiny hills were introduced,
corewar was investigated mainly by humans, writing
programs according to strategies set out in advance.
However, as happens with other games (e.g., go), changing
the space available to the players effectively turns one game
into a fairly different one. Strategies devised to play
effectively in a bigger core do not necessarily fare well in a
tighter environment, so an automated generation method has
a chance to produce some novelty in the field.
In theory, however, a hill with a reduced core is where
humans should achieve the best performance, since they can
take into account a very small number of independent
elements while planning, but have a fairly deep
understanding capability. Nevertheless, White Noise
defeated all human-written warriors for more than one year,
and, later on, Larger Than Infinity, another version of the
same warrior further modified exploiting the µGP by Zul
Nadzri, topped again the hill.
B. Nano Hills
Nano hills are played by exceptionally short warriors
(composed of 5 or less instructions), in a reduced memory
space (80 locations). The restrictions in the number of child
processes and execution time are also tighter than common
hill.
These characteristics make these hills even more attractive
for users of evolvers. First, the small size leads to a search
space that is smaller than that associated with other hills,
while still too large to make an exhaustive search practical;
this leads to the interesting situation where an automated
method to generate the warriors has a chance to perform a
significant sampling of the search space, but still needs to
use some heuristics to avoid getting lost.
Automated methods are not all the same, however; the
usual metric for a corewar warrior is the outcome of its
confrontations against other warriors: this not only depends
upon the exact composition of the hill, but is also a distinctly
nonlinear function of the warrior’s parameters. Simple hill-
climbing does not guarantee to find good results.
Evolutionary methods, with their ability to perform both an
exploration and an exploitation phase during the search
process, can be suited for the task.
III. WARRIOR EVOLUTION
The µGP is an evolutionary approach to generate Turing-
complete assembly programs. Its main purpose is the
generation of test programs for microprocessors, although it
can be used to tackle a variety of problems [1].
While continuously updated, the evolutionary core is not
suited to search for a good warrior in such environment, and
has been enhanced with two new operators: safe crossover
and scan mutation.
Recombination is certainly an essential operation in an
evolutionary methodology; however, its implementation in
µGP relies on the concept of graph core to avoid disrupting
the structure of the individuals. The small size of programs
leads most of the times to cores that are as big as the entire
individual. This makes crossover decay either in a swap of
the two individuals, which is useless, or in a concatenation,
which often produces individuals that exceed the 5
instruction limit for the nano hill. The purpose of the safe
crossover is being able to cut through the cores of the
individuals and correctly joining the obtained sections.
Warriors for the nano hill are very small programs, whose
functioning depends strictly upon the exact values of all their
constants. It makes sense, then, to be able to fine-tune any
one of them in the search for an optimum. While a local
mutation exists, the strong nonlinearity of the fitness
function makes a long-range search more effective. The scan
mutation answers exactly that need, allowing to find the
(local) best value for a given parameter, even when the
fitness function is very rough.
A. Safe Crossover
To completely explain the concept of safe crossover the
plain crossover has to be detailed. Inside µGP every
individual is represented by a loosely-connected graph. A
recombination operator that simply swaps parts of two
graphs is likely to disrupt the structure of the individuals. To
avoid this the crossover in µGP has been implemented
resorting to the concept of graph core. Loosely speaking, a
graph core is a self-contained subgraph that only has one
incoming and one outgoing edge. This well-defined
connectivity allows the free interchange of cores avoiding to
disrupt the entire graph structure.
In some cases the need to find two compatible cores
inside the individuals is a constraint too strong for the
successful application of the recombination operator. The
purpose of the safe crossover is to be able to relax this
constraint by cutting (safely) through the graph cores during
recombination. To this end the graph nodes are numbered,
and every edge in the graph is transformed into a numeric
offset from the node. After two sequences of nodes are
swapped between the graphs, the edges are restored; any
reference outside the nodes numbering is resolved as a
reference to the first or to the last node in the sequence. This
effectively allows swapping sections from two different
graphs without the need to find compatible subgraphs.
B. Scan Mutation
It must be noted that a single mutation in a 5-line warrior is
likely to change 20% of the code, producing a dramatic
effect. Thus, the mutation strength and even the small
mutation recently introduced [6] are likely to be ineffective.
Conversely, it is sometimes useful, especially near the end
of the evolutionary process, to be able to fine-tune the values
of some parameters in an individual. Although this may be
achieved using the normal forms of mutation (random
mutation and local mutation) a more efficient search
operator allows a faster convergence towards an optimum of
the fitness function.
The operation of the scan mutation is as follows: inside an
individual, a node is selected; one of the parameters for that
node (if any exist) is targeted for scan; a new individual is
generated for every possible value of the parameter. Since
the evolutionary core is equipped with a clone detection and
extermination mechanism, the generation of a new
individual exactly equal to the old one does not affect the
evolutionary process.
Scan mutation is especially useful when the fitness
function is strongly nonlinear or exhibits a large number of
optima. In these cases a long range search may increase the
performance of the evolutionary approach.
IV. FITNESS FUNCTION
The fitness function play a fundamental role in every
evolutionary program. Fitness must be able to lead the
evolution toward the desired goal, or at least away from the
less promising region of the search space.
However, due to the peculiar rules of the hills, defining
such a fitness function is not easy. Once a certain program
has entered the hill, its author can help it by submitting new
warriors designed to lose with the first one and struggle
reasonably with all the others. Maybe such a warrior is
instantly pushed off from the hill, but as a result of its
challenge the first program improves its position.
This is a fairly standard practice among expert redcoders
and it is considered perfectly acceptable. It should be also
remembered that the source of warriors on SAL is not
available, and a great amount of expertise is required to
exploit such team work between programs.
The problem of devising a fitness function is also
hardened by the fact that there do not exist good repositories
of strong warriors for the nano hills. This lack also affect
negatively the assimilation technique.
Three different fitness functions have been implemented
for the purpose of the experiments, all based on the warriors
downloaded from the koenigstuhl infinite nano hill [7].
A. Fitness A
The first fitness function simply measured the points earned
by the warrior against all programs in the test hill.
This functions can be highly ineffective because, unlikely
the tiny hill, the warriors taken from koenigstuhl infinite
nano hill were non competitive, and evolution may be
biased.
B. Fitness B
Test warriors were ranked and partitioned into 5 different
sets according to their relative strength. The points earned by
the warrior against programs in different sets were
considered separately, and the 5 contributions were used as
terms of strictly decreasing importance for the fitness.
The idea behind this approach is to favor warriors able to
compete well with strong warriors. However, the ranking is
able to measure only the relative strength, and since these
warriors are not a significant sample of the SAL tiny hill it
could be useless.
C. Fitness C
Test warriors were ranked, and the points earned by the
warrior against all programs were weighted considering the
relative strength of the opponent.
The idea behind this approach is analogous to the previous
fitness, as its drawbacks. However, the distinction is not
fixed and an erroneous classification could be less
deleterious.
V. EXPERIMENTAL RESULTS
Three different experiments were run, using the different
fitness functions. All experiments used a population of 300
individuals, generating an offspring of 200 individuals at
each generation. The delta-entropy fitness hole was set to 1.0
(i.e., 100%) to promote diversity. Evolution continued until
the µGP detected a steady state, and lasted approximately
one day each on a AMD-K7 with 1,024GB of RAM, running
Linux.
A. Bob
Exploiting the two new operators and the first fitness, the
evolutions of warriors generated by the µGP follows a
distinctive trend. In the early generations the warriors are
composed basically of SPL instructions. Such programs
replicate themselves into the core (SPL stands for split, and
is the instruction for spawning a new process), with no
aggressive strategy. Then, some DJN (decrement and jump
if zero) instructions appear. Finally, the population is
invaded from warriors composed of SPL, MOV (move) and
DJN, performing a core clear, i.e., systematically writing
illegal instruction on the core. Remarkably, also White Noise
contained a core clear routine.
;redcode-nano
;name Bob v2.1r1.7408
;author The MicroGP Corewars Collective
org START
START:
mov.i <-30, $-9
spl.a #-36, >18
mov.i >-14, {0
mov.i >-29, {-2
djn.f $-2, $-3
Figure 1: Bob v2.1r1.7408
Warriors evolved using this fitness were all called Bob.
The first one (Figure 1) entered the hill at the 6th position,
and later managed reaching the 4th with 155.3 points (Table
1).
#
NAME
SCORE
1
Polarization 05
162,1
2
Resolute
159,2
3
Master of the Core
158,3
4
Bob v2.1r1.7408
155,3
5
Polarization 04
155,1
6
Bob v2.1r2.6680
153,6
7
Man&Machine
152,8
8
qEvo[[3]]
151,5
9
Walking boots
151,4
10
Shutting Down Evolver Now
151,2
11
rdrc: Alcoholism Malt
151,1
12
rdrc: Repent Linemen
150,9
13
Petro "I'm Old" Warrior [II]
150,6
14
around the core in 80 cycle
150,5
15
the last of the dragons
150,4
16
Go on!
149,7
17
toy soldier
149,6
18
rdrc: Laundry OSHA
149,1
19
Petro "I'm Old" Warrior [I]
148,8
20
Ucekupatox
147,7
21
SuperSentryIV
147,1
22
My nano Qscan III
146,7
23
rdrc: Delicate Crowbait
146,5
24
NanoWarp
146,2
25
Cellulose Fiber
146,1
26
rdrc: Borneo Birdie
145,5
27
rdrc: Sportsmen Momentary
144,4
28
Bombus Sylvestris
144,4
29
quick hack
144,1
30
2247-5163-xt430-9-nano-ev
144,0
Table 1: First 30 positions of the SAL Nano Hill
after Bob v2.1r2.6680 challenge
Interestingly, submitting a newer Bob (Bob v2.1r2.6680)
produced the team work mentioned above, pushing the first
Bob to the 4th position.
B. Onions
Far more interestingly (although less productively) using the
second fitness and the assimilation process, the µGP
cultivated a series of warriors named Onions. Figure 2
shows the one called Crazy Onion I.
;redcode-nano
;name Crazy Onion I
;author The MicroGP Corewars Collective
org START
START:
spl.f #23, >57
mov.i >-1, {42
mov.i >23, {72
mov.i {40, {-3
mov.i {25, {50
end
Figure 2: Crazy Onion I
Despite the mediocre ranks (18th out of 50 with 148.9
points), it is quite interesting (Table 2).
#
NAME
SCORE
1
Polarization 05
161.9
2
Resolute
159
3
Master of the Core
157.5
4
Polarization 04
155.7
5
Bobv2.1r1.7408
155.3
6
Bob v2.1r2.6680
153.6
7
Man&Machine
152.9
8
Shutting Down Evolver Now
152.0
9
rdrc: Repent Linemen
151.4
10
rdrc: Alcoholism Malt
151.3
11
Petro "I'm Old" Warrior [II]
151.1
12
qEvo[[3]]
150.5
13
walking boots
150.1
14
rdrc: Laundry OSHA
150.0
15
Go on!
150.0
16
the last of the dragons
149.4
17
toy soldier
148.9
18
Crazy Onion I
148.9
19
Petro "I'm Old" Warrior [I]
148.8
20
around the core in 80 cycle
147.6
21
rdrc: Delicate Crowbait
146.9
22
My nano Qscan III
146.8
23
Ucekupatox
146.4
24
rdrc: Borneo Birdie
146.1
25
rdrc: Breakaway Carte
145.6
26
rdrc: Blanch Autoclave
144.8
27
Bombus Polaris
144.8
28
SuperSentryIV
144.8
29
rdrc: Sportsmen Momentary
144.7
30
Bombus Sylvestris
144.4
Table 2: First 30 positions of the SAL Nano Hill
after Crazy Onion I challenge
Crazy Onion I is composed of 4 MOV instruction. It tries to
cover the core with bombs at the maximum available speed.
Since the nano hills allowed only 5 child threads, the SPL
instruction is critical, and if it is hit the warrior is defeated.
And according to Zul Nadzri, Crazy Onion I is almost
identical to his Polarization 05, the KOTH of the nano hill.
However, no warrior of the Polarization series was
assimilated by the µGP since their source code is kept secret
by the author.
Crazy Onion I was thought to be able to survive long on
the hill.
C. Small Animals
The most interesting result was produced by the µGP
running with the third fitness and not exploiting assimilation.
Warriors cultivated in this experiments were named from
small animals. The first one (Paedocypris horridus) is
shown if Figure 3.
;redcode-nano
;name Paedocypris horridus
;author The MicroGP Corewars Collective
org START
START:
spl.x #-5, >41
mov.i #37, <2
mov.i {-1, {-2
mov.i >-20, {23
djn.f $-3, <31
end
Figure 3: Paedocypris horridus
Before submitting it, all other µGP generated warriors
were removed to avoid the team work effect. Paedocypris
horridus scored 155.9, ranking 2nd on the hill, just after
Polarization 05 (Table 3).
#
NAME
SCORE
1
Polarization 05
160.1
2
Paedocypris horridus
155.9
3
Resolute
155.0
4
Polarization 04
153.9
5
Master of the Core
152.5
6
Rdrc: Repent Linemen
151.8
7
Shutting Down Evolver Now
151.7
8
Man&Machine
151.6
9
the last of the dragons
151.6
10
Petro "I'm Old" Warrior [II]
151.5
11
rdrc: Laundry OSHA
151.3
12
Petro "I'm Old" Warrior [I]
150.3
13
rdrc: Alcoholism Malt
150.0
14
qEvo[[3]]
148.8
15
Go on!
148.2
16
toy soldier
148.2
17
around the core in 80 cycle
147.9
18
My nano Qscan III
147.6
19
rdrc: Delicate Crowbait
147.0
20
rdrc: Blanch Autoclave
145.9
21
Ucekupatox
145.5
22
Walking boots
145.3
23
rdrc: Sportsmen Momentary
145.3
24
Merdeka 06
144.8
25
rdrc: Borneo Birdie
144.8
26
rdrc: Breakaway Carte
144.4
27
2218-6722-xt430-22-nano-e
144.4
28
SuperSentryIV
144.1
29
Bombus Sylvestris
143.8
30
QuickHack
143.7
Table 3: First 30 positions of the SAL Nano Hill
after Paedocypris horridus challenge
It’s quite hard to understand why Paedocypris horridus
won (and kept on winning). According to corewar experts, it
lays a carpet of MOV instruction from 20 cores away which
eventually combines, overwriting the DJN instruction, and
creates a 23 line long warrior (a SPL followed by 22
MOVs).
Some of the threads execute the newly created code,
resulting in a more effective bombing, and making it more
difficult to kill.
VI. CONCLUSIONS
The new evolutionary heuristics detailed above have shown
their effectiveness in the field of corewar program
cultivation. The generated warriors compare favorably with
others, either manually written or evolved, on the nano hill.
Still, none of them is KOTH.
This primarily gives credit to the specific expertise of the
people participating in this kind of competition: many
evolved warriors, in fact, are manually tweaked before they
are submitted. This kind of fine tuning requires three
typically human elements that currently cannot be
incorporated in a generic evolutionary method: a deep
understanding of the specific problem; an analysis ability; a
predictive aptitude, to effectively direct further
experimentation. Not to mention the team work between
different warriors of the same author.
However, µGP autonomously reproduced the same
structure of the current champion (Happy Onion I), and
devised a sharp self-modifying warrior exploiting a
completely new strategy (Paedocypris horridus). Two
results that could be considered as requiring intelligence.
New experiments are currently run, working together with
corewar experts.
ACKNOWLEDGMENT
Authors whish to thank Zul Nadzri for his precious
support
REFERENCES
[1] G. Squillero, “MicroGP — An Evolutionary Assembly Program
Generator”, Journal of Genetic Programming and Evolvable
Machines, Vol. 6, No. 3, 2005, pp. 247-263
[2] F. Corno, G. Squillero, E. Sánchez, “Evolving Assembly Programs:
How Games Help Microprocessor Validation”, IEEE Transactions On
Evolutionary Computation, Vol. 9, 2005, pp. 695-706
[3] A. K. Dewdney, “Computer recreations: In the game called Core War
hostile programs engage in a battle of bits”, Scientific American,
250(5), 1984, pp. 14-22
[4] http://www.koth.org/
[5] http://sal.math.ualberta.ca/
[6] E. Sanchez, M. Schillaci, M. Sonza Reorda, G. Squillero, L. Sterpone,
M. Violante, “New Evolutionary Techniques for Test-Program
Generation for Complex Microprocessor Cores”, Genetic and
Evolutionary Computation Conference, 2005, pp. 2193-2194
[7] http://www.ociw.edu/~birk/COREWAR/koenigstuhl.html
[8] http://students.fhs-hagenberg.ac.at/se/se00001/yace.html
[9] http://users.erols.com/dbhillis/