Conference PaperPDF Available

Self Modifying Cartesian Genetic Programming: Finding algorithms that calculate pi and e to arbitrary precision

Authors:
  • Machine Intelligence Ltd.

Abstract and Figures

Self Modifying Cartesian Genetic Programming (SMCGP) aims to be a general purpose form of developmental genetic programming. The evolved programs are iterated thus allowing an infinite sequence of phenotypes (programs) to be obtained from a single evolved genotype. In previous work this approach has already shown that it is possible to obtain mathematically provable general solutions to certain problems. We extend this class in this paper by showing how SMCGP can be used to find algorithms that converge to mathematical constants (pi and e). Mathematical proofs are given that show that some evolved formulae converge to pi and e in the limit as the number of iterations increase.
Content may be subject to copyright.
Self Modifying Cartesian Genetic Programming:
Finding Algorithms that Calculate pi and e to Arbitrary
Precision
Simon Harding
Department Of Computer
Science
Memorial University
Newfoundland, Canada
simonh@cs.mun.ca
Julian F. Miller
Department of Electronics
University of York
York, UK
jfm7@ohm.york.ac.uk
Wolfgang Banzhaf
Department Of Computer
Science
Memorial University
Newfoundland, Canada
banzhaf@cs.mun.ca
ABSTRACT
Self Modifying Cartesian Genetic Programming (SMCGP)
aims to be a general purpose form of developmental genetic
programming. The evolved programs are iterated thus
allowing an infinite sequence of phenotypes (programs) to
be obtained from a single evolved genotype. In previous
work this approach has already shown that it is possible to
obtain mathematically provable general solutions to certain
problems. We extend this class in this paper by showing
how SMCGP can be used to find algorithms that converge
to mathematical constants (pi and e). Mathematical proofs
are given that show that some evolved formulae converge to
pi and e in the limit as the number of iterations increase.
Categories and Subject Descriptors
I.2.2 [ARTIFICIAL INTELLIGENCE]: Automatic Pro-
gramming; D.1.2 [Software]: Automatic Programming
General Terms
Algorithms
Keywords
Genetic programming, developmental systems
1. INTRODUCTION
Self Modifying Cartesian Genetic Programming (SM-
CGP) is a form of developmental genetic programming,
based on Cartesian Genetic Programming [9]. The concept
is that CGP programs, which are directed graphs, contain
not only functions for computation, but also functions that
can change the program during run time [3].
SMCGP has previously been applied to a number of
different tasks including finding scalable, general solutions
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
GECCO’10, July 7–11, 2010, Portland, Oregon, USA.
Copyright 2010 ACM 978-1-4503-0072-8/10/07 ...$10.00.
to digital circuits [6], finding sequences and mathematical
results [5] and evolving learning algorithms [4].
Here, we demonstrate the use of SMCGP to find programs
that approximate the fundamental constants π(3.1415...)
and e(2.7182...). Two different approaches are used,
one is to find a program that acts as a mathematical
approximation, the other is to find a program that outputs
the digits as a sequence.
We provide proofs for two of the evolved formulae (one
for pi and one for e) that they rapidly converge to the
constants in the limit of large iterations. We consider this
work to be significant as evolving provable mathematical
results is a rarity in evolutionary computation. Streeter
and Becker used tree-based GP to discover mathematical
approximations to well known mathematical series, such as
the Harmonic series and also new P ad´eapproximants to
mathematical functions [13]. However, they do not find
exact analytical results that can be shown to converge
in the limit1. Schmidt and Lipson used GP to discover
the known Hamiltionians and Langrangians of mechanical
systems purely by using GP symbolic regression techniques
on data acquired through motion tracking [11]. Schmidt and
Lipson investigated using GP for solving iterated function
problems (i.e. f(f(x)) = g(x)) and they show that one
evolved function provably makes f(f(x)) converge to x22
in the limit [10]. Spector et al showed that GP could be
used to evolve hitherto unknown algebraic expressions that
are important in the mathematics of finite algebras and
are orders of magnitude shorter than those that could be
produced by prior mathematical methods [12].
The plan of the paper is as follows. In section 2 we discuss
the SMCGP technique and the recent improvements made to
the previously published method. We discuss and compare
results for various ways of applying inputs (terminals) to the
SMCGP programs. The SMCGP function set consists of
both computational functions and self-modifying functions.
These are discussed in section 3. We discuss two distinct
methods and results for evolving algorithms that could
approximate πin sections 4 and 5. For one method we
show that SMCGP can rapidly find potentially novel and
fast converging mathematical approximations to π. Our
second approach where the digits were evolved as a sequence
1Indeed, they note on page 281 that finding such results
would be a “striking and exciting application of genetic
programming”
579
is not as effective, but still a plausible methodology for
approximating π. In section 6 we discuss experiments and
results for the case of approximations to e, and find an
algorithmic approximation that is similar to the well known
Bernoulli equation. We close with conclusions and future
work.
2. SMCGP
As in CGP, in SMCGP each node in the directed graph
represents a particular function and is encoded by a number
of genes. The first gene encodes the function of the node.
This is followed by a number of connection genes (as in CGP)
that indicate the location in the graph where the node takes
its inputs from. However SMCGP also has three real-valued
genes which encode parameters that may be required for
the function (primarily self modification (SM) functions use
these and in many cases they are truncated to integers when
necessary, see later). Section 3 details the available functions
and any associated parameters. In this paper all nodes take
two inputs, hence each node is specified by seven genes.
As in CGP, nodes take their inputs in a feed-forward
manner from either the output of a previous node or from a
program input (terminal). We use relative addressing in
which connection genes specify how many nodes back in
the graph they are connected to. Inputs are acquired to
programs through the use of special node functions, that we
call INP. Outputs are generated using OUTPUT functions.
The evaluation of a genotype is as follows: The initial
phenotype is a copy of the genotype. This graph is then
executed, and if there are any modifications to be made,
they alter the phenotype graph.
The genotype is invariant during the entire evaluation of
the individual. When executed, the phenotype is initially
a copy of the genotype, and all modifications are made to
the phenotype. In subsequent iterations, the phenotype will
usually gradually diverge from the genotype. The encoded
graph is executed in the same manner as standard CGP, but
with changes to allow for self-modification. The graph is
executed by recursion, starting from the output nodes down
through the functions, to the input nodes. In this way, nodes
that are unconnected are not processed and do not affect the
behavior of the graph at that stage. For function nodes (e.g.
+,-,/,*) the output value is the result of the mathematical
operation on input values.
Each active (i.e. expressed) graph manipulation function
(starting on the leftmost node of the graph) is added to a “To
Do” list of pending modifications. After each iteration, the
“To Do” list is parsed, and all manipulations are performed
(provided they do not exceed the number of operations
specified in the user defined “To Do” list length). The
parsing is done in order of the instructions being appended
to the list, i.e. first in is first to be executed. The length
of the list can be limited as manipulations are relatively
computationally expensive to perform. Here we limit
the length to just 2 instructions, which simplifies human
analysis. All graph manipulation functions use extra genes
as parameters. This is described in section 3. Complete
details of SMCGP can be found in [4].
We use an (1+4) evolutionary strategy for the experiments
in this paper as in CGP. We have used a relatively high
(for CGP) mutation rate of 0.1. In these experiments,
for simplicity, we chose to make all the rates the same.
Mutations for the function type and relative addresses
themselves are unbiased; a gene can be mutated to any other
valid value.
For the real-valued genes, the mutation operator can
choose to randomize the value (with probability 0.1) or
add noise (normally distributed, σ20). The evolutionary
parameters we have used have not been optimized in any
way, so we expect to find much better values through
empirical investigations.
Evolution is limited to 10,000,000 evaluations. Trials that
fail to find a solution in this time are considered to have
failed.
3. FUNCTION SET
The function set is defined in two parts. The computa-
tional operations as defined in Table 1. The other part is
the set of modification operators. These are common to all
data types used in SMCGP.
The self-modifying genotype (and phenotype) nodes con-
tain three double precision numbers, called “parameters”.
In the following discussion we denote these P0,P1,P2. We
denote the integer position of the node in the phenotype
graph that contained the self modifying function (i.e. the
leftmost node is position 0), by x. In the definitions of the
SM functions we often need to refer to the values taken by
node connection genes (which are all relative addresses). We
denote the jth connection gene on node at position i, by cij .
The modification functions (with the short-hand name)
are defined in Table 1.
4. EVOLVING APPROXIMATIONS TO π
There exist several iterative approaches to approximating
π.
For example, the Gregory-Leibniz series:
π=4
14
34
54
74
9...
This series is simple, but requires a large number of
iterations to reach good accuracy. Another method uses
recursion to find an approximation:
π=n(tan(π
/
n)tan3(π
/
n)
3tan5(π
/
n)
5tan7(π
/
n)
7. . .)
for n= 1,2, ....
Curiously, there has been little work on evolving approxi-
mators π, despite it being a well defined problem with many
human designed solutions to compare against.
In [7], Krohn used an artificial developmental system
based on fractal proteins [1] to produce approximations to
πusing two different approaches. The first approach was
to generate the digits as a binary sequence. The second,
and more successful, approach was to use the output of the
developmental system to provide values for the equation:
I
i=1 N
n=2 Bn,i
i
t=1 B1,i
where Iis the number of developmental iterations, Nis
the number of behavioural genes (an output of the evolved
program) and Bn,i is the output of the nth behavioural gene
at iteration i.
580
Basic
Delete (DEL) Delete the nodes between (P0+x) and (P0+x+P1).
Add (ADD) Add P1new random nodes after (P0+x).
Move (MOV) Move the nodes between (P0+x) and (P0+x+P1) and insert after (P0+
x+P2).
Duplication
Overwrite (OVR) Copy the nodes between (P0+x) and (P0+x+P1) to position (P0+x+P2),
replacing existing nodes in the target position.
Duplication (DUP) Copy the nodes between (P0+x) and (P0+x+P1) and insert after (P0+
x+P2).
Duplicate Preserving Connections
(DU3)
Copy the nodes between (P0+x) and (P0+x+P1) and insert after (P0+
x+P2). When copying, this function modifies the cij of the copied nodes
so that they continue to point to the original nodes.
Duplicate and scale addresses (DU4) Starting from position (P0+x) copy (P1) nodes and insert after the node at
position (P0+x+P1). During the copy, cij of copied nodes are multiplied
by P2.
Copy To Stop (COPYTOSTOP) Copy from xto the next “COPYTOSTOP” or ‘STOP” function node, or
the end of the graph. Nodes are inserted at the position the operator stops
at.
Connection modification
Shift Connections (SHIFTCONNEC-
TION)
Starting at node index (P0+x), add P2to the values of the cij of next P1.
Shift Connections 2 (MULTCON-
NECTION)
Starting at node index (P0+ x), multiply the cij of the next P1nodes by
P2.
Change Connection (CHC) Change the (P1mod3)th connection of node P0to P2.
Function modification
Change Function (CHF) Change the function of node P0to the function associated with P1.
Change Parameter (CHP) Change the (P1mod3)th parameter of node P0to P2.
Numeric functions
No operation (NOP) Passes through the first input.
Add, Subtract, Multiply, Divide
(DADD, DSUB, DMULT, DDIV)
Performs the relevant mathematical operation on the two inputs.
Const (CONST) Returns a numeric constant as defined in parameter P0.
x,1
xCos, Sin, TanH, Absolute
(SQRT, DRCP, COS, SIN, TANH,
DABS)
Performs the relevant operation on the first input (ignoring the second
input).
Average (AVG) Returns the average of the two inputs.
Node index (INDX) Returns the index of the current node. 0 being the first node.
Input count (INCOUNT) Returns the number of program inputs.
Min, Max (MIN, MAX) Returns the minimum/maximum of the two inputs.
Table 1: The function set.
4.1 Fitness Function
Our fitness function was configured to produce a program
where subsequent iterations of the program produced more
accurate approximation to π. Programs were allowed to
iterate for a maximum of 10 iterations. If the output after
an iteration did not better approximate π, evaluation was
stopped and a large fitness penalty applied. Note that it is
possible that after the 10 iterations the output value diverges
from π, and the quality of the result would therefore worsen.
The fitness score of an individual is defined as the
absolute error of the last output. In addition, the string
representation of the output was checked to ensure that all
digits matched correctly. Using doubles in .Net limits the
precision to 14 decimal places (i.e. 3.14159265358979).
Four variants of the fitness function were tested, each
with different configurations of inputs given to the programs,
these are:
A One input : the current iteration. Functions can
be built using the current iteration counter as a
parameter.
B One input : numeric constant (1). The program has
no real input, and therefore has to build a structure
that performs the iterative process.
C Two inputs : the current iteration and last outputted
value. This form can, in some sense, be viewed as
recurrent, as programs can depend on the previous
output.
D Two inputs : numeric constant (1) and last outputted
value. This form can also be viewed as recurrent, as
programs can depend on the previous output.
581
Figure 1: Visualization of SMCGP program that produces an approximation to π. Each row is a different
time step.
4.2 Results
The statistical results for these experiments are shown in
Table 2. Each experiment was conducted approximately 150
times. The standard deviations are large and overlap, which
means that the algorithms appear to perform similarly.
Config. % Success Avg. Evals Min., Avg. Iterations
A 96.7 8,952,441 3, 5.95
B 99.4 2,905,673 3, 5.49
C 96.2 4,953,518 3, 6.78
D 98.7 2,146,348 3, 5.90
Table 2: Finding πusing various inputs to the
evolved programs. Experiment types: (A)One
input, the current iteration; (B)One input, numeric
constant (1); (C) Two inputs,the current iteration
and last outputted value. (D): Two inputs, numeric
constant (1) and the last outputted value. Iterations
refers to the number of times the program has to be
iterated before it reaches πto 14 decimal places.
4.3 Example πGenerator
Figure 1 shows the output of an evolved SMCGP program
that accurately converges to π. Table 3 shows the output of
the program at each time step. As the program is relatively
short, it was possible to extract the evolved generating
function:
f(i) = cos(sin(cos(sin(0)))) i= 0
f(i1) + sin(f(i1)) i > 0(1)
Equation 1 is a nonlinear recurrence relation. However it
can be shown that it converges rapidly to π. When i= 10,
the output matches the first 2048 digits of π.
To prove mathematically that when Eqn 1 is iterated it
converges exactly to pi we note that the value of πis a
fixed point of equation 1 since x=x+sinx is obeyed when
x=π. It is stable since f(x) = 1 + cos(x) = 0 when x=π.
How rapidly it converges to πcan be seen from the following
Iter. Output Error Output Correct
0 3.142 0 0
1 2.475 0.666366745392881 0
2 0.898 2.24379742136643 0
3 0.116 3.02575192260092 0
4 2.589E-4 3.14133374812244 3
5 2.892E-12 3.1415926535869 11
6 0 3.14159265358979 14
7 0 3.14159265358979 14
8 0 3.14159265358979 14
9 0 3.14159265358979 14
Table 3: Output from program shown in figure 1.
Output is error (to 3 decimal places) is the difference
between π(.Net’s Math.PI) and the output from the
program. Correct digits is the count of the correctly
matching digits after the decimal point. Errors will
appear to be 0 when the actual error is very small.
argument. Suppose at some iteration m,f(m) is close to π.
Then we can write f(m) = πδ, where δis a small quantity.
Then from Eqn. 1 f(m+ 1) = πδ+sin(πδ)πδ3
3! .
So it approaches pi cubically.
5. EVOLVING THE DIGITS OF πAS A SE-
QUENCE
Evolving patterns has been a focus of much work in
developmental systems (i.e. French flags). Here we take the
view that pi could be also be viewed as a one dimensional
complex pattern (of repeating digits). Thus we formulate
the task here to be that of finding a program that on each
iteration will output the next digit of π. The way this is
done is as follows. The first time the program is executed
it outputs 3. Then the self modification is applied, the
program executed again, and it should output 1. Then 4,1,5,
and so on.
The program inputs were set to the iteration, iand the
previous output value. The function set used is the same
as that described above but used with an integer data type.
582
Iteration Description
0 CTS (Copy To Stop) returns the constant 3.29, but as the program is interpreted as integers, the
value is truncated to 3.
1 INP returns the first input, which is the current iteration (1). The CTS node returns the DSQRT
(square root) of 1.
2 CTS returns the truncated value of the constant i.e. 4.
3 The CTS node returns the square root of the square root of the current iteration (3). As a truncated
integer, this is 1.
4 Again, the CTS node returns a value (5) from a constant.
5 Here the output comes from adding two constants 4 and 5, to return 9.
6 Here the CTS node connects to a MOV (Move) function which is returning the square root of 6 (as
integer) i.e 2.
7 The output (6) is the sum (DADD) of 4 and 2 (which is the integer square root of 7).
8 Here, the output comes from the average (AVG) of 4 and 6 (via the same calculations as iteration
7), to get 5.
9 Here the input value is 9, so the square root function now outputs 3. The left most CTS function
returns 3 (via the MOV node connected to the top input). This is because of the order of modification
nodes has reached a limit on the ToDo list, and the operation has failed - changing which of the
inputs is selected as the output.
Table 4: Description of operations occurring in figure 2.
The fitness function stopped iterating the program when
either an incorrect digit or a number <0 or >9 was output.
Fitness is defined as the total number of correct digits output
before making a mistake. Evolution was allowed to continue
for 10,000,000 evaluations (or 100 digits of π).
5.1 πResults
The experiment was repeated 310 times (with 31 comput-
ers running 10 experiments each). The longest sequence
found was 31 correct digits. The shortest was 5 correct
digits, and the average number of correct digits was 14.
Figure 2 shows the development stages of the first 10
steps of this program, and a description of each step can
be found in table 4. Each step outputs the next digit of π,
starting with 3 in the first step. The program consistently
uses the final node, a Copy To Stop (CTS) function, as its
output. This is because no OUTPUT nodes were used, so
the graph runner defaulted to using the last node in the
graph as output.
6. EVOLVING APPROXIMATIONS TO E
Aside from π,eis one of the most famous constants in
mathematics. There is a well known approximation to e
given by (often known as Bernoulli’s formula) [8] :
lim
y→∞(1 + 1
y)y
This series approaches erather slowly. However recent
faster approximations have been found by Brothers and
Knox [2]. For instance:
lim
y→∞(y+ 1)
11y
6(y1)
5y
6(2y+ 1
2yy+1 )8
3
6.1 Results
Using the same fitness function as with π, evolving
solutions for ewas found to be significantly harder, the
success rate is shown in table 6. However, high quality
approximations were found as detailed in the following
section. It is suspected that the use of powers in the
Iteration Output Error Output Digits
0 0.470 2.248310938642 0
1 4.162E-4 2.71786562898987 2
2 2.723E-5 2.71825460295116 4
3 1.722E-6 2.71828010695015 5
4 1.079E-7 2.71828172054962 6
5 6.749E-0 2.71828182170977 8
6 4.219E-10 2.71828182803714 9
7 2.637E-11 2.71828182843268 10
8 1.648E-12 2.7182818284574 11
9 1.030E-13 2.71828182845894 11
10 6.2178E-15 2.71828182845904 13
11 4.441E-16 2.71828182845904 13
12 0 2.71828182845905 14
Table 5: Output from program shown in figure
3. Output is error (to 3 d.p.) is the difference
between ei (.Net’s Math.E) and the output from
the program. Digits is the count of the correctly
matching digits after the decimal point.
calculation produces nonlinearity in the fitness scoring,
which makes evolution less able to find solutions.
Config. Correct digits Successes Runs
A 3.75 6 528
B 3.79 2 532
C 3.61 5 542
D 3.59 1 526
Table 6: Finding eusing various inputs to the
evolved programs. The success rate was very low,
with the best performing approach being A with 6
successful solutions found in 528 runs. A successful
solution occurs when 14 digits are matched.
583
Figure 2: The first 10 developmental steps of a program that produces a πdigits sequence.
6.2 Example eGenerator
Figure 3 shows the output of an evolved SMCGP program
that accurately converges to e. Output from the evolved
program is show in table 5. The program was first tested
using an arbitrary precision maths utility to confirm the
program is correct. At 2048 significant figures of precision,
by iteration 100 the solution is correct to the first 121 digits,
with an error of 4.386165e122.
At the top of the figure is seen a schematic of the
evolved genotype. The node functions are written above the
nodes. The initial genotype has 20 nodes. The next three
graphs show the phenotypes at the first, second and third
iterations. The ToDo list length was 2 so that only two SM
functions are obeyed in each genotype/phenotype. The SM
functions CHF and DU4 are used. The CHF function has
the effect of changing the second COS function (at position
1) into DADD and the DU4 function results in nodes 1 to
3 being copied after node 3. Thus on each iteration the
graph increases by three nodes: DADD, DADD and AVG
respectively.
Consider the first iteration. The nodes from 4 to 22 form
a block of nodes that merely shifts along at each iteration.
We calculate what function these nodes calculate from an
input x(supplied by the previous node). We have labeled
the outputs of the nodes using xwith the subscript being
the same as node label. We obtain the following equations:
x4= 2x
x5= 2x4= 4x
x6=x5= 4x
x7=x6x6= 16x2
x8=x5+x7= 4x+ 16x2
x9=x8= 4x+ 16x2
x10 =x8=4x+ 16x2
x11 =x9/x7= 1 + 1
4x
x12 =x11x10 = (1 + 1
4x)4x+16x2
x17 =x12 =x12
x18 =x17 =x12
x19 =x18 = (1 + 1
4x)4x+16x2
The value of x arises from the COS node at position 0 and
the sequence of DADD, DADD, AVG nodes. The arguments
of COS are zero so it outputs 1. The result of applying
DADD, DADD, AVG is to produce the value 4. So at the
first iteration x=x3= 4. Each iteration multiplies xby 4.
Thus we can write x= 4it where it is the iteration. Clearly
as the iterations increase xbecomes very large. Defining
y= 4x= 4it+1 we can write the equation for the output x19
as
x19 = (1 + 1
y)y1+ 1
y(2)
Eqn 2 tends to the form of a well-known Bernoulii formula
(see above). Thus we have established mathematically that
the evolved formula converges to e in the limit of large
iterations.
7. CONCLUSIONS
SMCGP has been shown to successfully find good approx-
imations to both πand e. The evolved approximation to eis
a special case of previously known solution. We were unable
to find a known solution similar to the found πapproxi-
mation. We would speculate that this approach should be
able to find novel approximations to both constants, however
there is obvious difficulty in comparing a large number of
generated solutions to a large body of known solutions.
584
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
x
3
x
4
Figure 3: SMCGP genotype and three iterations of approximation to e
We also offer the problems of approximating to pi and e
as benchmarks for developmental methods. Clearly they are
difficult problems and also they are intrinsically interesting
(the problem of finding e appears to be especially difficult).
Generating sequences of digits in a sequence such as π
shows that SMCGP can generate programs that produce
arbitrary sequences. We were mildly disappointed that we
were unable to find a program that could generate the digits
precisely, however the known solutions to this do not operate
in base 10 2. Moving to different numeric representations
may allow us to reproduce this type of result.
We also expect that SMCGP could be used to try to
discover hitherto unknown but provable algorithms that sum
mathematical series or converge to particular constants.
It was interesting to see that the self modifying aspects of
SMCGP were used in both the solutions examined. In the
approximation to e, the self modifying functions found a
module that corresponded to a complicated power function.
Conceivably such a function could have been evolved with
a standard implementation of CGP, but we suspect that
the self modification made it easier to discover iterative
solutions. SMCGP can chose to ignore the SM functions,
and revert in operation to CGP. If the SM was not useful
then they should be ignored. We hope to investigate this in
future work.
8. ACKNOWLEDGEMENTS
We would like to thank Jon Rowe for his valuable com-
ments. WB acknowledges funding from Atlantic Canada’s
HPC network ACENET and by NSERC under the Discovery
Grant Program RGPIN 283304-07.
2Such as the BBP Formula (described here:
http://mathworld.wolfram.com/BBPFormula.html)that
operates in base 16.
9. REFERENCES
[1] P. Bentley. Fractal proteins. Genetic Programming and
Evolvable Machines, 5(1):71–101, Mar. 2004.
[2] H. J. Brothers and J. A. Knox. New closed-form
approximations to the logarithmic constant e.
Mathematical Intelligencer, 20(4):25–29, 1998.
[3] S. Harding, J. F. Miller, and W. Banzhaf.
Self-modifying cartesian genetic programming. In
H. Lipson, editor, Genetic and Evolutionary
Computation Conference, GECCO 2007, Proceedings,
London, England, UK, July 7-11, 2007, pages
1021–1028. ACM, 2007.
[4] S. Harding, J. F. Miller, and W. Banzhaf. Evolution,
development and learning with self modifying
cartesian genetic programming. In GECCO ’09:
Proceedings of the 11th Annual conference on Genetic
and evolutionary computation, pages 699–706, New
York, NY, USA, 2009. ACM.
[5] S. Harding, J. F. Miller, and W. Banzhaf. Self
modifying cartesian genetic programming: Fibonacci,
squares, regression and summing. In L. Vanneschi,
S. Gustafson, et al., editors, Genetic Programming,
12th European Conference, EuroGP 2009, T¨
ubingen,
Germany, April 15-17, 2009, Proceedings, volume
5481 of Lecture Notes in Computer Science, pages
133–144. Springer, 2009.
[6] S. Harding, J. F. Miller, and W. Banzhaf. Self
modifying cartesian genetic programming: Parity. In
A. Tyrrell, editor, 2009 IEEE Congress on
Evolutionary Computation, pages 285–292,
Trondheim, Norway, 18-21 May 2009. IEEE
Computational Intelligence Society, IEEE Press.
585
[7] J. Krohn, P. J. Bentley, and H. Shayani. The challenge
of irrationality: fractal protein recipes for pi. In
GECCO ’09: Proceedings of the 11th Annual
conference on Genetic and evolutionary computation,
pages 715–722, New York, NY, USA, 2009. ACM.
[8] E. Maor. e: The Story of a Number. Princeton
University Press, 1994.
[9] J. F. Miller and P. Thomson. Cartesian genetic
programming. In R. Poli, W. Banzhaf, et al., editors,
Proc. of EuroGP 2000, volume 1802 of LNCS, pages
121–132. Springer-Verlag, 2000.
[10] M. Schmidt and H. Lipson. Distilling free-form natural
laws from experimental data. Science,
324(5923):81–85, 2009.
[11] M. D. Schmidt and H. Lipson. Solving iterated
functions using genetic programming. In GECCO ’09:
Proceedings of the 11th Annual Conference Companion
on Genetic and Evolutionary Computation
Conference, pages 2149–2154. ACM, 2009.
[12] L. Spector, D. M. Clark, I. Lindsay, B. Barr, and
J. Klein. Genetic programming for finite algebras. In
GECCO ’08: Proceedings of the 10th annual
conference on Genetic and evolutionary computation,
pages 1291–1298. ACM, 2008.
[13] M. Streeter and L. A. Becker. Automated discovery of
numerical approximation formulae via genetic
programming. Genetic Programming and Evolvable
Machines, 4(3):255–286, 2003.
586
... -Evolution of large parity circuits (Harding et al. 2009b). -Calculation of "pi" and "e" to any arbitrary precision (Harding et al. 2010b (2015) -Symbolic regression problems (Yazdani and Shanbehzadeh, 2015). -Feature selection and dimension reduction (Yazdani et al. 2017 Miller 2004, 2008), and digital electronics problems (Walker and Miller 2005a,b). ...
... Another extension to CGP is Self-Modifying CGP (SMCGP), which defines self-modifying functions that can be executed over time and has its inspiration from biology. SMCGP has been used to study Fibonacci squares and regression (Harding et al. 2009a), evolution of large parity circuits (Harding et al. 2009b), and calculation of "π " and "e" to any arbitrary precision (Harding et al. 2010b). An extension within SMCGP, called SMCGP in two dimensions (SMCGP-2) , was proposed that uses height and width to define a 2D grid of nodes. ...
Article
Cartesian Genetic Programming (CGP) is a variant of Genetic Programming with several advantages. During the last one and a half decades, CGP has been further extended to several other forms with lots of promising advantages and applications. This article formally discusses the classical form of CGP and its six different variants proposed so far, which include Embedded CGP, Self-Modifying CGP, Recurrent CGP, Mixed-Type CGP, Balanced CGP, and Differential CGP. Also, this article makes a comparison among these variants in terms of population representations, various constraints in representation, operators and functions applied, and algorithms used. Further, future work directions and open problems in the area have been discussed.
... Recurrent CGP [241,242] allows the existence of recurrent connections which can target any node in a graph, facilitating the induction of recursive solutions to problems such as generating the Fibonacci sequence or time-series forecasting. Self-modifying CGP [90] facilitates the inclusion of nodes that can create and delete other nodes, thereby allowing a graph to develop to solve a class of problems, such as computing π or e to arbitrary precision [91]. ...
Thesis
Full-text available
Graphs are a ubiquitous data structure in computer science and can be used to represent solutions to difficult problems in many distinct domains. This motivates the use of Evolutionary Algorithms to search over graphs and efficiently find approximate solutions. However, existing techniques often represent and manipulate graphs in an ad-hoc manner. In contrast, rule-based graph programming offers a formal mechanism for describing relations over graphs. This thesis proposes the use of rule-based graph programming for representing and implementing genetic operators over graphs. We present the Evolutionary Algorithm Evolving Graphs by Graph Programming and a number of its extensions which are capable of learning stateful and stateless digital circuits, symbolic expressions and Artificial Neural Networks. We demonstrate that rule-based graph programming may be used to implement new and effective constraint-respecting mutation operators and show that these operators may strictly generalise others found in the literature. Through our proposal of Semantic Neutral Drift, we accelerate the search process by building plateaus into the fitness landscape using domain knowledge of equivalence. We also present Horizontal Gene Transfer, a mechanism whereby graphs may be passively recombined without disrupting their fitness. Through rigorous evaluation and analysis of over 20,000 independent executions of Evolutionary Algorithms, we establish numerous benefits of our approach. We find that on many problems, Evolving Graphs by Graph Programming and its variants may significantly outperform other approaches from the literature. Additionally, our empirical results provide further evidence that neutral drift aids the efficiency of evolutionary search.
... This allows SMCGP to be applied to classes of problems that non-developmental encodings can not solve. Indeed, it has been shown that SMCGP could find provably general solutions to certain classes of problems: parity, binary addition [26], computation of and e [27] Fig. 2 In ICCGP nodes are called enzymes. An enzyme has two binding vectors and a shape vector. ...
Article
Full-text available
Cartesian genetic programming, a well-established method of genetic programming, is approximately 20 years old. It represents solutions to computational problems as graphs. Its genetic encoding includes explicitly redundant genes which are well-known to assist in effective evolutionary search. In this article, we review and compare many of the important aspects of the method and findings discussed since its inception. In the process, we make many suggestions for further work which could improve the efficiency of the CGP for solving computational problems.
... Higher level functions such as adders and multipliers have been used for data compression [289], cell scheduling [228], and robot navigation [252]. Fine granular CGP is used mostly for benchmarking and for some applications such as pattern recognition [153] [6,9,10], signal processing [240,289,299,311,322,343], computer design [87] [11] and robot navigation [151]. ...
Presentation
Full-text available
Genetic Programming is often associated with a tree representation for encoding expressions and algorithms. However, graphs are also very useful and flexible program representations which can be applied to many domains (e.g. electronic circuits, neural networks, algorithms). Over the years a variety of representations of graphs have been explored such as: Parallel Distributed Genetic Programming (PDGP) , Linear-Graph Genetic Programming, Implicit Context Genetic Programming, Graph Structured Program Evolution (GRAPE) and Cartesian Genetic Programming (CGP). Cartesian Genetic Programming (CGP) is probably the best known form of graph-based Genetic Programming. It was developed by Julian Miller in 1999-2000. In its classic form, it uses a very simple integer address-based genetic representation of a program in the form of a directed graph. CGP has been adopted by a large number of researchers in many domains. In a number of studies, CGP has been shown to be comparatively efficient to other GP techniques. It is also very simple to program. Since its original formulation, the classical form of CGP has also undergone a number of developments which have made it more useful, efficient and flexible in various ways. These include the addition of automatically defined functions (modular CGP), self-modification operators (self-modifying CGP), the encoding of artificial neural networks (GCPANNs) and evolving iterative programs (iterative CGP).
Article
In nature, brains are built through a process of biological development in which many aspects of the network of neurons and connections change and are shaped by external information received through sensory organs. From numerous studies in neuroscience, it has been demonstrated that developmental aspects of the brain are intimately involved in learning. Despite this, most artificial neural network (ANN) models do not include developmental mechanisms and regard learning as the adjustment of connection weights. Incorporating development into ANNs raises fundamental questions. What level of biological plausibility should be employed? In this chapter, we discuss two artificial developmental neural network models with differing degrees of biological plausibility. One takes the view that the neuron is fundamental (neuro-centric) so that all evolved programs are based at the level of the neuron, the other carries out development at an entire network level and evolves rules that change the network (holocentric). In the process, we hope to reveal some important issues and questions that are relevant to researchers wishing to create other such models.
Article
Optimized shape design is used for such applications as wing design in aircraft, hull design in ships, and more generally rotor optimization in turbomachinery such as that of aircraft, ships, and wind turbines.We present work on optimized shape design using a technique from the area of Genetic Programming, self-modifying Cartesian Genetic Programming (SMCGP), to evolve shapes with specific criteria, such as minimized drag or maximized lift. This technique is well suited for a distributed parallel system to increase efficiency. Fitness evaluation of the genetic programming technique is accomplished through a custom implementation of a fluid dynamics solver running on graphics processing units (GPUs). Solving fluid dynamics systems is a computationally expensive task and requires optimization in order for the evolution to complete in a practical period of time. In this chapter, we shall describe both the SMCGP technique and the GPU fluid dynamics solver that together provide a robust and efficient shape design system.
Conference Paper
Full-text available
Cartesian Genetic Programming (CGP) is a well-known form of Genetic Programming developed by Julian Miller in 1999-2000. In its classic form, it uses a very simple integer address-based genetic representation of a program in the form of a directed graph. Graphs are very useful program representations and can be applied to many domains (e.g. electronic circuits, neural networks). It can handle cyclic or acyclic graphs. In a number of studies, CGP has been shown to be comparatively efficient to other GP techniques. It is also very simple to program. The classical form of CGP has undergone a number of developments which have made it more useful, efficient and flexible in various ways. These include self-modifying CGP (SMCGP), cyclic connections (recurrent-CGP), encoding artificial neural networks and automatically defined functions (modular CGP). SMCGP uses functions that cause the evolved programs to change themselves as a function of time. This makes it possible to find general solutions to classes of problems and mathematical algorithms (e.g. arbitrary parity, n-bit binary addition, sequences that provably compute pi and e to arbitrary precision, and so on). Recurrent-CGP allows evolution to create programs which contain cyclic, as well as acyclic, connections. This enables application to tasks which require internal states or memory. It also allows CGP to create recursive equations. CGP encoded artificial neural networks represent a powerful training method for neural networks. This is because CGP is able to simultaneously evolve the networks connections weights, topology and neuron transfer functions. It is also compatible with Recurrent-CGP enabling the evolution of recurrent neural networks. The tutorial will cover the basic technique, advanced developments and applications to a variety of problem domains. It will present a live demo of how the open source cgplibrary can be used.
Conference Paper
Full-text available
Cartesian Genetic Programming (CGP) is an increasingly popular and efficient form of Genetic Programming that was developed by Julian Miller in 1999 and 2000. In its classic form, it uses a very simple integer based genetic representation of a program in the form of a directed graph. Graphs are very useful program representations and can be applied to many domains (e.g. electronic circuits, neural networks). In a number of studies, CGP has been shown to be comparatively efficient to other GP techniques. It is also very simple to program. Since then, the classical form of CGP has been developed made more efficient in various ways. Notably, by including automatically defined functions (modular CGP) and self-modification operators (self-modifying CGP). SMCGP was developed by Julian Miller, Simon Harding and Wolfgang Banzhaf. It uses functions that cause the evolved programs to change themselves as a function of time. Using this technique it is possible to find general solutions to classes of problems and mathematical algorithms (e.g. arbitrary parity, n-bit binary addition, sequences that provably compute pi and e to arbitrary precision, and so on). The tutorial will cover the basic technique, advanced developments and applications to a variety of problem domains.
Conference Paper
Full-text available
Cartesian Genetic Programming (CGP) is an increasingly popular and efficient form of Genetic Programming. Cartesian Genetic Programming is a highly cited technique that was developed by Julian Miller in 1999 and 2000 from some earlier joint work of Julian Miller with Peter Thomson in 1997. In its classic form, it uses a very simple integer based genetic representation of a program in the form of a directed graph. Graphs are very useful program representations and can be applied to many domains (e.g. electronic circuits, neural networks). In a number of studies, CGP has been shown to be comparatively efficient to other GP techniques. It is also very simple to program. Since then, the classical form of CGP has been developed made more efficient in various ways. Notably by including automatically defined functions (modular CGP) and self-modification operators (self-modifying CGP). SMCGP was developed by Julian Miller, Simon Harding and Wolfgang Banzhaf. It uses functions that cause the evolved programs to change themselves as a function of time. Using this technique it is possible to find general solutions to classes of problems and mathematical algorithms (e.g. arbitrary parity, n-bit binary addition, sequences that provably compute pi and e to arbitrary precision, and so on). This tutorial is will cover the basic technique, advanced developments and applications to a variety of problem domains. The first edited book on CGP was published by Springer in September 2011. CGP has its own dedicated website http://www.cartesiangp.co.uk
Article
Full-text available
For e, there exists a straight-forward Maclaurin series summation that is quite accurate. In this article, we demonstrate that there exist alternative closed-form approximations to e that are also very accurate.
Conference Paper
Full-text available
Self Modifying CGP (SMCGP) is a developmental form of Cartesian Genetic Programming(CGP). It is able to modify its own phe- notype during execution of the evolved program. This is done by the inclusion of modification operators in the function set. Here we present the use of the technique on several different sequence generation and regression problems.
Conference Paper
Full-text available
Self modifying CGP (SMCGP) is a developmental form of Cartesian genetic programming(CGP). It differs from CGP by including primitive functions which modify the program. Beginning with the evolved genotype the self-modifying functions produce a new program (phenotype) at each iteration. In this paper we have applied it to a well known digital circuit building problem: even-parity. We show that it is easier to solve difficult parity problems with SMCGP than either with CGP or modular CGP, and that the increase in efficiency grows with problem size. More importantly, we prove that SMCGP can evolve general solutions to arbitrary-sized even parity problems.
Conference Paper
Full-text available
Self-Modifying Cartesian Genetic Programming (SMCGP) is a form of genetic programming that integrates developmental (self-modifying) features as a genotype-phenotype mapping. This paper asks: Is it possible to evolve a learning algorithm using SMCGP?
Conference Paper
Full-text available
Computational development traditionally focuses on the use of an iterative, generative mapping process from genotype to phenotype in order to obtain complex phenotypes which comprise regularity, repetition and module reuse. This work examines whether an evolutionary computational developmental algorithm is capable of producing a phenotype with no known pattern at all: the irrational number PI. The paper summarizes the fractal protein algorithm, provides a new analysis of how fractals are exploited by the developmental process, then presents experiments, results and analysis showing that evolution is capable of producing an approximate algorithm for PI that goes beyond the limits of precision of the data types used.
Conference Paper
Full-text available
We describe the application of genetic programming (GP) to a problem in pure mathematics, in the study of nite al- gebras. We document the production of human-competitive results in the discovery of particular algebraic terms, namely discriminator, Pixley, majority and Mal'cev terms, showing that GP can exceed the performance of every prior method of nding these terms in either time or size by several or- ders of magnitude. Our terms were produced using the ECJ and PushGP genetic programming systems in a variety of congurations. We compare the results of GP to those of exhaustive search, random search, and algebraic methods.
Article
The fractal protein is a new concept intended to improve evolvability, scalability, exploitability and provide a rich medium for evolutionary computation. Here the idea of fractal proteins and fractal proteins with concentration levels are introduced, and a series of experiments showing how evolution can design and exploit them within gene regulatory networks is described.
Conference Paper
An iterated function f(x) is a function that when composed with itself, produces a given expression f(f(x))=g(x). Iterated functions are essential constructs in fractal theory and dynamical systems, but few analysis techniques exist for solving them analytically. Here we propose using genetic programming to find analytical solutions to iterated functions of arbitrary form. We demonstrate this technique on the notoriously hard iterated function problem of finding f(x) such that f(f(x))=x2--2. While some analytical techniques have been developed to find a specific solution to problems of this form, we show that it can be readily solved using genetic programming without recourse to deep mathematical insight. We find a previously unknown solution to this problem, suggesting that genetic programming may be an essential tool for finding solutions to arbitrary iterated functions.