Content uploaded by Thomas Braunl
Author content
All content in this area was uploaded by Thomas Braunl on Jun 11, 2018
Content may be subject to copyright.
Knowl Inf Syst
DOI 10.1007/s10115-008-0184-9
SURVEY PAPER
A survey and taxonomy of performance improvement
of canonical genetic programming
Peyman Kouchakpour ·Anthony Zaknich ·
Thomas Bräunl
Received: 5 December 2007 / Revised: 1 September 2008 / Accepted: 8 November 2008
© Springer-Verlag London Limited 2008
Abstract The genetic programming (GP) paradigm, which applies the Darwinian principle
of evolution to hierarchical computer programs, has been applied with breakthrough success
in various scientific and engineering applications. However, one of the main drawbacks of
GP has been the often large amount of computational effort required to solve complex prob-
lems. Much disparate research has been conducted over the past 25years to devise innovative
methods to improve the efficiency and performance of GP. This paper attempts to provide a
comprehensive overview of this work related to Canonical Genetic Programming based on
parse trees and originally championed by Koza (Genetic programming: on the programming
of computers by means of natural selection. MIT, Cambridge, 1992). Existing approaches
that address various techniques for performance improvement are identified and discussed
with the aim to classify them into logical categories that may assist with advancing further
research in this area. Finally, possible future trends in this discipline and some of the open
areas of research are also addressed.
Keywords Genetic programming ·Computational effort ·Efficiency ·
Performance improvement ·Taxonomy
1 Introduction
In the natural world, there is a wealth of complex and intelligent biological organisms and
creatures. Biological processes have offered countless inspirations, presenting novel ideas
and metaphors to scientists for artificial systems. Evolutionary processes have been the vital
force for the progression of the smallest virus to the most complicated creature. This has lead
researchers to view evolution as a powerful concept and to validate whether the Darwinian
principles of natural selection can be applied in the silicon world as they appear in the carbon
P. Kouchakpour (B
)·A. Zaknich ·T. Bräunl
School of Electrical, Electronic and Computer Engineering,
University of Western Australia, Nedlands, Perth, WA, Australia
e-mail: peymank@nortel.com
123
P. Kouchakpour et al.
world, resulting in the birth of evolutionary computation (EC). EC has attracted the attention
of many researchers from diverse backgrounds, motivating new developments and applica-
tions [46,77,78,143,198], to name a few. As shown in Fig. 1, Evolutionary Computation has
four main traditional variants; evolutionary programming (EP) invented in the mid 1960s,
evolution strategies (ES) developed in the 1970s, genetic algorithms (GA) devised in the mid
1970s and the youngest stream the genetic programming (GP) formalised and championed
in the 1990s.
All the EC variants are based on the same concept of Darwinian evolution and thus the
underlying idea is the same for all of them. That is, they start off with an initial random popu-
lation of individuals and process this set of candidate solutions simultaneously, using natural
selection and operations such as crossover and mutation to produce new candidate solutions.
All the various dialects of EC are population based and stochastic. They use random initiali-
sation together with architecture altering operations, fitness function, selection mechanisms
and termination conditions to discover a solution. The primary feature that characterises
each EC system into its own stream is how the chromosomes are encoded. In fact, the dif-
ferent EC systems were organized in [9] based on their type of representation. This point
is graphically illustrated as illustrated in Fig. 2. In other words, the main difference is in
the structure undergoing adaptation, i.e. the representation or the genotypes. For example,
the data structures encoding the solution are typically fixed length binary characters in GA,
finite state machines in EP and trees with varying shapes and sizes in GP. Consequently,
the definition of their respective variational operators becomes specific and different. Each
discipline therefore differs in its application area. For example, GP is typically positioned in
machine learning whilst the other disciplines generally pertain to optimisation problems.
This work attempts to present an overview and taxonomy of previous research conducted
to enhance the performance of Genetic Programming. Although this paper very briefly covers
Evolutionary
Computation
Evolutionary
Programming
Genetic
Programming
Genetic
Algorithms
Evolutionary
Strategies
Biological
Evolution
Fig. 1 Evolutionary computation with its main variants
123
A survey and taxonomy of performance improvement of canonical genetic programming
Evolutionary
Computation
Static
Representations Dynamic
Representations
Fixed-length
Binary
Vectors
Fixed-length
Real Vectors Dynamic-
length Vectors
TreesGraphs
Genetic
Programming
Genetic
Algorithms TierraEvolutionary
Programming
Evolutionary
Strategies
Modern
EP
Modern
GA
Fig. 2 Alternate taxonomy based on type of representation [9]
the other variants of the Genetic Programming paradigm, its main focus is on the canonical
or standard GP which was championed by Koza in [124,125] based on tree structures.
The creative ideas proposed by researchers to improve the effectiveness and efficiency of
GP are reviewed and categorised in the following sections. Section 2briefly discusses the
basic properties that influence evolving populations. Section 3discusses the overall modifi-
cations and improvements. Section 4looks at Improved GP with the proposed modifications
that various researchers have put forward and forms the bulk of this paper. In Sect. 5,someof
the GP variants and hybrids are highlighted and briefly discussed, followed by conclusions
in Sect. 6.
2 Evolving populations
Although there exist many variants of evolutionary algorithms, their underlying idea is vir-
tually the same. Given a population of candidate solutions, their fitness can be increased by
environmental pressure through natural selection. Selection and variation operators form the
two basic forces in this system. The role of selection is to distinguish amongst individuals
based on their quality to either allow the better individuals to become parents or to replace the
existing individuals with individuals that have higher quality or fitness. Every individual has
an associated fitness value which represents how good an individual is in solving the prob-
lem at hand. The fitness function forms the basis for selection and facilitates improvements.
Therefore, selection can act as a force pressing on quality improvements. On the other hand,
the role of the variation operators is to create new individuals and hence explore the solutions’
search space. It is important to note that this iterative process is not always guaranteed to
produce individuals with higher fitness. Due to the highly stochastic nature of this system,
it is quite likely to witness the phenomenon of genetic drift, where individuals with higher
123
P. Kouchakpour et al.
fitness could be lost from the population or the population could experience a loss of variety
of certain characteristics.
Candidate solutions within the original context of the problem are referred to as phe-
notypes whilst their encodings in the problem solving space are termed as genotypes. The
representation, which is one of the distinguishing differences amongst the various evolution-
ary algorithms, involves specifying the mapping between the phenotype and genotype space.
Consequently, a range of encoding strategies has been developed for the various evolution-
ary algorithms. For example, the encoding of individuals is via fixed-length character strings
(typically binary) in Genetic Algorithms, real-valued vectors in Evolutionary Programming
and trees in canonical Genetic Programming.
Convergence can be generally interpreted as the point at which the population contains
a considerable number of similar individuals. In this instance, the algorithm is either not
progressing satisfactorily or is approaching a local optimum (premature convergence). At
times, convergence has been interpreted as the point at which the algorithm is approaching
the global or optimal solution. In the former, convergence could be considered to be a serious
problem or weakness as it could be expected that after repeated cycles of the evolutionary
process uniformity may arise sooner or later. Maintaining diversity may be considered a
possible remedy to this problem. The diversity of a population is a measure of the number
of different solutions present in the current generation. There exist numerous possible defi-
nitions of diversity in GP. The term diversity can be referred to as the diversity of genotypes
(structural diversity) or behavioural difference (phenotypes).
The no free lunch theorem (NFL) [226] states that no search algorithm is superior to any
other algorithm on average across all possible problems. For example, GA may perform
better than random search for certain problems, whereas random search may outperform GA
for other different problems. The erroneous conclusion should not be made from this that
there is no point in designing better algorithms or improving algorithms. We are not typi-
cally interested in solving all possible problems but rather a certain class of problems that
are suitable for the evolutionary algorithms to tackle. The broad implication of the NFL in
regards to performance improvement of the algorithms is that the modifications may only be
appropriate for certain form of representations or the modifications may provide improved
performance for a specific class of problems. This would mean that the stated improvements
are always accompanied with certain assumptions and limitations.
3 Overview of modifications and improvements
The modifications that have been made to the GP to improve performance or save compu-
tational effort generally fall into three categories, GP variants, hybrids and improved GP as
depicted in Fig. 3and described below.
The structures within the canonical or standard GP are tree-based. However, the structures
that undergo adaptation within the GP variants are no longer tree-based and significantly devi-
ate from the original canonical GP. As previously mentioned, since the genome or structure
is the primary feature that distinguishes the different variants within EC, the same argument
can be applied here. In other words, these variants of GP can be viewed as separate minor
dialects of EC because the representation is changed but they have the same motivation or
application of a GP. Hybridization of EC with other techniques, generally known as memet-
ic algorithms (MA) [51], are considered to be problem tailored methods. The combination
of GP with other algorithms or problem specific techniques enriches the GP with knowl-
edge and thereby improves its performance. All the other innovative methods to improve the
123
A survey and taxonomy of performance improvement of canonical genetic programming
GP
GP Variants Hybrids Improved GP
Various
Improvements
. . . . . .
Various
Improvements
Fig. 3 Variants of canonical GP
performance of GP fall in the last category, termed here as improved GP (IGP), which pre-
serves the tree-based structure of the canonical GP. Although GP variants and hybrids are
very briefly discussed herein, the focus will be based on Improved GP.
Before continuing with the survey of the proposed methods that improve the performance
of GP, a brief note needs to be made of the work performed on the theoretical aspects of
GP. Although these studies investigate why GP performs poorly or well, it is essential to
briefly note that the main work on the theoretical aspects of GP encompasses tree schemata
and the schema theorem of GP. Schemata were traditionally used to explain why GAs work
but have recently been used to show the workings of GP [184,185,202]. The definition of a
schema for GP is much less straightforward than for GAs and several alternative definitions
of GP schema have been proposed in the literature (for details, please refer to [185,186]). A
schema is defined as a similarity template composed of one or multiple trees or fragments of
trees. In some definitions [231] schema components are non-rooted leading to considerable
mathematical difficulties. In this case, a schema can be present multiple times within the
same program departing from the original concept of GA schemata. Rather than representing
subsets of the search space, such definitions focus on the propagation of program compo-
nents within the population. Recently, schemata have been represented by rooted trees or tree
fragments [184,185,202], resulting in easier schema calculations. In these studies, it was
hypothesized that solution outcomes are determined by rooted-tree schema y[202]andthat
GP identifies these structures first during a run and then builds upon these structures to form
a particular solution. The efforts highlighted above endeavour to build a theory for GP, based
on the concept of schema. In one case, groups of components that propagate within the pop-
ulation and how their occurrences vary over generations are modelled. In another instance,
subsets of the search space and how the amount of individuals in such subsets varies over
generations are modelled.
4 Improved GP
Figure 4shows how the various improvements on Standard GP may be organized into
possible categories. Generally the improvements are either implemented to remedy an
123
P. Kouchakpour et al.
IGP
Closure
Improvements on various
components of GP
Solutions to known issues
or problems within GP
Initialization
Selection
Control
Parameters
Variation
Operators
Fitness
Termination
Premature
Convergence
Innovative ideas enhancing
GP performance
Bloat
Diversity
Sec. 4.1.1
Sec. 4.1.2
Sec. 4.1.3
Sec. 4.1.4
Sec. 4.1.5
Sec. 4.1.6
Sec. 4.3.1
Sec. 4.3.2
Sec. 4.3.3
Sec. 4.3.4
Sec. 4.2
Sec. 4.1
Sec. 4.3
Fig. 4 Classification of IGP into different categories
acknowledged issue within Genetic Programming, such as the bloat phenomenon, lack of
diversity or premature convergence etc., or the improvements provide pioneering modifica-
tions to the components of the GP algorithm. The proposed modifications can usually be
further subdivided into three classes. They could be fixed and predetermined prior to the run
and unaltered during the run. For example, one may determine that a certain variation opera-
tion is beneficial and recommend its use for the entire run. Secondly, the modifications could
be deterministic and static—there is a predetermined rule which specifies how and when a
certain modification will take place. The rule f(g), which specifies how and when a modifi-
cation takes place, is a function of time or generation g. For example, it may be suggested that
the mutation operation should only be used after generation number g>30. Lastly, the mod-
ifications could be based on some feedback mechanism. For example, if a certain condition
occurs, then a particular selection mechanism or operation should take effect.
123
A survey and taxonomy of performance improvement of canonical genetic programming
4.1 Improvements on components of GP
Genetic Programming has a number of main components which define its operation, namely
variation operators, initialization, selection, control parameters, fitness and termination.
Researchers have looked into improving or altering all these components in turn to enhance
the performance of GP, as detailed in the following subsections.
4.1.1 Variationoperators
Variation operators are used to create new candidate solutions and are typically divided
based on their arity, e.g. mutation and crossover being unary and binary variation operators
respectively. Although crossover shoulders a great responsibility for the evolution of the GP
algorithm [52,53], the mutation operator plays a significant role in the genetic convergence
process by preventing loss of genetic diversity in the population [140]. The main search
operator in GP is the crossover operator and all the other variation operators are often termed
as secondary operations. The crossover operator is discussed in the following sub-subsection
and the next subsection reviews the secondary operators and all the other newly proposed
operators.
4.1.1.1 Crossover The primary operation for modifying genetic structures in GP is cross-
over. Crossover is a stochastic operator which merges information from generally two parents
to create offspring genotypes. The first crossover operator for GP was defined in [124]. The
effect of crossover was investigated in [172]. It was argued that it had the disadvantage of
producing a high computational cost due to growth of individuals in size and complexity
during the evolution process [223]. This effect, which is known as code bloat, is formed by
an excessive exploration capability of the crossover [155]. It was argued [16] that 75% of
crossover event could be termed as lethal and can result in disruption of building blocks. With
recombination or crossover being considered as the primary operator in GP, many researchers
have looked into ways of modifying it to improve the efficiency.
As crossover was viewed to be destructive, the brood recombination aimed to decrease this
effect and preserve good building blocks. The “soft brood selection” method [1] generated
a brood by performing crossover over the selected parents Ntimes and then introduced the
best of the brood in the next generation by holding a tournament. The “brood recombination”
was introduced in [220,221], which was a refinement of the soft brood selection. The Brood
Selection Recombination Operator RB(n)produced npairs of offspring but only kept the
best two of the 2 ×nproduced offspring using a selection function. As the brood selection
performs multiple samplings of the crossover operator and keeps the best 2 offspring, it can
essentially be viewed as a hill-climbing crossover operator. Although, the brood selection
increases the computational cost per generation it can however increase the selection pressure
and therefore the rate of convergence towards an optimal solution for some problems. To
reduce the computational cost a clever approach was implemented [220], where the eval-
uation of the new 2 ×noffspring is on a small portion of the training set rather than all
the test cases. The brood size was further investigated [252] for the brood recombination
crossover method. It was shown that as the brood size increased, the performance improved
and the brood recombination method outperformed the standard crossover method for the
three object classification problems studied. The disruptive nature of crossover was reduced
by the brood recombination, as the children of the destructive crossover events were rejected
by this operator and as a result the building of larger building blocks was promoted.
123
P. Kouchakpour et al.
Other approaches [187,246] choose good sub-trees or crossover points to swap.
Context-aware crossover [104,153] discovers the best possible crossover site for a sub-tree
and is shown to consistently attain higher fitness. There are similarities between the con-
text-aware crossover and the Brood Crossover in that multiple children are produced during
each crossover event. The destructive effects of standard crossover was minimized in [151]
by placing the selected sub-tree in its best context in the parent tree. The best context was
calculated by using the effect of the placement of the selected sub-tree on the overall fit-
ness of the parent tree and then selecting the placement, which produced the maximum final
fitness.
Some heuristics were added to the standard crossover operator [99] to make it smart.
Here, a form of intelligent heuristic guidance for the GP crossover was proposed. The smart
crossover computed the performance values for sub-trees and used this information to decide
which sub-trees are potential building blocks to be inserted into another sub-tree and which
sub-trees are to be replaced due to their poor performance value.
A homologous crossover operator was introduced in [16], where the exchange is strongly
biased towards very similar chunks of genome. Structural distances are measured by com-
paring genotypes and functional distances by comparing phenotypes. These two measures
are used to determine the probability that the trees are crossed over at a specific node. In this
way, the crossover probabilities are biased by structural and functional features of the trees.
Similarly, a GP 1-Point crossover operator was introduced in [184], which had homologous
overtones. This was based on the 1-point crossover for GAs, where the selection process
involved checking for structural similarities of trees to find points with structural homology.
The Ripple Crossover, examined in [114], was shown to outperform the traditional sub-tree
crossover on two benchmark problems. Although the Ripple Crossover was more explorative
than the single tree node crossover, it was a more disruptive crossover operator and its disrup-
tive nature resulted in a slower convergence. A one-point crossover was introduced in [185],
in which the same crossover point is selected in both parents. Two trees are aligned from
the root nodes and recursively and jointly traversed. Recursion is stopped as soon as an arity
mismatch between the corresponding nodes in the two trees is observed. A random crossover
point is selected from the above identified nodes and the two sub-trees below the common
crossover point are swapped. Some of the interesting features of the one-point crossover are
that it is a simpler form of crossover for GP and it facilitates population convergence by
searching for good partial upper part or structure solutions. Moreover, it does not increase
the depth of the offspring beyond that of their parents.
Crossover points are conventionally selected randomly. A depth-dependent crossover for
GP was proposed [103], in which the depth selection ratio was varied according to the depth
of a node. Shallow nodes were favoured as the crossover points and hence larger sub-trees
were swapped. This promoted the accumulation of useful building blocks via the encapsula-
tion of a larger part of a tree. The behaviour of the uniform crossover and point mutation was
examined in [178] presenting a novel representation of function nodes, which allowed the
search operators to make smaller movements around the solution space. It was shown that
the performance on the even-6-parity problem was improved by three orders of magnitude
when compared to the standard GP.
The headless chicken crossover operator, which was studied in [132], uses a selected pro-
gram Pand a newly randomly generated program Rto produce an offspring by replacing
a sub-tree of Pwith a replaced sub-tree from Runtil it finds an offspring with greater or
equal fitness to P. The crossover-hill climbing [16] operator is another form of the headless
chicken crossover. In [2,88], the fitness of the individual is the sum of its fitness components
or genes. A gene is periodically added to the individual during the evolution and if it improves
123
A survey and taxonomy of performance improvement of canonical genetic programming
its fitness it is kept, otherwise discarded. Between gene additions, the population evolves by
intergene crossover.
A novel crossover method was proposed [113] using the usage frequency of nodes. Three
crossover techniques were investigated, namely a crossover with crossover points in both
nodes having high usage frequency, with crossover points in a node having high usage fre-
quency and a node having low usage frequency, with crossover points in both nodes having
low usage frequency. It was discovered that their method was promising for speedup in GP.
Many researchers looked into determining crossover points that are likely to be more advan-
tageous. For example, a higher-level analysis of the population as a whole was used in [199]
utilizing statistics gathered over all sub-trees to determine the crossover points or in [10]
selective self-adaptive crossover (SSAC) and self-adaptive multi-crossover (SAMC) meth-
ods were used. The depth-fair crossover (DFC) was introduced in [117], which allowed for
weighting crossover points. It assigned an equal weight to each depth of the tree. Each node
within each depth was given an equal amount of the depth weight. Improvements were also
made to the crossover operator [250] using a measure called looseness to guide the selection
of crossover points rather than choosing them randomly. Improvement was shown over the
headless chicken crossover [132] and the standard crossover.
The latest developments imply that the crossover operator is on its way to becoming a
more powerful and robust operator. It is believed that there is still room for the crossover
operator to improve the quality and efficiency of the search it conducts. This can be achieved
by either combining the current approaches or devising new ways for improvement.
4.1.1.2 Other Operators In addition to the primary genetic operation of crossover, there are
various secondary operations such as mutation, permutation, encapsulation etc. The effect of
various operators, namely mutation, permutation, encapsulation and editing, on GP perfor-
mance was first investigated in [125]. Although it was argued that the subject was certainly
not solved and required further work for a general conclusion, it was shown that for some
selected problems there was no substantial difference in performance when these operators
were included.
The performance improvements in GP provided by automatic defined functions (ADF)
and decimation were compared in [164] using the Santa Fe ant, the lawnmower, the even
3-bit parity and symbolic regression problems. It was concluded that decimation provided
superior improvement in performance over ADF. It should however be noted that it was con-
cluded that ADF was not effective for simple problems [127] and its benefits only became
increasingly evident for complex problems.
To overcome the disruption of building-blocks due to crossover and mutation, an adaptive
program called STructured Representation On Genetic Algorithms for Non-linear Function
Fitting (STROGANOFF) was introduced [96,97]. In [98] an adaptive recombination for a
numerical GP was proposed which was guided by a measure called minimum description
length (MDL). The application of mutation or crossover operators was adaptively controlled
to improve efficiency.
A new operator was introduced in [52,53] to minimize the number of evaluations required
to find an ideal solution by evaluating the observed strengths and weaknesses of selected
individuals within areas of the problem. The motivation in [53] was to intelligently per-
form crossover by discriminating between the portions of each parent that lead to success
and failure. A new GP operator, the memetic crossover, was introduced, which allowed
for an intelligent search of the feature-space. The proposed process involved the identifi-
cation of specific areas of importance within the problem (sub-problem) and in tracking
the nodes executed while observing the individual’s performance as it was evaluated. The
123
P. Kouchakpour et al.
information gathered was then organized by ranking the nodes. Nodes that were executed
were said to participate in the sub-problem. Bad and good nodes were associated with one or
more sub-problem failures and successes respectively. Using the memetic crossover method
[53] individuals were examined to ensure compatibility with respect to sub-problem perfor-
mance. The individuals were regarded as compatible, when a significant sub-problem match
occurred between one of the worst performing nodes in the recipient and one of the best
performing nodes in a potential donor. In this instance, crossover was performed with the
recipient replacing its bad node with the donor’s good node. The Los Altos trail and the royal
tree problem, which can easily be decomposed into well-defined sub-problems, were used
as benchmark problems. For this approach to be significantly advantageous, it was required
that the problem be able to be methodically decomposed into sub-problems. Although me-
metic crossover incurred additional processing cost, they were considered negligible when
compared to the time saved through the reduction in the number of evaluations.
The macro-mutation operator (headless chicken crossover) was shown in [132] to out-
perform the traditional GP crossover operator. The pruning genetic operator was proposed
in [168] for removing useless structures from the GP individual. The operation was applied
to randomly selected sub-trees. Redundant node patterns, which are problem dependant and
uniquely defined for each problem, were searched and replaced with an effective terminal
node resulting in efficiency improvement in the GP’s search.
In [14], crossover was between a member of the population and an ancestor tree, which
was a fixed collection of trees. The crossover operator generated only one tree, the population
member with one of its sub-trees replaced by a sub-tree of the ancestor. This variation oper-
ator, which was neither a crossover nor a mutation, used information from two individuals
with only one member belonging to the population. Analysis of mean tree size growth dem-
onstrated that this operation limited parse tree growth because the ancestors did not grow. The
genetic material in the ancestor set did not change and was available indefinitely, implying
that building blocks or information was never lost.
It is suggested that new possible operators or methodologies should be devised that either
promote finding good building blocks or reduce the destructive nature of the currently pro-
posed GP operators. Care should however be exercised that that the reduction of disruptive
effects of current operators, that generate new candidate solutions, should not be overin-
dulged as it may simply transform the GP paradigm to a simple hill climber, which is not
desirable.
4.1.2 Initialization
Initialization involves the random generation of individuals for the initial population. The
traditional GP tree-creation algorithms GROW, FULL and Ramped Half-and-Half were intro-
duced in [125]. It was shown that Ramped Half-and-Half was the best observed for the Quartic
polynomial, 6-multiplexer, Artificial Ant and Linear equations problems. In [138], a random
initialization, which produced programs of random shapes, was defined.
The RAND_tree algorithm was introduced in [100]andin[21] where trees were initial-
ized with exact uniform probability from a tree-derivation grammar. Using diverse random
seeds, multiple abbreviated runs were made in [181]. An enriched population was created
using the best member from each abbreviated run. This enriched population was then loaded
together with a full set of randomly generated unique members at the start of a consolidated
run.
Two new tree-creation algorithms probabilistic tree-creation (PTC 1 and PTC 2) were
offered by [145], where an average tree size or a distribution of tree sizes could be specified
123
A survey and taxonomy of performance improvement of canonical genetic programming
with guaranteed probabilities of occurrence for specific terminal and non-terminal functions
within the generated trees. PTC 1 & 2 had very low computational complexity and had
comparable results with the GROW technique.
Further research in formulating novel ways of generating new individuals per run should
be called for, as it is believed that the overall fitness and diversity of the initial population in
the first generation plays a significant role in the success of the later generations within that
run. The main goal should be to enrich the initial population and increase its structural and
behavioural diversity without introducing a large computational effort. Finer initialization
techniques for new runs may emerge by either exploring memory or learned behaviour from
previous runs or introducing problem specific innovations into the initialization stage.
4.1.3 Selection
The role of selection is to differentiate among individuals based on their quality. Individu-
als with higher quality are favoured to be parents participating in a variation operation or
be replacements of an existing individual in the case of recombination. Selection is gener-
ally responsible for driving for quality improvements and is probabilistic. The performance
characteristics of a repertoire of selection methods, namely proportional selection, ranking
selection, and tournament selection, were investigated for time series prediction in [118].
In [189], the sampling behaviour of tournament selection over multiple generations was
analysed, where the analysis was focused on individuals which did not participate in any
tournament at all, due to not being sampled during the creation of the required tournament
sets. A new selection scheme was proposed in [69], which was based on standard tournament
selection, to encourage genetically dissimilar individuals to undergo genetic operation. It
demonstrated performance improvements of GP for the algebraic symbolic regression prob-
lem. An automatic selection pressure for the tournament selection was investigated in [242]
to improve the efficiency of GP. The number of tournament candidates was dynamically
changed in response to the changing evolutionary process. Using the symbolic regression
and the even-6 parity problems it was shown that this approach could improve the effec-
tiveness and efficiency of GP systems. In [244], the relationship between population size
and tournament size was investigated and a new fitness evaluation saving algorithm, evalu-
ated-just-in-time (Ejit), was proposed which resulted in constant computational savings by
avoiding the evaluation of not-sampled individuals.
It is believed that a large training set of fitness cases could slow down GP. Dynamic subset
selection based on a fitness case was proposed in [139], where an appropriate topology-based
subset selection method allowed individuals to be evaluated on a smaller subset of fitness
cases. A topology relationship on the set of fitness cases was created during the evolutionary
search by increasing the strength of the relation between two fitness cases that an individual
could successfully solve. The proposed selection method selected a subset, where the fit-
ness cases were distantly related with respect to the induced topology. Using four different
problems, it was shown that dynamic topology-based selection of fitness cases progressed
on average faster than the stochastic subset sampling.
In the canonical GP, selection pressure is only applied in the selection of parents and the
offspring are simply propagated into the next generation without any selection. In [243], a
many-offspring breeding process with selection pressure applied to the selection of offspring
was investigated. A many-offspring breeding process can be viewed as a standard crossover
that generates a large number of poor offspring in the search for good offspring. Two cross-
over operators were proposed. Firstly, the Ideal Crossover considers all the possible ways of
recombining two selected parents to produce all the possible offspring. It then evaluates all
123
P. Kouchakpour et al.
the offspring and keeps the best two offspring with the highest fitness values. Secondly, the
Partial Crossover, which is similar to the context-aware crossover operator [151], selects a
random point for crossover in one parent P1but considers all the other nodes in the other
parent P2to produce offspring. The focus of these techniques is to optimise the offspring’s
fitness and thereby increasing selection pressure.
A theoretical and empirical study that will increase our understanding of which selection
methodologies can be deemed superior during different stages of the evolutionary process
is recommended. This study can then become the basis for implementing new schemes that
explore dynamic selection of various proposed selection techniques during the run. In addi-
tion, further innovative ways that can result in reducing the computational burden of selection
can be very beneficial for enhancing the performance of GP.
4.1.4 Control of parameters
The genetic programming paradigm is controlled by various control parameters such as the
maximum number of generations (G), the population size (M), the probability of crossover
(Pc), recombination (Pr), mutation (Pm) and the maximum tree depth (D), to name a few.
The issue of parameter control and setting was discussed in [51]. The algorithm param-
eters can either be tuned or adapted. Parameter tuning involves the empirical investigation
of parameter values which will result in good performance before the run. Once the best
suited parameter value for the specific problem is determined, its value is set in advance
and remains unchanged for the duration of the run. Alternatively, the parameters could be
deterministically altered as a function of time/generation during the run or adaptively con-
trolled through some heuristic feedback mechanism resulting in explicit adaptation. On the
other hand, The actual parameters could be encoded into the data structures of the algo-
rithm and evolve with the adaptation being entirely implicit. This can be summarized as per
Fig. 5.
4.1.4.1 Tree size There has been substantial amount of work performed concerning the
solution’s shape (dynamics of tree size and depth) such as [43,44,47,128,211], to name a
few. The tree size parameter is used to impose a size restriction on individuals. Typically the
tree sizes for the random initial population and evolved individuals are restricted differently,
namely through the maximum initial tree size Diand the maximum created initial size Dc.
These parameters impose restrictions on the maximum allowable depth for a tree. The cor-
relation between average parent tree size and the modification point (crossover or mutation)
was shown in [148]. Both of these were directly linked to the size of the resulting child.
A dynamic tree depth limit was explored in [206,207]. The dynamic limit was initially
set as high as the maximum depth of the initial random trees [206]. Trees which exceeded
this threshold were rejected and replaced by one of their parents, unless the tree in question
was the best individual found so far. In this instance, the dynamic limit was increased to
match the depth of this new best-of-run individual. The dynamic limit was lowered as the
new best-of-run individual allowed such reduction [207]. Moreover, a dynamic tree depth
was adopted in [253] to constrain the complexity of programs. The proposed method was
applied to data fitting and forecasting problems with results indicating improvement over
GP.
Some researchers have looked into limiting the total amount of tree nodes of the
entire population [228], rather than imposing limits at the individual level. The concept
of resource-limited GP, which was a further development to [228], was introduced in [208].
As the total number of nodes in the population exceeded a predefined limit, resources became
123
A survey and taxonomy of performance improvement of canonical genetic programming
Before the run During the run
Parameter
Adjustment
Implicit
Adaptation
Parameter
Adaptation
Dynamic Static
Explicit
Adaptation
Deterministic
Parameter
Tuning
Fig. 5 Taxonomy of parameter adjustment
scarce and not all offspring were guaranteed to progress. The candidates were queued by fit-
ness and progressed into the next generation on a first come first serve basis. The trees that
required more resources than the amount still available would not survive. The relationship
between size and fitness was not explicitly defined and was a product of the evolutionary
process. A natural side effect of this approach was that the population was automatically
resized. Although these approaches used the same rationale, they operated at different levels
of the GP paradigm, namely acting at the individual level and at the population level. Tree
depth limits imposed a maximum depth to each individual and Resource-limited GP limited
the total amount of resources that the entire population could use.
The two different approaches, tree depth limits and resource-limited GP, were compared
in [210] using symbolic regression, even parity, and artificial ant problems. It was shown
that the resource-limited GP was superior to tree depth limits [210]. A dynamic approach
to resource-limited GP was developed in [209]. The dynamic resource limit was initially
set as high as the amount of resources required for the first generation. The allocation of
trees, sorted according to fitness, continued into the next generation until the resources were
exhausted (as per original resource-limited GP). The rejected individuals would be consid-
ered as candidates for the next generation if the mean population fitness was improved. Hence
the dynamic resource limit was raised as a function of mean population fitness providing the
additionally needed resources. It was shown that the dynamic approach to resource-limited
GP achieved better performance when compared with the static approach and with the tradi-
tional depth limits, using the symbolic regression polynomial problem and Santa Fe artificial
ant problem.
123
P. Kouchakpour et al.
4.1.4.2 Operator and selectionprobabilities As GP is a completely stochastic process, it
is controlled by various probabilistic control parameters, such as the probability of selecting
an internal point (Pip) as a node for the crossover operation or probability of crossover and
reproduction (Pcand Pr) which determine by which process the fraction of individuals are
created for the next generation.
In [172], explicitly defined introns (EDIs) were introduced as instruction segments that
were inserted between two nodes of useful code. EDIs changed the probability of crossover
between the two nodes on either side of the EDI improving the convergence properties of
a GP algorithm. Similarly in [30] the probability of selection of every node for crossover
was indirectly adapted through the evolutive introns (EIs). Evolutive introns are explicitly
defined introns, which are artificially generated, with the aim to increase the probability of
selecting good crossover points as the evolutionary process continues. The automatic growth
and shrinking of non-coding segments in the individuals are promoted, thereby adapting the
probabilities of groups of code being protected.
The adaptation of operator probabilities in genetic programming was investigated in [167]
with an attempt to reduce the number of free parameters within GP. Two problems from the
areas of symbolic regression and classification were used to show that the results were bet-
ter than randomly chosen parameter sets and could contest with parameters set as based on
empirical knowledge.
4.1.4.3 Functionandterminalsets In the conventional GP, the structures are typically com-
prised of the set of Nfunc functions from the function set F=f1,f2,..., fNfuncand the
set of Nterm terminals from the terminal set T={a1,a2,...,aNterm}forming the combined
set C=F∪T. This combined set defines the set of all the possible structures or elements.
The choice of function and terminal sets can have a significant effect on the GP’s performance
and if the sets are not sufficient to express a solution for a given problem, then GP would not
be able to solve the problem.
The effect of extraneous variables and functions was first studied by [125] and it was shown
that a linear degradation in performance was observed as additional extraneous variables
were added for the cubic polynomial problem. Similar results were obtained for extraneous
functions; nevertheless it was shown that for some specific problems, extraneous functions
improved performance. Furthermore, no substantial difference in performance was observed
for extraneous ephemeral random constants. It was concluded that the question of extraneous
sets was not definitely answered in this study and further experimental and theoretical work
was recommended to be carried out leading to general conclusions. In [229], a systematic
study was conducted of how to select appropriate function sets to optimize performance.
They classified functions into function groups of equivalent functions and showed that a set
that was optimally diverse (that included one function from each function group) was most
appropriate.
4.1.4.4 Population andgeneration number Population size (M) and the maximum number
of generations (G) are the two major numerical control parameters in GP, with their values
generally dependant on the difficulty of the problem. The role of population size in GP was
first very briefly studied by [125]. In this study, the 6-multiplexer problem was solved with
various population sizes. The study concluded that larger population size Mincreased the
cumulative probability P(M,i)of satisfying the success predicate of a problem for GP, for
generations between generation 0 and generation i. In this study, the population size was
maintained at a constant level and was not varied throughout the run. However, there was
123
A survey and taxonomy of performance improvement of canonical genetic programming
a point after which the cost of a larger population surpassed the benefit achieved from the
increase in P(M,i).
It was shown that using a large population was not always the best way to solve problems
[70]. It was demonstrated in [182] that as problem complexity increases, determination of
the optimal population size becomes more difficult. The control of population size in GP was
first implemented by [215] to improve the algorithm’s robustness and reliability. The plague
operator (first experimented within GA [42]) was introduced in [57,58] to fight bloat in GP,
where individuals were removed at a linear rate to compensate for the increase in individual
size. It was shown that computational effort could be saved and plague allowed a given fit-
ness level to be reached with a smaller computational effort. The decrease in population size
was also studied in [149], where the population size was gradually decreased throughout the
GP run. In Virtual Ramping, the size of the population and the number of generations were
continuously increased [61] reducing premature convergence.
In [122], the population variation (PV) scheme was introduced, where the population
could be increased and/or decreased with a variable profile. The increment or reduction of
population size could take on any flexible profile such as linear, exponential, hyperbolic,
sinusoidal or even random. It was demonstrated that PV significantly improved performance
and showed that the optimum profile was dependant on the problem domain. An investigation
was carried out to determine whether the nature of the “population variation”, i.e. the way
the population was altered during the run, had any significant impact on GP performance in
terms of computational effort. In addition, a novel concept was introduced to ensure that the
computational effort for an unsuccessful run in the PV scheme would never exceed that of
the Standard Genetic Programming. It was shown that the PV algorithm outperformed the
plague operator.
Static population variation employs a deterministic adaptation approach using simple time-
varying schedules, where the population size is varied according to a deterministic function.
One of the shortcomings of the static population variation scheme is that the population size
is varied by a blind deterministic function. It is more desirable to vary the population size in
an informed way during the run. Using a heuristic feedback mechanism, the population size
can be dynamically varied by taking into account the actual progress of GP in solving the
problem. A technique was introduced in [62] to dynamically vary the size of the population
during the execution of the GP system. The population size was varied “on the run” accord-
ing to some particular events occurring during the evolution. In [123] various new ways to
dynamically vary the population size during the run of the GP system were proposed. The
proposed approach, referred to as dynamic population variation (DPV), extended the work
of [62] and it was shown that DPV was superior to Standard Genetic Programming (SGP),
the plague operator, the dynamic population modification technique in [62] and all the static
PV schemes reported in [122].
Some possible potential research in this area may be a further study into population vari-
ation based on structural and behavioural diversity, which may yield interesting results.
4.1.5 Fitness and objectivefunction
Fitness is the driving force of natural selection and measures the quality of an individual
with respect to how well an individual can solve a given problem. It is generally defined
by an objective or fitness function forming the basis for selection, defining and facilitating
improvements.
Multi-objective techniques [31], which allow the concurrent optimization of several objec-
tives by searching the so-called Pareto-optimal solutions, were investigated in [20]toevolve
123
P. Kouchakpour et al.
compact programs. There are various multi-objectiveoptimization techniques such as strength
pareto evolutionary algorithm (SPEA) [256], SPEA2 (an improved version of SPEA) [258],
SPEA2+ [119], NSGA-11 [45], Adaptive Parsimony Pressure [248]. The program size was
considered to be a second independent objective in addition to the program functionality in
[20] and an enhanced version of SPEA proposed in [257] was used. A multi-objective GP
was used in [13] to make improvements on the results.
Novel strategies based on elastic artificial selection (EAS) and improvedminimum descrip-
tion length (IMDL) were investigated in [121] for fitness evaluation and selection to create
shorter programs and prevent premature convergence. The effect of tournament selection
and fitness proportionate selection with and without over-selection for particular problems
was investigated in [125]. It was shown that for many problems it was possible to enhance
the performance of GP by greedily over-selecting the fitter individuals in the population.
In [241], the whole population was clustered to reduce the fitness evaluations and improve
the effectiveness of GP. The clustering was performed by a heuristic called fitness-case-equiv-
alence. For each cluster, a cluster representative was selected and its fitness calculated and
directly assigned to other members in the same cluster. Using a clustering tournament selec-
tion method and a series of experiments of symbolic regression, binary classification and
multi-class classification problems, it was shown that the new GP system outperformed the
standard GP on these problems.
A new methodology was proposed [162] to create a new training set of randomly-gener-
ated fitness cases prior to each generation of the GP run instead of using a fixed set of fitness
cases. It was shown that, this methodology was mainly useful to reduce the brittleness of GP
when the fixed training population does not adequately represent the full range of difficult
situations of the problem. The fitness function was scaled over time in order to improve
performance in [71]. The motivation behind this approach was that it is often easier to learn
difficult tasks after simpler tasks have been learned.
As the majority of the computational effort in GP is expended in the fitness function, it
is beneficial to avoid invoking the fitness function whenever possible. As the reproduction
operator produces an identical copy of its parent, it is then quite obvious that the fitness
evaluation can be safely avoided. This can result in considerable savings in computation
because reproduction in GP conventionally accounts for the creation of ten percent of the
new individuals. This flagging and caching of the already computed fitness of reproduced
individuals was proposed in [125], provided that there are no varying fitness cases from
generation to generation. A technique was devised in [105], which allowed the GP system
to determine many instances in which invocation of the fitness function could be avoided.
This was achieved through the consideration of the program nodes executed during fitness
evaluation to establish whether a newly generated individual has the same fitness value as
its parent. This could be realized through the identification of dormant nodes [106], which
are program nodes that are never executed, extending a marking method described in [18].
It was shown that this technique, when applied to the multiplexer problem and even-parity
problem, resulted in significant savings in execution time.
4.1.6 Termination
In traditional GP, a fixed number of generations are usually used as the condition for termi-
nating the evolution process. To address the CPU time-consuming issue and the large amount
of computational resources required for GP, the three different termination criteria of effort,
time and max-generation were examined by [67]. An improved termination criterion was
implemented in [131] to prevent premature termination, when further search may continue
123
A survey and taxonomy of performance improvement of canonical genetic programming
to pay off, or to prevent unnecessarily continuing to search dead-ends when further progress
seems implausible. Here, the run will continue as long as improvements continue to be made.
A maximum number of unproductive generations is used to terminate a run.
Examination of different measures for stagnation and premature convergence could be
most promising together with newly invented methodologies to abruptly terminate a run and
commence a new run with the knowledge gained from the previous run. Moreover, a thor-
ough study on a dynamic termination method that is based on the combined structural and
behavioural diversity is suggested.
4.2 Innovative ideas
The improvements detailed in this section contain some pioneering modifications to the
canonical GP algorithm to enhance its performance.
4.2.1 Simplification
An approach to online simplification was introduced in [236,251], where programs were auto-
matically simplified during the evolution using algebraic simplification rules and algebraic
equivalence. The proposed method was tested on the regression and classification problems,
showing its superior performance when compared with the standard GP systems.
4.2.2 Modularization
The technique of automatic function definition (ADF) was introduced by [126] to potentially
define useful functions dynamically during a run and to accelerate the discovery of solutions
in GP. The number of fitness evaluations that must be executed [127] can be considered as a
reasonable measure of computational burden. In [127], ADF was shown to allow the discov-
ery and exploitation of regularities, symmetries, similarities and modularity of the problem
environment. It was shown that for simpler versions of problems ADF was not effective but
as the problems were scaled up the increasing benefits became evident.
Evolving modular programs was also investigated in [6,8], where special mutation oper-
ators (compress and expand) defined modules from the developing programs at random,
allowing modular programs to emerge using the genetic library builder (GLiB). An approach
for reusability was proposed in [92] based on ADF, incorporating a library for keeping sub-
routines acquired by ADF. This library preserved knowledge and ensured reusability so that
the acquired subroutines could be shared and reused. The most frequent sub-trees, which
were expected to contain useful partial solutions, were grouped as modules [195]. Such sub-
trees were encapsulated by representing them as atoms in the terminal set. Additionally, a
random sub-tree selection and encapsulation was examined and empirical results illustrated
performance improvement over standard GP. A method for automatically generating useful
subroutines by systematically considering all small trees was presented in [34]. This algo-
rithm moved progressively and systematically through the best trees of a given size and
considered them as candidates for subroutine generation. This algorithm was successfully
tested on the artificial ant problem.
Layered learning has been used for solving GP problems in a hierarchical fashion. The
layered learning approach [39,79,94,95,107,217] decomposes a problem into subtasks, each
of which is then associated with a layer in the problem-solving process. It is believed that
the learning achieved at lower layers when solving the simpler tasks directly facilitates the
learning required in higher subtask layers. Two program architectures are proposed in [108]
123
P. Kouchakpour et al.
for enabling the hierarchical decomposition based on the division of test input cases into
subsets, each dealt with by an independently evolved code segment. The main program
branch includes calls to these new entities via an expanded terminal set. The proposed tech-
nique offered substantial performance improvements over the more established methods such
as the ADF for the even-10 parity problem.
A sub-tree was randomly selected in module acquisition (MA) [7,120] from an individual
and then a part of this sub-tree was extracted as a module and preserved in a library defined
as a new function. This module was protected against blind crossover operations and could
be referred to by other individuals. In adaptive representation GP (AR-GP) [200,201], an
effective sub-tree was selected and added as a new function to the function set to improve
learning efficiency.
For GP to be able to address more demanding larger and more complex solution programs,
it is inevitable for GP to have the ability to scale up. One way to achieve this is through more
efficient modularization practices. Further theoretical and empirical studies that aid in under-
standing the concept of building blocks within GP, their early detection and their further
development and enrichment would certainly be a promising way to explore new schemes
or possible improvements of current methods. In addition, a study that combines the current
approaches may provide new insights in this area.
4.2.3 Other innovativeideas
Double-based genetic algorithm (DGA), which improves the performance of a GA, was
shown to be relevant for the GP paradigm [36]. Two types of doubles were defined based
on permutations on the arguments and permutations on the terminals of the terminal set,
introducing doubles in the population set. The Double-based Genetic Programming para-
digm provided a useful extension of the GP standard search procedure and demonstrated its
advantages for Genetic Programming.
The best subtree genetic programming (BSTGP) [163] selects the best sub-tree in order
to provide the solution of the problem. This is different from the canonical GP, where the
fitness of a tree is given by its root node. BSTGP also produces smaller trees as nodes that do
not belong to the best sub-tree are deleted. The proposed approach was tested using a number
of symbolic regression and classification problems showing comparable results to standard
GP.
An approach using a clustering method was described [12] to reorganize subpopulations in
GP, with the goal of producing more highly fit individuals. The initial population Pis divided
into number of subpopulations Siafter a nominated clustering frequency and according to
the genetic similarity of the individuals. The sizes of the subpopulations are proportional
to the average fitness of the individuals they contain. It was shown that a slight speedup
over the canonical GP was observed for the multiplexer, parity and artificial ant problems.
The evaluation of a generation is widely accepted to be the most expensive process in GP.
Sub-tree caching and vectorized evaluation [115] attempted to make this less expensive and
more efficient. Two types of bottom-up and top-down caching were introduced, where the
latter encouraged the caching of big sub-trees and the former encouraged the caching of small
sub-trees. Although the caching of big sub-trees made the evaluation process more efficient,
it was less likely that it could be matched and used again during the evaluation process due
to its larger size.
It should be noted that the other main innovative ideas studied herein and proposed by
various researchers, which are specific to either the GP components and operators or the GP
123
A survey and taxonomy of performance improvement of canonical genetic programming
aspects and concepts that they pertain to, have been grouped, categorized and discussed in
the other sections of this paper.
4.3 Solutions to known issues or problems within GP
The improvements detailed in this section endeavour to remedy an acknowledged issue within
GP, such as the bloat phenomenon or lack of diversity etc.
4.3.1 Closure
The initial population-generating algorithms may not always generate valid individuals. The
grammar-guided genetic programming (GGGP) attempted to address this known closure
problem. The reader is referred to Sect. 5.1.
4.3.2 Premature convergence
Various studies have been conducted to address the issue of premature convergence [172,240,
253]. The issue of Premature Convergence is mainly addressed by making improvements on
the components of GP. The reader is referred to Sect. 4.1.
4.3.3 Diversity
There has been much work which has focused on diagnosing or remedying the loss of diver-
sity within Evolutionary Computation. A new method of approximating the genetic similarity
between two individuals was presented in [69], which used ancestry information to exam-
ine the issue of low population diversity. By defining a new diversity-preserving selection
scheme, genetically dissimilar individuals were selected to undergo genetic operation. This
provided the means to alter the perceived fitness of individuals. The study of how multi-
population GP helps in maintaining phenotypic diversity was conducted in [225]. In [158],
negative correlation was examined to improve diversity and prevent premature convergence.
A study to evaluate the influence of the parallel GP in maintaining diversity in a population
was conducted in [66].
A two-phase diversity control approach was proposed by [240] to prevent the common
problem of the loss of diversity in GP. The loss of diversity was prevented in the early
stage through a refined diversity control (RDC) method with automatically defined functions
(ADF) and a fully covered tournament selection (FCTS) method. RDC was an extension
to general diversity control (GDC), which passed the diversity check if two whole program
trees, with the main tree and ADFs treated as a whole, were not exactly identical in genotype.
RDC treated the main tree and ADF as individual objects and hence both were required to
be unique for the diversity check to pass. FCTS was an extension to the standard tournament
selection (STS). It was argued in [240] that due to the randomness of STS, individuals with
bad fitness may be selected multiple times, where an individual with good fitness may never
be selected. FCTS avoided this issue by excluding individuals that had already been selected.
The proposed methods effectively improved the GP’s performance resulting in the reduction
of number of generations needed to reach an optimal solution and decreased incidences of
premature convergence.
123
P. Kouchakpour et al.
4.3.4 Bloatand code growth
Many researchers have highlighted the problem of bloat, which is the uncontrolled growth
of the average size of an individual in the population. There exist numerous studies of code
bloat in GP [17,76,116,134,137,156,171,192,213,214,239]. Three principal approaches
were summarized in [133] to prevent bloat, namely (i) Limiting tree depth to some maxi-
mum value, (ii) Using parsimony pressure, with the use of multi-objective (MO) methods and
(iii) Tailoring genetic operators such as size-fair crossover [135,138]orfairmutation[136].
In the standard GP this issue is indirectly dealt with by limiting individual tree’s maxi-
mal allowed depths. This can be viewed as unsatisfactory as this will require knowledge of
the maximum necessary depth in advance of solving the problem. The effects and biases of
size and depth limits on variable length linear structures were explored using empirical and
theoretical analyses in [157]. It was argued in [253] that the increasing size of trees would
reduce the speed of convergence towards a solution and thus affect the fitness of the best
solution. Consequently, the dynamic maximum tree depth was proposed to avoid the typical
undesirable growth of program size. A technique was demonstrated in [161], which signifi-
cantly constrained the growth of solutions, i.e. bloat. This method imposed a maximum size
on the created individuals within the population, which solely depended on the size of the
best individual of the population. It was shown that the combination of depth limiting and
methods which punish individuals based on excess size, were effective [150].
One mechanism for limiting code size is the Constant Parsimony Pressure, where larger
programs are penalized by adding a size dependent term to their fitness [212]. This tech-
nique incorporated the program size as an additional constraint, but a hidden objective. The
application of parsimony pressure was investigated in [72] in order to reduce the complexity
of the solutions. Their results reported that while the accuracy on the test sets were pre-
served for binary classification setup, the mean tree size was significantly reduced. Parsimony
pressure has also been used in [146,147] to fight Bloat. Parsimony pressure incorporated in
a multi-objective framework has been used by many researchers [197,254,255]. The use
of multi-objective optimization for size control was studied in [40]. Multi-objective tech-
niques in the context of GP were also investigated in [20] to reduce the effects caused by
bloating. The inclusion of the tree size measure as one of the objectives has been found to
be extremely effective at controlling bloat. It was shown in [15] that mutation can be used
to prevent population collapse, a phenomenon where the population in Multi-objective GP
rapidly degenerates to just trees of a single node because it tends to produce a positive mean
increase in tree size per generation counterbalancing the parsimony pressure exerted by the
fitness-based selection process. In [111], the functionality was first optimized and then after-
wards followed by size (ranking method-two stage). The advantage of this was that pressure
on size would not deter GP from discovering good solutions. This was because pressure is
only applied when the individual has already reached the desired performance. However,
bloating continued in solutions that had not attained the aspired performance.
The genetic operators could be modified to address the problem of bloating such as Delet-
ingCrossoverin[19]. In [50], speed improvements were observed by removing introns.
In [183], the issue of too long solutions and bloat were addressed by maximum homolo-
gous crossover (MHC). Equivalent structures from parents were preserved by aligning them
according to their homology. MHC was tested on a symbolic regression problem and demon-
strated its abilities in bloat reduction without inducing any specific biases in the distribution
of sizes, allowing efficient size control during evolution. It was evident from the results that
control of the size was possible while improving performance. The use of multiple cross-
overs was explored as a natural means to contain code growth [216]. Multiple crossovers
123
A survey and taxonomy of performance improvement of canonical genetic programming
were performed between two parent trees, where the total number of crossovers occurring
between the two selected parents is dependent on the sum of the sizes of the parents involved.
Three similar multi-crossover algorithms were shown to be a viable choice for containment
of code growth.
Itwasarguedin[238] that a significant problem with GP was the continuous growth
of individual’s size without a corresponding increase in fitness. A self-improvement (SI)
operator was applied in combination with a characteristic based selection strategy to reduce
the effects of code growth. Instead of simply editing out non-functional code the proposed
method selected sub-trees with better fitness. The SI operator selects individuals that have
at least one sub-tree that has a higher fitness value than its original tree (mother individual).
This will result in a fitter individual to replace a less performing individual with the reduction
in depth and node count from its original tree. In other words, the sub-tree is upgraded to a
new individual by removing the branches of the original mother tree. The performance of
the proposed method was validated by testing it on a symbolic regression and a multiplexer
problem showing a substantial reduction of code growth while maintaining the same level
of fitness. It may be argued that the proposed approach is suitable for the above problems
because they are simply and decomposable.
It was claimed in [247] that code bloat slowed down the search process, destroying pro-
gram structures, and exhausting computer resources. Non-neutral offspring (NNO) operators
and non-larger neutral offspring (NLNO) operators were proposed to deal with these issues.
An offspring could be distinguished as improved, neutral or worsened with respect to their
fitness in comparison with their parents. The neutral offspring that are larger in size than their
parents (LNO) could be further separated from those that are smaller in size than their parents
(SNO). An LNO has more introns while an SNO has less. But they both have the same exon
structure as their parents. It is argued that evolutionary processes favour LNOs, resulting
in intron growth with no performance improvement. The proposed approach discarded all
neutral offspring, named non-neutral offspring operators (NNO). Another approach, called
non-larger neutral offspring (NLNO), kept SNOs and discarded only LNOs. Both approaches
confined intron growth to different degrees. These two kinds of neutral offspring control-
ling operators were tested on two GP benchmark problems, namely symbolic regression and
multiplexer problems to verify whether they could successfully apply parsimony pressure. It
was concluded that NLNO was only able to confine code bloat and simultaneously improve
performance.
It was shown in [218] that by eliminating bloat, the performance of GP could be improved.
Some modifications to the selection procedures were presented to eliminate bloat without
deteriorating performance. The relationships of the bloat phenomenon with parallel and
distributed GP has been investigated by many researchers and positive results have been
obtained, where the bloat phenomenon could be controlled by parallelizing GP [56,73]. It
was shown in [41] that the parallel evolutionary model, specifically the island model, helped
to prevent the bloat phenomenon.
A simple theoretically-motivated method for controlling bloat was introduced in [188],
which was based on the idea of dynamically and strategically creating fitness “holes” in the
fitness landscape repelling the population. These holes were created by zeroing the fitness of a
certain proportion of above average length offspring. This meant that only a fixed proportion
of offspring, those which violated the length constraints, were randomly penalized.
Three methods for bloat control were presented [179], biased multi-objective parsimony
pressure (BMOPP), the Waiting Room, and Death by Size. BMOPP was a variation on the
pareto-optimization theme which combined lexicographic ordering, pareto dominance and
a proportional tournament. The latter two methods do not consider parsimony as a part of
123
P. Kouchakpour et al.
the selection process, but instead penalize for parsimony at other stages in the evolutionary
process. In the Waiting Room, newly created individuals were only permitted to enter the
population after having sat in the “waiting room” or queue for period of time proportional to
their size. This provided smaller individuals a greater opportunity to spread. Death by Size
chose individuals to die and be replaced based on their size.
Parsimony pressure is traditionally used to reduce the complexity of solutions. In [219]
however, a negative parsimony pressure was applied for a financial portfolio optimization
problem in GP, preferring complex solutions rather than simpler ones. Negative parsimony
pressure presumed that the principle of Occam’s Razor inhibits evolution [219]. Favour-
able results were shown; where in some instances it was better to apply negative parsimony
pressure.
Recent studies support the hypothesis that introns emerge predominantly in response to
the destructive effects of the variation operators. Although, it may be argued that introns can
be considered as useful because they protect good building blocks, nevertheless it can at the
same time run the entire population into stagnation due to bloat, which is the explosive and
exponential growth of introns. In this instance, no feasible improvements can be observed
as the population is merely exchanging introns during recombination. A potential future
direction within this area would be the formulation of more efficient variation operators or
methodologies that can reduce the destructive nature of existing operators.
5 GP variants and hybrids
It should be noted that this paper is focused on the canonical GP and does not delve into the
other variants and hybrids. A very brief summary and introduction to GP variants and hybrids
(entire Sect. 5) has been included to introduce the reader to these new GP variants in the hope
that it may lead to insights or clues for possible ideas that may guide future improvements
in canonical GP.
Variants of GP are differentiated by their different structures [87,125]. There are many
different GP structures such as tree, linear and graph structures with many other forms of
representations being investigated and continuously emerging. For example, in [112]anew
kind of GP structure called linear-tree-structure together with its own crossover and mutation
operations was introduced. A novel genetic parallel programming (GPP) paradigm, with a
linear genetic programming representation, was introduced in [141] for evolving parallel pro-
grams. It was observed that parallel programs were more evolvable than sequential programs.
In [32], considerable speed up in evolution was observed using the GPP paradigm, running
on a multi-arithmetic-logic-unit (Multi-ALU) processor (MAP) evolving parallel programs
and then serializing them into a sequential program.
A low level modularization strategy, called compressed GA (cGA), was presented for
linear genetic programming based on a substring compression/substitution scheme. The pur-
pose was to protect building blocks and foster genetic code reuse. There are many more other
variants explored by various researchers such as the gene expression programming (GEP),
gene estimated gene expression programming (GEGEP) an extension to GEP [49], Multi
niche parallel GP [74], directed acyclic graphs (DAGS) [82], parallel automatic induction of
machine code with genetic programming (parallel AIM-GP) [174], genetic network program-
ming (GNP) [89], grammar model-based program evolution (GMPE) [102] and many others
[3,11,101,144,203]. There are also many various hybrids that have been researched, such
as genetic programming neural network (GPNN) [193,194], Ant Colony Programming [22],
traceless genetic programming (TGP) [175]. Many researchers have looked into improving
123
A survey and taxonomy of performance improvement of canonical genetic programming
the newly proposed hybrid. For example, in [23] the problem of eliminating introns in ant
colony programming (paradigm based on genetic programming and ant colony system) was
investigated.
5.1 Grammar-guided genetic programming
Grammar-guided genetic programming (GGGP) is an extension to standard GP with the
aim to address the closure problem [130,233,235]. GGGP employs a context-free grammar
(CFG) establishing a formal definition of the syntactical restrictions. The grammar allowed
the declarative introduction of language bias that could result in a reduction in the search
space. Individuals are derivation trees that represent solutions belonging to the language
defined by the context-free grammar [230]. GGGP always generates valid individuals (points
or possible solutions) that belong to the search space. In [232], the influence of program gram-
mars on the efficiency of GP was described. In [249], a new method of representing based
on CFG was used to separate search space from solution space through a genotype to phe-
notype mapping and this technique was applied to a symbolic regression problem showing
improvement over a basic GP without a grammar. In [259], extensions to the operators were
made to improve grammar-based evolutionary algorithms.
Many other variants of GGGP have also been investigated. For example in [90,91],anew
grammar guided genetic programming system called tree-adjoining grammar guided genetic
programming (TAG3P+) was proposed. It is argued in [165] that standard GP is unable to
search for all tree shapes, namely solutions that require very full or narrow trees. A different
tree-based representation was used by [165] together with new local structural modification
operators (point insertion and deletion) to eliminate this problem. The new representation
was based on tree adjoining grammars (TAGs), which were first proposed in [109], to remove
the fixed-arity limitation of standard GP.
5.1.1 Initialization
A new initialization method was introduced for GGGP in [33]. Random Branch tree-gen-
eration algorithm [33] guaranteed the generation of trees of requested size. However, this
algorithm could not produce a well distributed set of trees, thus resulting with a negative
impact on the convergence speed [81]. The Uniform Tree Generation algorithm [21] guar-
anteed the uniform creation of trees of requested tree size but was known to be too complex
[75]. The Grow algorithm was modified in PTC1 and PTC2 to ensure that the trees were
generated around an expected size [145].
A new tree-generation algorithm for GPPP, grammar-based initialization method (GBIM),
was proposed [75]. A parameter was included to control the maximum size of the trees to
be generated and thereby the initial populations generated were distributed in terms of tree
size. It was shown that the proposed method had a higher convergence speed when compared
with Ramped Half-and-Half, Basic, Random Branch, Uniform and PTC2 tree generation
algorithms for an arithmetical equalities problem and the real-world task of breast cancer
prognosis.
5.1.2 Variationoperators
The strong context preservative crossover operator (SCPC) was proposed in [38] to preserve
the context in which the sub-trees occur in the parent trees and to control code bloat. Nodes
with matching coordinates can be selected as crossover points in SCPC, in other words,
123
P. Kouchakpour et al.
restricting crossover to nodes that reside in similar contexts within the individual. In this
case, emphasis was placed on the genotype of the individual (locations of the nodes), not the
phenotype.
Crossover in GP has been blind [87] in contrast to biological crossover, where chro-
mosomes exist with a matching and aligned homologous partner, in a process referred to
as meiosis. The Fair crossover operator [35] was designed to prevent code bloat, which is a
modified version of the operator proposed by Langdon [133]. Two original genetic operators,
crossover and mutation, were proposed in [37] for the grammar-guided genetic programming
(GGGP) paradigm. The grammar-based crossover operator (GBC) improved the GGGP per-
formance, by providing a good balance between search space exploration and exploitation.
The grammar-based mutation (GBM) operator generated individuals that matched the syntac-
tical constraints of the CFG that defined the programs. The proposed operators were tested in
two experiments demonstrating a higher convergence speed and a lesser likelihood of being
trapped in local optima.
A new grammar-based crossover (GBX) operator was introduced in [154] for the gram-
mar-guided genetic programming system to prevent code bloat. Moreover, GBX provided
trade-off between exploration and exploitation of the search space. Grammatical evolution
(GE) is an extension of GP and evolves complete programs by using a Backus Naur Form
(BNF) grammar or style of notation. An important aspect of GE, which distinguishes itself
from other GGGP approaches, is its representation of the individual as a linear string, which
is decoded to produce a derivation tree. Various different crossover operators were proposed
in [83,84]andin[85] a meta-grammar was introduced into GE allowing the grammar to
dynamically define functions.
5.2 Parallel genetic programming
The measurement and computation of population fitness consumes a large amount of com-
putational effort and is generally considered time-consuming. Many researchers have looked
into distributing the computational effort needed to calculate fitness, hence Parallelizing
GP which is frequently the focus of the parallel computing community. In addition, many
researchers have investigated the spatial distribution of GP models. The two fields of parallel
GP and spatially-distributed GP models had different goals. The main goal for parallel GP
was often to speed up computation of the fitness evaluations, by either havingeach individual
or some subpopulation evaluated on a separate processor. In this instance, the population as a
whole was often treated as a panmictic population. The spatially-distributed GP constructed
some form of spatial structure with the intention of maintaining diversity, which could be
executed on the same processor or multiple processors.
Two basic approaches to parallelization were discussed in [125]. In Distributed GP the
population is divided into sub-populations (island model), each assigned to a processor. The
Distributed GP can be implemented on a network of workstations or a parallel computer
where the GP operates on each sub-population separately. A specified percentage of individ-
uals within each sub-population are selected for migration after a certain designated number
of generations. There are many variations of distributed models such as demes, islands and
niching methods. For example, in the island method a population Pof Mindividuals is
divided into Nsubpopulations (called demes) D1,...,DNof M/Nindividuals. A standard
GP works on each deme and the subpopulations are interconnected according to various com-
munication topologies and information is periodically exchanged by migrating individuals
from one subpopulation to another. As a result various new parameters such as the number
of subpopulations N, number of individuals to be migrated (migration rate), the number
123
A survey and taxonomy of performance improvement of canonical genetic programming
of generations after which migration should occur (frequency) and the migration topology
are needed in this methodology. In the second approach, there are no sub-populations or
migrations, where steps are executed locally and asynchronously on a distributed basis. The
independent algorithm tasks are distributing to separate processors.
The behaviour of distributed GP with respect to sequential GP was firstly analysed in
[191]. Three levels of parallelization for the determination of fitness were described by [125]
namely, at fitness cases, at fitness evaluation for different individuals and at independent
runs. In [4], each processor was responsible for the fitness evaluation and breeding of a sub-
population increasing the efficiency of GP. It was hypothesized in [225] that distributed GP
outperforms the panmictic GP due to maintaining diversity.
In [129], it was argued that increases in computational power can be realized by paralle-
lizing the application. The parallel implementation of GP on a network of processing nodes
was described in [5] that achieved super-linear performance. A divide and conquer strategy
was introduced to increase the probability of success in GP [63], where the search space was
partitioned in smaller regions that were explored independently of each other.
In [190], multi-populations were examined and in [54,59] various control parameters
for multi-population models were systematically studied. Layered genetic programming
(LAGEP) was proposed in [142], which was based on multi-population genetic program-
ming (MGP). This method employed layer architecture to arrange multiple populations. A
layer contains a number of populations. In addition, an adaptive mutation rate tuning method
was proposed to increase the mutation rate. LAGEP achieved comparable results to single
population GP in much less time. The GP was used with several isolated subpopulations,
where the individuals among the several populations were not allowed to communicate [55].
This methodology was referred to as isolated multipopulation genetic programming (IMGP).
It was shown that although IMGP was not always helpful in obtaining better results, in some
instances better results were obtained than in the classic method.
A fine-grained parallel implementation of GP through cellular model on distributed-mem-
ory parallel computers with good performances was presented in [65]. In the fine-grained
(grid or cellular) model, each individual is associated with a spatial location on a low-dimen-
sional grid, interacting only with their direct neighbours. Different neighbourhoods can be
defined for the cells. Some examples of the two-dimensional (2-D) neighbourhoods are the
4-neighbour (von Neumann neighbourhood) and 8-neighbour (Moore neighbourhood).
Various researchers have also attempted to make improvements to the canonical paral-
lel evolutionary algorithms, e.g. in [80] the speciating island model (SIM) was explored.
In [54,55], the aspect of population size was investigated for the multi-population Parallel
Genetic Programming. It was discovered that an optimal range of values exists to speed up
the search for solutions. In [60], the plague operator was used to enhance the performance of
parallel GP (based on the island model). Individuals were removed every generation, altering
the population size. This compensated for the increase in size of individuals and hence saved
computational effort. Changing population size was also investigated for distributed GP in
[196] to reduce bloat. It was shown that by keeping their size as small as possible and the
amount of resources needed was decreased. There have been many other approaches in par-
allelizing GP [48,110,169,177,204,222]. An extensive survey on the subject can be found
in [224].
5.3 Graph genetic programming
In graph genetic programming (GGP) system the GP operates on graphs. In [166], the
notion of graph isomorphism was discussed and it was empirically shown how using a
123
P. Kouchakpour et al.
canonical graph indexed database (fitness database) can improve the performance by reduc-
ing the number of fitness evaluations and thus saving considerable evaluation time.
5.4 Cartesian genetic programming
A new form of GP called Cartesian genetic programming (CGP) was introduced in [159]
in which programs were represented as indexed graphs (rather than as parse trees), encoded
in the form of a linear string of integers. In [237], the CGP programs were represented as
directed acyclic graphs (DAGs), enabling outputs from previous computations to be reused.
An implicit context representation for CGP was described in [28] showing the beneficial
effects of recombination to outperform the conventional Cartesian GP. The computational
efficiency of graph-based Cartesian Genetic Programming was described in [160]. The
Cartesian genetic programming was extended by utilizing automatic module acquisition
in [227].
5.5 Page-based genetic programming
Page-based GP [170] is a linearly structured GP (L-GP), where individuals take the form
of a “linear” list of instructions. A Page-based linear GP was proposed in [86,87]where
individuals were described in terms of a number of pages. It was shown that page-based
linear GP evolves solutions better than the block-based linear GP [173].
5.6 Other representations
The performance of GP was improved by using a data structure coded by binary decision dia-
grams (BDDs), reducing storage requirements and accelerating the fitness calculation [245].
BDDs are a compact representation of Boolean functions using directed acyclic graphs. The
entire population was stored as a shared BDD and all genetic operations and fitness calcu-
lations were performed on the BDD. This technique is suitable for problems where only
Boolean variables and functions are involved. BDD-based GP is not practical for problems
where real variables, such as symbolic regression are used. Nevertheless, it can also be
used for integer-based programs by encoding the integers as binary vectors. New crossover,
mutation and evaluation algorithms were developed for BDD [245].
Linear GP, which makes use of linear phenomena and resembles conventional GA with
the exception of the chromosome length being allowed to evolve, is also generally used as
an alternative representation to the tree based GP. Some of the works that have used linear
GP are [24,25,64]and[26], to name a few.
A technique to reduce the time and space requirements of GP was proposed in [82]. The
population of parse trees was stored as a directed acyclic graph (DAG). However, it was stated
that this technique cannot be applied to all problems due to restricted program encoding and
bounded fitness cases, such as the Artificial Ant and Cart Centering problems. The number
of nodes stored and evaluated was reduced by a significant factor resulting in less space
requirements to store a population of computer programs. In addition, time savings were
also observed as a result of caching. In the standard sub-tree crossover it is difficult to make
changes near the root, occasionally causing runs to become trapped in local maxima. Based
on these structural limitations a different tree representation, AppGP, was proposed [68]. The
representation of trees and the tree manipulation algorithms were modified in AppGP. All
non-terminal nodes were represented as application (APP) nodes and the AppGP represen-
tation had more nodes than the standard GP representation, providing more potential points
123
A survey and taxonomy of performance improvement of canonical genetic programming
for the application of recombination operators. It was shown that on all of the test problems,
AppGP did no worse than standard GP, and in several instances it outperformed standard GP.
5.7 Memetic algorithms: hybrids
Researchers have attempted to capitalize upon the strategies of other methods by incorpo-
rating them into an enhanced version of GP that outperforms the canonical GP. The GA-P
[93], which is a genetic algorithm and genetic programming hybrid, performed symbolic
regression by combining the conventional GA function optimization strength with the GP
paradigm to evolve complex mathematical expressions. The GA-P was extended in [205].
A genetic algorithm (GA) was embedded into a genetic programming (GP), where each
paradigm operated at different levels within the problem domain [29].
The genetic programming paradigm was hybridized with statistical analysis in [1]to
derive systems of differential equations. A framework for combining GP and inductive logic
programming (ILP) was proposed in [234], which is another form of GGGP. A memetic
algorithm was proposed in [27] evolving heterogeneous populations, in which a GA was
used to optimize the numeric terminals of programs evolved using GP. The GP algorithm
was hybridized with hill climbing and the nature of new crossover algorithms, crossover hill
climbing (XOHC) and crossover with simulated annealing (XOSA), was investigated [176].
It was shown that the hybrids offer added search power and hybridizing GP with hill climbing
yields better results than the standard GP.
6 Conclusions
An overview has been provided of the research work done to improve the performance of
Canonical Genetic Programming based on parse trees. The various techniques of existing
approaches to improve performance were briefly discussed, with an attempt to classify these
proposed methods into various categories. The improvements made to the GP were catego-
rized based on whether the improvements pertained to the various components of GP or if
they contained some innovative ideas enhancing GP’s performance or whether they resolved
known issues highlighted within GP. Each of these categories was then further subdivided
as appropriate.
A variety of modifications to the crossover operator have been introduced to reduce the
destructive nature of crossover. There have been diverse investigations to discover methodol-
ogies for controlling and setting GP parameters to enhance effectiveness. The issue of bloat
and code growth has been given a lot of attention. Many researchers have proposed different
representations to canonical GP and thereby have departed from the traditional tree based
GP, inventing their own GP variants. In addition, hybrids involving other methods such as
GA and Neural Networks have been successfully implemented.
Although there exist a multitude of proposed methods for improving various aspects of
GP, it should be noted that no single “best” method can exist to improve the performance of
GP. Some promising areas of future research, according to the authors’ opinions, are outlined
as follows. Further investigations into the issue of diversity may still be worth pursuing in
conjunction with performing some new systematic operation to remedy the lack of diversity.
Examination of different measures for stagnation and premature convergence could be most
promising together with newly invented methodologies to either bring the population out of
stagnation or abruptly terminating a run and commencing a new run with the gained experi-
ence and knowledge of the previous run, i.e. incorporated memory. A complete theoretical
123
P. Kouchakpour et al.
analysis of evaluating and measuring difficulty in GP could be pivotal. An understanding of
how to highlight and flag sub-trees that are the main reason for success of the individual is
crucial and new operators could be used to expose these highly competitive genes by keeping
an updated genes library and incorporating them into the new individuals. Little research has
been conducted in the field of implicit adaptation or self-adaptive control, i.e. modifications
and parameter control are encoded as a genome and are evolved implicitly with the individ-
ual. The authors believe that further breakthrough successes may be achievable by exploring
self-adaptive control. It should, however be noted that such research has been conducted
in GA. This suggests that possible future advances in GP may stem from examining other
similar fields. In addition, using the insights from other similar fields such as theoretical
population genetics and evolutionary biology may help extend GP approaches.
References
1. Altenberg L (1994) The evolution of evolvability in genetic programming. In: Kinnear KE Jr (ed)
Advances in genetic programming, Chap 3. MIT Press, Cambridge, pp 47–74
2. Altenberg L (1995) Genome growth and the evolution of the genotype-phenotype map. In: Banzhaf W,
Eckman FH (eds) Evolution and biocomputation: computational models of evolution. Springer, Berlin,
pp 205–259
3. Ando S, Sakamoto E, Iba H (2002) Modelling genetic network by hybrid GP. In: Proceedings of the
2002 congress on evolutionary computation, CEC ’02, vol 1, pp 291–296
4. Andre D, Koza JR (1996) Parallel genetic programming: a scalable implementation using the transputer
network architecture. In: Angeline PJ, Kinnear KE Jr (eds) Advances in genetic programming, vol 2,
Chap 16. The MIT Press, Cambridge
5. Andre D, Koza JR (1998) A parallel implementation of genetic programming that achieves super-linear
performance. Inf Sci 106(3–4):201–218
6. Angeline PJ, Pollack JB (1992) The evolutionary induction of subroutines. In: The proceedings of the
fourteenth annual conference of the cognitive science society, pp 236–241
7. Angeline PJ, Pollack JB (1993) Competitive environments evolve better solutions for complex tasks. In:
Proceedings of the fifth international conference on genetic algorithms (ICGA), pp 264–270
8. Angeline PJ, Pollack JB (1994) Coevolving high-level representations. In: Artificial life III, Santa Fe
Institute Studies in the Sciences of Complexity, Proceedings, vol XVI, pp 55–72
9. Angeline PJ (1994) Genetic programming: a current snapshot. In: Proceedings of the third annual con-
ference on evolutionary programming. World Scientific, Singapore, pp 224–232
10. Angeline PJ (1996) Two self-adaptive crossover operators for genetic programming. In: Angeline PJ,
Kinnear KE (eds) Advances in genetic programming, vol 2. MIT Press, Cambridge, pp 89–109
11. Angeline P (1998) Multiple interacting programs: a representation for evolving complex behaviors.
Cybern Syst 29:779–806
12. Antolík J, Hsu WH (2005) Evolutionary tree genetic programming. In: GECCO ’05: proceedings of the
2005 conference on Genetic and evolutionary computation, pp 1789–1790
13. Araujo L (2006) Multiobjective genetic programming for natural language parsing and tagging. In:
Proceedings of parallel problem solving from nature—PPSN IX. Lecture notes in computer science,
pp 433–442
14. Ashlock W, Ashlock D (2005) Single parent genetic programming. In: The 2005 IEEE congress on
evolutionary computation, vol 2, pp 1172–1179
15. Badran KMS, Rockett PI (2007) The roles of diversity preservation and mutation in preventing popu-
lation collapse in multiobjective genetic programming. In: GECCO ’07: proceedings of the 9th annual
conference on Genetic and evolutionary computation, pp 1551–1557
16. Banzhaf W, Nordin P, Keller RE, Francone FD (1998) Genetic programming: an introduction on the
automatic evolution of computer programs and its applications. Morgan Kaufmann Publishers/Dpunkt-
verlag, Menlo Park/Heidelburg
17. Besetti S, Soule T (2005) Function choice, resiliency and growth in genetic programming. In: GECCO
’05: proceedings of the 2005 conference on Genetic and evolutionary computation, pp 1771–1772
18. Blickle T, Thiele L (1994) Genetic programming and redundancy. In: Hopf J (ed) Genetic algorithms
within the framework of evolutionary computation (Workshop at KI-94), Saarbruicken, pp 33–38
19. Blickle T (1996) Evolving compact solutions in genetic programming: A case study. In: Voigt H-M,
Ebeling W, Rechenberg I, Schwefel H-P (eds) PPSN IV. Springer, Heidelberg, pp 564–573
123
A survey and taxonomy of performance improvement of canonical genetic programming
20. Bleuler S, Brack M, Thiele L, Zitzler E (2001) Multiobjective genetic programming: reducing bloat
using SPEA2. In: Proceedings of the 2001 congress on evolutionary computation, vol 1, pp 536–543
21. Böhm W, Geyer-Schulz A (1996) Exact uniform initialization for genetic programming. In: Belew RK,
Bose M (eds) Foundations of genetic algorithms IV. Morgan Kaufmann, Menlo Park, pp 379–407
22. Boryczka M, Czech ZJ, Wieczorek W (2003) Ant colony programming for approximation problems. In:
Proceedings of genetic and evolutionary computation GECCO 2003, PT 1. Lecture notes in computer
science, pp 142–143
23. Boryczka M (2005) Eliminating introns in ant colony programming. Fundam Informaticae 68(1–2):1–
19
24. Brameier M, Banzhaf W (2001a) A comparison of linear genetic programming and neural networks in
medical data mining. IEEE Trans Evolut Comput 5(1):17–26
25. Brameier M, Banzhaf W (2001b) Evolving teams of predictors with linear genetic programming. Genetic
Program Evolvable Mach 2(4):381–407
26. Brameier M, Banzhaf W (2007) Linear genetic programming. Springer, Heidelberg
27. Cagnoni S, Rivero D, VanneschiL (2005) A purely evolutionary memetic algorithm as a first step towards
symbiotic coevolution. In: The 2005 IEEE congress on evolutionary computation, vol 2, pp 1156–1163
28. Cai XY, Smith SL, Tyrrell AM (2006) Positional independence and recombination in Cartesian genetic
programming. In: Proceedings of genetic programming. Lecture notes in computer science, pp 351–360
29. Cao HQ, Kang LI, Guo T, Chen YP, de Garis H (2000) A two-level hybrid evolutionary algorithm
for modelling one-dimensional dynamic systems by higher-order ODE models. IEEE Trans Syst Man
Cybern B 30(2):351–357
30. Carbajal SG, Martinez FG (2001) Evolutive introns: a non-costly method of using introns in GP. Genetic
Program Evolvable Mach 2:111–122
31. Carlos A, Coello C (1999) A comprehensive survey of evolutionary-based multiobjective optimization
techniques. Knowl Inf Syst 1(3):129–156
32. Cheang SM, Leung KS, Lee KH (2006) Genetic parallel programming: design and implementation.
Evolut Comput 14(2):129–156
33. Chellapilla K (1997) Evolving computer programs without subtree crossover. IEEE Trans Evolut Com-
put 1(3):209–216
34. Christensen S, Oppacher F (2007) Solving the artificial ant on the Santa Fe trail problem in 20,696 fit-
ness evaluations. In: GECCO ’07: proceedings of the 9th annual conference on Genetic and evolutionary
computation, pp 1574–1579
35. Crawford-Marks R, Spector L (2002) Size control via size fair genetic operators in the push GP genetic
programming system. In: Proceedings of the genetic and evolutionary computation conference, New
York, pp 733–739
36. Collard P, Segapeli JL (1994) Using a double-based genetic algorithm on a population of computer pro-
grams. In: Proceedings of 6th international conference of tools with artificial intelligence, pp 418–424
37. Couchet J, Manrique D, Rios J, Rodriguez-Paton A (2007) Crossover and mutation operators for gram-
mar-guided genetic programming. Soft Computing 11(10):943–955
38. D’Haesleer P (1994) Context preserving crossover in genetic programming. In: IEEE Proceedings of
the 1994 World congress on computational intelligence, Orlando, pp 1:379–407
39. deGaris H (1990) Genetic programming: building artificial nervous systems using genetically pro-
grammed neural network modules. In: Porter BW et al (eds) Procedings of seventh international con-
ference on machine learning (ICML-90), pp 132–139
40. De Jong E, Pollack J (2003) Multi-objective Methods for tree size control. Genet Program Evolv Mach
4:211–233
41. de Vega FF, Gil GG, Pulido JAG, Guisado JL (2004a) Control of bloat in genetic programming by means
of the island model. In: Parallel problem solving from nature—PPSN VIII. Lecture notes in computer
science, pp 263–271
42. de Vega FF, Cantu-Paz E, Lopez JI, Manzano T (2004b) Saving resources with plagues in genetic
algorithms. In: Parallel problem solving from nature—PPSN VIII. Lecture notes in computer science,
pp 272–281
43. Daida JM, Hilss AM (2003) Identifying structural mechanisms. In: Cantú-Paz E et al (ed) Standard GP,
in GECCO. Springer, Heidelberg, pp 1639–1651
44. Daida JM (2006) Characterizing the dynamics of symmetry breaking in genetic programming. In: GE-
CCO ’06: proceedings of the 8th annual conference on Genetic and evolutionary computation, pp
799–806
45. Deb K, Agrawal S, Pratab A, Meyarivan T (2001) A fast elitist non-dominated sorting genetic algorithm
for multi-objective optimization: NSGA-II, Kan-GAL report 200001. Indian Institute of Technology,
Kanpur, India
123
P. Kouchakpour et al.
46. De Falco I, Della Cioppa A, Iazzetta A et al (2005) An evolutionary approach for automatically extract-
ing intelligible classification rules. Knowl Inf Syst 7(2):179–201
47. Dignum S, Poli R (2007) Generalisation of the limiting distribution of program sizes in tree-based
genetic programming and analysis of its effects on bloat. In: GECCO ’07: proceedings of the 9th annual
conference on Genetic and evolutionary computation, pp 1588–1595
48. Dracopoulos DC, Kent S (1996) Speeding up genetic programming: a parallel BSP implementation. In:
Koza JR, Goldberg DE, Fogel DB, Riolo RL (eds) Proceedings of the first annual conference on genetic
programming 1996, July 28–31. MIT Press, Cambridge, pp 125–136
49. Du X, Li YQ, Xie DT, Kang LS (2006) A new algorithm of automatic programming: GEGEP. In:
Proceedings of simulated evolution and learning. Lecture notes in computer science, pp 292–301
50. Eggermont J, Kok JN, Kosters WA (2004) Detecting and pruning introns for faster decision tree
evolution. In: Parallel problem solving from nature—PPSN VIII. Lecture notes in computer science,
pp 1071–1080
51. Eiben AE, Smith JE (2003) Introduction to evolutionary computing, 1st edn. Springer, Natural Com-
puting Series, pp 129–151
52. Eskridge BE, Hougen DF (2004a) Memetic crossover for genetic programming: evolution through imi-
tation. In: Proceedings of genetic and evolutionary computation GECCO 2004, pt 2. Lecture notes in
computer science, pp 459–470
53. Eskridge BE, Hougen DF (2004b) Imitating success: a memetic crossover operator for genetic program-
ming. In: Congress on evolutionary computation, CEC2004, vol 1, pp 809–815
54. Fernandez F, Tomassini M, Punch WF, Sanchez JM (2000a) Experimental study of multipopulation par-
allel genetic programming. In: Proceedings of genetic programming. Lecture notes in computer science,
pp 283–293
55. Fernandez F, Tomassini M, Sanchez JM (2004b) Experimental study of isolated multipopulation genetic
programming. In: 26th Annual conference of the IEEE industrial electronics society, vol 4, IECON 2000,
pp 2672–2677
56. Fernandez F, Galeano G, Gomez JA, Sanchez JM (2002) Efficient use of computational resources in
genetic programming: controlling the bloat phenomenon by means of the island model. In: IECON 02,
Industrial Electronics Society. IEEE 2002 28th Annual Conference, vol 3, pp 2520–2524
57. Fernandez F, Vanneschi L, Tomassini M (2003a) The effectof plagues in genetic programming: a study of
variable-size populations. In: Proceedings of genetic programming. Lecture notes in computer science,
pp 317–326
58. Fernandez F, Tomassini M, Vanneschni L (2003b) Saving computational effort in genetic programming
by means of plagues. In: Congress on evolutionary computation (CEC’2003). IEEE Press, New York,
pp 2042–2049
59. Fernandez F, Tomassini M, Vanneschi L (2003c) An empirical study of multipopulation genetic pro-
gramming. Genetic Program Evolvable Mach 4(1):21–51
60. Fernandez F, Martin A (2004a) Saving effort in parallel GP by means of plagues. In: Proceedings of
genetic programming. Lecture notes in computer science, pp 269–278
61. Fernandez T (2004) Virtual ramping of genetic programming populations. In: Proceedings of genetic
and evolutionary computation GECCO 2004, PT 2. Lecture notes in computer science, pp 471–482
62. Fernandez F, Tomassini M, Vanneschni Land Cuendet J (2004a) A new technique for dynamic size
populations in genetic programming. Congress on evolutionary computation (CEC’2004). IEEE, New
York, pp 486–493
63. Fillon C, Bartoli A (2006) A divide & conquer strategy for improving efficiency and probability of
success in genetic programming. In: Proceedings of genetic programming. Lecture notes in computer
science, pp 13–23
64. Fogelberg C, Zhang M (2005) Linear Genetic Programming for Multi-class Object Classification. In:
Zhang S, Jarvis R (eds) Proceedings of AI 2005: advances in artificial intelligence, 18th Australian Joint
conference on artificial intelligence, vol 3809, pp 369–379
65. Folino G, Pizzuti C, Spezzano G (2003a) A scalable cellular implementation of parallel genetic pro-
gramming. IEEE Trans Evolut Comput 7(1):37–53
66. Folino G, Pizzuti C, Spezzano G, Vanneschi L, Tomassini M (2003b) Diversity analysis in cellular and
multipopulation genetic programming. In: The 2003 congress on evolutionary computation, CEC ’03,
vol 1, pp 305–311
67. Folino G, Spezzano G (2006) P-CAGE: an environment for evolutionary computation in peer-to-peer
systems. In: Proceedings of genetic programming. Lecture notes in computer science, pp 341–350
68. Freitag MN, Hopper NJ (1999) AppGP: an alternative structural representation for GP. In: Proceedings
of the 1999 congress on evolutionary computation, CEC 99, vol 2, pp 1377–1383
123
A survey and taxonomy of performance improvement of canonical genetic programming
69. Fry R, Tyrrell A (2003) Enhancing the performance of GP using an ancestry-based mate selection
scheme. In: Proceedings of genetic and evolutionary computation GECCO 2003, pt 2. Lecture notes in
computer science, pp 1804–1805
70. Fuchs M (1999) Large Populations are not always the best choice in genetic programming. In: Proceed-
ings of the genetic and evolutionary computation conference GECCO, pp 1033–1038
71. Fukunaga AS, Kahng AB (1995) improving the performance of evolutionary optimization by dynami-
cally scaling the evaluation function. Evolut Comput 1:182–187
72. Gagne C, Schoenauer M, Parizeau M, Tomassini M (2006) Genetic programming, validation sets,
and parsimony pressure. In: Proceedings of genetic programming. Lecture notes in computer science,
pp 109–120
73. Galeano G, Fernandez F, Tomassini M, Vanneschi L (2002) Studying the influence of synchronous and
asynchronous parallel GP on programs length evolution. In: Proceedings of the congress on evolutionary
computation, CEC ’02, vol 2, pp 1727–1732
74. Garcia S, Levine J, Gonzalez F (2003) Multi niche parallel GP with a junk-code migration model. In:
Proceedings of genetic programming. Lecture notes in computer science, pp 327–334
75. Garcia-Arnau M, Manrique D, Rios J, Rodriguez-Paton A (2007) Initialization method for grammar-
guided genetic programming. Knowl Based Syst 20(2):127–133
76. Gelly S, Teytaud O, Bredeche N, Schoenauer M (2005) A statistical learning theory approach of
bloat. In: GECCO ’05: Proceedings of the 2005 conference on Genetic and evolutionary computation,
pp 1783–1784
77. Guo H, Nandi AK (2006) Breast cancer diagnosis using genetic programming generated feature. Pattern
Recogn 39(5):980–987
78. Gusikhin O, Rychtyckyj N, Filev D (2007) Intelligent systems in the automotive industry: applications
and trends. Knowl Inf Syst 12(2):147–168
79. Gustafon SM, Hsu WH (2001) Layered Learning in Genetic Programming for a Cooperative Robot Soc-
cer Problem. In: Miller JF et al (eds) Proceedings of EuroGP 2001. Lecture notes in computer science,
vol 2038. Springer, Heidelberg, pp 291–301
80. Gustafson S, Burke EK (2006) The speciating island model: an alternative parallel evolutionary algo-
rithm. J Parallel Distrib Comput 66(8):1025–1036
81. Hao HT, Hoai NX, McKay RB (2004) Does this matter where to start in grammar guided genetic pro-
gramming? In: Proceedings of the second Pacific Asian Workshop in Genetic Programming, Cairns,
Australia
82. Handley S (1994) On the use of a directed acyclic graph to represent a population of computer programs.
In: Proceedings of the IEEE conference on evolutionary computation, pp 154–159
83. Harper R, Blair A (2005) A structure preserving crossover in grammatical evolution. In: IEEE congress
on evolutionary computation, pp 2537–2544
84. Harper R, Blair A (2006a) A self-selecting crossover operator. In: IEEE Congress on evolutionary
computation, CEC 2006, pp 1420–1427
85. Harper R, Blair A (2006b) Dynamically defined functions in grammatical evolution. In: IEEE congress
on evolutionary computation, CEC 2006, pp 2638–2645
86. Heywood MJ, Zincir-Heywood AN (2000) Page-based linear genetic programming. In: IEEE interna-
tional conference on systems, man, and cybernetics, pp 3823–3828
87. Heywood MI, Zincir-Heywood AN (2002) Dynamic page based crossover in linear genetic program-
ming. IEEE Trans Syst Man Cybern B Cybern 32(3):380–388
88. Hinchliffe M, Hiden H, McKay B, Willis M, Tham M, Barton G (1996) Modelling chemical process
systems using a multi-gene genetic programming algorithm. In: Koza (ed) Late Breaking papers at the
genetic programming 1996 conference, pp 56–65
89. Hirasawa K, Okubo M, Katagiri H, Hu J, Murata J (2006) Comparison between genetic network pro-
gramming (GNP) and genetic programming (GP). In: Proceedings of the 2001 congress on evolutionary
computation, vol 2, pp 1276–1282
90. Hoai NX, McKay RI (2001) A framework for tree adjunct grammar guided genetic programming. In:
Proceedings of the post-graduate ADFA conference on computer science (PACCS 01), pp 93–99
91. Hoai NX, McKay RI, Abbass HA (2003) Tree adjoining grammars, language bias, and genetic program-
ming. In: Proceedings of genetic programming. Lecture notes in computer science, pp 335–344
92. Hondo N, Iba H, Kakazu Y (1996) Sharing and refinement for reusable subroutines of genetic program-
ming. In: Proceedings of IEEE international conference on evolutionary computation, pp 565–570
93. Howard LM, D’Angelo DJ (1995) The GA-P: a genetic algorithm and genetic programming hybrid.
Can J Fish Aquat Sci 10(3):11–15
94. Hsu WH, Gustafon SM (2002) Genetic programming and multi-agent layered learning by reinforce-
ments. In: Proceedings of GECCO 2002, pp 764–771
123
P. Kouchakpour et al.
95. Hsu WH, Harmon SJ, Rodriguez E, Zhong C (2004) Empirical comparison of incremental reuse strate-
gies in genetic programming for keep-away soccer. In: GECCO 2004, late-breaking papers
96. Iba H, deGaris H, Sato T (1994) Genetic programming using a minimum description length principle.
In: Kinnear KE Jr (ed) Advances in genetic programming. MIT Press, Cambridge
97. Iba H, deGaris H, Sato T (1995a) Temporal data processing using genetic programming. In: Proceedings
of 6th international conference on genetic algorithms, pp 279–286
98. Iba H, Sato T, deGaris H (1995b) Recombination guidance for numerical genetic programming. In:
IEEE international conference on evolutionary computation, vol 1, pp 97–102
99. Iba H, de Garis H (1996) Extending genetic programming with recombinative guidance. In: Angeline
PJ, Kinnear KE Jr (eds) Advances in genetic programming, vol 2, Chap 4. MIT Press, Cambridge, pp
69–88
100. Iba H (1996) Random tree generation for genetic programming, source Lecture Notes In Computer
Science, vol 1141. In: Proceedings of the 4th international conference on parallel problem solving from
nature, pp 144–153
101. Imae J, Kikuchi Y, Ohtsuki N, Kobayashi T, Guisheng Zhai (2004) Design of nonlinear control systems
by means of differential genetic programming. In: 43rd IEEE conference on decision and control, CDC,
vol 3, pp 2734–2739
102. Ishida CY, Pozo A (2003) Grammatically based genetic programming for mining relational databases,
Proceedings. 23rd International Conference of the Chilean Computer Science Society, SCCC 2003,
pp 86–95
103. Ito T, Iba H, Sato S (1998a) Depth-dependent crossover for genetic programming, evolutionary compu-
tation proceedings. In: IEEE World Congress, the 1998 IEEE international conference on computational
intelligence, pp 775–780
104. Ito T, Iba H, Sato S (1998b) Non-destructive depth-dependent crossover for genetic programming.
In: Proceedings of the first european workshop on genetic programming. LNCS, vol 1391. Springer,
Heidelberg, pp 71–82
105. Jackson D (2005) Fitness evaluation avoidance in Boolean GP problems. In: The 2005 IEEE congress
on evolutionary computation, vol 3, pp 2530–2536
106. Jackson D (2005) Dormant program nodes and the efficiency of genetic programming. In: GECCO ’05:
proceedings of the 2005 conference on genetic and evolutionary computation, pp 1745–1751
107. Jackson D, Gibbons AP (2007a) Layered learning in Boolean GP problems. In: Proceedings od EuroGP
2007. Lecture notes in computer science, vol 4445. Springer, Heidelberg, pp 148–159
108. Jackson D (2007b) Hierarchical genetic programming based on test input subsets. In: GECCO ’07:
proceedings of the 9th annual conference on genetic and evolutionary computation, pp 1612–1619
109. Joshi AK, Levy LS, Takahashi M (1975) Tree adjunct grammars. J Comp Syst Sci 10:136–163
110. Juillé H, Pollack JB (1996) Massively parallel genetic programming. In: Angeline PJ, Kinnear KE Jr
(eds) Advances in genetic programming, vol 2, Chap 17. MIT Press, Cambridge, pp 339–358
111. Kalganova T, Miller JF (1999) Evolving more efficient digital circuits by allowing circuit layout
evolution and multiobjective fitness. In: Keymeulen AD, Lohn I (eds) Proceedings of the 1st
NASMDoD workshop on evolvable hardware (EH’99). IEEE Computer Society Press, New York,
pp 54–63
112. Kantschik W, Banzhaf W (2001) Linear-tree GP and its comparison with other GP structures. In: Pro-
ceedings of Genetic Programming. Lecture notes in computer science, pp 302–312
113. Katagami D, Yamada S (1999) Speedup of evolutionary behavior learning with crossover depending
on the usage frequency of a node. In: IEEE International conference on systems, man, and cybernetics,
IEEE SMC ’99 Conference Proceedings, vol 5, pp 601–606
114. Keijzer M, Ryan C, O’Neill M, Cattolico M, Babovic V (2001) Ripple crossover in genetic programming.
In: Proceedings of genetic programming. Lecture notes in computer science, pp 74–86
115. Keijzer M (2004) Alternatives in subtree caching for genetic programming. In: Proceedings of
genetic programming 7th European conference, EuroGP 2004. LNCS, vol 3003. Springer, Heidelberg,
pp 328–337
116. Kennedy CJ, Giraud-Carrier C (1999) A depth controlling strategy for strongly typed evolutionary pro-
gramming. In: Banzhaf W, Daida J, Eiben E et al (eds) GECCO-1999, Morgan Kaufman, Menlo Park,
pp 1–6
117. Kessler M, Haynes T (1999) Depth-fair crossover in genetic programming. In: SAC ’99: Proceedings
of the 1999 ACM symposium on applied computing, pp 319–323
118. Kim JJ, Zhang BT (1999) Effects of selection schemes in genetic programming for time series prediction.
In: Proceedings of the 1999 congress on evolutionary computation, CEC 99, vol 1, pp 252–258
119. Kim M, Hiroyasu T, Miki M (2004) SPEA2+: improving the performance of the strength pareto evolu-
tionary algorithm2, parallel problem solving from nature—PPSN VIII, pp 742–751
123
A survey and taxonomy of performance improvement of canonical genetic programming
120. Kinnear KE (1994) Alternatives in automatic function definition, comparison of performance in advances
in genetic programming. MIT Press, Cambridge, pp 119–141
121. Korenaga M, Hagiwara M (1998) Modified genetic programming based on elastic artificial selection
and improved minimum description length. In: IEEE international conference on systems, man, and
cybernetics, vol 3, pp 2348–2353
122. Kouchakpour P, Zaknich A, Bräunl T (2007) Population Variation in Genetic Programming. Inf Sci
177(17):3438–3452
123. Kouchakpour P, Zaknich A, Bräunl T (2008) Dynamic population variation in genetic programming. Inf
Sci (to be printed)
124. Koza JR (1989) Hierarchical genetic algorithms operating on populations of computer programs. In:
The 11th international conference on genetic algorithms, ICGA, pp 768–774
125. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection.
MIT, Cambridge
126. Koza JR, Keane MA, Rice JP (1993) Performance improvement of machine learning via automatic
discovery of facilitating functions as applied to a problem of symbolic system identification, neural
networks. In: IEEE international conference, vol 1, pp 191–198
127. Koza JR (1994) Genetic programming II, automatic discovery of reusable programs. MIT Press,
Cambridge
128. Koza JR (1995) Two ways of discovering the size and shape of a computer program to solve a problem.
In: ICGA, 1995. Morgan Kaufmann, Menlo Park, pp 287–294
129. Koza JR, Bennett FH, Andre D (1999) Genetic programming III. Darwinian Invention and Problem
Solving. Morgan Kaufmann, San Francisco
130. Koza JR, Keane MA, Streeter MJ et al (2005) Genetic programming IV: routine human-competitive
machine intelligence. Kluwer, Norwell
131. Kramer MD, Zhang D (2000) GAPS: a geneticprogramming system. In: The 24th annual international
computer software and applications conference, COMPSAC 2000, pp 614–619
132. Lang KJ (1995) Hill climbing beats genetic search on a Boolean circuit synthesis of Koza’s. In: Pro-
ceedings of the 12th international conference on machine learning, pp 340–343
133. Langdon WB, Poli R (1998) Fitness causes bloat: mutation. In: 1st European Workshop on genetic
programming. Springer, Heidelberg, pp 37–48
134. Langdon WB (1998) The evolution of size in variable length representations. In: ICEC’98. IEEE Press,
New York, pp 633–638
135. Langdon WB (1999) Size fair and homologous tree genetic programming crossovers. In: Proceedings
genetic and evolutionary computation conference, GECCO-99, Washington DC, pp 1092–1097
136. Langdon WB, Soule T, Poli R, Foster JA (1999) The evolution of size and shape. In: Spector L, Langdon
WB, O’Reilly U-M, Angeline P, (ed). Advances in genetic programming III. MIT Press, Cambridge,
pp 163–190
137. Langdon WB (2000a) Quadratic bloat in genetic programming. In: GECCO, 2000. Morgan Kaufmann,
Cambrdige, pp 451–458
138. Langdon WB (2000b) Size fair and homologous tree genetic programming crossovers. Genetic Program
Evolv Mach 1(1/2):95–119
139. Lasarczyk CWG, Dittrich P, Banzhaf W (2004) Dynamic subset selection based on a fitness case topol-
ogy. Evolut Comput 12(2):223–242
140. Lee CY, Yao X (2004) Evolutionary programming using mutations based on the Lévy probability dis-
tribution. IEEE Trans Evolut Comput 8(1):1–13
141. Leung KS, Lee KH, Cheang SM (2003) Parallel programs are more evolvable than sequential programs.
In: Proceedings of Genetic programming. Lecture notes in computer science, pp 107–118
142. Lin JY, Ke HR, Chien BC, Yang WP (2007) Designing a classifier by a layered multi-population genetic
programming approach. Pattern Recogn 40(8):2211–2225
143. Lin WY, Kuo IC (2004) A genetic selection algorithm for OLAP data cubes. Knowl Inf Syst 6(1):83–102
144. Lones M A, Tyrrell AM (2001) Enzyme genetic programming. In: Proceedings of the 2001 congress on
evolutionary computation, vol 2, pp 1183–1190
145. Luke S (2000) Two fast tree-creation algorithms for genetic programming. IEEE Trans Evolut Comput
4(3):274–283
146. Luke S, Panait L (2002a) Fighting bloat with nonparametric parsimony pressure, parallel problem solving
from nature—PPSN VII, Lecture notes in computer science, vol 2439. Springer, Heidelberg, p 411
147. Luke S, Panait L (2002b) Lexicographic parsimony pressure. In: GECCO 2002, Proceedings of the
Genetic and evolutionary computation conference. Springer, Heidelberg, pp 829–836
148. Luke S (2003) Modification point depth and genome growth in genetic programming. Evolut Comput
11(1):67–106
123
P. Kouchakpour et al.
149. Luke S, Balan GC, Panait L (2003) Population implosion in genetic programming. In: Proceedings
of genetic and evolutionary computation GECCO 2003, PT II. Lecture notes in computer science,
pp 1729–1739
150. Luke S, Partait L (2006) A comparison of bloat control methods for genetic programming. Evolut Com-
put 14(3):309–344
151. Majeed H, Ryan C (2006a) A less destructive, context-aware crossover operator for GP. In: Collet P
et al (eds) Proceedings of EuroGP 2006. LNCS, vol 3905. Springer, Heidelberg, pp 36–48
152. Majeed H, Ryan C (2006b) Using context-aware crossover to improve the performance of GP. In:
GECCO ’06: proceedings of the 8th annual conference on Genetic and evolutionary computation,
pp 847–854
153. Majeed H, Ryan C (2007) On the constructiveness of context-aware crossover. In: GECCO ’07: pro-
ceedings of the 9th annual conference on genetic and evolutionary computation, pp 1659–1666
154. Manrique D, Marquez F, Rios J, Rodriguez-Paton A (2005) Grammar based crossover operator in
genetic programming. In: Proceedings of artificial intelligence and knowledge engineering applications:
a bioinspired approach, PT 2. Lecture notes in computer science, pp 252–261
155. Manrique D, Ríos J, Rodríguez-Patón A (2006) Evolutionary system for automatically constructing and
adapting radial basis function networks. Int J Neuro-computation, pp 2268–2283
156. McPhee NF, Miller JD (1995) Accurate replication in genetic programming. In: Eshelman L (ed) Genetic
algorithms: proceedings of the sixth international conference (ICGA95). Morgan Kaufmann, Menlo
Park, pp 303–309
157. McPhee NF, Jarvis A, Crane EF (2004) On the strength of size limits in linear genetic programming. In:
Proceedings of genetic and evolutionary computation GECCO 2004 , PT 2. Lecture notes in computer
science, pp 593–604
158. McKay R, Abbass HA (2001) Anti-correlation: a diversity promoting mechanisms in ensemble learning,
Austr J Intell Inf Process Syst (3/4) 7:139–149
159. Miller JF, Thomson P (2000) Cartesian genetic programming. In: Proceedings of genetic programming.
Lecture notes in computer science, pp 121–132
160. Miller JF, Smith SL (2006) Redundancy and computational efficiency in Cartesian genetic program-
ming. IEEE Trans Evolut Comput 10(2):167–174
161. Monsieurs P, Flerackers E (2001) Reducing bloat in genetic programming. In: Proceedings of compu-
tational intelligence: theory and applications. Lecture notes in computer science, pp 471–478
162. Moore FW, Garcia ON (1997) A new methodology for reducing brittleness in genetic programming. In:
Proceedings of the IEEE, aerospace and electronics conference, NAECON, vol 2, pp 757–763
163. Muntean O, Diosan L, Oltean M (2007) Best SubTree genetic programming. In: GECCO ’07: proceed-
ings of the 9th annual conference on genetic and evolutionary computation, pp 1667–1673
164. Nanduri DT, Ciesielski V (2005) Comparison of the effectiveness of decimation and automatically
defined functions. In: Proceedings of knowledge-based intelligent information and engineering systems,
PT 3. Lecture notes in artificial intelligence, pp 540–546
165. Nguyen XH, McKay RI, Essam D (2006) Representation and structural difficulty in genetic program-
ming. IEEE Tran Evolut Comput 10(2):157–166
166. Niehaus J, Igel C, Banzhaf W (2007) Reducing the number of fitness evaluations in graph genetic pro-
gramming using a canonical graph indexed database. Evolut Comput 15(2):199–221
167. Niehaus J, Banzhaf W (2001) Adaption of operator probabilities in genetic programming. In: Proceed-
ings of genetic programming. Lecture notes in computer science, pp 325–336
168. Niimi A, Tazaki E (1999) Extended genetic programming using reinforcement learning operation. In:
Proceedings of IEEE international conference on systems, man, and cybernetics, SMC ’99 conference,
vol 5, pp 596–600
169. Niwa T, Iba H (1996) Distributed genetic programming—empirical study and analysis, In: Koza JR,
Goldberg D, Fogel DB, Riolo RL (eds) Genetic programming 1996: proceedings of the first annual
conference, 38–31 July, Stanford University. MIT Press, Cambridge, pp 339–344
170. Nordin JP (1994) Genetic programming and emergent intelligence. In: Kinnear KE Jr (ed) Advances in
genetic programming, vol 1, Chap 14. MIT Press, Cambridge, pp 311–331
171. Nordin P, Banzhaf W (1994) Complexity compression and evolution. In: Eshelman L (ed) Genetic algo-
rithms: proceedings of the sixth international conference (ICGA95). Morgan Kaufmann, Menlo Park,
pp 310–317
172. Nordin P, Francone F, Banzhaf W (1996) Explicitly defined introns and destructive crossover in genetic
programming. Adv Genetic Program 2:111–134
173. Nordin JP, Banzhaf W, Francone FD (1999a) A compiling genetic programming systems that directly
manipulates the machine code. In: Spector L, Langdon WB, O’Reilly U-M, Angeline PJ (eds) Advances
in genetic programming, vol 3, Chap 12. MIT Press, Cambridge, pp 275–299
123
A survey and taxonomy of performance improvement of canonical genetic programming
174. Nordin P, Hoffmann F, Francone FD, Brameier M, Banzhaf W (1999) AIM-GP and parallelism. In:
Proceedings of the 1999 congress on evolutionary computation, CEC 99, vol 2, pp 1059–1066
175. Oltean M (2004) Solving even-parity problems using traceless genetic programming. In: Congress on
evolutionary computation, CEC2004, vol 2, pp 1813–1819
176. O’Reilly UM, Oppacher F (1995) Hybridized crossover-based search techniques for program discovery.
In: IEEE international conference on evolutionary computation, vol 2, pp 573–578
177. Oussaidène M, Chopard B, Pictet O V,Tomassini M (1996) Parallel genetic programming: an application
to trading models evolution. In: Koza JR, Goldberg DE, Fogel DB, Riolo RL (eds) Genetic programming
1996: proceedings of the first annual conference. MIT Press, Cambridge, pp 357–380
178. Page J, Poli R, Langdon WB (1999) Smooth uniform crossover with smooth point mutation in genetic
programming: a preliminary study. In: Proceedings of genetic programming. Lecture notes in computer
science, pp 39–48
179. Panait L, Luke S (2004) Alternative bloat control methods. In: Proceedings of genetic and evolutionary
computation GECCO 2004, PT 2. Lecture notes in computer science, pp 630–641
180. Parent J, Nowe A, Steenhaut K, Defaweux A (2005) Linear genetic programming using a compressed
genotype representation. In: The IEEE congress on evolutionary computation, vol 2, pp 1164–1171
181. Perry JE (1994) The effect of population enrichment in genetic programming. In: Proceedings of the first
IEEE conference on evolutionary computation. IEEE World Congress on Computational Intelligence,
vol 1, pp 456–461
182. Piszcz A, Soule T (2006) Genetic programming: optimal population sizes for varying complexity prob-
lems. In: GECCO ’06: proceedings of the 8th annual conference on genetic and evolutionary computa-
tion, pp 953–954
183. Platel MD, Clergue M, Collard P (2006) Size control with maximum homologous crossover. Artificial
Evolution Lecture notes in computer science, pp 13–24
184. Poli R, Langdon WB (1997) A new schema theory for genetic programming with one-point crossover
and point mutation. In: Koza JR, Deb K, Dorigo M, Fogel DB, Garzon M, Iba H, Riolo RL (eds) Genetic
programming 1997: proceedings of the second annual conference (Stanford University, CA, USA).
Morgan Kaufmann, Menlo Park, pp 278–285
185. Poli R, Langdon WB (1998a) Schema theory for genetic programming with one-point crossover and
point mutation. Evolut Comput 6(3):231–252
186. Poli R and Langdon WB (1998b) A review of theoretical and experimental results on schemata in
genetic programming. In: Banzhaf W et al (eds) Proceedings of the first European workshop on genetic
programming, vol 1391, pp 1–15
187. Poli R and Langdon WB (1998c) On the search properties of different crossover operators in genetic
programming. In: Genetic programming 1998: proceedings of the third annual conference. Morgan
Kaufmann, Menlo Park, pp 293–301
188. Poli R (2003) A simple but theoretically-motivated method to control bloat in genetic programming. In:
Proceedings of genetic programming. Lecture notes in computer science, pp 204–217
189. Poli R (2005) Tournament selection, iterated coupon-collection problem, and backward-chaining evolu-
tionary algorithms. In: Alden H et al, Schmitt (eds) Foundations of genetic algorithms: 8th international
workshop (FOGA). Lecture notes in computer science, vol 3469, pp 132–155
190. Punch WF, Zongker D, Goodman ED (1996) The royal tree problem, a benchmark for single and
multi-population genetic programming. In: Angeline PJ, Kinnear KE Jr (eds) Advances in genetic pro-
gramming, vol 2, Chap 15. The MIT Press, Cambridge, pp 299–316
191. Punch WF (1998) How effective are multiple populations in genetic programming. In: Koza JR,
Banzhaf W, Chellapilla K, Deb K, Dorigo M, Fogel DB, Garzon MH, Goldberg DE, Iba H, Riolo R (eds)
Proceedings of the third annual conference on genetic programming. Morgan Kaufmann, San Mateo,
pp 308–313
192. Ratle A, Sebag M (2001) Avoiding the bloat with probabilistic grammar-guided genetic programming.
In: Collet P, Fonlupt C, Hao J-K, Lutton E, Schoenauer M (eds) Artificial evolution 5th International
conference, Evolution Artificielle, EA 2001, vol 2310. Springer, Heidelberg, pp 255–266
193. Ritchie MD, White BC, Parker JS, Hahn LW, Moore JH (2003) Optimization of neural network archi-
tecture using genetic programming improves detection of gene–gene interactions in studies of human
diseases. BMC Bioinf 4:28
194. Ritchie MD, Coffey CS, Moore JH (2004) Genetic programming neural networks as a bioinformatics
tool for human genetics. In: Proceedings of genetic and evolutionary computation—GECCO 2004, PT
1. Lecture notes in computer science, Part 1, pp 438–448
195. Roberts SC, Howard D, Koza JR (2001) Evolving modules in genetic programming by subtree encap-
sulation. In: Proceedings of genetic programming. Lecture notes in computer science, pp 160–175
196. Rochat D, Tomassini M, Vanneschi L (2005) Dynamic size populations in distributed genetic program-
ming. In: Proceedings of genetic programming. Lecture notes in computer science, pp 50–61
123
P. Kouchakpour et al.
197. Rodriguez-Vazquez K, Fonseca CM, Fleming PJ (2004) Identifying the structure of non-linear dynamic
systems using multiobjective genetic programming. IEEE Trans Syst Man Cybern A Syst Humans,
pp 531–547
198. Rodriguez-Vazquez K, Fleming PJ (2005) Evolution of mathematical models of chaotic systems based
on multiobjective genetic programming. Knowl Inf Syst 8(2):235–256
199. Rosca JP, Ballard DH (1994a) Hierarchical self-organizationin g enetic programming,machine learning.
In: Proceeding of the 11th international conference on machine learning, pp 25 1–258
200. Rosca JP, Ballard DH (1994b) Genetic programming with adaptive representations. Technical Report
TR 489, University of Rochester, Computer Science Department, Rochester, NY, USA, pp 1–30
201. Rosca JP, Ballard DH (1996) Discoveryof Subroutines in Genetic Programming. In: Angeline P, Kinnear
KE Jr (eds) Proceedings of advances in genetic programming, vol 2, Chap 9. MIT Press, Cambridge,
pp 177–202
202. Rosca JP (1997) Analysis of complexity drift in genetic programming, in Genetic Programming 1997.
In: Koza JR, Deb K, Dorigo M, Fogel DB, Garzon M, Iba H, Riolo RL (eds) Proceedings of the second
annual conference. Morgan Kaufmann, pp 286–294
203. Ryan C, Collins JJ, O’Neill M (1998) Grammatical evolution: Evolving programs for an arbitrary lan-
guage. In: Banzhaf W et al (ed) 1st European workshop on genetic programming. Lecture notes in
computer science, vol 1391. Springer, Heidelberg
204. Salhi A, Glaser H, De Roure D (1998) Parallel implementation of a genetic-programming based tool
for symbolic regression. Inform Process Lett 66(6): 299–307
205. Sanchez L (2000) Interval-valued GA-P algorithms. IEEE Trans Evolut Comput 4(1):64–72
206. Silva S, Almeida JS (2003) Dynamic maximum tree depth - a simple technique for avoiding bloat in
tree-based GP. In: Cantu-Paz E, Foster JA, Deb K et al (eds) GECCO-2003. LNCS, Chicago, IL, USA.
Springer, Heidelberg
207. Silva S, Costa E (2004) Dynamic limits for bloat control-variations on size and depth. In: Deb K,
Poli R, Banzhaf W et al. (eds) GECCO-2004, Seattle, WA, USA. LNCS. Springer, Heidelberg,
pp 666–677
208. Silva S, Silva PJN (2005) Costa E. Resource-limited genetic programming: Replacing tree depth lim-
its. In: Ribeiro B, Albrecht RF, Dobnikar A et al (eds) ICANNGA-2005, Coimbra, Portugal. Springer,
Heidelberg, pp 243–246
209. Silva S, Costa E (2005a) Resource-limited genetic programming: the dynamic approach, genetic and evo-
lutionary computation conference. In: Proceedings of the 2005 conference on Genetic and evolutionary
computation, pp 1673–1680
210. Silva S, Costa E (2005b) Comparing tree depth limits and resource-limited GP. In: The 2005 IEEE
congress on evolutionary computation, vol 1, pp 920–927
211. Soule T, Foster JA, Dickinson J (1996) Code Growth in GP, in GP. The MIT Press, Cambridge, pp
215–223
212. Soule T, Foster JA (1999) Effects of code growth and parsimony pressure on populations in genetic
programming. Evolut Comput 6(4):293–309
213. Soule T (2002) Exons and code growth in genetic programming. In: Foster A, Lutton E, Miller J, Ryan
C, Tettamanzi AGB (eds) EuroGP 2002. LNCS, vol 2278. Springer, Heidelberg, pp 142–151
214. Soule T, Heckendorn RB (2002) An analysis of the causes of code growth in genetic programming.
Genetic Program Evol Mach 3:283–309
215. Spinosa E, Pozo A (2004) Controlling the population size in genetic programming. In: Proceedings of
advances in artificial intelligence. Lecture Notes in Artificial Intelligence, pp 345–354
216. Stevens J, Heckendorn RB, Soule T (2005) Exploiting disruption aversion to control code bloat. In:
GECCO ’05: proceedings of the 2005 conference on Genetic and evolutionary computation,
pp 1605–1612
217. Stone P,Veloso M (2000) Layered Learning. In: Proceedings of 17th international conference on machine
learning. Springer, Heidelberg, pp 369–381
218. Streeter MJ (2003) The root causes of code growth in genetic programming. In: Proceedings of genetic
programming. Lecture notes in computer science, pp 443–454
219. Svangard N, Nordin P, Lloyd S (2003) Using genetic programming with negative parsimony pressure
on exons for portfolio optimization. The 2003 congress on evolutionary computation, CEC ’03, vol 2,
pp 1014–1017
220. Tackett WA (1994) Recombination, selection and the genetic construction of computer programs. PhD
dissertation, University of Southern California. Department of Electrical Engineering Systems
221. Tackett WA, Carmi A (1994) The unique implications of brood selection for genetic programming.
In: IEEE World congress on computational intelligence, proceedings of the first IEEE conference on
evolutionary computation, vol 1, pp 160–165
123
A survey and taxonomy of performance improvement of canonical genetic programming
222. Tanev I, Uozumi T, Ono K (2001) Parallel genetic programming: component object-based distributed
collaborative approach. In: Proceedings of 15th international conference on information networking,
pp 129–136
223. Terrio MD, Heywood MI (2002) Directing crossover for reduction of bloat in GP. In: IEEE proceedings
of Canadian conference on electrical and computer engineering, vol 2, pp 1111–1115
224. Tomassini M (1999) Parallel and distributed evolutionary algorithms: a review. In: Neittaanmki P,
Miettinen K, Mkel M, Periaux J (eds) Evolutionary algorithms in engineering and computer science.
Wiley, Chichester
225. Tomassini M, Vanneschi L, Fernandez F, Galeano G (2004) A study of diversity in multipopulation
genetic programming. In: Artificial evolution. Lecture notes in computer science, pp 243–255
226. Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE TransEvolut Comput,
pp 67–82
227. Walker JA, Miller JF (2007) The automatic acquisition, evolution and reuse of modules in Cartesian
genetic programming. IEEE Trans Evolut Comput (accepted)
228. Wagner N, Michalewicz Z (2001) Genetic programming with efficient population control for financial
time series prediction. In: Goodman ED (ed) GECCO-2001 late breaking papers, San Francisco, CA,
USA, pp 458–462
229. Wang G, Soule T (2004) How to choose appropriate function sets for genetic programming. In: Pro-
ceedings of genetic programming. Lecture notes in computer science, pp 198–207
230. Whigham PA (1995a) Grammatically-based genetic programming. In: Rosca JP (ed) Proceedings of
the workshop on genetic programming: from theory to real-world applications, Tahoe City, California,
USA, 1995, pp 33–41
231. Whigham PA (1995b) A schema theorem for context-free grammars. In: IEEE conference on evolution-
ary computation, vol 1. IEEE Press, New York, pp 178–181
232. Wieczorek W, Czech ZJ (2000) Grammars in genetic programming. Control Cybern 29(4):1019–1030
233. Wong ML, Leung KS (1995a) Applying logic grammars to induce sub-functions in genetic program-
ming. Evolut Comput 2:737–740
234. Wong ML, Leung KS (1995b) Combining genetic programming and inductive logic programming using
logic grammars. In: IEEE international conference on evolutionary computation, vol 2, pp 733–736
235. Wong ML, Leung KS (2000) Data mining using grammar based genetic programming and applications.
Kluwer, Boston
236. Wong P, Zhang M (2006) Algebraic simplification of GP programs during evolution. In: GECCO ’06:
proceedings of the 8th annual conference on genetic and evolutionary computation, pp 927–934
237. Woodward JR (2006) Complexity and Cartesian genetic programming. In: Proceedings of genetic pro-
gramming. Lecture notes in computer science, pp 260–269
238. Wyns B, Sette S, Boullart L (2004) Self-improvement to control code growth in genetic programming.
In: Artificial evolution, Lecture notes in computer science, pp 256–266
239. Wyns B, Boullart L, De Smedt PJ (2007) Limiting code growth to improve robustness in tree-based
genetic programming. In: GECCO ’07: proceedings of the 9th annual conference on genetic and evolu-
tionary computation, pp 1763
240. Xie HY (2005) Diversity control in GP with ADF for regression tasks. In: AI 2005: advances in artificial
intelligence. Lecture Notes In Artificial Intelligence, pp 1253–1257
241. Xie HY, Zhang MJ, Andreae P (2006a) Population clustering in genetic programming. In: Proceedings
of genetic programming. Lecture notes in computer science, pp 190–201
242. Xie HY, Zhang, Andreae P (2006b) Automatic selection pressure control in genetic programming. In:
Sixth international conference on intelligent systems design and applications, ISDA ’06, vol 1, pp 435–
440
243. Xie H, Zhang M, Andreae P (2007a) An analysis of constructive crossover and selection pressure in
genetic programming, GECCO, pp 1739–1748
244. Xie H, Zhang M, Andreae P (2007b) Another investigation on tournament selection: modelling and
visualisation. GECCO, pp 1468–1475
245. Yanagiya M (1995) Efficient genetic programming based on binary decision diagrams. IEEE Int Conf
Evolut Comput 1:234–239
246. Yuen CC (2004) Selective crossover using gene dominance as an adaptive strategy for genetic program-
ming. Msc intelligent systems, University College, London, UK
247. Zhang L, Nandi AK (2007) Neutral offspring controlling operators in genetic programming. Pattern
Recogn 40(10):2696–2705
248. Zhang BT, Miihlenbein H (1995) Balancing accuracy and parsimony in genetic programming. Evolut
Comput 3(1):17–38
123
P. Kouchakpour et al.
249. Zhang H, Lu YN, Wang F (2003) Grammar based genetic programming using linear representations.
Chin J Elect 12(1):75–78
250. Zhang M, Gao X, Lou W (2006a) Looseness Controlled Crossover in GP for Object Recognition. In:
IEEE congress on evolutionary computation, CEC 2006, pp 1285–1292
251. Zhang MJ, Wong P, Qian DP (2006b) Online program simplification in genetic programming, source.
In: Proceedings of simulated evolution and learning. Lecture notes in computer science, pp 592–600
252. Zhang MJ, Gao XY, Lou WJ, Qian DP (2006c) Investigation of brood size in GP with brood recombina-
tion crossover for object recognition. In: Proceedings of PRICAI 2006: trends in artificial intelligence.
Lecture Notes in Artificial Intelligence, pp 923–928
253. Zhang YQ, Chen HS (2006) Improved approach of Genetic Programming and applications for data
mining. In: Advances in natural computation, pt 1. Lecture notes in computer science, pp 816–819
254. Zhang Y, Rockett PI (2005) Evolving optimal feature extraction using multi-objective genetic pro-
gramming: a methodology and preliminary study on edge detection. In: Beyer et al (eds) Genetic and
evolutionary computation conference (GECCO 2005), pp 795–802
255. Zhang Y, Rockett PI (2006) Feature extraction using multi-objective genetic programming. In: Jin Y
(ed) Multi-objective machine learning. Springer, Heidelberg
256. Zitzler E, ThieIe L (1998) An evolutionary algorithm for multiobjective optimization: the strength pareto
approach. Swiss Federal Institute of Technology (ETH) Zurich, TIK-Report, No 43
257. Zitzler E, Thiele L (1999) Multiobjective evolutionary algorithms: a comparative case study and the
strength pareto approach. IEEE Trans Evolut Comput 3(4):257–27
258. Zitzler E, Laumanns M, Thiele L (2001) SPEA2: improving the performance of the strength pareto
evolutionary algorithm, technical Report 103, Computer Engineering and Communication Networks
Lab (TLK). Swiss Federal Institute of Technology (ETH) Zurich
259. Zvada S, Vanyi B (2004) Improving grammar-based evolutionary algorithms via attributed derivation
trees. In: Proceedings of genetic programming. Lecture notes in computer science, pp 208–219
Author Biographies
Peyman Kouchakpour received his Bachelor of engineering (majoring
in Electronics) and his Honours (majoring in Robotics) from the Uni-
versity of Western Australia (UWA), Perth. He completed his Doctorate
(PhD) in the field of Genetic Programming at UWA. Dr. Kouchakpour
has over 15years of professional engineering experience working at
world leading companies in telecommunications. He has also taught
many undergraduate courses at the University of Western Australia.
Dr. Kouchakpour’s research interests are artificial intelligence, genetic
programming, robotics, and biomedical engineering.
Anthony Zaknich is currently an Adjunct Associate Professor with
the Centre for Intelligent Information Processing Systems (CIIPS) in
the School of Electrical, Electronic and Computer Engineering at UWA.
From 1990 to 1999 he held the position of Technical Manager for Indus-
try Projects working as a Research Fellow and Lecturer at CIIPS in
the Electrical and Electronics Engineering Department, UWA. His main
work at CIIPS has been involvedwith supervision, teaching, research and
development related to signal processing and artificial neural networks
at the undergraduate, postgraduate and professional-development levels.
Previously, he was involved in the research and development of under-
water control and acoustic signalling systems in private enterprise, and
also in the establishment of a public company, Nautronix Ltd, producing
and marketing products in these areas for the international market.
123
A survey and taxonomy of performance improvement of canonical genetic programming
Thomas Bräunl is Associate Professor at the University of Western
Australia, Perth, where he founded and directs the Robotics and Auto-
mation Lab and is also Director of the Centre for Intelligent Information
Processing Systems (CIIPS). Professor Bräunl received a Diploma in
Informatics in 1986 from Univ. Kaiserslautern, an MS in Computer Sci-
ence in 1987 from the University of Southern California, Los Angeles,
and a PhD and Habilitation in Informatics in 1989 and 1994, respectively,
from University of Stuttgart. He has worked in the past for BASF and
DaimlerChrysler, has been a Guest Professor at Technical Univ. Munich
and Santa Clara University, and has founded a company for innovative
mobile robot design. Professor Bräunl’s research interests are robotics,
vision, graphics, and concurrency. He is author of several research books
and textbooks and has developed the EyeBot mobile robot family.
123