Genetic Programming and Evolvable Machines

Published by Springer Nature
Online ISSN: 1573-7632
Print ISSN: 1389-2576
Learn more about this page
Recent publications
Content generation is one of the major challenges in the modern age. The video game industry is no exception and the ever-increasing demand for bigger titles containing vast volumes of content has become one of the vital challenges for the content generation domain. Conventional game development as a human product is not cost efficient and the need for more intelligent, advanced and procedural methods is evident in this field. In a sense, procedural content generation (PCG) is a Non-deterministic Polynomial-Hard optimization problem in which specific metrics should be optimized. In this paper, we use the Estimation of Distribution Algorithm (EDA) to optimize the task of PCG in digital video games. EDA is an evolutionary stochastic optimization method and the introduction of probabilistic modeling as one of the main features of EDA into this problem domain is a reliable way to mathematically apply human knowledge to the challenging field of content generation. Acceptable performance of the proposed method is reflected in the results, which can inform the academia of PCG and contribute to the game industry.
When applying evolutionary algorithms to circuit design automation, circuit representation is the first consideration. There have been several studies applying different circuit representations. However, they still have some problems, such as lack of design ability, which means the diversity of evolved circuits was limited by the circuit representation, and inefficient transformation from circuit representation into SPICE (Simulation Program with Integrated Circuit Emphasis) netlist. In this paper, a novel tree-based circuit representation for analog circuits is proposed, which is equipped with an intuitive and three-terminal devices friendly mapping rule between circuit representation and SPICE netlist, as well as a suitable crossover operator. Based on the proposed representation, a framework for automated analog circuit design using genetic programming is proposed to evolve both the circuit topology and device values. Three benchmark circuits are applied to evaluate the proposed approach, showing that the proposed method is feasible and evolves analog circuits with better fitness and number of components while using less fitness evaluations than existing approaches. Furthermore, considering physical scalability limits of conventional circuit elements and the increased interest in emerging technologies, a memristor-based pulse generation circuit is also evolved based on the proposed method. The feasibility of the evolved circuits is verified by circuit simulation successfully. The experiment results show that the evolved memristive circuit is more compact and has better energy efficiency compared with existing manually-designed circuits.
Biological systems are very robust to morphological damage, but artificial systems (robots) are currently not. In this paper we present a system based on neural cellular automata, in which locomoting robots are evolved and then given the ability to regenerate their morphology from damage through gradient-based training. Our approach thus combines the benefits of evolution to discover a wide range of different robot morphologies, with the efficiency of supervised training for robustness through differentiable update rules. The resulting neural cellular automata are able to grow virtual robots capable of regaining more than 80% of their functionality, even after severe types of morphological damage.
Automated neural architecture search (NAS) methods are now employed to routinely deliver high-quality neural network architectures for various challenging data sets and reduce the designer’s effort. The NAS methods utilizing multi-objective evolutionary algorithms are especially useful when the objective is not only to minimize the network error but also to reduce the number of parameters (weights) or power consumption of the inference phase. We propose a multi-objective NAS method based on Cartesian genetic programming for evolving convolutional neural networks (CNN). The method allows approximate operations to be used in CNNs to reduce the power consumption of a target hardware implementation. During the NAS process, a suitable CNN architecture is evolved together with selecting approximate multipliers to deliver the best trade-offs between accuracy, network size, and power consumption. The most suitable 8 × N-bit approximate multipliers are automatically selected from a library of approximate multipliers. Evolved CNNs are compared with CNNs developed by other NAS methods on the CIFAR-10 and SVHN benchmark problems.
Deep Learning has been very successful in automating the feature engineering process, widely applied for various tasks, such as speech recognition, classification, segmentation of images, time-series forecasting, among others. Deep neural networks (DNNs) incorporate the power to learn patterns through data, following an end-to-end fashion and expand the applicability in real world problems, since less pre-processing is necessary. With the fast growth in both scale and complexity, a new challenge has emerged regarding the design and configuration of DNNs. In this work, we present a study on applying an evolutionary grammar-based genetic programming algorithm (GP) as a unified approach to the design of DNNs. Evolutionary approaches have been growing in popularity for this subject as Neuroevolution is studied more. We validate our approach in three different applications: the design of Convolutional Neural Networks for image classification, Graph Neural Networks for text classification, and U-Nets for image segmentation. The results show that evolutionary grammar-based GP can efficiently generate different DNN architectures, adapted to each problem, employing choices that differ from what is usually seen in networks designed by hand. This approach has shown a lot of promise regarding the design of architectures, reaching competitive results with their counterparts.
The two simplified Push solutions generated by random search to the GCD problem
For the past seven years, researchers in genetic programming and other program synthesis disciplines have used the General Program Synthesis Benchmark Suite (PSB1) to benchmark many aspects of systems that conduct programming by example, where the specifications of the desired program are given as input/output pairs. PSB1 has been used to make notable progress toward the goal of general program synthesis: automatically creating the types of software that human programmers code. Many of the systems that have attempted the problems in PSB1 have used it to demonstrate performance improvements granted through new techniques. Over time, the suite has gradually become outdated, hindering the accurate measurement of further improvements. The field needs a new set of more difficult benchmark problems to move beyond what was previously possible and ensure that systems do not overfit to one benchmark suite. In this paper, we describe the 25 new general program synthesis benchmark problems that make up PSB2, a new benchmark suite. These problems are curated from a variety of sources, including programming katas and college courses. We selected these problems to be more difficult than those in the original suite, and give results using PushGP showing this increase in difficulty. We additionally give an example of benchmarking using a state-of-the-art parent selection method, showing improved performance on PSB2 while still leaving plenty of room for improvement. These new problems will help guide program synthesis research for years to come.
The Bayesian Optimization Algorithm (BOA) is one of the most prominent Estimation of Distribution Algorithms. It can detect the correlation between multiple variables and extract knowledge on regular patterns in solutions. Bayesian Networks (BNs) are used in BOA to represent the probability distributions of the best individuals. The BN’s construction is challenging since there is a trade-off between acuity and computational cost to generate it. This trade-off is determined by combining a search algorithm (SA) and a scoring metric (SM). The SA is responsible for generating a promising BN and the SM assesses the quality of such networks. Some studies have already analyzed how this relationship affects the learning process of a BN. However, such investigation had not yet been performed to determine the bond linking the selection of SA and SM and the BOA’s output quality. Acting on this research gap, a detailed comparative analysis involving two constructive heuristics and four scoring metrics is presented in this work. The classic version of BOA was applied to discrete and continuous optimization problems using binary and floating-point representations. The scenarios were compared through graphical analyses, statistical metrics, and difference detection tests. The results showed that the selection of SA and SM affects the quality of the BOA results since scoring metrics that penalize complex BN models perform better than metrics that do not consider the complexity of the networks. This study contributes to a discussion on this metaheuristic’s practical use, assisting users with implementation decisions.
The proper management of diversity is essential to the success of Evolutionary Algorithms. Specifically, methods that explicitly relate the amount of diversity maintained in the population to the stopping criterion and elapsed period of execution, with the aim of attaining a gradual shift from exploration to exploitation, have been particularly successful. However, in the area of Genetic Programming, the performance of this design principle has not been studied. In this paper, a novel Genetic Programming method, Genetic Programming with Dynamic Management of Diversity (GP-DMD), is presented. GP-DMD applies this design principle through a replacement strategy that combines penalties based on distance-like functions with a multi-objective Pareto selection based on accuracy and simplicity. The proposed general method was adapted to the well-established Symbolic Regression benchmark problem using tree-based Genetic Programming. Several state-of-the-art diversity management approaches were considered for the experimental validation, and the results obtained showcase the improvements both in terms of mean square error and size. The effects of GP-DMD on the dynamics of the population are also analyzed, revealing the reasons for its superiority. As in other fields of Evolutionary Computation, this design principle contributes significantly to the area of Genetic Programming.
Protein folding is the dynamic process by which a protein folds into its final native structure. This is different to the traditional problem of the prediction of the final protein structure, since it requires a modeling of how protein components interact over time to obtain the final folded structure. In this study we test whether a model of the folding process can be obtained exclusively through machine learning. To this end, protein folding is considered as an emergent process and the cellular automata tool is used to model the folding process. A neural cellular automaton is defined, using a connectionist model that acts as a cellular automaton through the protein chain to define the dynamic folding. Differential evolution is used to automatically obtain the optimized neural cellular automata that provide protein folding. We tested the methods with the Rosetta coarse-grained atomic model of protein representation, using different proteins to analyze the modeling of folding and the structure refinement that the modeling can provide, showing the potential advantages that such methods offer, but also difficulties that arise.
We introduce GPLS (Genetic Programming for Linear Systems) as a GP system that finds mathematical expressions defining an iteration matrix. Stationary iterative methods use this iteration matrix to solve a system of linear equations numerically. GPLS aims at finding iteration matrices with a low spectral radius and a high sparsity, since these properties ensure a fast error reduction of the numerical solution method and enable the efficient implementation of the methods on parallel computer architectures. We study GPLS for various types of system matrices and find that it easily outperforms classical approaches like the Gauss–Seidel and Jacobi methods. GPLS not only finds iteration matrices for linear systems with a much lower spectral radius, but also iteration matrices for problems where classical approaches fail. Additionally, solutions found by GPLS for small problem instances show also good performance for larger instances of the same problem.
In this paper we investigate the benefits of applying a multi-objective approach for solving a symbolic regression problem by means of Grammatical Evolution. In particular, we extend previous work, obtaining mathematical expressions to model glucose levels in the blood of diabetic patients. Here we use a multi-objective Grammatical Evolution approach based on the NSGA-II algorithm, considering the root-mean-square error and an ad-hoc fitness function as objectives. This ad-hoc function is based on the Clarke Error Grid analysis, which is useful for showing the potential danger of mispredictions in diabetic patients. In this work, we use two datasets to analyse two different scenarios: What-if and Agnostic , the most common in daily clinical practice. In the What-if scenario, where future events are evaluated, results show that the multi-objective approach improves previous results in terms of Clarke Error Grid analysis by reducing the number of dangerous mispredictions. In the Agnostic situation, with no available information about future events, results suggest that we can obtain good predictions with only information from the previous hour for both Grammatical Evolution and Multi-Objective Grammatical Evolution.
In some situations, the interpretability of the machine learning models plays a role as important as the model accuracy. Interpretability comes from the need to trust the prediction model, verify some of its properties, or even enforce them to improve fairness. Many model-agnostic explanatory methods exists to provide explanations for black-box models. In the regression task, the practitioner can use white-boxes or gray-boxes models to achieve more interpretable results, which is the case of symbolic regression. When using an explanatory method, and since interpretability lacks a rigorous definition, there is a need to evaluate and compare the quality and different explainers. This paper proposes a benchmark scheme to evaluate explanatory methods to explain regression models, mainly symbolic regression models. Experiments were performed using 100 physics equations with different interpretable and non-interpretable regression methods and popular explanation methods, evaluating the performance of the explainers performance with several explanation measures. In addition, we further analyzed four benchmarks from the GP community. The results have shown that Symbolic Regression models can be an interesting alternative to white-box and black-box models that is capable of returning accurate models with appropriate explanations. Regarding the explainers, we observed that Partial Effects and SHAP were the most robust explanation models, with Integrated Gradients being unstable only with tree-based models. This benchmark is publicly available for further experiments.
In this paper we examine the concept of complexity as it applies to generative and evolutionary art and design. Complexity has many different, discipline specific definitions, such as complexity in physical systems (entropy), algorithmic measures of information complexity and the field of “complex systems”. We apply a series of different complexity measures to three different evolutionary art datasets and look at the correlations between complexity and individual aesthetic judgement by the artist (in the case of two datasets) or the physically measured complexity of generative 3D forms. Our results show that the degree of correlation is different for each set and measure, indicating that there is no overall “better” measure. However, specific measures do perform well on individual datasets, indicating that careful choice can increase the value of using such measures. We then assess the value of complexity measures for the audience by undertaking a large-scale survey on the perception of complexity and aesthetics. We conclude by discussing the value of direct measures in generative and evolutionary art, reinforcing recent findings from neuroimaging and psychology which suggest human aesthetic judgement is informed by many extrinsic factors beyond the measurable properties of the object being judged.
GNGP structure for regular grammars
GNGP structure for context-free grammars
GNGP structure for context-sensitive grammars
GNGP structure for phrase structure (non-restricted) grammars
The Networks of Genetic Processors (NGPs) are non-conventional models of computation based on genetic operations over strings, namely mutation and crossover operations as it was established in genetic algorithms. Initially, they have been proposed as acceptor machines which are decision problem solvers. In that case, it has been shown that they are universal computing models equivalent to Turing machines. In this work, we propose NGPs as enumeration devices and we analyze their computational power. First, we define the model and we propose its definition as parallel genetic algorithms. Once the correspondence between the two formalisms has been established, we carry out a study of the generation capacity of the NGPs under the research framework of the theory of formal languages. We investigate the relationships between the number of processors of the model and its generative power. Our results show that the number of processors is important to increase the generative capability of the model up to an upper bound, and that NGPs are universal models of computation if they are formulated as generation devices. This allows us to affirm that parallel genetic algorithms working under certain restrictions can be considered equivalent to Turing machines and, therefore, they are universal models of computation.
Syntax tree representation of regular expression ^ab[^c]
The first four solutions generated by the swap operator if applied to the base regular expression (shown on the left-side)
The first four solutions generated by the concatenation operator if applied to the base regular expression (shown on the left-side)
Distributions of fitness for the solutions found by the local search on an instance basis
Regular expression is a technology widely used in software development for extracting textual data, validating the structure of textual documents, or formatting data. Regex Golf is a challenge that consists in finding the smallest possible regular expression given a set of sentences to perform matches and another set not to match. An algorithm capable of meeting the Regex Golf requirements is a relevant contribution to the area of semi-structured document data extraction. In this paper, we propose a heuristic search algorithm based on local search, combined with a regular expression shrinker, to find valid results for Regex Golf problems. An experimental study was conducted to compare the proposed technique with an exact algorithm and a genetic programming algorithm designed for the Regex Golf challenge. The proposed local search was shown to outperform both competing algorithms in six out of fifteen problem instances, tying in another three instances. On the other hand, all algorithms still lack the ability to outperform human software developers in designing regular expressions for the challenge.
We study both genotypic and phenotypic convergence in GP floating point continuous domain symbolic regression over thousands of generations. Subtree fitness variation across the population is measured and shown in many cases to fall. In an expanding region about the root node, both genetic opcodes and function evaluation values are identical or nearly identical. Bottom up (leaf to root) analysis shows both syntactic and semantic (including entropy) similarity expand from the outermost node. Despite large regions of zero variation, fitness continues to evolve and near zero crossover disruption suggests improved GP systems within existing memory use.
Example posterior probability density function. See text for a further explanation
This paper extends the numerical tuning of tree constants in genetic programming (GP) to the multiobjective domain. Using ten real-world benchmark regression datasets and employing Bayesian comparison procedures, we first consider the effects of feature standardization (without constant tuning) and conclude that standardization generally produces lower test errors, but, contrary to other recently published work, we find or{blue}{a much less clear trend for} tree sizes. In addition, we consider the effects of constant tuning -- with and without feature standardization -- and observe that i) constant tuning invariably improves test error, and ii) usually decreases tree size. Combined with standardization, constant tuning produces the best test error results; tree sizes, however, are increased. We also examine the effects of applying constant tuning only once at the end a conventional GP run which turns out to be surprisingly promising. Finally, we consider the merits of using numerical procedures to tune tree constants and observe that for around half the datasets evolutionary search alone is superior whereas for the remaining half, parameter tuning is superior. We identify a number of open research questions that arise from this work.
Time series data is often composed of a multitude of individual, superimposed dynamics. We propose a novel algorithm for inferring time series compositions through evolutionary synchronization of modular networks (ESMoN). ESMoN orchestrates a set of trained dynamic modules, assuming that some of those modules’ dynamics, suitably parameterized, will be present in the targeted time series. With the help of iterative co-evolution techniques, ESMoN optimizes the activities of its modules dynamically, which effectively synchronizes the system with the unfolding time series signal and distributes the dynamic subcomponents present in the time series over the respective modules. We show that ESMoN can adapt modules of different types. Moreover, it is able to precisely identify the signal components of various time series dynamics. We thus expect that ESMoN will be useful also in other domains—including, for example, medical, physical, and behavioral data domains—where the data is composed of known signal sources.
This work uses genetic programming to explore the space of continuous optimisers, with the goal of discovering novel ways of doing optimisation. In order to keep the search space broad, the optimisers are evolved from scratch using Push, a Turing-complete, general-purpose, language. The resulting optimisers are found to be diverse, and explore their optimisation landscapes using a variety of interesting, and sometimes unusual, strategies. Significantly, when applied to problems that were not seen during training, many of the evolved optimisers generalise well, and often outperform existing optimisers. This supports the idea that novel and effective forms of optimisation can be discovered in an automated manner. This paper also shows that pools of evolved optimisers can be hybridised to further increase their generality, leading to optimisers that perform robustly over a broad variety of problem types and sizes.
A fundamental aspect of intelligent agent behaviour is the ability to encode salient features of experience in memory and use these memories, in combination with current sensory information, to predict the best action for each situation such that long-term objectives are maximized. The world is highly dynamic, and behavioural agents must generalize across a variety of environments and objectives over time. This scenario can be modeled as a partially-observable multi-task reinforcement learning problem. We use genetic programming to evolve highly-generalized agents capable of operating in six unique environments from the control literature, including OpenAI’s entire Classic Control suite. This requires the agent to support discrete and continuous actions simultaneously. No task-identification sensor inputs are provided, thus agents must identify tasks from the dynamics of state variables alone and define control policies for each task. We show that emergent hierarchical structure in the evolving programs leads to multi-task agents that succeed by performing a temporal decomposition and encoding of the problem environments in memory. The resulting agents are competitive with task-specific agents in all six environments. Furthermore, the hierarchical structure of programs allows for dynamic run-time complexity, which results in relatively efficient operation.
Semantic GP is a promising branch of GP that introduces semantic awareness during genetic evolution to improve various aspects of GP. This paper presents a new Semantic GP approach based on Dynamic Target (SGP-DT) that divides the search problem into multiple GP runs. The evolution in each run is guided by a new (dynamic) target based on the residual errors of previous runs. To obtain the final solution, SGP-DT combines the solutions of each run using linear scaling. SGP-DT presents a new methodology to produce the offspring that does not rely on the classic crossover. The synergy between such a methodology and linear scaling yields final solutions with low approximation error and computational cost. We evaluate SGP-DT on eleven well-known data sets and compare with ϵ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\epsilon$$\end{document}-lexicase, a state-of-the-art evolutionary technique, and seven Machine Learning techniques. SGP-DT achieves small RMSE values, on average 23.19% smaller than the one of ϵ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\epsilon$$\end{document}-lexicase. Tuning SGP-DT ’s configuration greatly reduces the computational cost while still obtaining competitive results.
Cartesian genetic programming (CGP) represents the most efficient method for the evolution of digital circuits. Despite many successful applications, however, CGP suffers from limited scalability, especially when used for evolutionary circuit design, i.e. design of circuits from a randomly initialized population. Considering the multiplier design problem, for example, the 5\(\times\)5-bit multiplier represents the most complex circuit designed by the evolution from scratch. The efficiency of CGP highly depends on the performance of the point mutation operator, however, this operator is purely stochastic. This contrasts with the recent developments in genetic programming (GP), where advanced informed approaches such as semantic-aware operators are incorporated to improve the search space exploration capability of GP. In this paper, we propose a semantically-oriented mutation operator (\(\mathrm {SOMO}^k\)) suitable for the evolutionary design of combinational circuits. In contrast to standard point mutation modifying the values of the mutated genes randomly, the proposed operator uses semantics to determine the best value for each mutated gene. Compared to the common CGP and its variants, the proposed method converges on common Boolean benchmarks substantially faster while keeping the phenotype size relatively small. The successfully evolved instances presented in this paper include 10-bit parity, 10 + 10-bit adder and 5\(\times\)5-bit multiplier. The most complex circuits were evolved in less than one hour with a single-thread implementation running on a common CPU.
Number of individuals with loops (y-axis) for each generation (x-axis). Each line represents a single run, and the incomplete lines represent runs that succeeded in finding a solution before hitting the generation limit of 300.
Total error values of individuals (x-axis) and whether or not they execute loops (y-axis). Note that the x-axis values for all the subplots are different.
In genetic programming, parent selection methods are employed to select promising candidate individuals from the current generation that can be used as parents for the next generation. These algorithms can affect, sometimes indirectly, whether or not individuals containing certain programming constructs, such as loops, are selected and propagated in the population. This in turn can affect the chances that the population will produce a solution to the problem. In this paper, we present the results of the experiments using three different parent selection methods on four benchmark program synthesis problems. We analyze the relationships between the selection methods, the numbers of individuals in the population that make use of loops, and success rates. The results show that the support for the selection of specialists is associated both with the use of loops in evolving populations and with higher success rates.
Graph representations promise several desirable properties for genetic programming (GP); multiple-output programs, natural representations of code reuse and, in many cases, an innate mechanism for neutral drift. Each graph GP technique provides a program representation, genetic operators and overarching evolutionary algorithm. This makes it difficult to identify the individual causes of empirical differences, both between these methods and in comparison to traditional GP. In this work, we empirically study the behaviour of Cartesian genetic programming (CGP), linear genetic programming (LGP), evolving graphs by graph programming and traditional GP. By fixing some aspects of the configurations, we study the performance of each graph GP method and GP in combination with three different EAs: generational, steady-state and $$(1+\lambda )$$ ( 1 + λ ) . In general, we find that the best choice of representation, genetic operator and evolutionary algorithm depends on the problem domain. Further, we find that graph GP methods can increase search performance on complex real-world regression problems and, particularly in combination with the ( $$1 + \lambda$$ 1 + λ ) EA, are significantly better on digital circuit synthesis tasks. We further show that the reuse of intermediate results by tuning LGP’s number of registers and CGP’s levels back parameter is of utmost importance and contributes significantly to better convergence of an optimization algorithm when solving complex problems that benefit from code reuse.
Reversible Cellular Automata (RCA) are a particular kind of shift-invariant transformations characterized by dynamics composed only of disjoint cycles. They have many applications in the simulation of physical systems, cryptography, and reversible computing. In this work, we formulate the search of a specific class of RCA – namely, those whose local update rules are defined by conserved landscapes – as an optimization problem to be tackled with Genetic Algorithms (GA) and Genetic Programming (GP). In particular, our experimental investigation revolves around three different research questions, which we address through a single-objective, a multi-objective, and a lexicographic approach. In the single-objective approach, we observe that GP can already find an optimal solution in the initial population. This indicates that evolutionary algorithms are not needed when evolving only the reversibility of such CA, and a more efficient method is to generate at random syntactic trees that define the local update rule. On the other hand, GA and GP proved to be quite effective in the multi-objective and lexicographic approach to (1) discover a trade-off between the reversibility and the Hamming weight of conserved landscape rules, and (2) observe that conserved landscape CA cannot be used in symmetric cryptography because their Hamming weight (and thus their nonlinearity) is too low.
For many systems of linear equations that arise from the discretization of partial differential equations, the construction of an efficient multigrid solver is challenging. Here we present EvoStencils, a novel approach for optimizing geometric multigrid methods with grammar-guided genetic programming, a stochastic program optimization technique inspired by the principle of natural evolution. A multigrid solver is represented as a tree of mathematical expressions that we generate based on a formal grammar. The quality of each solver is evaluated in terms of convergence and compute performance by automatically generating an optimized implementation using code generation that is then executed on the target platform to measure all relevant performance metrics. Based on this, a multi-objective optimization is performed using a non-dominated sorting-based selection. To evaluate a large number of solvers in parallel, they are distributed to multiple compute nodes. We demonstrate the effectiveness of our implementation by constructing geometric multigrid solvers that are able to outperform hand-crafted methods for Poisson’s equation and a linear elastic boundary value problem with up to 16 million unknowns on multi-core processors with Ivy Bridge and Broadwell microarchitecture.
We introduce and experimentally demonstrate the utility of tag-based genetic regulation, a new genetic programming (GP) technique that allows programs to dynamically adjust which code modules to express.Tags are evolvable labels that provide a flexible mechanism for referencing code modules. Tag-based genetic regulation extends existing tag-based naming schemes to allow programs to “promote” and “repress” code modules in order to alter expression patterns. This extension allows evolution to structure a program as a gene regulatory network where modules are regulated based on instruction executions. We demonstrate the functionality of tag-based regulation on a range of program synthesis problems. We find that tag-based regulation improves problem-solving performance on context-dependent problems; that is, problems where programs must adjust how they respond to current inputs based on prior inputs. Indeed, the system could not evolve solutions to some context-dependent problems until regulation was added. Our implementation of tag-based genetic regulation is not universally beneficial, however. We identify scenarios where the correct response to a particular input never changes, rendering tag-based regulation an unneeded functionality that can sometimes impede adaptive evolution. Tag-based genetic regulation broadens our repertoire of techniques for evolving more dynamic genetic programs and can easily be incorporated into existing tag-enabled GP systems.
An example of a GP-REG individual
Introducing new decision boundaries by window shifts
Averages for RMSE on generalization per spatial severity
Averages for RMSE on generalization per temporal severity
Various machine learning techniques exist to perform regression on temporal data with concept drift occurring. However, there are numerous nonstationary environments where these techniques may fail to either track or detect the changes. This study develops a genetic programming-based predictive model for temporal data with a numerical target that tracks changes in a dataset due to concept drift. When an environmental change is evident, the proposed algorithm reacts to the change by clustering the data and then inducing nonlinear models that describe generated clusters. Nonlinear models become terminal nodes of genetic programming model trees. Experiments were carried out using seven nonstationary datasets and the obtained results suggest that the proposed model yields high adaptation rates and accuracy to several types of concept drifts. Future work will consider strengthening the adaptation to concept drift and the fast implementation of genetic programming on GPUs to provide fast learning for high-speed temporal data.
When dealing with a new time series classification problem, modellers do not know in advance which features could enable the best classification performance. We propose an evolutionary algorithm based on grammatical evolution to attain a data-driven feature-based representation of time series with minimal human intervention. The proposed algorithm can select both the features to extract and the sub-sequences from which to extract them. These choices not only impact classification performance but also allow understanding of the problem at hand. The algorithm is tested on 30 problems outperforming several benchmarks. Finally, in a case study related to subject authentication, we show how features learned for a given subject are able to generalise to subjects unseen during the extraction phase.
Genetic Network Programming (GNP) is a relatively recently proposed evolutionary algorithm which is an extension of Genetic Programming (GP). However, individuals in GNP have graph structures. This algorithm is mainly used in decision making process of agent control problems. It uses a graph to make a flowchart and use this flowchart as a decision making strategy that an agent must follow to achieve the goal. One of the most important weaknesses of this algorithm is that crossover and mutation break the structures of individuals during the evolution process. Although it can lead to better structures, this may break suitable ones and increase the time needed to achieve optimal solutions. Meanwhile, all the researches in this field are dedicated to test GNP in deterministic environments. However, most of the real-world problems are stochastic and this is another issue that should be addressed. In this research, we try to find a mechanism that GNP shows better performance in stochastic environments. In order to achieve this goal, the evolution process of GNP was modified. In the proposed method, the experience of promising individuals was saved in consecutive generations. Then, to generate offspring in some predefined number of generations, the saved experiences were used instead of crossover and mutation. The experimental results of the proposed method were compared with GNP and some of its versions in both deterministic and stochastic environments. The results demonstrate the superiority of our proposed method in both deterministic and stochastic environments.
Example pipelines for model configurations used in this study. a Stacked NN strategy with no AutoML. b Example of a standard (no neural networks) TPOT pipeline containing a logistic regression classifier and a kernel SVM classifier. c Example of a TPOT-NN pipeline containing two multilayer perceptron estimators
Distributions of accuracy scores for TPOT deployed in various configurations on 6 well-studied public datasets. Each distribution consists of 30 experiments using the same initial TPOT configuration on the same dataset
CPU clock time distributions for training TPOT on each of the 6 evaluation datasets. In most cases, TPOT configurations containing PyTorch neural network estimators require longer to train than “base” TPOT configurations, with the overall effect scaling proportionally with the size of the dataset
Randomly selected pipeline learned when restricting TPOT’s pool of estimators to logistic regression classifiers only. Some redundant components, such as make_pipeline function calls, are omitted. Notably, the structure of this pipeline resembles one of the key components of the popular ResNet architecture, which suggests that other motifs learned by TPOT-NN may be possible to expand into deeper architectures
Automated machine learning (AutoML) and artificial neural networks (ANNs) have revolutionized the field of artificial intelligence by yielding incredibly high-performing models to solve a myriad of inductive learning tasks. In spite of their successes, little guidance exists on when to use one versus the other. Furthermore, relatively few tools exist that allow the integration of both AutoML and ANNs in the same analysis to yield results combining both of their strengths. Here, we present TPOT-NN—a new extension to the tree-based AutoML software TPOT—and use it to explore the behavior of automated machine learning augmented with neural network estimators (AutoML+NN), particularly when compared to non-NN AutoML in the context of simple binary classification on a number of public benchmark datasets. Our observations suggest that TPOT-NN is an effective tool that achieves greater classification accuracy than standard tree-based AutoML on some datasets, with no loss in accuracy on others. We also provide preliminary guidelines for performing AutoML+NN analyses, and recommend possible future directions for AutoML+NN methods research, especially in the context of TPOT.
It is crucial in the field of image steganography to find an algorithm for hiding information by using various combinations of compression techniques. The primary factors in this research are maximizing the capacity and improving the quality of the image. The image quality cannot be compromised up to a certain level as it breaks the concept of steganography by getting distorted visibly. The second primary factor is maximizing the data-carrying/embedding capacity, which makes the use of this technique more efficient. In this paper, we are proposing an image steganography tool by using Huffman Encoding and Particle Swarm Optimization, which will improve the performance of the information hiding scheme and improve overall efficiency. The combinational technique of Huffman PSO not only offers higher information embedment capabilities but also maintains the image quality. The experimental analysis and results on cover images along with different sizes of secret messages validate that the proposed HPSO scheme has superior results using parameters Peak-Signal-to-Noise-Ratio, Mean Square Error, Bit Error Rate, and Structural Similarity Index. It is also robust against statistical attacks.
Designing a Recurrent Neural Network to extract sentiment from tweets is a very hard task. When using memory cells in their design, the task becomes even harder due to the large number of design alternatives and the costly process of finding a performant design. In this paper we propose an original evolutionary algorithm to address the hard challenge of discovering novel Recurrent Neural Network memory cell designs for sentiment analysis on tweets. We used three different tasks to discover and evaluate the designs. We conducted experiments and the results show that the best obtained designs surpass the baselines—which are the most popular cells, LSTM and GRU. During the discovery process we evaluated roughly 17,000 cell designs. The selected winning candidate outperformed the others for the overall sentiment analysis problem, hence showing generality. We made the winner selection by using the cumulated accuracies on all three considered tasks.
Modifying standard gradient boosting by replacing the embedded weak learner in favor of a strong(er) one, we present SyRBo: symbolic-regression boosting. Experiments over 98 regression datasets show that by adding a small number of boosting stages—between 2 and 5—to a symbolic regressor, statistically significant improvements can often be attained. We note that coding SyRBo on top of any symbolic regressor is straightforward, and the added cost is simply a few more evolutionary rounds. SyRBo is essentially a simple add-on that can be readily added to an extant symbolic regressor, often with beneficial results.
This paper describes a new modification of fuzzy cognitive maps (FCMs) for the modeling of autonomous entities that make decisions in a dynamic environment. The paper offers a general design for an FCM adjusted for the decision-making of autonomous agents through the categorization of its concepts into three different classes according to their purpose in the map: Needs, Activities, and States (FCM-NAS). The classification enables features supporting decision-making, such as the easy processing of input from sensors, faster system reactions, the modeling of inner needs, the adjustable frequency of computations in a simulation, and self-evaluation of the FCM-NAS that supports unsupervised evolutionary learning. This paper presents two use cases of the proposed extension to demonstrate its abilities. It was implemented into an agent-based artificial life model, where it took advantage of all the above features in the competition for resources, natural selection, and evolution. Then, it was used as decision-making for human activity simulation in an ambient intelligence model, where it is combined with scenario-oriented mechanism proving its modularity.
Supervised learning by means of Genetic Programming (GP) aims at the evolutionary synthesis of a model that achieves a balance between approximating the target function on the training data and generalising on new data. The model space searched by the Evolutionary Algorithm is populated by compositions of primitive functions defined in a function set. Since the target function is unknown, the choice of function set’s constituent elements is primarily guided by the makeup of function sets traditionally used in the GP literature. Our work builds upon previous research of the effects of protected arithmetic operators (i.e. division, logarithm, power) on the output value of an evolved model for input data points not encountered during training. The scope is to benchmark the approximation/generalisation of models evolved using different function set choices across a range of 43 symbolic regression problems. The salient outcomes are as follows. Firstly, Koza’s protected operators of division and exponentiation have a detrimental effect on generalisation, and should therefore be avoided. This result is invariant of the use of moderately sized validation sets for model selection. Secondly, the performance of the recently introduced analytic quotient operator is comparable to that of the sinusoidal operator on average, with their combination being advantageous to both approximation and generalisation. These findings are consistent across two different system implementations, those of standard expression-tree GP and linear Grammatical Evolution. We highlight that this study employed very large test sets, which create confidence when benchmarking the effect of different combinations of primitive functions on model generalisation. Our aim is to encourage GP researchers and practitioners to use similar stringent means of assessing generalisation of evolved models where possible, and also to avoid certain primitive functions that are known to be inappropriate.
Intraday trading attempts to obtain a profit from the microstructure implicit in price data. Intraday trading implies many more transactions per stock compared to long term buy-and-hold strategies. As a consequence, transaction costs will have a more significant impact on the profitability. Furthermore, the application of existing long term portfolio selection algorithms for intraday trading cannot guarantee optimal stock selection. This implies that intraday trading strategies may require a different approach to stock selection for daily portfolios. In this work, we assume a symbiotic genetic programming framework that simultaneously coevolves the decision trees and technical indicators to generate trading signals. We generalize this approach to identify specific stocks for intraday trading using stock ranking heuristics: Moving Sharpe ratio and a Moving Average of Daily Returns. Specifically, the trading scenario adopted by this work assumes that a bag of available stocks exist. Our agent then has to both identify which subset of stocks to trade in the next trading day, and the specific buy-hold-sell decisions for each selected stock during real-time trading for the duration of the intraday period. A benchmarking comparison of the proposed ranking heuristics with stock selection performed using the well known Kelly Criterion is conducted and a strong preference for the proposed Moving Sharpe ratio demonstrated. Moreover, portfolios ranked by both the Moving Sharpe ratio and a Moving Average of Daily Returns perform significantly better than any of the comparator methods (buy-and-hold strategy, investment in the full set of 86 stocks, portfolios built from random stock selection and Kelly Criterion).
Complexity-performance plots (left) and box plots of training and testing errors (right) for the Koza-1 dataset. Legend: individual runs of GPTIPS, mGPTIPS, EFS, FFX, GSGP-Red, median RMSE of — LR, - - - RF, ⋯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\cdots $$\end{document} SVR (Color figure online)
Complexity-performance plots (both left) and box plots of training and testing errors (both right) for the Korns-11 dataset. The upper plots display the whole results, the lower ones zoom on the dense area around RMSE = 7.8. Legend: individual runs of GPTIPS, mGPTIPS, EFS, FFX, GSGP-Red, median RMSE of — LR, - - - RF, ⋯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\cdots $$\end{document} SVR (Color figure online)
Complexity-performance plots (left) and box plots of training and testing errors (right) for the S1 dataset. FFX has only a single point because both the sampling of this dataset and FFX are deterministic. Legend: individual runs of GPTIPS, mGPTIPS, EFS, FFX, GSGP-Red, median RMSE of — LR, - - - RF, ⋯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\cdots $$\end{document} SVR (Color figure online)
Complexity-performance plots (both left) and box plots of training and testing errors (both right) for the S2 dataset. FFX has only a single point because both the sampling of this dataset and FFX are deterministic. The upper plots display the whole results, the lower ones zoom on the dense area around RMSE = 1. Legend: individual runs of GPTIPS, mGPTIPS, EFS, FFX, GSGP-Red, median RMSE of — LR, - - - RF, ⋯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\cdots $$\end{document} SVR (Color figure online)
Symbolic regression (SR) is a powerful method for building predictive models from data without assuming any model structure. Traditionally, genetic programming (GP) was used as the SR engine. However, for these purely evolutionary methods it was quite hard to even accommodate the function to the range of the data and the training was consequently inefficient and slow. Recently, several SR algorithms emerged which employ multiple linear regression. This allows the algorithms to create models with relatively small error right from the beginning of the search. Such algorithms are claimed to be by orders of magnitude faster than SR algorithms based on classic GP. However, a systematic comparison of these algorithms on a common set of problems is still missing and there is no basis on which to decide which algorithm to use. In this paper we conceptually and experimentally compare several representatives of such algorithms: GPTIPS, FFX, and EFS. We also include GSGP-Red, which is an enhanced version of geometric semantic genetic programming, an important algorithm in the field of SR. They are applied as off-the-shelf, ready-to-use techniques, mostly using their default settings. The methods are compared on several synthetic SR benchmark problems as well as real-world ones ranging from civil engineering to aerodynamics and acoustics. Their performance is also related to the performance of three conventional machine learning algorithms: multiple regression, random forests and support vector regression. The results suggest that across all the problems, the algorithms have comparable performance. We provide basic recommendations to the user regarding the choice of the algorithm.
Top-cited authors
John R. Koza
  • Stanford University
Sabine Dormann
Andreas Deutsch
  • Technische Universität Dresden
Julian Francis Miller
  • The University of York
Nuno Lourenço
  • University of Coimbra