Project

Novelty Search optimization for NeuroEvolution

Goal: Study of Novelty Search optimization to be applied with method of NeuroEvolution of Augmented Topologies to solve deceptive control optimization tasks by building specialized Artificial Neural Networks. It will be studied by applying it to the maze solving by learning agent.

Updates
0 new
13
Recommendations
0 new
2
Followers
0 new
9
Reads
3 new
273

Project log

Iaroslav Omelianenko
added an update
The source code and input data used in the experiment available through GitHub: https://github.com/yaricom/goNEAT_NS
Feel free to use it for further experiments with neuro-evolution and with explainable AI systems implementation.
 
Iaroslav Omelianenko
added a research item
Search for Novelty is an universal method of biological life evolution. We decided to apply it for Autonomous Artificial Intelligent Agents breeding using Neuro-Evolution algorithm to conduct evolutionary process.
Iaroslav Omelianenko
added an update
Iaroslav Omelianenko
added an update
In this work we have tested two approaches to perform fitness function optimization with NEAT algorithm: novelty search and objective-based. The novelty search optimization was found as outperforming method for solving of deceptive tasks when strong local optima present, such as maze solving. Our experiments based on two maze environments configurations: medium and hard maze.
Medium maze results
With medium maze configuration both fitness function optimization methods was able to produce agents able to solve the maze:
  • the Novelty Search based agent was able to solve maze in 10 from 10 trials
  • the Objective-Based agent was able to solve medium maze in 9 from 10 trials
The novelty search optimization also resulted in producing more energy efficient and elegant genome for solver agent. The absolute winner with NS optimization has only 15 neurons with 19 links between (Fitness: 0.984) compared to objective-based optimization where best agent has 60 neurons with 214 links between (Fitness: 0.987). The provided fitness value describe how close final agent's position to the maze exit after 400 time steps (where 1.0 means exact match). Full statistics of experiment provided in attached file.
Hard maze results
With hard maze configuration objective-based optimization method failed to produce any agent able to solve this maze. At the same time Novelty Search based optimization is able to avoid deceptive strong local optima introduced in hard maze and produce effective solver agents in less than 300 generations over the same ten trial executions.
Conclusion
As it was shown by experimental data, the Novelty Search optimization, where fitness of agent is based on novelty of the solution it was able to find, considerably outperforms traditional objective-based optimization and even was able to solve task where traditional method failed completely.
We believe that novelty search optimization can be successfully applied to produce optimal solving agents in many areas where strong deceptive local fitness optima is blocking traditional objective-based methods from finding optimal or any solutions.
The full experiment details and source code can be found at: https://github.com/yaricom/goNEAT_NS
 
Iaroslav Omelianenko
added an update
After running maze solver against hard maze configuration with objective-based fitness function optimization algorithm we have found that it failed to solve maze in ten trials consequently. At the same time novelty search based fitness function was able to produce maze solvers able to crack hard maze configuration with same ease as medium maze configuration.
Our experiment demonstrates that Novelty Search based optimization able to avoid deceptive strong local optima introduced in hard maze and produce effective solver agents in less than 300 generations over the same ten trial executions.
Attached is the renderings of failed trials with final solving agent positions marked by color dot. The color of dots depends on species the solver organism belongs to. The top part depicts organisms with fitness greater than 0.8. It can be seen that no organism from all the populations testes was even able to exceed this threshold.
 
Iaroslav Omelianenko
added an update
In this experiment evaluated the performance of maze agent controlled by ANN which is created by NEAT algorithm with objective-based fitness optimization. The mentioned optimization is based on maximizing solving agent's fitness by following its objective, i.e. the distance from agent to exit. As with previous experiment the behavior of a navigator is defined as its ending position in a maze. The fitness function is then the squared Euclidean distance between the ending position of the agent and maze exit.
The effect of this fitness function is to reward the solving agent for ending in a place as close to the maze exit as possible.
The experiment results with medium difficulty maze:
Average
Winner Nodes: 22.0
Winner Genes: 49.0
Winner Evals: 62168.0
Mean
Complexity: 44.6
Diversity: 27.5
Age: 222.0
After 248 generations of population was found near optimal winner genome configuration able to guide maze solving agent through medium maze and approach the maze exit with spatial error of 1.8%. The artificial neural network produced by this genome has 22 units (neurons) with nine hidden neurons to model complex learned behaviour.
The genotype of the winning agent presented above has more complicated structure compared to the near optimal genome created by Novelty Search based optimization from the first experiment with more redundant neurons and links. Due to added complexity, the produced organism is less energy efficient and harder to execute at the inference time.
By comparing it with simulation based on Novelty Search optimization it may be seen that agent's final destinations is less evenly distributed through the maze space and some areas left completely unexplored.
 
Iaroslav Omelianenko
added an update
There is a visualization of hard maze solving by all agents until winner is found. The initial agent position is at the bottom-left and maze exit at the top-left of the maze. The agents is color coded based on species they belong. So, each dot of similar color is the final position of agent controlled by organism belonging to the same species.
The top plot shows final destinations of the most fit agents (fitness >= 0.8) and bottom is the rest. The fitness of agent is measured as distance from it's final position to the maze exit after 400 time steps of simulation.
From the plot we can see that winner species produced organisms that control agents in such a way that its final destinations is evenly distributed through the maze. As a result it was possible to produce control ANN able to solve the maze.
 
Iaroslav Omelianenko
added an update
Applying Novelty Search optimization with NEAT algorithm we was able to find near optimal configuration of Artificial Neural Network able to control hard maze solving agent.
After 109 generations of population was found near optimal winner genome configuration able to guide maze solving agent through hard maze and approach the maze exit with spatial error of 2.5%. The artificial neural network produced by this genome has only 17 units (neurons) with four hidden neurons to model complex learned behaviour.
The optimal genome configuration produced by growing three additional hidden units and multiple new links compared to seed genome. It's interesting to note that recurrent link at output neuron #13 (angular velocity effector) was routed through two hidden neurons in contrast with medium maze where neuron #13 was simply linked to itself. This may result in more complex behaviour learned especially taking into account that link pass through neuron #42 affected by range finder: LEFT and radar: BACK. The neuron #42 also affected by connection with neuron #643 (affected by range finder: LEFT). As a result we may assume that it learned how steer agent when maze exit is behind and wall is at the left of it, i.e. to follow the left wall by moving forward.
Other important point to note is about possible learned behaviour encoded by hidden neuron #297 - it's affected by input range finder sensors detecting distance to obstacles at RIGHT and FRONT direction. Looking at maze configuration we may assume that this neuron learned to avoid left chamber trap with extremely strong local optimum of fitness based on the distance to the maze exit.
 
Iaroslav Omelianenko
added an update
We have conducted series of experiments and found another optimal genome configuration with only 16 nodes which was found after 64 generations.
It's interesting to examine plot with final destinations of maze solving agents controlled by ANNs generated from population of organisms during evolution. We have visualized it by color coding agents depending on which species belongs its source organism. The fitness of agent is measured as a relative distance between it's final destination and maze exit after running simulation for particular number of time steps (400 in our setup).
The initial agent position is at the top-left corner marked with green circle and maze exit at the bottom-right marked with red circle.
The top plot shows final destinations of the most fit agents (fitness >= 0.8) and bottom is the rest. The results is given for experimental run with winner genome configuration presented above. At that experiment was produced 32 species among which the most fit ones has amounted to eight.
It can be seen that most fit and less fit species demonstrate similar behavior by examining mostly nearest area around initial agent position. But most fit agent at particular stage of evolution was able to make a leap and break local optima trap by investigating further areas of the map. As a result of such behavior one of the fittest species was able to produce organism able to solve maze and find the exit.
The solution was found only after 16 111 agents evaluations, which is really fast compared to the error backpropagation based methods requiring hundreds of thousands evaluations to find solution in similar setup.
 
Iaroslav Omelianenko
added an update
After 281 generations was found near optimal winner genome configuration able to control maze solving agent. The artificial neural network produced by this genome has only 17 units (neurons) with three hidden neurons.
During the experiment novelty search optimization resulted in growing three additional hidden units (neurons) and introducing recurrent link at one of the output neurons (#13). The recurrent link at the output neuron seems to have extreme importance as it's introduced at each winner genome configurations generated by solution.
Introduced genome was able to solve maze and find exit with spatial error about 0.8% at the exit point.
 
Iaroslav Omelianenko
added an update
The maze solving agent has six range finder sensors, four slice radar sensors, and two effectors controlling linear and angular velocity.
Thus the seed genome of maze solving agent need to have following configuration:
  • ten input (sensor) neurons: six for range finders plus four for slice radar sensors (blue)
  • two output neurons: linear and angular velocity controlling effectors (red)
  • one hidden neuron to introduce non linearity (green)
  • one bias neuron to avoid over saturation when input neurons is not activated (yellow)
The directed network graph of neural network attached.
 
Iaroslav Omelianenko
added an update
It was found that by reducing drop off age of population's species its possible to stimulate more optimal winner solutions generation. This can be explained by the fact that more novel species has less complexity than older ones (nodes + links between) and as a result different less complex topologies are examined.
 
Iaroslav Omelianenko
added an update
With initial implementation of Novelty Search optimization it was possible to build artificial neural network which controls agent able to solve normal maze within approximately 300 generations of organisms.
The winner configuration consists of 20 units interconnected by 66 links. The average complexity of generated ANNs is 53.8 with average diversity (# of species) about 34.9. To find winner genome was performed 82813 evaluations.
Average
Winner Nodes: 20.0
Winner Genes: 66.0
Winner Evals: 82813.0
Mean
Complexity: 53.8
Diversity: 34.9
Age: 332.9
The winner genome configuration and novelty points archive data attached. The 'record.dat' contains records of maze simulations results per each evaluated organism. The full generations of organisms with step of 100 epochs as well as winner generation presented as well.
 
Iaroslav Omelianenko
added a project goal
Study of Novelty Search optimization to be applied with method of NeuroEvolution of Augmented Topologies to solve deceptive control optimization tasks by building specialized Artificial Neural Networks. It will be studied by applying it to the maze solving by learning agent.