Conference PaperPDF Available

Instinct and Learning Synergy in Simulated Foraging Using a Neural Network.


Abstract and Figures

Instinct and experience are shown to form a potent combination to achieve effective foraging in a simulated environment. A neural network capable of evolving instinct-related neurons and learning from experience is used as the brain of a simple foraging creature that must find food and water in a 3D block world. Instincts provide basic tactics for unsupervised exploration of the world, allowing pathways to food and water to be learned. The combination of both instinct and experience was found to be more effective than either alone. As a comparison, neural network learning also proved superior to Q-Learning on the foraging task.
Content may be subject to copyright.
Instinct and Learning Synergy in Simulated Foraging Using a Neural Network
Thomas E. Portegys
School of Information Technology, Illinois State University
Instinct and experience are shown to form a potent
combination to achieve effective foraging in a simulated
environment. A neural network capable of evolving
instinct-related neurons and learning from experience is
used as the brain of a simple foraging creature that must
find food and water in a 3D block world. Instincts
provide basic tactics for unsupervised exploration of the
world, allowing pathways to food and water to be
learned. The combination of both instinct and experience
was found to be more effective than either alone. As a
comparison, neural network learning also proved
superior to Q-Learning on the foraging task.
1. Introduction
Foraging is an essential activity for many species,
including some human societies. It thus also provides a
valuable test bed for behavioral simulation with an aim
toward artificial animal intelligence. In some organisms
foraging consists of a combination of instinctive and
learned behaviors. For example, honey bees will search
their environment for nectar sources, learning their
locations through visual cues and communicating this
information to other bees [11]. Ants also forage and use
pheromone signals to mark the location of food sources in
the environment for further exploitation. The
computational field of Ant Colony Optimization (ACO)
[2,3] is largely based on this phenomenon.
Due to their similarity to natural nervous systems,
artificial neural networks seem the most fruitful means of
achieving generalized systems from the solutions of
specific problems like foraging. Thus a neural network
was chosen for the foraging task. Over the past 15 years a
number of other systems have studied foraging and
related problems with neural networks. Zhou and Shen
[12] constructed a system that allows foraging “bugs” to
learn an environment containing food, obstacles, and
competing bugs. Their network learned from exposure to
two-epoch trial cases that shaped an abstract force field
gradient which in turn allowed test cases to be
categorized with similar trials. Erdur and Güngör [4]
follow a theme similar to this project in utilizing a
combination of evolution with a genetic algorithm and
experiential Hebbian learning to modify the neural
network configuration to produce effective foraging as
well as other behaviors. Nolfi and Parisi [5] pointed out
that although evolution is a good way to get reasonable
initial behavior, learning is indispensable to adapt to
specific and changing conditions. Mazes have also been
employed as an environment to investigate goal-seeking
learning in artificial neural networks [1,10].
This study is believed to be novel in the way it
combines instinctive behavior as a means of training
experiential learning. This is a plausible counterpart to the
way simple animals learn, and is therefore a useful
approach to simulating them. The way this works in the
foraging task is as follows: instincts guide the creature to
effectively explore its environment, producing a stream of
stimuli and responses which are then incorporated into
new neurons recording pathways in the environment.
These learned neurons are reinforced by consistent
repetition as well as by association with the acquisition of
food and water goals. Over trials the learned network
often overrides instincts to guide the creature directly
along paths to food and water.
The Mona goal-seeking neural network was used for
this task. Mona has been shown to be capable of
supporting instinct evolution to solve the Monkey and
Bananas Problem [7], as well as effectively learning
mazes requiring the retention of context information over
time [6].
Q-Learning [9], a well-known reinforcement learning
technique that is amenable to stimulus-response search
space tasks, was used as a comparison to the neural
1.1. A brief overview of Mona
Mona is based on the rationale that brains are goal-
seeking entities. It has a simple interface with the
environment: all knowledge of the state of the
environment is absorbed through senses. Responses are
expressed to the environment with the goal of eliciting
sensory inputs which are internally associated with the
reduction of needs.
Events can be drawn from sensors, responses, or the
states of internal neurons, calling for three types of
neurons. Neurons attuned to sensors are receptors, those
associated with responses are motors, and those mediating
other neurons are mediators. Mediators can be structured
in hierarchies representing environmental contexts. A
mediator neuron controls the transmission of need
through and the enablement of its component neurons.
To elucidate by example, consider this somewhat
whimsical task: let Mona be a mouse that has been out
foraging in a house and now wishes to return back to her
mouse-hole in a certain room. For the sake of keeping
peace with her fellow mice, she must not make the
mistake of going into a hole in another room. Figure 1
shows her neural network at this juncture.
Figure 1. Initial mouse network
The triangle-shaped object at the bottom is the receptor
neuron that fires once she has reached her hole; the
inverted triangles are motor neurons that accomplish the
responses of going to the correct room (Go Room), and
going into the hole (Go Hole). The ellipses are mediator
neurons. Each is linked up to a cause and effect event
neuron. The “Hole Ready” mediator is not enabled,
reflecting the importance of not going into a hole in the
wrong room. The “Room Ready” mediator is enabled,
signifying an expectation that if its cause event fires, its
effect will also fire.
The “Home!” receptor neuron has a high goal value,
indicating that it is associated with a need. Because of
this, motive influence propagates into the network,
flowing into motor neurons whose firings will navigate to
the goal. Since the “Hole Ready” neuron is not enabled,
the motive bypasses the “Go Hole” motor neuron in
search of a mediator whose firing will enable “Go Hole”.
Since “Hole Ready” is an effect of “Room Ready”, it
flows into the “Go Room” motor via the enabled “Room
Ready” mediator and causes it to fire (double outline).The
flow of motive illustrates how mediators representing
contexts work together. The appropriate context for “Hole
Ready” is “Room Ready”, which means that the latter
should necessarily contribute something to the former in
order to enable it. This something is called a wager. A
wager temporarily modifies the enablement of a mediator
that is the effect event of another mediator. It is called a
wager because the base-level enablement of the wagering
mediator will be evaluated based on subsequent firing of
the effect neuron.
In Figure 2 the “Go Room” cause firing can be
understood as a conditional probability event: given that
Mona is in the correct room (“Room Ready”), she is quite
certain that she can go into her own hole. This
accomplished by a wager from “Room Ready”, triggered
by “Go Room” that boosts the enablement of “Hole
Ready”. After this enablement occurs, motive flows into
the “Go Hole” motor neuron, causing it to fire.
Subsequently the Mona senses that she is home in her
Figure 2. Final mouse network
2. Description
A muzz is a creature that lives in a 3D block world
such as that shown in Figure 3. The right panel of the
display shows a top view of the world, and the left panel
shows the muzz in the upper left corner of the world as
viewed by another muzz facing it. Blocks are randomly
marked with letters of the English alphabet. Striped ramps
also may lead to platforms of various heights. A
mushroom (a small circular shape from the top view) and
a pool (a larger circular shape) also appear somewhere in
the world as a food and water source for the muzz
Figure 3. A muzz world
A muzz must forage for mushrooms and water in this
world. A muzz has the following sensory capabilities: 3
sensors for detecting if the way is open to move in the
forward, right, and left directions; a sensor to detect the
terrain in the forward direction: { platform, wall, drop off,
ramp up, ramp down }; and an object sensor for detecting
objects in the forward direction: { mushroom, pool, muzz,
empty, <block letter> }. Its response repertoire consists
of: wait, move forward, turn right or left, eat, and drink.
A muzz also has 3 needs: food, water, and foraging.
The forage need is based on a fraction of the maximum
food or water need, which means when they are satisfied
the muzz has no need of foraging. Initially, all of the
needs are positive, meaning that they may compete to
drive the muzz’s responses. In other words, a learned path
to a pool may “vie” with a different path to a mushroom.
By attenuating need-derived motives as they drive
through the network, the path to the closest goal will be
preferred. This is assuming that the need for water and
food are equal, which is the case in this study. Once a
need has been satisfied, e.g. by drinking water, only
motives associated with other positive needs will drive
the network toward goals satisfying those needs.
2.1. Instincts
Receptor neurons for sensing mushrooms and water
were initially placed into the neural network and given
goal values associated with the reduction of hunger and
thirst respectively. These were terminal goals for learned
mediators (see learning section). Upon sensing a
mushroom a muzz will automatically eat it if it is hungry.
The same goes for a pool and drinking.
Three mediator neurons were also “hard wired” into
the muzz to implement foraging instincts. One of them
associates the “forward open” receptor neuron with the
“move forward” response. The others associate receptors
indicating openings to the right and left with turning right
and left respectively. The goal values of these instinct
mediators determine the probability of expressing the
movement responses. These are critical values, since it is
possible to set these such that foraging fails completely.
For example, if the move forward mediator always
dominates, the muzz will never turn down a side pathway
that may lead to a goal, or will always follow walls and
never explore an open space. If the turn right mediator
dominates, on the other hand, the muzz will rotate
endlessly in an open area. To determine effective settings,
an evolutionary selection procedure (see procedure
section) was used to select the instinct mediator goal
values to produce effective foraging. These values were
evolved in the presence of learning to achieve synergistic
2.2. Learning
Figure 4. Muzz approaching mushroom
As foraging proceeds, a stream of sensory inputs and
responses is generated. The neural network creates new
receptor and mediator neurons to record these streams.
The Mona neural network prefers to retain mediators that
excel at being reliable/repeatable or lead to need-reducing
goals, which in this task are the mushroom and pool
sensing receptors. In this study mediators were capped at
a maximum of 200, which, coupled with the exploratory
nature of foraging, meant that most learned mediators
were eventually destroyed.
As an example, Figure 4 shows a muzz ascending a
ramp toward a mushroom on the platform above. Figure 5
is an annotated snapshot of the mediator controlling this
activity, showing the sequence of stimuli and responses
Figure 5. Mushroom seeking mediator neuron
2.3. Procedure
An initial population of 40 muzzes was generated and
given random foraging instinct values. For each trial, a
single muzz, mushroom and pool were placed in the
world, and the muzz allowed to forage for 500 response
steps. The fitness of a muzz was a function of whether it
found food and/or water, and by how quickly it did so.
The fittest 20 muzzes were used to create the next
generation through mutation and mating. Mutation
consisted of copying learned neurons into the offspring
and probabilistically (10%) randomizing instinct goal
values. Mating consisted of randomly choosing instinct
goal values from a parent and randomly copying the
strongest neurons from either parent into the offspring
until the maximum of 200 was reached. Since each
neuron is uniquely identified by a recursively computed
MD5 hash, duplicating neurons was prevented. For an
individual evolution run, the world configuration,
consisting of the topography and object locations, was the
same; these varied for different runs. Each evolution run
proceeded for 30 generations. A set of 25 runs was done
for 3 world dimensions: 4x4, 8x8, and 12x12.
2.4. Q-Learning
The block world presents a search space in which a
stimulus-response stream can take a muzz from an initial
point to a foraging goal. Consequently it was chosen as a
comparison to the neural network. Just as for neural
network experiential learning, Q-Learning was initially
guided by foraging instincts. To tune them to work
together, several Q-Learning parameters, shown in Table
1, were evolved in conjunction with instincts. This was
done along the lines of the instinct evolution; hence the
Q-Learning parameters of a mutant muzz were set to
randomized values within minimum and maximum
values. Also, since there were two goals, water and
mushrooms, there were actually two concurrent Q-
Learning processes, each sensitive to one of the goals.
Each contributed to response selection as long as its
respective goal was unsatisfied, which is a mechanism
also incorporated in the neural network. Thus, combined
with instincts, there were possibly three influences on
response selection.
Table 1. Q-Learning parameters
Name Initial Minimum Maximum
Reward 1.0 .001 5.0
Q value .001 .001 1.0
Learning rate .9 .1 .9
Rate attenuation .9 .1 .9
Discount .9 .1 .9
3. Results
For each world dimension setting, the fittest 10 muzzes
for each of the 25 runs were tested, scored, and averaged
under a variety of conditions to create the graphs shown
below. The score was how many response steps out of a
maximum of 500 were needed to get both food and water.
Table 2 provides the legend for the graph symbols.
Table 2. Graph symbol legend
FI, ~FI Foraging instincts enabled/not enabled
LC, ~LC Learning capability enabled/not enabled
LE, ~LE Learning experience used/not used
QLE,~QLE Q-Learning experience used/not used
Figures 6, 7, and 8 show the 4x4, 8x8, and 12x12
world performances respectively. As observed, scaling
the world for the most part seems to scale the results
accordingly. In the first base case experiment (~FI,~LC),
the muzzes were “lobotomized” by disabling both
foraging instincts and learning capability. In most
configurations, the muzzes were simply unable to locate
food and water within the 500 step limit. Some
configurations placed the muzz, mushroom, and pool
close enough to allow success by making random
responses. In the second (~FI,LC) experiment, only the
learning capability was enabled. This resulted in
performance as poor as the lobotomized muzzes, which is
a stark testimony to the importance of having some tactics
available to engage the environment. The next experiment
(FI,~FC) indicates what a powerful effect the few simple
instincts alone had on task success.
Foraging Steps
Figure 6. 4x4 World performance
Foraging Steps
Figure 7. 8x8 World performance
Foraging Steps
Figure 8. 12x12 World performance
The next experiment (~FI,LE) was interesting and
somewhat unexpected. Here instincts and learning were
enabled and the world foraged. For the test, foraging
instincts were disabled and learning experience alone
enabled. The result is comparable performance to
foraging instincts alone. On closer observation, it appears
that not only were a number of environmental paths
learned, but that foraging itself was learned: the muzzes
moved about with exploratory movement patterns. In the
last experiment, the synergistic benefit of instinct and
experiential learning was striking, cutting the time to find
food and water approximately in half relative to either
alone. Looking closer at a number of trials, especially in
the 12x12 world, foraging appears to serve to get the
muzz on to a learned pathway, whereupon learned
behavior can activate to take the muzz directly to a goal.
The Q-Learning performance was unexpectedly poor.
Not only was learning experience running without
foraging instincts highly ineffective in all three world
dimensions, but in the 8x8 and 12x12 worlds it actually
hindered the effectiveness of foraging instincts. While it
was expected that Q-Learning would in some instances be
confounded by redundant sensory states within a goal
path and by the three-way vying for control between
instincts and the two goal-specific Q-Learning processes,
the extent of the degradation was surprising.
4. Conclusion
The use of a few basic hard-wired neurons, tuned by
evolution, has been shown to radically improve foraging
performance. Moreover, the superiority of the
instinct/learning synergy suggests that more ambitious
studies are warranted. For example:
In order to more closely mimic nature, an
environment might be constructed that contains
generalities related to foraging, such as a certain
type of fruit that grows in proximity to
environmental cues, such as odors or terrain
markings. Then creatures might learn more
generalized patterns related to resource
The addition of manipulable objects in the
environment could be used to study such
behaviors as nest-building.
The addition of other creatures could be used to
study social behaviors such as predator/prey
The embodiment of the creatures in simple
physical robots would create an opportunity to
mesh other fields such as pattern recognition and
kinematics with the neural network.
As a final note, the utility of uniquely identifying each
neuron with an MD5 hash to prevent duplication during
mating should be underscored. What it means is that any
two neurons in different networks having the same id are
recursively structurally identical. One of Mona’s design
goals is to address the critical problem of non-modularity
in classical feed-forward networks [8] by being able to
configure neurons that do specific jobs, something that
biological neurons are also capable of. Imagine the
possibilities of exchanging and even sharing neurons
between networks, something that nature does not design
The C++/OpenGL source code for Mona and the muzz
world are available at:
ip (zip) or muzz.tgz (tarball).
It can be compiled with either gcc/make or Microsoft
Visual Studio .NET.
5. References
[1] J. Blynel, and D. Floreano, “Exploring the T-Maze:
Evolving Learning-Like Robot Behaviors using CTRNNs”, In
Raidl, G. et al. (Eds.) Applications of Evolutionary Computing,
Heidelberg: Springer Verlag, 2003.
[2] E. Bonabeau, M. Dorigo, and G. Theraulaz, Swarm
Intelligence: From Natural to Artificial Systems. ISBN 0-19-
513159-2, 1999.
[3] M. Dorigo and T. Stützle, Ant Colony Optimization, MIT
Press, 2004.
[4] S. Erdur and T. Güngör, “An Investigation of Artificial
Neural Network Architectures in Artificial Life
Implementations”, International 13th Turkish Symposium on
Artificial Intelligence and Neural Networks (TAINN 2004),
191-199, İzmir, 2004.
[5] S. Nolfi and D. Parisi, “Neural networks in an artificial life
perspective”, In: W. Gerstner, A. Germond, M. Hasler and J.-D.
Nicoud (Eds.), Lecture Notes in Computer Science 1327. Berlin:
Springer-Verlag, pp.733-738, 1997.
[6] T. Portegys, “An Application of Context-Learning in a Goal-
Seeking Neural Network”, The IASTED International
Conference on Computational Intelligence (CI 2005), Calgary,
Canada, 2005.
[7] T. Portegys, “Instinct Evolution in a Goal-Seeking Neural
Network”, The IASTED International Conference on
Computational Intelligence (CI 2006), San Francisco, USA,
[8] J. Tan and S. Nolfi, “Learning to perceive the world as
articulated: An approach for hierarchical learning in sensory-
motor systems”, Neural Networks, 12(7–8):1131–1141, 1999.
[9] C. Watkins, Learning from Delayed Rewards, Thesis,
University of Cambidge, England, 1989.
[10] Y. Yamauchi and R. Beer, “Sequential behavior and
learning in evolved dynamical neural networks”, Adaptive
Behavior, 2(3):219-246, 1995.
[11] S. Zhang, F. Bock, A. Si, J. Tautz, and M.V. Srinivasan,
“Visual working memory in decision making by honey bees”,
Proceedings of the National Academy of Sciences of the United
States of America, 102 (14): 5250-5, 2005.
[12] Z-H Zhou and X-H Shen, “Virtual Creatures Controlled by
Developmental and Evolutionary CPM Neural Networks”,
Intelligent Automation and Soft Computing, 9(1): 23-30, 2003.
... While planners in artificial intelligence (Benson & Nilsson, 1995) are typically symbolic and not connectionist systems, it seems clear that neural networks must also be able to perform planning if they are to function as biological networks do. Mona has been successfully used on a number of tasks, including cooperative nest-building (Portegys, 2001) and learning a 3D grid environment for a simulated foraging robot (Portegys, 2007). Mona models the homeostatic need-reduction mechanism that animals possess as an integrated motivation mechanism designed to produce responses to reach goals that reduce needs. ...
... One of the original inspirations for Mona was to exhibit some properties, such as the rapid learning of novel and changing environments, which animals possess but conventional computer systems largely do not. Furthering this goal, Mona's learning and goal-seeking capabilities have been utilized as the nervous system of simple creatures in a simulated world (Portegys, 2007). Using a combination of instinct evolution and experiential learning, the creatures are able to acquire foraging skills and knowledge to explore and exploit their environment. ...
This study compares the maze learning performance of three artificial neural network architectures: an Elman recurrent neural network, a long short-term memory (LSTM) network, and Mona, a goal-seeking neural network. The mazes are networks of distinctly marked rooms randomly interconnected by doors that open probabilistically. The mazes are used to examine two important problems related to artificial neural networks: (1) the retention of long-term state information and (2) the modular use of learned information. For the former, mazes impose a context learning demand: at the beginning of the maze, an initial door choice forms a context that must be remembered until the end of the maze, where the same numbered door must be chosen again in order to reach the goal. For the latter, the effect of modular and non-modular training is examined. In modular training, the door associations are trained in separate trials from the intervening maze paths, and only presented together in testing trials. All networks performed well on mazes without the context learning requirement. The Mona and LSTM networks performed well on context learning with non-modular training; the Elman performance degraded as the task length increased. Mona also performed well for modular training; both the LSTM and Elman networks performed poorly with modular training.
... Interestingly, individuals also evolved to perform the learning task better. Portegys [20] also observed synergistic foraging performance with separate evolve and learning components. ...
Complex organisms exhibit both evolved instincts and experiential learning as adaptive mechanisms. In isolation, neither mechanism is sufficient to successfully navigate the environments of such organisms. Instincts provide behaviors that are generally adaptive but fail in specific cases. Learning must rely on some internal or external guidance to succeed on challenging tasks. This paper explores how instincts and experiential learning can work in tandem to solve a maze environment. Specifically, instincts comprise general knowledge of a set of related mazes representing worlds that an organism might be born into, and experiential learning discriminates specific situations in the particular maze world that an organism is born into. Synergy is accomplished by a hybrid neural network, one part instinctive and the other part capable of learning. After sufficient discriminating experiences, learning can override instinct to navigate a maze when instinct would otherwise fail. Results show a marked improvement in performance when this synergistic approach is employed relative to using either instincts or learning in isolation.
Full-text available
Swarm intelligence is a relatively new approach to problem solving that takes inspiration from the social behaviors of insects and of other animals. In particular, ants have inspired a number of methods and techniques among which the most studied and the most successful is the general purpose optimization technique known as ant colony optimization. Ant colony optimization (ACO) takes inspiration from the foraging behavior of some ant species. These ants deposit pheromone on the ground in order to mark some favorable path that should be followed by other members of the colony. Ant colony optimization exploits a similar mechanism for solving optimization problems. From the early nineties, when the first ant colony optimization algorithm was proposed, ACO attracted the attention of increasing numbers of researchers and many successful applications are now available. Moreover, a substantial corpus of theoretical results is becoming available that provides useful guidelines to researchers and practitioners in further applications of ACO. The goal of this article is to introduce ant colony optimization and to survey its most notable applications
Full-text available
In this paper, an artificially created world is defined and simulation results are presented. The proposed world is a complex system consisting of three types of agent: carnivorous, h erbivorous, and p lants. Agents live on a two-dimensional hypothetical world and have artificial neural network brains, which can learn o ver a life span and evolve over generations by g enetic algorithms. Three senses, vision, smell, and hearing are implemented. A new version o f Hebbian rule for short-term learning is defined. Each p arameter was tested with numerous simulations and important guidelines were obtained that can direct the design of such artificial worlds.
Conference Paper
Full-text available
Instincts are a vital part of the behavioral repertoire of organisms. Even humans rely heavily on these inborn mechanisms for survival. Many creatures, for example, build elaborate nests without ever learning through experience. This paper explores this evolutionary legacy in the context of an artificial goal-seeking neural network. An instinct is defined as a simple stimulus-response sequence that is triggered by environmental and other events. The well-known "Monkey and Bananas" problem is used as a task situation. Instincts are "hard-wired" neurons in the brain of a monkey. Using a genetic algorithm, a population of monkeys evolved to successfully solve the task that none were able to solve by experience alone. The solutions were also found to be quite adaptable to variations in the task; in fact more so than a hand-crafted solution.
Conference Paper
Full-text available
An important function of many organisms is the ability to use contextual information in order to increase the probability of achieving goals. For example, a street address has a particular meaning only in the context of the city it is in. In this paper, predisposing conditions that influence future outcomes are learned by a goal-seeking neural network called Mona. A maze problem is used as a context-learning exercise. At the beginning of the maze, an initial door choice forms a context that must be remembered until the end of the maze, where the same door must be chosen again in order to reach a goal. Mona must learn these door associations and the intervening path through the maze. Movement is accomplished by expressing responses to the environment. The goal-seeking effectiveness of the neural network in a variety of maze complexities is measured.
Full-text available
Photocopy. Supplied by British Library. Thesis (Ph. D.)--King's College, Cambridge, 1989.
Full-text available
The robustness and plasticity of working memory were investigated in honey bees by using a delayed matching-to-sample (DMTS) paradigm. The findings are summarized as follows: first, performance in the DMTS task decreases as the duration between the presentation of the sample stimulus and the presentation of the comparison stimuli is increased. This decrease is well approximated by an exponential decay function. Performance is significantly better than random-choice level even at delays as long as 5 sec and is reduced to random-choice levels at an average delay time of 8.68 ± 0.06 sec. Second, when the DMTS task involves two samples (one relevant, the other irrelevant), bees can be trained to learn to use the relevant sample to perform the task if (i) the relevant sample is always at a fixed position, or (ii) the relevant sample always has the same place in the sequence of presentation (always first or always second). Bees that have learned to use the relevant sample and to ignore the irrelevant sample can generalize this learning, and apply it to novel sets of sample and comparison stimuli that they have never previously encountered. The findings point to a remarkably robust, and yet plastic, working memory in the honey bee. • honey bee learning • matching-to-sample • maze • tunnel
In this paper, a system where virtual creatures called bugs navigating a grid-based environment, which is controlled by developmental and evolutionary CPM neural networks, is presented. Each bug is born with a certain amount of energy that decreases in the navigation and increases only when the bug gets food. The bug can accumulate experience, i.e. training instance,., in its life, which is used to incrementally tune its CPM network to improve the chance of making good decisions in later navigation. If two bugs meet their they may fight each other or produce all offspring, which is determined by their gender. The controlling organ, i.e. the CPM neural network, of the offspring is inherited from its parents in it specific way that the experience, i.e. the training instances, of its parents instead of the knowledge, i.e. the architectures or the weights, of them is genetically transmitted. Simulations show that the CPM networks are valuable to the longevity of the bugs, which exhibits not only the importance of the interaction of the developmental and evolutionary processes to virtual creatures, but also the feasibility of introducing evolution at the level of training instances into artificial neural networks.
Conference Paper
This paper explores the capabilities of continuous time recur- rent neural networks (CTRNNs) to display reinforcement learning-like abilities on a set of T-Maze and double T-Maze navigation tasks, where the robot has to locate and "remember" the position of a reward-zone. The "learning" comes about without modifications of synapse strengths, but simply from internal network dynamics, as proposed by (12). Neural controllers are evolved in simulation and in the simple case evaluated on a real robot. The evolved controllers are analyzed and the results obtained are discussed.
This paper describes how agents can learn an internal model of the world structurally by focusing on the problem of behavior-based articulation. We develop an on-line learning scheme-the so-called mixture of recurrent neural net (RNN) experts-in which a set of RNN modules become self-organized as experts on multiple levels, in order to account for the different categories of sensory-motor flow which the robot experiences. Autonomous switching of activated modules in the lower level actually represents the articulation of the sensory-motor flow. In the meantime, a set of RNNs in the higher level competes to learn the sequences of module switching in the lower level, by which articulation at a further, more abstract level can be achieved. The proposed scheme was examined through simulation experiments involving the navigation learning problem. Our dynamical system analysis clarified the mechanism of the articulation. The possible correspondence between the articulation mechanism and the attention switching mechanism in thalamo-cortical loops is also discussed.
Social insects--ants, bees, termites, and wasps--can be viewed as powerful problem-solving systems with sophisticated collective intelligence. Composed of simple interacting agents, this intelligence lies in the networks of interactions among individuals and between individuals and the environment. A fascinating subject, social insects are also a powerful metaphor for artificial intelligence, and the problems they solve--finding food, dividing labor among nestmates, building nests, responding to external challenges--have important counterparts in engineering and computer science. This book provides a detailed look at models of social insect behavior and how to apply these models in the design of complex systems. The book shows how these models replace an emphasis on control, preprogramming, and centralization with designs featuring autonomy, emergence, and distributed functioning. These designs are proving immensely flexible and robust, able to adapt quickly to changing environments and to continue functioning even when individual elements fail. In particular, these designs are an exciting approach to the tremendous growth of complexity in software and information. Swarm Intelligence draws on up-to-date research from biology, neuroscience, artificial intelligence, robotics, operations research, and computer graphics, and each chapter is organized around a particular biological example, which is then used to develop an algorithm, a multiagent system, or a group of robots. The book will be an invaluable resource for a broad range of disciplines.