PreprintPDF Available

I WANT TO PLAY A GAME

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

An animal behavior problem in the form of a game is proposed that involves two cooperating birds, a male and female. The female builds a nest into which she lays an egg. The male's job is to forage in a forest for food for both himself and the female. In addition, the male must fetch stones from a nearby desert for the female to use as nesting material. The game is complete when the nest is built and an egg is laid in it. The game can be run in three modes: manual (user-supplied responses), "auto pilot" (self-playing), and using the bird's brain (supplied by a player). The game is intended to serve as a benchmark to evaluate machine learning systems. Some preliminary results are included.
I WANT TO PLAY A GAME
Thomas E. Portegys, portegys@gmail.com ORCID 0000-0003-0087-6363
Dialectek, DeKalb, Illinois USA
ABSTRACT
An animal behavior problem in the form of a game is proposed that involves two cooperating
birds, a male and female. The female builds a nest into which she lays an egg. The male's job is
to forage in a forest for food for both himself and the female. In addition, the male must fetch
stones from a nearby desert for the female to use as nesting material. The game is complete
when the nest is built and an egg is laid in it. The game can be run in three modes: manual (user-
supplied responses), "auto pilot" (self-playing), and using the bird's brain (machine learning
system). The game is intended to serve as a benchmark to evaluate machine learning simulations
of animal behavior. Some preliminary results are included.
Keywords: Artificial animal intelligence, machine learning, benchmark.
INTRODUCTION
A game is proposed to simulate two birds, a male and a female, that cooperate in navigation,
foraging, communication, and nest-building tasks. These are tasks commonly found in nature to
ensure survival and reproduction for many animal species. The female builds a nest into which
she lays an egg, winning the game. The male's job is to forage in a forest for food for both himself
and the female. In addition, the male must fetch stones from a nearby desert for the female to
use as nesting material. The game is intended to serve as an animal artificial intelligence
benchmark to evaluate machine learning systems.
The question of why anyone should work on artificial animal intelligence is, at least on the
surface, a reasonable one, given our species unique intellectual accomplishments. Thus,
historically, AI has mostly focused on human-like intelligence, for which there are now
innumerable success stories: games, self-driving cars, stock market forecasting, medical
diagnostics, language translation, image recognition, etc. Yet the elusive goal of artificial general
intelligence (AGI) seems as far off as ever, likely because these success stories lack the “general”
property of AGI, operating as they do within narrow, albeit deep, domains. A language translation
application, for example, does just that and nothing else.
Anthony Zador (2019) expresses this succinctly: "We cannot build a machine capable of building
a nest, or stalking prey, or loading a dishwasher. In many ways, AI is far from achieving the
intelligence of a dog or a mouse, or even of a spider, and it does not appear that merely scaling
up current approaches will achieve these goals."
I am in the camp that believes that achieving general animal intelligence is a necessary, if not
sufficient, condition for AGI. While imbuing machines with abstract thought is the ultimate goal,
in humans there is a massive amount of evolved neurology that underlies this talent.
Hans Moravec put it thusly (1988): “Encoded in the large, highly evolved sensory and motor
portions of the human brain is a billion years of experience about the nature of the world and
how to survive in it. The deliberate process we call reasoning is, I believe, the thinnest veneer of
human thought, effective only because it is supported by this much older and much more
powerful, though usually unconscious, sensorimotor knowledge. We are all prodigious
Olympians in perceptual and motor areas, so good that we make the difficult look easy. Abstract
thought, though, is a new trick, perhaps less than 100 thousand years old. We have not yet
mastered it. It is not all that intrinsically difficult; it just seems so when we do it.”
The bird problem was first introduced by Portegys (2001). The solution was obtained using a
connectionist goal-seeking architecture (Mona) that employed macro-responses, such as “Go to
mate”. The network was also manually coded. In planning to attack the problem again with a
hybrid of Mona and Morphognosis (Portegys, 2022) networks, I thought it might also be of value
to other researchers to solve the game using a variety of techniques. Reinforcement learning
(Kaelbling et al., 1996) with its recent successes in winning games such as “Go” (Silver et al., 2016)
is a possible candidate.
DESCRIPTION
The game code can be found at: https://github.com/morphognosis/NestingBirds
The game can be run in three modes: manual (user-supplied responses), "auto pilot" (self-
playing), and using the bird's brain (supplied by user). The autopilot mode can be used to form
training data for machine learning systems. The bird brain stub is BirdBrain.java, written in the
Java language. Java provides a simple interface to other languages, such as Python, using the
Process class.
The program is built with the build_nestingbirds.sh (bat) and the graphical interface run with the
run_nestingbirds.sh (bat) commands found in the work directory. These are the command-line
options:
[-steps <steps> (default=single step)]
[-responseDriver <autopilot | bird> (default=autopilot)]
[-maleInitialFood <amount> (default=200)]
[-femaleInitialFood <amount> (default=200)]
[-maleFoodDuration <amount> (default=200)]
[-femaleFoodDuration <amount> (default=200)]
[-randomizeMaleFoodLevel
(food level probabiLISTICALLY INCREASES
0-200 UPON eating food)]
[-randomizeFemaleFoodLevel
(food level probabilistically increases
0-200 upon eating food)]
[-writeMaleDataset
(write dataset file: male_dataset.csv)]
[-writeFemaleDataset
(write dataset file: female_dataset.csv)]
[-verbose <true | false> (default=false)]
[-randomSeed <seed> (default=4517)]
[-version]
The batch interface run with the run_nestingbirds_batch.sh (bat) command. These are the
command-line options:
-steps <steps>
[-runs <runs> (default=1)]
[-responseDriver <autopilot | bird> (default=autopilot)]
[-maleInitialFood <amount> (default=200)]
[-femaleInitialFood <amount> (default=200)]
[-maleFoodDuration <amount> (default=200)]
[-femaleFoodDuration <amount> (default=200)]
[-randomizeMaleFoodLevel
(food level probabilistically increases
0-200 upon eating food)]
[-randomizeFemaleFoodLevel
(food level probabilistically increases
0-200 upon eating food)]
[-writeMaleDataset
(write dataset file: male_dataset_<run>.csv)]
[-writeFemaleDataset
(write dataset file: female_dataset_<run>.csv)]
[-verbose <true | false> (default=true)]
[-randomSeed <seed> (default=4517)]
[-version]
ENVIRONMENT
The environment is a 21x21 grid of cells. Each cell has a locale, and an object attribute. Locale
describes the type of terrain: plain, forest, and desert. These are mutually exclusive. Objects are
items to be found in the environment: mouse (food), and stone (nest-building material). These
are also mutually exclusive. A forest exists in the upper left of the environment, populated by
mice, who randomly move about in the forest. A desert is found in the lower right, where stones
are to be found at various locations. The birds are initially located on the plain in the center of
the environment.
BIRDS
The birds have four components: sensory, internal, needs, and response.
MALE:
Internal state: orientation, food (hunger), has-object (object being carried).
Orientation can be north, south, east and west. After consuming a mouse, the food state value is
set to a parameterized amount which is decremented each step. When food reaches zero, the
male bird is compelled to fly to the forest to forage for a mouse to eat. This compulsion overrides
any other current activity. Has-object can be either a mouse or a stone.
Sensory state: locale, object, mate-proximity, female-needs-food, female-needs-stone.
Locale and object pertain to the current location of the bird and the cells in the left, right, and
forward directions. Mate-proximity can be present, left, right, forward, or unknown. Female-
needs-food is activated when the female expresses a corresponding response of needs-food in
the presence of the male. This is the only time this sense is active; when not in the presence of
the female it is in the off state. A similar process exists for the female-needs-stone sense. Only
one of the female-needs is sensed at a time. Upon sensing female-needs-food, the male is
compelled to forage for a mouse and bring it back to the female to eat. Likewise, female-needs-
stone compels the male to fly to the desert in search of a stone to bring back to the female for
her to build the nest.
Need hierarchy: male needs food, female needs food, female needs stone, follow female around.
Responses:
do-nothing: a no-op response.
eat: eat mouse if has-object is a mouse. If no mouse, this is a no-op.
get: if has-object is empty and object visible, pick up the object and set it to has-object.
put: if has-object and no object visible, put object on cell and clear has-object.
toss: if has-object not empty, throw the object away to a random local cell.
move: move forward in the orientation direction. Movement off the grid is a no-op.
turn-right/left: change orientation by 90 degrees.
give-food: if has-object is mouse, and female present with empty has-object, transfer food to
female.
give-stone: if has-object is stone, female present with empty has-object, transfer stone to female.
FEMALE
Internal state: orientation, food (hunger), has-object (object being carried).
Sensory state: locale, object, mate-proximity.
Needs hierarchy: female needs food, female needs stone, build nest, lay egg.
Responses:
do-nothing through turn responses: common with male.
need-food: when food reaches zero, female halts all activity, possibly tossing stone, and responds
with need-food until male arrives and gives mouse to female, at which time she eats the mouse
and resumes nesting activity.
need-stone: the female builds a square configuration of stones around the center cell, proceeding
step-by-step to place stones. When she reaches a cell that requires a stone, she will respond with
need-stone until the male arrives with a stone and gives it to her, at which time she will place it
in the cell and move to the next cell in the prescribed configuration.
When the nest is complete, the female will move back to the center cell and lay an egg,
completing the game.
SCENARIO
The following scenario shows intermittent states of the game in autopilot mode, from initial state
to egg-laying in the completed nest. A video is available here: https://youtu.be/d13hxhltsGg
Figure 1. Beginning of game. Mode set to autopilot. Female is hungry (0 food), male has maximum
food. Initial response for both is “do nothing”. Both are located at center of world. Upper left is
forest with mice (food). Lower right is desert with stones for nest-building.
Figure 2. While co-located, female signals to male with “want food” response. Male flies to forest
and picks up a mouse to feed to her.
Figure 3. Female moves to location of first nesting stone. Male follows her. She signals to male
that she wants a stone. Male flies to desert and picks up a stone.
Figure 4. While carrying stone, male becomes hungry. He tosses stone aside and flies to forest
for mouse.
Figure 5. Male returns to female with stone. Discovers she is hungry. He tosses stone aside and
flies to forest for mouse for her.
Figure 6. Nest completed. Egg laid. Game completed in 512 steps.
GAME SCORING
The game should be trained with a variety of food duration and random seed values to ensure
generalization. These factors play a role in scoring performance:
Nest completed.
Egg laid in nest.
Male and female respond properly to hunger need.
Male and female respond properly to need to procure and place stones.
Avoidance of domain-specific information, e.g. internally recording coordinates of birds
PRELIMINARY RESULTS
An LSTM (Long-short term memory) (Hochreiter and Schmidhuber, 1997) recurrent neural
network (RNN) was trained and tested on generated datasets using the keras 2.6.0 python
package.
Create 3 dataset files (<gender>_dataset_<run>.csv):
run_nestingbirds_batch.sh -steps 1000 -runs 3 -writeMaleDataset
-writeFemaleDataset
Train and test RNN with 3 datasets (2 training and 1 testing):
nestingbirds_rnn.sh -gender male -num_datasets 3
-num_test_datasets 1
Both male and female birds almost perfectly validate the training data. For testing, the female,
presumably with more limited behavior patterns than the male, performs with near 100%
accuracy. The male performs with 80% accuracy.
CONCLUSION
The bird game is a benchmarking platform to evaluate animal artificial intelligence efforts, which
I believe is an essential capability of general artificial intelligence.
By the way, if you haven’t guessed where the title comes from, it’s from the movie Saw, in which
failure to win the game results in your head exploding.
REFERENCES
Hochreiter, S., Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8),
1735-1780.
Kaelbling, Leslie P.; Littman, Michael L.; Moore, Andrew W. (1996). Reinforcement Learning: A
Survey. Journal of Artificial Intelligence Research. 4: 237285. arXiv:cs/9605103.
doi:10.1613/jair.301. S2CID 1708582. Archived from the original on 2001-11-20.
Moravec, H. (1988). Mind Children: The Future of Robot and Human Intelligence. (Harvard
University Press).
Portegys, T.E. (2001). Goal-Seeking Behavior in a Connectionist Model. Artificial Intelligence
Review 16, 225253 (2001). https://doi.org/10.1023/A:1011970925799
Portegys, T.E. (2022). Dynamically handling task disruptions by composing together behavior
modules. arXiv. https://doi.org/10.48550/arXiv.2207.06482
Silver, D., Huang, A., Maddison, C. et al. (2016). Mastering the game of Go with deep neural
networks and tree search. Nature 529, 484489. https://doi.org/10.1038/nature16961
Zador, A. (2019). A critique of pure learning and what artificial neural networks can learn from
animal brains. Nature Communications volume 10, Article number: 3770.
https://www.nature.com/articles/s41467-019-11786-6
ResearchGate has not been able to resolve any citations for this publication.
Preprint
Full-text available
Biological neural networks operate in the presence of task disruptions as they guide organisms toward goals. A familiar stream of stimulus-response causations can be disrupted by subtask streams imposed by the environment. For example, taking a familiar path to a foraging area might be disrupted by the presence of a predator, necessitating a "detour" to the area. The detour can be a known alternative path that must be dynamically composed with the original path to accomplish the overall task. In this project, overarching base paths are disrupted by independently learned path modules in the form of insertion, substitution, and deletion modifications to the base paths such that the resulting modified paths are novel to the network. The network's performance is then tested on these paths that have been learned in piecemeal fashion. In sum, the network must compose a new task on the fly. Several network architectures are tested: Time delay neural network (TDNN), Long short-term memory (LSTM), Temporal convolutional network (TCN), and Morphognosis, a hierarchical neural network. LSTM and Morphognosis perform significantly better for this task.
Article
Full-text available
Artificial neural networks (ANNs) have undergone a revolution, catalyzed by better supervised learning algorithms. However, in stark contrast to young animals (including humans), training such networks requires enormous numbers of labeled examples, leading to the belief that animals must rely instead mainly on unsupervised learning. Here we argue that most animal behavior is not the result of clever learning algorithms-supervised or unsupervised-but is encoded in the genome. Specifically, animals are born with highly structured brain connectivity, which enables them to learn very rapidly. Because the wiring diagram is far too complex to be specified explicitly in the genome, it must be compressed through a "genomic bottleneck". The genomic bottleneck suggests a path toward ANNs capable of rapid learning.
Article
Full-text available
The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.
Article
Full-text available
Goal-seeking behavior in a connectionist modelis demonstrated using the examples of foragingby a simulated ant and cooperativenest-building by a pair of simulated birds. Themodel, a control neural network, translatesneeds into responses. The purpose of this workis to produce lifelike behavior with agoal-seeking artificial neural network. Theforaging ant example illustrates theintermediation of neurons to guide the ant to agoal in a semi-predictable environment. In thenest-building example, both birds, executinggender-specific networks, exhibit socialnesting and feeding behavior directed towardmultiple goals.
Article
Full-text available
Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O(1). Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.