Content uploaded by Shade T Shutters
Author content
All content in this area was uploaded by Shade T Shutters on Oct 23, 2014
Content may be subject to copyright.
STRONG RECIPROCITY, SOCIAL STRUCTURE, AND THE EVOLUTION OF
COOPERATIVE BEHAVIOR
by
Shade Timothy Shutters
A Dissertation Presented in Partial Fulfillment
of the Requirements for the Degree
Doctor of Philosophy
ARIZONA STATE UNIVERSITY
December 2009
STRONG RECIPROCITY, SOCIAL STRUCTURE, AND THE EVOLUTION OF
COOPERATIVE BEHAVIOR
by
Shade Timothy Shutters
has been approved
July 2009
Graduate Supervisory Committee:
Ann Kinzig, Co-Chair
Berthold Hölldobler, Co-Chair
Jürgen Liebig
Charles Perrings
Charles Redman
ACCEPTED BY THE GRADUATE COLLEGE
iii
ABSTRACT
The phenomenon of cooperation is central to a wide array of scientific disciplines.
Not only is it key to explaining some of the most fundamental questions of biology and
sociology, but it is also a cornerstone of understanding and successfully overcoming
social dilemmas at multiple scales of human society. In addition, cooperation is
considered crucial to any hope of long-term sustainable occupation of the earth by
humans. Drawing on a broad interdisciplinary literature from biology, sociology,
economics, political science, anthropology, law, and international policy analysis, this
study uses computational methods to meld two disparate approaches to explaining
cooperation – individual incentives and social structure. By maintaining a high level of
abstraction results have broad applicability, ranging from colonies of social amoebae or
ants to corporations or nations interacting in markets and policy arenas. In all of these
cases, actors in a system with no central controller face a trade-off between individual
goals and the needs of the collective. Results from evolutionary simulations of simple
economic games show that when individual incentives, in the form of punishment, are
coupled with social structure, especially complex social networks, cooperation evolves
quite readily despite traditional economic predictions to the contrary. These simulation
results are then synthesized with experimental work of others to present a challenge to
standard, narrow definitions of rationality. This challenge asserts that, by defining
rational actors as absolute utility maximizers, standard rational choice theory lacks an
evolutionary context and typically ignores regard that agents may have for others in the
local environment. Such relative considerations become important when potential
interactions of a society’s individuals are not broad and random, but are governed by
iv
emergent social networks as they are in real societies. Finally, analysis of the
implications of these findings to efficacy of international environmental agreements
suggest that conventional strategies for overcoming global social dilemmas may be
inadequate when other-regarding preferences influence national strategies.
v
for a demon-haunted world
vi
ACKNOWLEDGEMENTS
It is with great humility that I thank my advisor, mentor, and friend, Dr. Ann
Kinzig, who invited me into her lab with only a desire to aid what she perceived to be a
lost but capable student. She and her students, Kris Gade, Bethany Cutts, Steven Metzger
and Maya Kapoor, were my second family during the long years of this research. This is
true too of the extended lab group Ann built around us including Jason Walker, Brad
Butterfield, Gustavo Garduño-Angeles, and Elisabeth Larson.
In addition to committee members Bert Hölldobler, Jürgen Liebig, Charles
Perrings, and Charles Redman, I was guided and encouraged by many others that gave
selflessly of their time, especially J. Marty Anderies. David Hales (University of Delft,
Netherlands) and Kim Hill gave critical feedback on this work. Sebastiano Alessio DelRe
(Università Bocconi, Italy) and John Murphy (University of Arizona) provided assistance
with the simulation program. For much-needed support I give special thanks to Christofer
Bang, Cyd Hamilton, and Mary Laner.
For providing both a unique intellectual atmosphere and limitless administrative
support, I am deeply indebted to ASU’s IGERT in Urban Ecology, Center for Social
Dynamics and Complexity, and School of Life Sciences, each with more staff, faculty,
and students to whom I owe gratitude than I could possibly list here. Long-standing
relationships with the Math & Cognition Group, the Social Insect Research Group, and
the ecoSERVICES Group also helped solidify ideas developed in this dissertation.
Finally, this dissertation would never have reached completion without the
infinite patience and unwavering support of my loving wife, Callen Shutters.
vii
This work was supported through generous fellowships from Arizona State
University’s Integrative Graduate Education and Research Traineeship (IGERT) in Urban
Ecology (NSF Grants # 9987612 and # 0504248) and the U.S. National Science
Foundation. Additional funding was provided by Indiana University-Bloomington,
University of Bologna, Italy, University of Washington-Friday Harbor, University of
Alaska-Fairbanks, The Air Force Office of Scientific Research, and both the Graduate
College and School of Life Sciences at Arizona State University.
Any opinions, findings, conclusions or recommendations expressed in this
material are mine alone and do not necessarily reflect the views of the National Science
Foundation.
TABLE OF CONTENTS
Page
LIST OF TABLES........................................................................................................... xiii
LIST OF FIGURES ...........................................................................................................xv
ABBREVIATIONS ......................................................................................................... xvi
CHAPTER
1. INTRODUCTION .................................................................................................1
2. COOPERATION: PROBLEM DISCUSSION AND BACKGROUND...............7
Early Work on the Evolution of Cooperation......................................................7
Other Theories .....................................................................................................9
Inclusive fitness .........................................................................................10
Direct reciprocity .......................................................................................10
Indirect reciprocity.....................................................................................11
Multi-level selection ..................................................................................12
Strong Reciprocity .............................................................................................13
Rational Choice Theory .....................................................................................16
Game Theory .....................................................................................................17
The public good game................................................................................19
The prisoners dilemma...............................................................................21
The ultimatum and dictator games.............................................................21
Experimental games and rationality...........................................................22
Social Network Theory......................................................................................24
Social network metrics...............................................................................26
Classification of social networks ...............................................................27
CHAPTER Page
ix
Agent-based Modelling......................................................................................30
3. PUNISHMENT AND SOCIAL STRUCTURE: COOPERATION IN A
CONTINUOUS PRISONER’S DILEMMA ...........................................38
Introduction........................................................................................................38
The Simulation Model .......................................................................................40
The continuous prisoners dilemma (CPD).................................................41
Game play..................................................................................................42
Simulation variables and output.................................................................44
Results................................................................................................................45
Control case: no social structure, no punishment ......................................45
Either punishment or social structure alone...............................................45
Punishment and social structure together ..................................................46
Discussion..........................................................................................................46
Cooperation in continuous versus discrete games .....................................46
Localization of interactions and the evolution of altruistic punishment....47
Cooperation and network density ..............................................................48
Social dilemmas and their underlying social structure ..............................49
The anomaly of scale-free networks..........................................................49
Evolutionary dynamics ..............................................................................50
Cooperation in other games.......................................................................51
Future Directions ...............................................................................................52
CHAPTER Page
x
4. EXTENDING THE CONTINUOUS PRISONERS DILEMMA MODEL: THE
AFTERMATH OF PUNISHMENT ........................................................62
The Detrimental Side of Punishment.................................................................62
Punishment versus payoffs ........................................................................63
The 2ND Order Free-rider Problem.....................................................................64
Results and discussion: 2nd order free rider simulations............................67
The Effect of Retaliatory Behavior....................................................................68
Results and discussion of retaliation experiments .....................................69
Summary and Future Directions ........................................................................70
5. PUNISHMENT AND SOCIAL STRUCTURE: THE EVOLUTION OF
FAIRNESS IN AN ULTIMATUM GAME ............................................81
Introduction........................................................................................................81
The question of fairness.............................................................................81
Background........................................................................................................83
The punishment multiplier.........................................................................84
The Simulation Model .......................................................................................86
Results and Discussion ......................................................................................89
Offer rates in the absence of punishment...................................................90
Response of offer rates as M increases ......................................................91
Network type and the transition value of M...............................................94
Future Directions ...............................................................................................95
CHAPTER Page
xi
6. PUNISHMENT, RATIONAL EXPECTATIONS, AND REGARD FOR
OTHERS................................................................................................107
Introduction......................................................................................................107
Interdependent Preferences and Utility............................................................108
Explanations of interdependent preferences ............................................110
Biological fitness and relativity ...............................................................112
Utility, payoffs and fitness: confusion in behavioral experiments ..........113
Relativity and Localized Interactions in Complex Networks..........................115
Altruistic Punishment and Rational Expectations............................................115
The Simulation.................................................................................................116
The continuous prisoners dilemma ..........................................................117
Introduction of altruistic punishment.......................................................118
Social structure.........................................................................................118
The simulation algorithm.........................................................................119
The Model of Relative Payoff Maximization ..................................................120
Experimental Results and Discussion..............................................................122
Occurrence of punishment .......................................................................122
Punishment and number of neighbors......................................................123
Summary..........................................................................................................124
7. INTERDEPENDENT PREFERENCES AND THE EFFICACY OF
INTERNATIONAL ENVIRONMENTAL AGREEMENTS................128
A Model of International Conflict: The Standard Prisoners Dilemma............129
CHAPTER Page
xii
Distinguishing Between Payoffs and Utility....................................................131
Nations As Agents ...........................................................................................134
Relative Payoffs and Implications for Successful IEAs ..................................136
Enhancing the Ability of IEAs to Restructure Dilemmas................................138
Conclusion .......................................................................................................139
8. SUMMARY AND FUTURE DIRECTIONS....................................................143
Summary of Findings.......................................................................................143
Future Directions .............................................................................................144
Dynamic networks ...................................................................................144
Mixed models...........................................................................................145
Getting the game right .............................................................................145
BIBLIOGRAPHY............................................................................................................147
APPENDIX
A. CALCULATION OF APPROXIMATE TRANSITION VALUES IN TABLES
3.5 AND 5.7...........................................................................................165
B. SIMULATION PROGRAM CODE .................................................................167
LIST OF TABLES
Table Page
3.1. Strategy Components Used by Agents in the Continuous Prisoners Dilemma ......54
3.2. Payoffs p in the Continuous Prisoners Dilemma Between i and j with Possible
Punishment of i by k..........................................................................................55
3.3. Simulation Parameters for the Continuous Prisoners Dilemma and their Values ..56
3.4. Mean Ending contributions on Various Networks With and Without Punishment57
3.5. For Regular Networks, Approximate Value of M at which Populations
Transitioned from Low to High Contributions in the Continuous Prisoners
Dilemma.............................................................................................................58
4.1. Payoffs p in the continuous prisoners dilemma between observer k and 2nd order
punisher l............................................................................................................72
4.2. Approximate Value of M at Which Populations Transitioned from Low to High
Contributions in the Continuous Prisoners Dilemma, With and Without
Punishment of 2nd Order Free-riders..................................................................73
4.3. Response of Contribution Rate to Network Type With and Without Retaliation in
the Continuous Prisoners Dilemma ...................................................................74
5.1. Some Commonly-cited Experiments Using Altruistic Punishment and their Values
of the Punishment Multiplier M.........................................................................98
5.2. Strategy Components Used by Agents in the Ultimatum Game.............................99
5.3. Network Types Used in the Ultimatum Game......................................................100
5.4. Payoff Matrix of the Ultimatum Game With 3rd Party Punishment .....................101
5.5. Simulation Parameters Used in the Ultimatum Game..........................................102
Table Page
xiv
5.6. Response of Offers to Network Type in the Ultimatum Game Without 3rd Party
Punishment.......................................................................................................103
5.7. Approximate Value of M at Which Populations Transitioned from Low to High
Offers in the Ultimatum Game ........................................................................104
6.1. Lattice Networks Used for the Continuous Prisoners Dilemma in this Chapter ..125
6.2. Payoffs p in the Continuous Prisoners Dilemma Between i and j With Possible
Punishment of i by k........................................................................................126
6.3. Simulation Results Versus both Standard Economic and Evolutionary Economic
Predictions........................................................................................................127
7.1. Expected Strategies Before and After Restructuring a Prisoners Dilemma .........140
LIST OF FIGURES
Figure Page
2.1. Examples of Directed and Non-Directed Graphs and Their Adjacency Matricies.34
2.2. Examples of Connected and Non-connected Graphs..............................................35
2.3. Examples of Regular Networks Used in this Dissertation......................................36
2.4. Examples of Non-Regular Networks Used in this Dissertation..............................37
3.1. Results of the Continuous Prisoners Dilemma on Four Different Networks..........59
3.2. Response of Mean Ending Contributions to Increasing M in the Continuous
Prisoners Dilemma.............................................................................................60
3.3. Evolutionary Dynamics of the Continuous Prisoners Dilemma on Small-world
Networks............................................................................................................61
4.1. Prisoners Dilemma Payoffs vs. M at Low Values of M..........................................75
4.2. Prisoners Dilemma Payoffs vs. M at High Values of M.........................................76
4.3. Effect of 2nd-Order Punishment: Lattice Networks ................................................77
4.4. Effect of 2nd-Order Punishment: Other Networks ..................................................78
4.5. Effect of Retaliation on Mean Ending Contributions in the Continuous Prisoners
Dilemma.............................................................................................................79
4.6. Effect of Retaliation on Mean Payoffs in the Continuous Prisoners Dilemma ......80
5.1. Response to Regular Networks of Offers in an Ultimatum Game Without 3rd Party
Punishment.......................................................................................................105
5.2. Response of Ultimatum Game Offers to Increasing M Under Regular Network.106
7.1. Payoff Matrices for the Standard Prisoners Dilemma ..........................................141
7.2. Weighted Effect of Relative vs. Absolute Payoffs in a Restructured Prisoners
Dilemma...........................................................................................................142
ABBREVIATIONS
Page of first use
CPD Continuous Prisoners Dilemma .......................................................................21
IEA International Environmental Agreement........................................................128
PD Standard Prisoners Dilemma..........................................................................129
CFCs Chlorofluorocarbons ......................................................................................137
SDG Snowdrift Game.............................................................................................145
CHAPTER 1
INTRODUCTION
government…though composed of men subject to all human infirmities,
becomes, by one of the finest and most subtle inventions imaginable, a
composition which is in some measure exempted from all these infirmities
- Mancur Olson (1965)
The history of life is punctuated by the periodic emergence of new hierarchical
levels of organization, among them eukaryotic cells, multi-cellular life, eusociality, and
institutions (Michod 1997). Like the human institution of government described by Olson
above, these emergent levels are made of populations of individuals which become
entities in their own right, often existing far longer than the life spans of the individuals
of which they are comprised. These emergent entities frequently have global attributes
and an evolutionary trajectory not predictable from observation of the component
individuals. At the heart of these emergent levels of living organization lies the
phenomenon of cooperation (Maynard Smith and Szathmáry 1997, Michod 1997), a
mechanism that can facilitate collective action by individuals and make it possible for the
collective to become itself an individual.
This dissertation is an investigation into cooperative behavior – a phenomenon
that remains largely unexplained by science and whose evolution is one of the greatest
questions facing evolutionary biologists (West et al. 2007). A broadly applicable
explanation of cooperation remains elusive despite years of theoretical and empirical
investigation. Some researchers even suggest that the current state of sociobiologic
inquiry is in disarray (Wilson and Wilson 2007). This is due partly to the recurring debate
over the units of selection and simple semantics of social behavior (West et al. 2007),
partly to a resistance by theoreticians to move beyond formal mathematical models
2
(Bedau 1999, Griffin 2006), and partly to the slow pace of incorporating new insights
from emerging fields such as evolutionary economics, social network theory, and
complexity science (Sawyer 2005).
In addition, researchers are hindered by a narrow definition of cooperation
wherein agents subordinate their individual self-interest to that of a larger group. In other
words, the quest for an explanation of cooperation is often confused with a quest for an
explanation of altruism (West et al. 2007). While not denying the existence of seemingly
altruistic acts, such as a soldier jumping on a grenade to save his comrades, the focus of
this dissertation is on behaviors that allow individuals to benefit a larger group while also
being evolutionarily beneficial to the individual. Though it may seem an individual is
behaving in the interest of a collective and not himself, those acts may nonetheless
increase the evolutionary fitness of the individual.
Experimental research on cooperation through the use of laboratory games
generally assumes that participants will attempt to maximize payoffs in a game if the
participants are rational. This is a subtle but important departure from rational choice
theory, in which actors maximize utility, not payoffs (Mas-Colell et al. 1995). In this
study I explore the implications of equating payoffs and utility in this manner. In
addition, I incorporate the biological concept of relative fitness which further confounds
predictions of economic experiments.
Though fundamental questions of cooperation remain unanswered there are other
pressing reasons that justify its investigation. Many environmental problems today
require coordinated, collective action among nations of the world if they are to be solved
(Oldero 2002, Rees 2002, Kaul and Mendoza 2003, Beddoe et al. 2009). As societies,
3
driven by a desire to avoid their own “tragedy of the commons” (Hardin 1968),
increasingly push for sustainable management of the earth’s resources, international
cooperation is required. Given that sustainability is a desirable policy course for a
growing number of nations (Beddoe et al. 2009), and that cooperation is often a
prerequisite for coordinated global sustainable management, a fundamental
understanding of when and under what conditions cooperation will emerge becomes
imperative.
As defined in this dissertation, cooperation results when individuals act in a
manner that produces some beneficial collective or social outcome, even though those
individuals may have an incentive to cheat or act otherwise. Currently there is no broadly
compelling theory that explains why unrelated individuals choose to cooperate. This is
equally true at the international level where the actors are nations of the global
community that may be called upon to cooperate in multinational initiatives to the
detriment of their own national self-interests. But as Sandler (2004, p. 260) points out,
“nations will sacrifice autonomy only in the most desperate circumstances.”
A long tradition of western philosophy holds that cooperation is obtainable only
through top-down coercion by a central authority. As Hobbes ([1651] 1946) asserted in
his classic work Leviathan, “there must be some coercive power, to compel men equally
to the performance of their covenants, by the terror of some punishment.” However, a
coercive central power is often the least desirable solution to social dilemmas (Ostrom
1990). Furthermore, for environmental problems requiring international cooperation, a
4
central power does not currently exist that could carry out such enforcement (Sandler
1999, Wagner 2001, Barrett 2003b, 2005, Hodgson 2009)1.
On the other hand, overwhelming evidence from both case studies and
experiments does not support the Hobbesian notion that a Leviathan is the only path to
collective action. Instead it shows that cooperation can emerge in the absence of a central
controller despite predictions of economic theory that selfish individuals, left to their own
devices, will fail to cooperate. Attempts to understand these empirical results have
recently led to the concept of strong reciprocity – the idea that cooperation persists
because individuals inflict costly punishment on cheaters and bestow rewards on
cooperators, and in either case receive no benefit in return.
Like cooperation, punishment is ubiquitous among social organisms. Where
cooperating individuals have an incentive to cheat, punishment mechanism often exist to
deter cheating (Frank 1995). This includes restricting cancer cell growth through
preprogrammed death or senescence (Sharpless and DePinho 2005), toxin release by
colonial bacteria that affects only non-cooperators (Travisano and Velicer 2004), the
destruction of eggs laid by workers in social insect colonies (Foster and Ratnieks 2001),
and enforcement of mating and dominance hierarchies in non-human mammals (Clutton-
Brock and Parker 1995, Dugatkin 1997b). Even the process of cellular meiosis can be
viewed as a form of policing selfish genes (Michod 1996). In humans, punishment and
policing are common across societies and many cultural groups (Marlowe et al. 2008)
1Abbott et al (2000) argue that the closest thing currently to an effective global government is the World
Trade Organization.
5
and neurological research suggests this behavior is to some degree genetically coded in
humans (Sanfey et al. 2003, de Quervain et al. 2004, Spitzer et al. 2007).
In this dissertation I examine the ability of punishment mechanisms to induce a
society or group to act cooperatively. In particular I explore the effects of punishment
under different ratios of costs between the punisher and punishee and examine the role
that social network structure plays in the ability of a society to cooperate. In addition, I
examine the effects of both retaliation by a punished agent and punishment of those who
refuse to punish cheaters, both of which are typically ignored in punishment studies
(Clutton-Brock and Parker 1995, Nikiforakis 2008). The primary method of investigation
used in this work is agent-based computer modelling, a computational tool that
incorporates genetic algorithms, heterogeneity, mutation, and selection to simulation
evolutionary trajectories of decision making behavior.
It is important to distinguish between promoting cooperation in expectation that it
will lead to the provision of a public good and promoting cooperation merely for
cooperation’s sake. Though often implied that cooperation is always desirable, the costs
required to facilitate cooperation may outweigh its benefits. In this dissertation I present
evidence to support this point, especially when punishment is the mechanism used to
induce cooperative behavior.
The structure of this dissertation is as follows. Chapter 2 outlines background
information on the problem of cooperation and presents theoretical fundamentals on a
variety of topics meant to facilitate understanding of this study by a wide audience.
Chapters 3 and 4 present empirical results from computer simulation experiments
showing that in structured societies, punishment can increase contributions to a public
6
good but that it may also lead to detrimental side-effects. Chapter 5 presents simulation
results demonstrating that, unlike cooperation, fairness, defined as a roughly equal
division of resources, is not induced by a combination of punishment and social structure.
Chapter 6 develops a theoretical framework to explain experimental results from this
dissertation and other sources and shows that a model in which agents make decisions
based, in part, on the decisions of others, best explains results. Chapter 7 applies this
theoretical framework to the design of international environmental treaties, primarily
those intended to promote sustainability and the provisioning of global public goods, and
shows that treaties may not work as intended when nations are concerned with relative
position. Finally, Chapter 8 reviews major findings of this dissertation and discusses
possible future research related to this work.
This study not only advances understanding in evolutionary biology, theoretical
sociology, and behavioral economics, but also has practical applicability to transboundary
environmental management and policy science.
CHAPTER 2
COOPERATION: PROBLEM DISCUSSION AND BACKGROUND
Before any discussion on cooperation can proceed it is necessary to address years
of semantic confusion on the topic and to define the terms being discussed. Due in large
part to the multidisciplinary nature of inquiry into cooperative behavior, many terms have
been used interchangeably in various literature. This is especially true of the terms
cooperation, altruism, mutualism, symbiosis, and reciprocity. These terms are often
confused by the same author at different points in his or her career and at times even in
the same literary piece (West et al. 2007). In advocating a clear distinction between
cooperation and altruism, West et al (2007) define altruism as a behavior that is costly, in
terms of biological fitness, to the individual performing the act but beneficial to another,
while they define cooperation as behavior by an individual that benefits others, in terms
of fitness, and which is evolutionarily selected for because of the benefit it bestows (see
also Travisano and Velicer 2004).
Though West and colleagues present an attempt to clarifying the confusing
semantics of social behavior, they do so almost exclusively from a biological perspective.
For applicability to a broader audience, especially those in social sciences, cooperation in
this dissertation is defined as behavior by an individual that produces a beneficial
collective or social outcome, even though those cooperating individuals may have an
incentive to cheat or act otherwise.
Early Work on the Evolution of Cooperation
Among researchers vexed by cooperative behavior was Charles Darwin who
could never, to his own satisfaction, rectify observations of seeming altruism with his
8
own theory of natural selection (Sulloway 1998). In his seminal work Darwin fretted that
explaining altruism ubiquitous in social insects was “to me insuperable, and actually fatal
to my whole theory” ([1859] 1996, p. 192). Subsequent researchers did not afford the
question a high-priority since it could easily be explained by the classical notion of group
selection. However, when the theory of group selection was largely discredited in the
1960s (Olson 1965, Williams 1966, Hagen 1992) scientific interest in the topic of
cooperation was renewed (Axelrod and Hamilton 1981).
Two theories emerged at this time as extensions of neo-Darwinian evolution that
were thought to explain most instances of cooperation: Hamilton’s (1964) theory of kin
selection, which was thought to explain cooperation between related individuals
(especially between non-human animals), and Trivers’ (1971) theory of direct reciprocity,
which was thought to explain cooperation between unrelated individuals. These theories
were so influential that many today still assume, though incorrectly, that nearly all
instances of cooperation and altruism can be explained by these two theories (West et al.
2007).
Following pioneering work of Robert Axelrod in 1981, cooperative phenomena
became a major focus of computational research. In a series of computer simulation
tournaments between various prisoners dilemma strategies, it was shown that cooperative
strategies could be evolutionarily stable despite the ever present incentive to cheat
(Axelrod and Hamilton 1981). An immense response from researchers followed in which
various parameters and settings of Axelrod’s original model were altered (see Dugatkin
1997a for a detailed review of this work). Subsequent experiments led to a number of
9
important insights that have helped move the field toward a broadly applicable theory of
the evolution of cooperation.
One modification was the introduction of stochasticity. When strategies were
executed probabilistically instead of deterministically, cooperative outcomes became
much less likely (Nowak 1990). The same was found to be true if agents probabilistically
made mistakes in the execution of their strategies (Hirshleifer and Coll 1988).
A more important modification of Axelrod’s model was the introduction of space.
Axelrod’s original work grew out of evolutionary game theory, in which techniques of
population biology are used to explore evolutionary stability of game situations
(Maynard-Smith 1982). The technique, however, is limited to exploration of equilibrium
points within an infinite, homogeneous, and well-mixed population (Killingback and
Doebeli 1996). Nowak and colleagues were among the first to include spatial explicitness
and to demonstrate that it could lead to qualitatively different outcomes in terms of
cooperation (Nowak and May 1992, Nowak et al. 1994).
Other Theories
The work of Axelrod and his successors is generally classified as a direct
reciprocity theory of cooperation (described below). This is only one of a group of
theories that has emerged in an effort to understand cooperative behavior. Below are brief
descriptions of important theories regarding the evolution of cooperation, along with
major criticisms of each.
10
Inclusive fitness
Inclusive fitness theory, now often called kin selection, redefines an individual’s
fitness as a product of how well that individual’s genes are propagated, regardless of who
carries the genes (Hamilton 1964). In other words, an individual acting altruistically
toward close relatives can pass on copies of genes through those relatives’ offspring in
addition to its own offspring. An initial requirement that individuals share a common
ancestry was thought to limit the theory’s applicability and later researchers broadened
the definition of related individuals to include those that share particular genes of interest,
regardless of ancestry (West et al. 2007). Despite this broader definition and its
explanatory power regarding social insects and certain animal groups, inclusive fitness
remains unsatisfactory for explaining cooperative behavior that is common between
unrelated individuals in human societies (Di Paolo 1999, Abbot et al. 2001, Wilson
2005).
Direct reciprocity
The idea of direct reciprocity is embodied in the phrase “you scratch my back
now, I’ll scratch yours later”. This theory, formerly (and often still) referred to as
reciprocal altruism, asserts that when a population of agents reciprocates each other’s
cooperative behavior, that population will resist invasion by a selfish strategy (Trivers
1971). Axelrod and Hamilton’s (1981) work with agents playing the iterated prisoner’s
dilemma is one example. However, because its underlying requirements and assumptions
are so restrictive, direct reciprocity has fallen out of favor as a general theory of
cooperation and few researchers still believe it is has applicability beyond certain
11
situations involving humans (Dugatkin 1997a, West et al. 2007). In addition, direct
reciprocity typically requires long-term repeated interactions and cannot explain
cooperation in anonymous one-shot interactions – a phenomenon growing ever more
prevalent in human societies (Nowak and Sigmund 2005).
Indirect reciprocity
In contrast to direct reciprocity, the idea of indirect reciprocity can be summarized
as “you scratch my back, I’ll scratch someone else’s” (Nowak and Sigmund 2005).
According to this theory, after an agent unconditionally confers a benefit on a 2nd agent,
this 2nd agent will at some later time confer an unconditional benefit on a 3P
rd
P agent and so
on (Leimar and Hammerstein 2001). Like direct reciprocity, indirect reciprocity requires
a lengthy period of repeat interactions, though unlike direct reciprocity, these interactions
must only be within the same group and not with the same individual. This excludes
indirect reciprocity also as an explanation of cooperation in anonymous one-shot
interactions.
A related concept is that of tag recognition, or the so-called “green-beard”
phenomenon, in which a benefit is unconditionally conferred on another, but only to an
interaction partner that exhibits the proper trait or signal (Macy and Skvoretz 1998,
Ostrom 1998, Riolo et al. 2001). This mechanism may lead to the development of
reputation, which has been shown to promote cooperative acts between repeatedly
interacting individuals, both human (Nowak and Sigmund 1998b, Nowak and Sigmund
1998a, Suzuki and Toquenaga 2005) and non-human (Zehavi and Zahavi 1997).
However, the ability of reputation to induce cooperative behavior requires the reliable
12
reception and interpretation of signals by an individual, which may not happen despite
broadcast of signals by another individual
Multi-level selection
Multi-level selection theory, also know variously as demic selection, intrademic
selection, trait-group selection, or new group selection, asserts that a trait’s frequency can
increase in a population because it confers a benefit on a group of individuals, not on the
individuals themselves (West et al. 2007). In contrast to classical group selection, which
was discredited by both evolutionary ecologists and political economists during the
1960’s (Olson 1965, Williams 1966, Hagen 1992), the contemporary theory of multi-
level selection is a more general version of classical group selection (Wilson 2007) and
has provided explanatory power for many social insect phenomena as well as cultural
patterns in isolated human populations (Wilson and Hölldobler 2005, Hölldobler and
Wilson 2009).
Multi-level selection theory was initially met with resistance because it required
that selection forces be easily parsed into within-groups and between-groups forces (Price
1970). This, in turn, required that some members of a group have alternating periods in
its life history in which it is at one time within a clearly demarcated group and at another
time well-mixed with those of other groups (Wilson 1975). Consequently, multi-level
selection was not widely accepted as an explanation of cooperation among organisms that
do not form clearly defined groups, including dynamic human societies. Theories
incorporating the idea of population viscosity sought to remedy this, but with mixed
results (Queller 1992, Mitteldorf and Wilson 2000).
13
Eventually researchers dispensed with the requirement of clearly defined groups,
which has turned multi-level selection into a powerful theory in several fields. This
broadening of the theory is best summarized by Wilson and Wilson (2007), who assert
that “groups need not have discrete boundaries; the important feature is that social
interactions are local, compared to the size of the total population.” Through this broader
definition this dissertation contributes to the theory of multi-level selection by identifying
complex social networks as the substrate that delivers the required local interactions. This
is discussed further in the conclusion in Chapter 8.
Others have criticized a broader definition of groups on the grounds that it blurs
the distinction between multi-level selection and inclusive fitness, creating confusion and
negatively affecting the ability to execute and interpret research on the evolution of social
behavior (West et al. 2007). Whether this broadening of multi-level selection theory has
been beneficial or counterproductive continues to be a source of contentious debate
(Wilson 2007, West et al. 2008).
Strong Reciprocity
Theories described above have been variously in and out of favor since Darwin
first raised the issue of cooperation. However, the quest for a fundamental understanding
of cooperation through use of simulations and experimental games during the past three
decades has increasingly focused on the concept of strong reciprocity – the idea that
individuals reward others who cooperate and punish those who do not (Gintis 2000,
Bowles and Gintis 2004). These acts of rewarding and punishing are performed
altruistically in that the agent conferring the reward or punishment incurs a cost but
14
obtains no material benefit in return. Though West et al (2007) assert there is nothing
altruistic about the punishment and rewards that comprise strong reciprocity, I retain the
terminology here to be consistent with contemporary literature.
Altruistic punishment, in particular, has been shown empirically to induce
cooperative outcomes in social interactions (Fehr and Gächter 2000, 2002, Boyd et al.
2003, Fehr and Fischbacher 2004, Gardner and West 2004, Fowler 2005, Fowler et al.
2005). This finding is echoed by those engaged in statecraft, who assert that punishment
mechanisms are prerequisites for successful international environmental agreements
(Barrett 2003a, b).
Though composed of two principles – punishment and reward – strong reciprocity
research has been dominated by work on punishment, and it is now well-established that
altruistic punishment can increase contributions in public goods games (see below).
Researchers seem sufficiently sure of punishment’s ability to induce cooperation that
they have moved to advocating its use by policy makers, both at local scales, in
institutions governing common pool resources (Ostrom et al. 1992, Ostrom et al. 1994,
Dietz et al. 2003), and at global scales, where non-compliance with environmental
treaties must be deterred without the aid of an independent enforcement authority (Barrett
2003a, b).
An important parameter governing the mechanism of altruistic punishment, and
one that will be referenced throughout this dissertation, is the ratio of costs incurred by
the punishing party to those of the party being punished (Casari 2005). Letting c = the
cost which an individual incurs to punish another, cM is then the fee or sanction imposed
on the punished party where M is a parameter of the model referred to as the punishment
15
multiplier. In an evolutionary context, when costs and benefits represent fitness, as M
becomes arbitrarily large there should be some point at which it could no longer be
considered altruistic to provide punishment but is instead evolutionarily beneficial. M,
therefore, becomes an important parameter in understanding outcomes of punishment
experiments.
As noted, altruistic punishment is only one aspect of strong reciprocity, and
though it has dominated research on the topic, the phenomenon of altruistic rewarding
should not be ignored. Those that advocate punishment, such as Ostrom (1994) and Dietz
(2003), briefly discuss the benefits of rewards or incentives in social dilemmas but list
only sanctioning mechanisms in their recommended institutional solutions. One
explanation for less attention to rewards may be that experiments have demonstrated the
threat of punishment leads to higher contributions in public goods games than the
promise of rewards (Sefton et al. 2002, Andreoni et al. 2003). In addition, case studies of
successfully managed common pool resources typically credit punishment instead of
rewards (Ostrom 1990, Ostrom et al. 1992, Ostrom et al. 1994), though this may be a
product of researcher preferences or bias.
Economic theory does not predict that reward systems should be inferior to
punishment systems. However, from a biological perspective the disparity is not
unexpected. As stated above, there should be times when punishment is an evolutionarily
beneficial strategy (Shutters 2009). On the other hand, a reward given will always reduce
the fitness of the rewarder relative to the agent receiving the reward.
16
Rational Choice Theory
Cooperation is the collective result of individual behavior and therefore the result
of a series of individual choices. To begin to understand cooperation requires first an
adequate understanding of theories of choice. By far the dominant paradigm for
explaining how individuals choose among several alternatives is that of rational choice
theory. This theory argues that individuals choosing from a vast set of options first rank
those options in order of preference and then choose the option that is most preferred,
given constraints on the ability to acquire those choices. To be rational is to have the
following properties2 with respect to preferences:
1) preferences are complete – given the set of all available consumption choices X,
an individual can consistently rank his preferences for any two choices c1, c2 ∈ X
so that one of the following is true: c1 f c2 (read c1 is preferred to c2), c1 p c2, or c1
~ c2 (read c1 is indifferent to c2 , or the individual is indifferent between c1 and c2);
2) preferences are transitive – given three choices c1, c2, and c3 ∈ X, if c1 f c2 and
c2 f c3, then c1 f c3; and if c1 ~ c2 and c2 f c3, then c1 f c3;
3) preferences are non-satiable – given c = consumption of some good and ε = some
incremental consumption of the same good, (c + ε) f c and u(c + ε) > u(c). In
other words, consuming more of a good is always better in terms of utility (Mas-
Colell et al. 1995).
2 Though only properties 1 and 2 are requirements of rational choice theory, property 3 is included as an
important corollary customarily listed as a component of rationality.
17
Under the preceding restrictions all choices confronting an individual can be
ranked in order of preference. For purposes of formalized models it is preferred to
examine choices in terms of an individual’s utility, which may be defined simply as an
individual’s satisfaction or happiness (Rayo and Becker 2007). Formally however, utility
is a function of consumption u(c) that quantifies preference rankings such that if c1 f c2,
then u(c1) > u(c2) and if c1 ~ c2, then u(c1) = u(c2). Therefore, a rational decision maker,
in choosing the most preferred consumption alternative, maximizes his utility. Note that
in standard rational choice theory the utility function of one individual is assumed to be
independent of preferences and consumption of others, though most economists would
now agree that this is simply a best first approximation of behavior. This point will be
discussed in detail in Chapter 6.
Game Theory
A common methodology for testing theories of choice and rationality is the use of
controlled laboratory games. The economic theory of games provides a highly simplified
framework for analyzing decision making under constraint (see Binmore (1992) or
Osborne (2004) for a comprehensive introduction to game theory). Competitive situations
can be made sufficiently abstract that they become mathematically tractable, are
applicable across species or entities, and are readily simulated by computer applications.
Games may be either one-shot (single stage) games or multi-stage games. A one-
shot game requires strategic reasoning while multi-stage games require that the agent
reason based on what it learns through repeated interactions (Weirich 1998). As learning
models are beyond the scope of this dissertation, simulations are restricted to one-shot
18
games. Agents’ strategic choices are independent of both their own past choices and of
the past choices of other agents in the population. However, because an agent’s strategy
in any given period is a result of the cumulative effects of evolution in an environment
with other agents, one may validly argue that any agent’s strategy is indirectly driven by
past choices of itself and others.
Games may also be classified as either normal form or extensive form. An
extensive form game describes a series of game plays in advance and thus applies only to
multi-stage games. On the other hand, a normal form game describes a one-shot strategy
and requires the strategy be causally independent of those against which it plays (Weirich
1998). As stated previously, experiments presented in this dissertation consist only of
one-shot games. Therefore, only normal form games are used throughout.
Games may be further classified as either cooperative games or non-cooperative
games. The designation of a game as cooperative or non-cooperative defines whether or
not agents may make binding coalitions before strategies are played and should not be
confused with whether or not the game has a cooperative outcome. In non-cooperative
games agents may not make binding agreements before game play – agreements may be
made in advance but they are non-binding and, therefore, not enforceable. Cooperative
games, on the other hand, permit binding agreements prior to play which allows
coalitions to form. These games have essentially two stages – a decision of whether to
join a coalition or not and then a play of strategy. Experiments in this dissertation use
only non-cooperative games, as a goal of this study is broad applicability to a variety of
social species and at different hierarchical levels of society. In particular, attention is
focused on the international level of human societies where actors are nation states. Both
19
in non-human societies and at the international level of human societies there is no
effective mechanism or institution that facilitates binding agreements. This is true in
human society despite the existence of the World Court and the United Nations,
institutions considered ineffectual for the purposes of enforcing binding agreements
(Barrett 2003b). Accordingly, games which allow for binding coalitions are excluded in
this dissertation. Following are descriptions of common experimental games relevant to
this study.
The public good game
A public good game consists of n players. Each player i is given an endowment
and then contributes a portion of that endowment xi to a public good pool but keeps the
remainder. Choice of xi by each agent is made strictly independently of the choices of
other agents. In this dissertation initial endowments for players in all games are
standardized to 1 unit so that
[2.1]
[]
1,0∈
i
x.
A public good G is created by summing contributions from the n players and multiplying
by some factor r that represents the synergistic effect of cooperation
[2.2] ∑
=
=n
i
i
xrG
1
.
To make the game meaningful for studying social dilemmas, r must be greater than 1 or
individuals have no incentive to contribute to a public good. Likewise r must be less than
20
n or individuals have no incentive to retain their endowments. Accordingly,
[2.3] ),1( nr ∈.
The public good G is then distributed evenly to all n players so that i’s payoff pi equals
what the player did not contribute plus i’s share of the public good:
[2.4] n
G
xp ii +−= )1(.
Substitution of [2.2] into [2.3] yields
[2.5] ⎟
⎟
⎠
⎞
⎜
⎜
⎝
⎛
+
⎟
⎠
⎞
⎜
⎝
⎛−+= ∑
≠
n
ij
jii x
n
r
n
r
xp 11 .
Since [2.3] requires that r < n, it follows that
[2.6] 01 <
⎟
⎠
⎞
⎜
⎝
⎛−
n
r .
and any positive value for xi in [2.5], ceteris paribus, will decrease player i’s payoff.
In other words, for any given set of contributions by all other participants, an
individual’s payoff is maximized by contributing 0 to the public good. In contrast total
social welfare, measured here as the sum of all payoffs, is maximized when every
individual contributes its entire endowment to the public good.
21
The prisoners dilemma
The prisoners dilemma is a reduced version of the public good game played
between only 2 players. Unlike the public good game, it is customary in the standard
prisoners dilemma to limit strategies to those of full defection (x = 0) or full cooperation
(x = 1), so that each agent simply faces a binary choice – cooperate or defect. Despite this
simplification, the expected outcome of the prisoners dilemma is the same as that of the
public goods game – rational agents attempting to maximize their own individual benefit
will experience the least desirable social outcome.
A hybrid between the public good game and the standard prisoners dilemma is the
continuous prisoners dilemma (CPD). In this version of the prisoners dilemma the game
is still restricted to two players, but the set of allowable strategic choices is expanded to
the entire interval [0, 1] as in the public good game. This allows for a richer set of
possible outcomes while maintaining the analytical simplicity of a 2-person interaction
(see Chapter 3 for a detailed description of the CPD).
The ultimatum and dictator games
Bargaining games comprise another class of experimental games. This category
includes the ultimatum and dictator games, which are primarily used to understand the
evolution of fairness.
The ultimatum game is played between two players i and j and is structured so
that i, the proposer, is given an endowment from which a portion xi must be offered to j,
the responder. The offer may be any portion from 0% to 100% of the endowment. It is
customary when possible to standardize the endowment to 1 so that xi ∈ [0, 1]. The
22
responder may accept the offer, in which case each player receives her agreed upon share,
or may reject the proposal, in which case both players receive nothing. Economic theory
predicts that a rational responder will accept the smallest possible positive fraction, and
that a rational proposer, knowing this, will offer the smallest possible positive fraction
(Binmore 1992).
In a simplification of the ultimatum game known as the dictator game, the
responder has no ability to react. i simply gives a portion of its endowment to j and the
game ends (Osborne 2004). In this case the economic expectation is that i will offer 0.
This game is often used as a control case for comparisons to ultimatum game results.
Experimental games and rationality
In the context of controlled economic laboratory games it is customary to discuss
experimental predictions and results in terms of the utility of a subject’s payoffs u(p)
instead of consumption u(c). Since monetary payoffs normally translate easily into
purchasing power, this is a reasonable assumption and will be adopted throughout this
text.
It is an altogether different matter that in nearly all experimental economics it is
implied, if not expressly stated, that rational agents are expected to maximize their payoff
p, not their utility from that payoff u(p). This is a subtle but very important distinction.
Experimenters can generally ignore this distinction by assuming that utility is a
monotonic transformation of payoffs so that if p1 > p2, then u(p1) > u(p2). In other words,
agents will maximize utility u(p) if they simply maximize their payoffs p. This
assumption is valid only if payoffs are ordered the same as preferences for those payoffs.
23
If they are not, then there is no basis to predict game outcomes using strict rational choice
theory.
This becomes a problem in light of numerous studies from experimental
economists that demonstrate human behavior in laboratory games is not always
consistent with payoff maximization. Instead, evidence suggests that a player’s utility is a
function, at least in part, of other players’ payoffs. This phenomenon, termed
interdependent preferences, is examined in detail in Chapter 6.
Contributing to this semantic confusion are the ways in which different authors
use the terms payoff and utility. In the majority of behavioral literature cited in this study,
expected payoff is the explicit quantified value (usually monetary) of a possible game
outcome. For example, if choice A in a game earns a player 5 dollars, we would say the
player’s expected payoff for choice A is 5 dollars. Yet, in David Morrow’s textbook
Game Theory for Political Scientists (1994, p. 351) for example, the author defines the
term payoff as “A player’s utility for an outcome of a game.” In other words, payoffs
equal the utility of payoffs, or p = u(p). Is a payoff the expected value of a game outcome
or the expected utility of that outcome? Without loss of generality I consistently interpret
payoffs in this dissertation as the expected value of a game and not the expected utility.
This allows investigation of a given payoff matrix under different utility functions,
including functions that incorporate interdependent preferences or predispositions for fair
allocations discussed above.
24
Social Network Theory
The term “social structure” is increasingly confusing to those attempting to
understand individual behavior by examining explicit connections between members of a
society. The term is often used with models that segregate a society into smaller groups
that are merely well-mixed subpopulations, especially multi-level selection models (e.g.
Fryxell et al. 2007). This is still systems-level thinking with multiple, linked pools.
However, the explicit connection pattern between individuals within these societies is
still ignored. Here a narrower definition of social structure is proposed in which the
connections of every member of a society to every other member are explicitly defined.
In other words the social network of the society is defined.
Social network theory emerged from attempts by social scientists to understand
social phenomena in terms of the connection pattern among a society’s individuals (see
Wassermann and Faust (1994) for a comprehensive text on social networks and their
analysis). At the same time but in other academic circles, the mathematical theory of
graphs developed to understand quantitative properties of network structures. This has
unfortunately led to alternate vocabularies and techniques to address what are effectively
equivalent properties and phenomena of social networks. Throughout this dissertation the
following groups of terms from social network theory and graph theory may be used
interchangeably:
(a) social network, network, graph;
(b) connection, link, edge;
(c) agent, node, actor.
25
A social network is easily represented by a square matrix in which each node is
uniquely identified by its column and row. The existence or absence of a link can then be
listed between every possible pair within a society at the appropriate index site. Such a
matrix is defined as an adjacency matrix. The simplest such matrix contains only binary
information; aij = aji =1 if agents i and j are linked and aij = aji = 0 if they are not (Figure
2.1). Index values may also contain information other than simply 0’s and 1’s such as the
strength of a link, the cost of maintaining or using a link, probability of interaction, or
any number of other meaningful types of information describing the relationship between
two members of a population.
In addition, the simplest adjacency matrix represents what is known as a non-
directed graph (Figure 2.1a). In a non-directed graph if node i is linked to node j, then j is
linked to i. This need not always be the case. In some circumstances it is important to
distinguish between the link from i to j and the link from j to i. This is done by means of a
directed graph (Figure 2.1b). Simulation experiments in this dissertation use only simple
binary, non-directed graphs.
A further designation of graphs is whether they are connected or non-connected.
A connected graph is a network in which every node is reachable by every other node,
regardless of how many links it may take to reach each other (Figure 2.2a). If there exists
any node that is not reachable by every other node, then the network is not a connected
graph (Figure 2.2b). As this dissertation is concerned with ability of behavior to
propagate throughout a society or population only connected graphs are considered.
26
Social network metrics
There exist a myriad of quantitative descriptors of social networks, many of
which require laborious algorithms to compute. Among them are at least four that are
more relevant to the work of this dissertation.
Degree - The measure of degree can be confusing because it can describe both an
individual node and an entire network. The degree of any node i in an undirected network
is simply the number of direct links from i to other nodes. In other words it is a count of
how many neighbors to which i is directly connected in a non-directed network. In
contrast, the degree of a network (not just a single node) is the average degree of all
nodes in the network. In other words if di = number of neighbors to which i is linked,
then the degree of the entire network with n nodes can be represented as
[2.7] ∑
=
=n
i
i
d
n
d
1
1.
Geodesic and Eccentricity - The geodesic of two nodes is the shortest distance between
them. When links are unweighted, so that they are all of equal length or value, the
geodesic is simply the least number of links required to travel between two nodes. The
geodesic (di,j) is synonymous with the popular notion of degrees of separation between
nodes i and j. A related measure is a node’s eccentricity, which is simply the largest
geodesic between that node and all other nodes in the network.
27
Diameter of a graph - The diameter of a graph D is the largest eccentricity value among
the nodes of the graph (and therefore the largest geodesic among the nodes). This can be
a useful measure of how close the members of a society are to each other.
Clustering coefficient - The clustering coefficient is a measure of the density of local
connections around a node. Specifically it measures how many neighbors of a node are
connected to other neighbors of the same node. This measure is standardized by the
number of possible connections between neighbors.
Classification of social networks
The following are descriptions of important classes of social networks or graphs
used in this dissertation along with descriptions of how they are generated. Graphical
representations of networks used in this research are presented in Figure 2.3 and Figure
2.4. The actual computer algorithms used to generate networks for this dissertation were
written in the Java programming language and are presented in Appendix B.
Random networks - A random network, also known as an Erdős-Rényi graph, is simply
generated by starting with a fixed number of nodes and then connecting pairs of nodes at
random until the desired total number of links in the network is reached (Figure 2.4a). In
this dissertation an extra step is added after a random network is generated to ensure that
the graph is connected (see above). If the random generation process results in a non-
connected graph, additional links are added until it is connected, resulting in an
insignificant variation in total number of links among these networks. Random networks
28
have the feature of low degrees of separation or network diameter, but also have low
clustering coefficients.
Regular network - A regular network is often represented as a grid structure. All nodes
have the same degree, or number of neighbors, and are arranged in a regular repeating
pattern. In addition, such structures are torroidal, meaning that they have no edges but
instead loop around onto themselves such as the surface of a sphere. Regular networks
typically have high clustering coefficients but also high degrees of separation.
The two most commonly used regular networks in simulations are the von
Neumann graph (Figure 2.3c), in which every node is connected to four neighbors in a
torroidal grid, and the Moore graph (Figure 2.3e), in which every node is connected to
eight neighbors in a torroidal grid. Hexagonal networks (Figure 2.3d) are also used but
with less frequency.
Regular networks may also be one-dimensional instead of two-dimensional. In
this case the resulting network is linear and, because it is also torroidal, is known as a
ring (Figures 2.3a & b).
Random regular networks - Random regular networks have elements in common with
both regular grid structures and randomly generated graphs. Like regular networks, all
nodes in a random regular network have the same degree. However, links are generated
randomly instead of in a regular, repeating pattern (Figure 2.4f). Like generation of an
Erdős-Rényi, network pairs of nodes are linked at random but only until a node reaches
some predetermined degree. At that point, the node becomes ineligible for further links.
29
Small world networks - Also known as a Watts-Strogatz graph, small-world networks
show features of both random networks and regular networks (Figures 2.4d & e) and
represent the dominant interaction pattern observed in human societies (Watts and
Strogatz 1998). Watts-Strogatz graphs are generated by starting with a regular network
and then randomly cutting and relinking ties in the network with a probability r. Watts
and Strogatz found that over a certain range of r, networks are generated that exhibit
features long sought by social scientists of both low degrees of separation and high
clustering coefficients.
In this dissertation small world networks are generated by starting with a
population structured in a linear ring (Figure 2.3a). This is known as the ring substrate
method of generation.
Scale-free networks - Also known as a Barabási-Albert graph, a scale-free network is
characterized by a power law distribution of nodal degrees and are ubiquitous in the real
world from metabolic pathways to river drainage patterns to the hyperlink patterns of the
world wide web (Barabási and Albert 1999). Scale-free networks are generated by
growth through preferential attachment (Figure 2.4b & c). That is, the network is
“grown” by adding new nodes one at a time until the desired number of total nodes is
reached. As a new node is added, the point within the existing network at which it is
attached is probability-based. The probability that any one node in the existing network
will be the point of new attachment is proportional to the number of existing links that
node already has (for a discussion of preferential attachment based on factors other than
number of existing links see Ko et al. 2008).
30
Though scale-free networks as described above are sometimes classified as
simply another form of small world network (e.g. Buchanan 2002), in this dissertation I
draw a sharp distinction between the two. While scale-free networks may have the feature
of low degrees of separation in common with small world networks, the clustering
coefficient of a scale-free network is rarely high and is more reflective of random
networks.
Agent-based Modelling
Because we live in a world of continuous change, standard theories of behavior
are of little use in understanding dynamic behavior (North 2005). Not only are traditional
mathematical methods of understanding dynamic systems limited to a small number of
special cases, but those methods may have little potential for understanding how
individual behavior affects emergent properties at higher scales (Anderies 2002, Harrison
and Singer 2006). Therefore, argues North (2005), researchers must dispense with a quest
for elegant mathematical equilibria if their goals are to truly understand behavior in a
changing world (see also Janssen 2002).
In some cases the overzealous pursuit of mathematically simple models has led to
unrealistic representations of a system (Harrison and Singer 2006). In the case of early
models of group selection, simplifying assumptions made to allow formalized treatment
led to vehement rejection of group selection theories and may have set back unbiased
research on the topic for over 30 years (Wilson and Wilson 2007).
Prior to the availability of sufficient computational power, systems of interacting
parts were often analyzed through sets of difference or differential equations. As
31
computing power grew, methods of systems dynamics emerged in which computers were
used to iteratively solve complex systems of equations. However, systems level analysis
is still limited to discovering and describing macro-level, aggregate phenomena, and are
not concerned with heterogeneous attributes of the individual components of a system
(Sawyer 2005). While aggregate mathematical models have worked remarkably well for
systems such as gas molecules, they may create particularly unrealistic representations of
biological systems (Bedau 1999). This is especially true of societies (Sawyer 2005,
Harrison 2006).
Since the early 1990’s agent-based modelling has emerged as a preferred method
of investigating such dynamic processes when analytical methods become intractable and
it is ideal for creating models in which social structure is explicitly acknowledged or
individuals are not atomistic clones of one another (Drogoul and Ferber 1994, Bedau
1999, Janssen 2002, Sawyer 2005, Wilson and Wilson 2007). In hierarchical terms, the
modelled unit moves from the system to the individual entities that comprise the system.
Thomas Schelling’s (1971) Sugarscape model, which exhibited aggregate social patterns
not predicted by equations, is often acknowledged as the first fruitful use of agent-based
modelling in the social sciences. Nowak and May further demonstrate the shortcomings
of systems level models by showing that restricting agents to a defined structure of
possible interactions led to very different outcomes than when the same system was
modelled as a well-mixed aggregate (Nowak and May 1992).
Agent-based modelling has achieved acceptance in part from a growing
realization of the limitations of reductionist science (Sawyer 2005) and by renewed
attempts to use holistic or organic approaches to understanding a social system as a whole
32
(Sawyer 2005, Harrison 2006). To fully explore the evolution of cooperation in systems
of social agents, computational social simulation, or agent-based modelling, is used as the
primary method of investigation in this dissertation. Unlike laboratory experiments the
use of agent-based simulations allows careful control over factors that may confound
empirical studies such as emotion, reputation, visual cues, anonymity, or cultural
influences (Cederman 2001). This control allows researchers to single out cultural and
other factors that may be most important to facilitating cooperation and allows almost
unlimited creativity in designing virtual experiments. More importantly, agent-based
models go beyond the capabilities of mathematical analysis to allow investigation of
dynamic systems far from stable equilibrium points. In particular agent-based modelling
is used in this study to explore behavior in populations of agents that play dynamic,
evolving strategies in various economic game situations.
However, it should be noted that this dissertation does not advocate simulations of
networked populations as a panacea for understanding the evolution of social behavior.
For instance, certain social aspects of eusocial insect colonies are described well by
systems of differential equations (Reeve and Hölldobler 2007) even though there is
evidence that colony interaction patterns exhibit characteristics of social networks
(Fewell 2003). Instead the explanatory power of this dissertation is most applicable to
those superorganisms whose internal structure is clearly described by complex social
networks.
One criticism of agent-based computer simulations is that their results are often
produced from proprietary programs or cannot otherwise be independently verified
(Bedau 1999). The appropriate response to this is to verify that results are independent of
33
software platform and that results are reproducible by sufficiently informed colleagues.
This can be accomplished by supplying colleagues with only pseudo-code, or a concise
description of a simulation, and confirming that they are able to reproduce results
(Edmonds and Hales 2003). When possible, simulation programs developed for this
dissertation have been independently recreated by fellow researchers3 with sufficient
expertise in agent-based modelling using only pseudo-code as a guide.
The simulation algorithms developed for this dissertation4 will continue to have
research uses well beyond those of this project. In addition to follow-up questions that
may arise from this research, seemingly unrelated questions about the evolution of
cooperation may be explored easily once models such as these are standardized.
3 Preliminary findings of this dissertation were successfully replicated by Dr. Francois Bousquet of
CIRAD, France, using the CORMAS modeling platform (http://cormas.cirad.fr/indexeng.htm), and by S.
Alessio Delre, of the University of Groningen, Netherlands, using the C programming language.
4 Simulations were written in Java 1.5.0 (http://java.sun.com/) using the Eclipse 3.2 software development
kit (http://www.eclipse.org/). See Appendix B for detail code.
34
1
2 3 4 5 6
1- 0 0 1 0 0 0
2- 0 0 0 1 1 0
3- 1 0 0 0 1 0
4- 0 1 0 0 1 1
5- 0 1 1 1 0 1
6- 0 0 0 1 1 0
(a) Example of a non-directed graph with 6 nodes and its representation as an adjacency
matrix. Note that the adjacency matrix of a non-directed graph is symmetrical about the
main diagonal.
1 2 3 4 5 6
1- 0 1 0 0 1 0
2- 0 0 0 0 0 1
3- 0 0 0 0 1 0
4- 0 1 0 0 1 1
5- 0 0 1 0 0 0
6- 1 0 1 1 0 0
(b) Example of a directed graph with 6 nodes and its representation as an adjacency
matrix. The adjacency matrix of a directed graph need not be symmetrical.
Figure 2.1. Examples of directed and non-directed graphs and their adjacency matrices.
35
(a) A connected graph with 20 nodes. Every node is reachable by every other node.
(b) A non-connected graph. No node is reachable by every other node. Note that node 6 is
reachable only by three other nodes, while node 14 is not reachable by any other node.
Figure 2.2. Examples of connected and non-connected graphs. Because this dissertation
examines cases where interactions take place between agents, only connected graphs are
used throughout.
36
(a) (b) (c)
(d) (e) (f)
Figure 2.3. Examples of regular networks used in this dissertation. Each example network
is composed of 64 nodes. (a) A ring with neighborhood radius = 1. (b) A ring with
neighborhood radius = 2. (c) A von Neumann lattice. (d) A hexagonal lattice. (e) A
Moore lattice. (f) A complete graph. Examples c, d, and e are known as regular lattices
and are actually torroidal, meaning they bend back around on themselves to make a
single surface with no edges.
37
(a) (b) (c)
(d) (e) (f)
Figure 2.4. Examples of non-regular networks used in this dissertation. Each example
network is composed of 64 nodes. (a) A random network with probability of link = 0.20.
(b) A scale-free network with one link per new node. (c) A scale-free network with two
links per new node. (d) A small-world network using a ring substrate with neighborhood
radius = 2 and probability of rewire = 0.2. (e) A small-world network using a ring
substrate with neighborhood radius = 2 and probability of rewire = 0.05. (f) A random
regular network with 4 links per node.
CHAPTER 3
PUNISHMENT AND SOCIAL STRUCTURE:
COOPERATION IN A CONTINUOUS PRISONERS DILEMMA
Introduction
The phenomenon of cooperative behavior remains unexplained in several
branches of science. Though a number of mechanisms have been proposed to explain at
least some observable instances of cooperation (Hamilton 1964, Trivers 1971, Axelrod
and Hamilton 1981, Wilson and Sober 1994, Fehr et al. 2002, Foster et al. 2004) they
invariably apply to limited cases or special circumstances (see Chapter 2, Alternate
Theories for further discussion).
Arguably the leading contemporary explanation for the evolution of cooperation
is the phenomenon of altruistic punishment. Altruistic punishment occurs when an
individual incurs a cost to punish another without receiving any material benefit in return
(Fehr and Gächter 2002). This mechanism has been shown repeatedly to induce
cooperative behavior in laboratory experiments with humans, where subjects often pay to
punish players that are not even in the same game as the punishee (Ostrom et al. 1992,
Fehr and Gächter 2000, 2002, Andreoni et al. 2003, Gürerk et al. 2006).
From an economic perspective, however, the phenomenon of altruistic
punishment is just as irrational as cooperation and as an explanation of cooperation in
one-shot anonymous interactions it only shifts the question from “why should an
individual cooperate?” to “why should an individual altruistically punish?” It therefore
remains to demonstrate a causal mechanism for altruistic punishment if it to explain the
evolution of cooperation. One mechanism currently proposed as the key to altruistic
39
punishment is cultural group selection (Richerson and Boyd 2005, Hagen and
Hammerstein 2006). This dissertation offers evidence for at least one mechanism leading
to the evolution of altruistic punishment – social structure.
Like other explanations of cooperation, many theories of altruistic punishment are
limited by the fact that they are framed in terms of evolutionary game theory (Maynard-
Smith 1982) and fail to address the social structure governing interactions between actors
(Jackson and Watts 2002). Such explanations assume a system is homogeneous or well-
mixed and that members of the system interact randomly with each other with equal
probability. These system dynamics models have limited applicability to groups of social
organisms (Sawyer 2005, Griffin 2006, Harrison and Singer 2006).
Previous simulations have shown that adding simple two-dimensional space leads
to very different behavior than simple well-mixed population models (Schelling 1971,
Nowak and May 1992). Despite a growing tendency to include social structure in
simulation models, there is still a bias for use of overly simplistic, regular two-
dimensional lattices. Real-world network structures of cooperating agents are known to
be far from well-mixed, yet neither do they conform neatly to the regular pattern of a
lattice (Barabási and Albert 1999, Amaral et al. 2000, Dorogtsev and Mendes 2003).
Social scientists, on the other hand, have long acknowledged the association
between complex social networks and cooperation (Oliver 1984, Marwell et al. 1988,
Gould 1993, Chwe 1999, 2000). Yet only recently has the simulation and modelling
community begun to move beyond regular lattice structures to explore the important role
that complex social networks play in the evolution of cooperation (Santos and Pacheco
2005, Santos et al. 2006b, Chen et al. 2007, Olfati-Saber et al. 2007).
40
It remains then, to explore the combined effects punishment and social structure
and whether their combination may lead to a broadly applicable explanation of
cooperation. As Michael Chwe (1999) lamented,
Collective action has been studied in two largely disjoint approaches, one
focusing on the influence of social structure and another focusing on the
incentives for individual participation. These approaches are often seen as
competing or even opposed.
This chapter attempts to bridge these two approaches by focusing on altruistic
punishment as the mechanism for the evolution of cooperation and by demonstrating how
incorporating social structure may make punishment a viable mechanism for the
evolution of cooperation. Agent-based computer simulations were conducted to test the
ability of altruistic punishment to induce cooperation on a variety of social network
structures.
The Simulation Model
To test the ability of altruistic punishment to induce cooperation, a punishment
option was incorporated into simulations of the continuous prisoners dilemma played out
on a variety of social networks. These networks included a complete graph, representing
a well-mixed system, several regular lattices representative of some of the first
computational simulations and often analogous to spatial explicitness, and more
sophisticated complex social networks, such as scale-free5 and small-world networks6,
representative of many real-world processes in physical, biological, and social systems.
5 This simulation uses a Barabási-Albert (1999) type algorithm to create scale-free networks by preferential
growth. That is, the network is “grown” by adding new nodes one at a time until the desired population
level is reached. As a new node is connected, the point within the existing network at which a new node is
attached is probability-based. The probability that any one node in the existing network will be the point of
41
The continuous prisoners dilemma (CPD)
In the classic prisoners dilemma players are limited to two choices - cooperate or
defect. Here that requirement is relaxed and players are able to select a level of
cooperation at any point on a continuum between full cooperation and full defection. This
presents an arguably more realistic picture of choices facing those in social dilemmas
(Sandler 1999, Killingback and Doebeli 2002). In this dissertation the set of contribution
choices is standardized to the interval [0,1] so that 0 = full defection and 1 = full
cooperation. This is known as the continuous prisoners dilemma (CPD).
The CPD can also be thought of as a simplified version of the public goods game
described in Chapter 2 with only 2 players, i and j. When n = 2, [2.4] is modified so that
i’s payoff becomes
[3.1] pi = 1 – xi + r(xi + xj)/2; r ∈ (1,2).
The addition of altruistic punishment introduces a 3rd player to the game, the observer k
and the possibility of punishment further modifies potential payoffs. The CPD payoff
matrix used in this chapter is presented in Table 3.2.
new attachment is proportional to the number of existing connections that node already has. An important
parameter to consider when growing these networks is the number of links that a new node makes to the
existing network. Results in this study were obtained using scale-free networks grown by nodes that linked
to two nodes of the existing network. Supplementary simulations with single-link attachment showed no
appreciable difference in results.
6 This simulation uses the Watts-Strogatz (1998) algorithm for creating small-world networks by random
rewiring of a regular ring structure). That is, the algorithm begins with a simple circle of connected agents
(a ring substrate) then randomly rewires links from an adjacent node in the circle to one selected at random
anywhere in the population. An important parameter governing this algorithm is the radius of agents in the
initial circle that are considered neighbors. Let r = the neighborhood radius which equals the number of
links between an agent and what constitutes a neighbor. Given any agent A, when r = 1, only the two
agents on either side of A are considered neighbors; when r = 2, two agents in each direction from A (four
total) are considered neighbors, and so forth. Results presented in this study for small-world networks were
obtained using a ring substrate with r = 2. Supplementary simulations using a ring substrate with r = 1
showed no appreciable difference in results.
42
Like the public goods game, for any given contribution by an opponent, an
individual’s payoff is maximized by contributing 0 to the public good. This is the
expected rational choice or Nash equilibrium of the prisoners dilemma (Binmore 1992).
The dilemma arises, however, because total social welfare is maximized when both
individuals cooperate fully. Theory predicts that, given rational agents, each player in the
CPD will contribute 0 to the public good and regardless of the amount of the agent’s
contribution, an observing neighbor will never pay to punish (Fehr and Gächter 2000).
Game play
A simulation run initiates by creation of a social network. Let N(V,E) be a
connected network where V is the array of vertices or nodes and E is the array of edges or
links. Each node is occupied by a single agent i consisting of strategy (xi, ti, ci) where xi =
the contribution i makes to the public good when playing against j, ti = the contribution
below which the agent will punish another agent in a game being observed by i, and ci =
the cost that i is willing to incur to punish the observed agent when the observed agent’s
contribution is too low (Table 3.1). In other words ti determines if agent i will punish and
ci determines how much agent i will punish. Each strategy component xi, ti, ci ∈ [0,1] and
is generated randomly from a uniform distribution at the beginning of each simulation.
To control for other factors that might contribute to the maintenance of cooperation, such
as interaction history or reputation, the model does not allow recognition of or memory of
other agents within the population. Every game is effectively one-shot and anonymous.
During a single CPD game an agent i initiates the encounter by randomly
selecting j from its neighborhood, which unless otherwise indicated, consists of all nodes
43
one link away from i in the given network type. Agents are given their endowment of one
unit from which each simultaneously contributes a portion to a public good. Payoffs are
then calculated using the payoff matrix in Table 3.2. The initiating player i then randomly
selects a second neighbor k, who is tasked with observing and evaluating i’s contribution.
If k judges the contribution to be too low (xi < tk), k pays ck to punish i in the amount of
ckM, where M is the relative strength of punishment referred to here as the punishment
multiplier. Each agent initiates three CPD games during a single generation of the
simulation and each simulation run proceeds for 10,000 generations.
Each generation consists of three routines – game play, observation &
punishment, and selection & reproduction. During each routine an agent interacts only
with its immediate neighbors as defined by the network type and all interactions take
place in parallel. The payoff variable for each agent p, tallies the costs and payoffs an
agent experiences during a generation. Because this model depicts the elementary case in
which parents do not differentially provision resources for their offspring, p = 0 for each
agent at the beginning of a new generation7.
Following game play and punishment agents compete with one another for the
right to pass offspring to the next generation. During this reproduction routine each agent
i randomly selects a neighbor j with which to compare respective payoffs accumulated
during the generation. If pi > pj, i’s strategy remains at i’s node in the next generation.
However, if pi < pj, j’s strategy is copied onto i’s node for the next generation. In the
event that pi = pj, a coin toss determines the prevailing strategy. As strategies are copied
7Many species, especially social animals, do contribute to the success of their offspring through resource
provisioning or parental care. See (Wilson 2000) for examples.
44
to the next generation each of the three strategy components of every agent is subject to
mutation with a probability m = 0.10. If selected for mutation, Gaussian noise is added to
the component with mean = 0 and std. dev. = 0.01. Should mutation drive a component’s
value outside [0,1] the value is adjusted back to the closer boundary value.
Simulation variables and output
The important parameter governing the mechanism of altruistic punishment is the
ratio of costs incurred by the punishing party to those of the party being punished (Casari
2005, Shutters 2008). Defined above as the punishment multiplier M, this parameter is
analogous to the strength or efficiency of punishment and, along with network type, is the
independent variable in these simulations. The dependent variables of interest are the
mean contribution and the mean payoff which evolve in a population after 10,000
generations. The mean contribution represents the population’s level of cooperativeness
while the mean payoff represents the population’s social welfare.
Data were collected in two sets. In the first data set, 100 simulation replications
were conducted at M = 0.0 and then at subsequent values of M in increments of 0.5, up to
M = 6.0 (Table 3.2). This allowed for an analysis of variances in outcomes for a given
simulation parameter set. In the second data set, a parameter sweep of M was conducted
so that a single simulation was run at 0.0 and at subsequent values of M in increments of
0.01 up to M = 6.0 (Figure 3.1). This allowed for an analysis of the effect of M at higher
resolution but at the cost of no replications.
45
Results
Control case: no social structure, no punishment
For control purposes the initial population was simulated on a complete graph
with no altruistic punishment. As predicted by rational choice theory, the population
evolved contribution rates of approximately 0. In the absence of both punishment and
social structure no cooperation was exhibited.
Either punishment or social structure alone
In the second set of simulations, populations were subjected to alternate
treatments of either social structure or punishment. First, with punishment disabled
simulations were run on a variety of network structures (Table 3.4). Unlike similar
experiments with fairness in the ultimatum game (Chapter 5), social structure alone did
not drive outcomes from the Nash equilibrium (Table 3.2) and no cooperation evolved.
Only in the anomalous case of scale-free networks did contributions deviate from the
expected contribution rate of approximately 0.
Next, using a well-mixed complete graph analogous to no social structure,
simulations were run in which punishment was enabled. Despite having the ability to
punish each other populations lacking structure continued to evolve to the Nash
equilibrium with increasing M (Figure 3.1a). Neither social structure alone nor
punishment alone was sufficient to induce the population to evolve away from non-
cooperative behavior. Again these results concur with rational expectations.
46
Punishment and social structure together
In the final round of simulations, the CPD was played using both structured
populations and altruistic punishment for which the punishment multiplier M was
systematically varied. Results are presented in Figure 3.1. With increasing M, punishment
eventually led to nearly full cooperation on a Moore lattice and a small-world network.
Under these network types as M increased, populations underwent a rapid transition from
contributions ~ 0 to contributions ~ 1 (Figure 3.1b & 3.1c). This flip from nearly full
defection to nearly full cooperation occurred also in supplemental simulations run on the
following social structures: von Neumann lattice, hexagonal lattice, and linear (ring)
structures (Table 3.4). Response curves to these structures were so similar to those of the
Moore lattice and small-world network that their figures are excluded for the sake of
brevity.
Interestingly, populations using scale-free networks neither evolved to the Nash
equilibrium nor showed any significant response to the introduction of altruistic
punishment. Though Figure 3.1d reveals a slight positive trend in mean contributions
under a scale-free network with increasing M (R2 = 0.007), the trend is not significant
(Spearman rank order correlations, p = 0.086).
Discussion
Cooperation in continuous versus discrete games
In treatments with social structure and no punishment, populations playing the
continuous prisoners dilemma evolved contribution levels of approximately 0. This
presents an interesting contrast to early simulation work of Nowak and May (1992), in
47
which spatially arrayed populations played the standard prisoners dilemma. Recall that in
the standard prisoners dilemma agents are restricted to only two choices – cooperate or
defect (see Chapter 2 above). In other words the standard prisoners dilemma is the
discrete choice counterpart of the continuous prisoners dilemma. Nowak and May found
that under certain parameter settings populations evolved to an equilibrium mixture of
cooperators and defectors. This may indicate an important difference between continuous
and discrete games and should be explored further. However, Nowak and May also used
a non-torroroidal, finite plane, which gives rise to edge effects that are not present in the
current study. Furthermore, their simulations used initial populations of 90% cooperators
whereas in this study’s initial population uses a uniform distribution of public good
contribution levels.
Localization of interactions and the evolution of altruistic punishment
Though altruistic punishment is now accepted as a mechanism for maintaining
cooperation, it remains to explain the evolution of the punishment mechanism itself. This
is true because, just as an agent contributing in the prisoners dilemma receives a lower
payoff than those that do not contribute, an agent that punishes receive a lower payoff
than those that do not punish.
Social structure, through its restriction of agents to local interactions, offers one
possible explanation for the rise of the seemingly irrational phenomenon of altruistic
punishment. Results are robust when moving from artificial social structures, such as
regular lattices, to stochastically generated small-world networks which more realistically
represent complex human interaction patterns. Furthermore, because social networks are
48
not exclusive to human societies (Fewell 2003, Lusseau and Newman 2004, Flack et al.
2006) this finding may have broad applicability wherever social organisms engage in
costly punishment.
Under the parameters of these simulations it is clear that punishment, as a
mechanism for the evolution of cooperation, is only a viable explanation in the presence
of structured populations. It would appear that Chwe is correct in his call for a melding of
those exploring social structure and those studying individual incentives as mechanisms
for collective action.
Cooperation and network density
These results also shed light on a contemporary debate among network scientists
regarding the role that social networks play in facilitating cooperation. In particular there
has been lively discussion on the role of “dense” networks, or what is defined in this
study as complete (or nearly complete) networks. A long-held belief is that when a
population is more densely connected the likelihood of cooperation increases (Marwell
and Oliver 1993, Opp and Gern 1993, Jun and Sethi 2007). On the other hand, recent
research suggests the opposite and shows that dense networks inhibit cooperation in a
structured population (Flache and Macy 1996, Flache 2002, Takács et al. 2008). Results
from this dissertation support the latter view. Simulations using the maximally dense
complete network never evolved cooperation even when the punishment multiplier was
set to the unrealistic value of M = 5,000. Instead, cooperation evolved only on sparsely
linked networks (Table 3.4, Table 3.5).
49
The view that increasing network density adversely affects cooperation is further
supported by the results from regular networks. Though full cooperation eventually
evolved on each of the regular networks, the severity of punishment (measured as the
magnitude of M) required to move the population from defectors to cooperators increased
as the density of the network increased (Table 3.5). In other words, the more densely a
network was linked, measured as the number of neighbors per agent in a regular lattice,
the stronger the punishment required to evolve cooperation (Figure 3.2). This finding is
in direct contrast a study by Jun and Sethi who conclude that “dense networks are more
conducive to the evolution of cooperation” (Jun and Sethi 2007, p. 625).
Social dilemmas and their underlying social structure
Results from these simulations reveal that it may be possible to classify social
dilemmas based on the social structure under which they occur. If so, it may give policy
makers a new tool by creating a system of institutional recommendations for each
structural class of social dilemma. For instance, social dilemmas occurring in a society
characterized by a small-world network may be amenable to institutionalized
punishment. Dilemmas characterized by interactions following a scale-free network or a
highly dense network may require other institutional solutions.
The anomaly of scale-free networks
An unexpected simulation result was the response to punishment of populations
embedded in scale-free networks. Unlike other social networks used in this experiment,
populations on scale-free networks appear to be unresponsive to punishment even as M
50
increases. To ensure that results were not due to an inadequate sampling of the model’s
parameter space, simulations were run on scale-free networks at M = 5,000 but again
resulted in no convergence.
Another possible explanation for these results is that convergence on scale-free
networks takes longer to emerge. Therefore, simulations were re-run at M = 1.5 but
extended to 200,000 generations. However, even after extending the evolutionary period
by 20 times, no convergence in contribution rates occurred.
These results suggest that there are features unique to scale-free networks that
should be identified through further investigation. This is especially important given that
scale-free architecture is common in nature and is known to exist in widely diverse
organic systems, from cellular signal transduction pathways to the world wide web
(Barabási and Albert 1999).
Evolutionary dynamics
Results presented thus far have consisted of a population’s mean contribution at
the end of 10,000 simulated generations. Before ending discussion it is important to
examine the evolutionary trajectory through time. While it is true that one advantage of
evolutionary computer simulations is the ability to store data of every interaction during
every generation, the immense data processing and storage requirements that would be
needed to completely analyze evolutionary dynamics was beyond the scope of this
dissertation. However, a small subset of such trajectories is presented as examples of
population dynamics over time. Complete generation-level data was collected for three
individual simulations run on small-world networks. Simulations were selected to give
51
examples of a population that evolved to full defection at low strength of punishment (M
= 0.5), a population that evolved to full cooperation at higher strength of punishment (M
= 3.0), and a population that evolved to an intermediate level of cooperation at M = 1.75,
corresponding to the chaotic transition range in Figure 3.1c.
Results of this brief survey are presented in Figure 3.3. Simulations run at M = 0.5
and M = 3.0 converged to full defection and full cooperation respectively within the first
400 generations. In the intermediate range near the transition between defection and
cooperation (M = 1.75), the population’s mean contribution rate did not converge over
time to either cooperation or defection but instead drifted in a random fashion. To
ascertain whether the population was simply converging more slowly at punishment
strengths near the transition point, the simulation run at M = 1.75 was extended to
100,000 generations but still exhibited no convergence in contribution rate.
Cooperation in other games
It is important acknowledge that the prisoners dilemma is but one of many 2-
person games used to explore and understand social dilemmas. Though it is more
commonly used than others, it is likely not representative of all social dilemmas.
Manipulation of the payoff structure in the prisoners dilemma leads to several other
games with alternative equilibria and expected outcomes. Hauert (2001) gives a
comprehensive description of different games that arise when ordinality of payoffs
changes. It is likely that all these alternate games have applicability to at least some real
world social dilemmas.
52
One alternative game enjoying increased attention from scientists in recent years
is the snowdrift game, also known as the chicken game or the hawk-dove game. Unlike
the prisoners dilemma the snowdrift game has two Nash equilibria, neither of which is
the least socially desirable outcome (defect-defect). In recent simulations of the snowdrift
game on a regular (von Neumann) network, results showed that, in contrast to the
prisoners dilemma, spatially structuring the population actually inhibits cooperation
(Hauert and Doebeli 2004).
Future Directions
A limitation of this study is that comparisons were made between different
networks based on a nominal classification scheme. While this does allow for a test of
significance through ANOVA, it is less desirable than a general linear model in which
CPD contributions could be regressed against one or more numerical descriptors of the
underlying networks. As stated above, several such quantitative statistics exist to describe
social networks (Wasserman and Faust 1994). However, the computational requirements
to calculate such statistics for even a single randomly generated network are extensive
and the exploration of evolutionary space required to generate a meaningful linear model
would require the random generation and quantitative measure of hundreds or even
thousands of networks. There is currently no feasible way to accomplish this high-
volume quantitative analysis in addition to the computational requirements of the CPD
simulations themselves. To move forward with evolutionary network science such as that
presented in this dissertation, it is important that such a computational solution be
developed.
53
Second, even though several nominal classifications of networks were used in this
experiment the simulated worlds remain essentially flat and one-dimensional. An
approach more representative of the complexities of real-world societies would be the use
of multiple hierarchically nested networks. For instance, a model of metapopulations may
place populations at each node of a network. However, each population may itself be
made of multiple interacting actors arranged in their own network. The same is true for
models of international relations networked nations are made of networks of people. In
addition, there is no reason why an actor at one hierarchical level may not interact with
an actor at another level. While adding multiple layers of complexity to such models it
should also lead to a much richer array of outcomes for analysis and hypothesis testing.
54
Table 3.1
Strategy components used by agents in the continuous prisoners dilemma
Component Description
x contribution to public good
t threshold for punishment
c amount or cost of punishment
Note: x, t, c ∈ [0, 1]
55
Table 3.2
Payoffs p in the continuous prisoners dilemma between i and j with possible punishment
of i by k
xi ≥ tk xi < tk
k punishes i? no yes
pi 1 – xi + r(xi + xj)/2 1 – xi + r(xi + xj)/2 – ckM
pj 1 – xj + r(xi + xj)/2 1 – xj + r(xi + xj)/2
pk 0 – ck
Note: see Tables 3.1 and 3.3 for description of variables
56
Table 3.3
Simulation parameters for the CPD and their values
Parameter Values
The population size (N) 400
The number of generations a single simulation run 10,000
The number of games initiated by each agent
in one generation 3
The range for strategy component values (x, t, c) [0, 1]
The probability of strategy component mutation (m) 0.1
The (mean, standard deviation) of Gaussian noise
added to a mutated strategy component (0, 0.01)
The punishment multiplier (M) 0.0 to 6.0a
The public good multiplier (r) 1.5
b
The probability of rewire for small-world networks 0.05
The number of links per new node in scale-free networks 2
a In increments of 0.01.
b An alternative representation of the public good multiplier r is to be standardized by the
number of players per game. Stated in this manner r is bounded by 0.5 < r < 1 for the
prisoners dilemma and is fixed at r = 0.75 in all cases in this dissertation.
57
Table 3.4
Mean ending contributions on various networks with and without punishment
Mean contribution (std. dev.)
Network type M = 0.0 M = 4.0
Complete graph 0.003 (0.001) 0.030 (0.010)
Regular graphs
Moore 0.004 (0.001) 0.990 (0.017)
Hexagonal 0.005 (0.001) 0.997 (0.002)
von Neumann 0.005 (0.001) 0.998 (0.001)
Linear 0.006 (0.001) 0.996 (0.002)
Complex, real-world graphs
Small-world 0.006 (0.001) 0.997 (0.001)
Scale-free 0.490 (0.310) 0.666 (0.232)
Other graphs
Random 0.023 (0.012) 0.455 (0.284)
Random regular 0.005 (0.001) 0.999 (0.001)
Note: in each case number of replications = 100
58
Table 3.5
For regular networks, approximate value of M at which populations transitioned from low
to high contributions in the continuous prisoners dilemma
Network type Number of neighbors Approx. transition value of M a
Linear 2 1.5
von Neumann 4 1.8
Hexagonal 6 2.2
Moore 8 2.8
Complete N – 1 N/Ab
a See Appendix A for method of approximating transition values.
b A transition did not occur on the complete graph with increasing M. This was true even
at values as high as M = 5,000.
59
(a) Complete
036
0.0
0.5
1.0
(b) Moore
036
0.0
0.5
1.0
(c) Small-world
036
0.0
0.5
1.0
(d) Scale-free
036
0.0
0.5
1.0
Punishment Multiplier M
Mean Ending Contribution
Figure 3.1 Results of the continuous prisoners dilemma on four different networks. Mean
contributions vs. M are presented for populations on (a) complete network, (b) a regular
(Moore) lattice, (c) a small-world network, and (d) a scale-free network. Each dot
represents the population’s mean contribution in the 10,000th generation of a single
simulation run. Simulations on small-world networks clearly demonstrate a transition
effect as M increases. Scale-free networks exhibit no such effect.
60
Punishment Multiplier M
036
Mean Ending Contribution
0.0
0.5
1.0
Linear
von Neumann
Hexagonal
Moore
Complete
(a)
Punishment Multiplier M
036
Mean Ending Contribution
0.0
0.5
1.0
Random
Scale-free
Small-world
Random-regular
(b)
Figure 3.2. Response of mean ending contributions to increasing M in the continuous
prisoners dilemma. (a) on regular network structures, and (b) on other networks. For
networks that experience a rapid transition from low to high contributions, approximate
transition values are listed in Table 3.5.
61
Generation
0200
Mean Ending Contribution
0.0
0.5
1.0
M =3.0
M =1.75
M = 0.5
Generation
0 10000
M = 0.5
M =1.75
M =3.0
(a) (b)
Figure 3.3. Evolutionary dynamics of the continuous prisoners dilemma on small-world
networks. (a) Through 200 generations. (b) Through 10,000 generations. While
simulations run at M = 0.5 and M = 3.0 converged to full defection and full cooperation
respectively within 200 generations, the simulation run at M = 1.75 did not converge to
any value even after 10,000 generations. The end point of each curve in (b) corresponds
to a single point in Figure 3.1c.
CHAPTER 4
EXTENDING THE CONTINUOUS PRISONERS DILEMMA MODEL:
THE AFTERMATH OF PUNISHMENT8
In this chapter, the continuous prisoners dilemma (CPD) simulation developed in
Chapter 3 is modified to answer a serious of supplemental questions. These questions are
related to the premise that in real world situations, or action arenas, the unilateral
punishment of a non-cooperator is rarely the final interaction in a social dilemma.
Therefore, experiments in this chapter investigate what happens after punishment takes
place.
The Detrimental Side of Punishment
Thus far this dissertation has demonstrated that punishment can induce a
structured population to cooperate, provided that the punishment multiplier is sufficient.
However, it remains to examine whether such punishment-induced cooperation has a
favorable impact on social welfare, measured as the sum of all individual payoffs in the
population. Laboratory experiments have shown that even when punishment leads to
increased contribution rates in a public good game, it may consistently decrease overall
social welfare in the form of total payoffs (Sefton et al. 2002). This occurs because the
fees collected from those wishing to inflict punishment, as well the sanctions collected
from those being punished, are not redistributed by the experimenter and may be greater
than the benefits from increased contributions to the public good.
Unfortunately this has led to ambiguity in the literature regarding the efficacy of
altruistic punishment. For example, in a ground-breaking laboratory experiment by Fehr
8 This chapter is based, in part, on (Shutters 2008).
63
and Gächter (2000), participants played the public goods game for 20 rounds and had the
ability to punish others after each round. Treatments in which punishment was allowed
led to higher contributions than when punishment was not allowed and the authors
concluded that, since free-riding was deterred, punishment had facilitated cooperation.
Yet in 18 of 20 rounds with punishment, average payoffs to all participants was actually
lower than without punishment. So while punishment induced higher contributions to the
public good it led to decreased social welfare.
Herrmann et al (2008) found that in some human societies, those contributing to a
public good are punished just as frequently as non-contributors. This “antisocial
punishment”, as the authors call it, can be so strong that it destroys the ability of
punishment to facilitate cooperative outcomes.
Therefore, it is with great care and caution that scientists should approach policy
makers to advocate the use of punishment as some have done (Ostrom et al. 1992,
Ostrom et al. 1994, Barrett 2003b, a, Dietz et al. 2003). Using this argument, an
oppressive and coercive central power that punishes those whose views do not concur
could be considered a source of cooperation as long as it deters free-riding, but this may
come at terrible cost to individual liberty (Marlowe et al. 2008).
Punishment versus payoffs
In the previous chapter, simulations of the CPD were used to examine the effect
of punishment on contributions to a public good. However, it is prudent to also
investigation the effect that punishment has on payoffs. Close examination of the rapid
transition in contributions that occur in small-world and regular networks reveals that
64
mean payoffs actually drop as M increases but before the transition occurs (Figure 4.1).
This suggests that unless M is sufficiently high, altruistic punishment can actually lead to
decreased social welfare. In the complete network, where there is no transition to high
contributions, mean payoffs simply continue to decrease with increasing M.
Once the transition occurs to cooperative behavior, further increases in M beyond
its transition value again decreases total payoffs (Figure 4.2). These results indicate that
once a society achieves widespread cooperation some level of punishment persists, and
suggest that there is an optimal strength of punishment at the point just beyond the
transition to full cooperation. Any attempt to craft institutions that promote punishment
as a mechanism for inducing cooperation will face a practical problem of attempting to
find this optimal formula for punishment. At worst, a poorly crafted punishment regime
will lead to worse payoffs than without punishment.
The 2ND Order Free-rider Problem
To facilitate the provisioning of a public good it is often the case that institutional
solutions are implemented to deter free-riding. However, these institutions are themselves
public goods and the question then arises of how these institutions are maintained
(Hodgson 2009). What deters free-riding in the provisioning of deterrence institutions?
This is the essence of what is known as the 2nd order free-rider problem (Okada 2008).
For example, it is common for human societies to employ police to enforce laws.
A police force is tasked with detection and punishment of 1st order free-riders. However,
what incentives exist to ensure that members of a police force carry out their duties?
Without deterrents and/or incentives it is expected that a rational enforcer would collect
65
wages but then rely on fellow police officers to carry out enforcement of laws – a costly
endeavor in terms of individual risk to the enforcement officer (Oliver 1980).
This scenario may continue for several levels, each with a new free-rider
dilemma. If a police force should create an internal affairs department to ensure that its
members are carrying out their enforcement duties, we then ask what incentives do
internal affairs agents have to carry out their internal enforcement duties?
It is expected that individuals that cooperate but that do not punish others –
individuals Heckathorn (1998) refers to as private cooperators – will have an
evolutionary advantage over those that both cooperate and punish. This has been shown
in experimental games where those that cooperate but do not punish receive the highest
payoffs (Dreber et al. 2008). However, if punishers are responsible for cooperative
outcomes but are evolutionarily inferior to those that do not punish, it is expected that
they will evolve out of the population, taking any hope for general cooperation with
them. Therefore, even if we conclude that cooperation is maintained in a society by the
tendency of individuals to inflict costly punishment on non-cooperators, it remains to
explain in an evolutionary context how these punishers could out-compete other
cooperators that do not punish.
In addition to those that cooperate but do not participate in enforcement, there
may also be individuals that cheat or defect with regard to provisioning a public good but
participate actively in sanctioning other cheaters (hypocritical cooperators). This is a
further complication that contributes to expected frailty of 2nd order cooperation
(Heckathorn 1998).
66
This 2nd order problem is of no concern once the society has reached a population
of all cooperators, as punishers no longer reduce their fitness to punish. But as shown in
Figure 3.3 even those societies that evolve to full cooperation pass through evolutionary
periods in which members of the society contribute less than a fully cooperative amount.
Yet the fact that populations with punishers do achieve full cooperation despite passing
through periods of lower cooperation, indicates that social structure alone may create the
feedbacks necessary to overcome the 2nd order (and higher) free-rider dilemma. This
concurs with Hodgson (2009, p. 145) who states that to understand 2nd order institutions
“explanations must ultimately devolve on individuals and their interactions.” Full
cooperation evolves despite the prospect that some cooperators may not contribute to the
punishment of non-cooperators. Like Panchanathan and Boyd (2004) the 2nd order free-
rider problem appears to have been solved without intervention.
Therefore, it remains to answer the question, what effect does 2nd-order
punishment have in these simulations? It may be that the strength of punishment required
to achieve cooperation under different social structures is affected by whether or not 2nd
order free riders are subject to punishment. On one hand it is intuitive to predict that,
since punishment of 1st order cheaters led to cooperative behavior, further punishment of
2nd order cheaters may lead to cooperation at even lower values of the punishment
multiplier M. However, laboratory experiments with human subjects have demonstrated
the opposite and suggest that 2nd order punishment can inhibit the emergence of
cooperation (Denant-Boemont et al. 2007)
To test the effects of 2nd order punishment simulations were conducted of
populations playing the CPD described in Chapter 3. The simulation was modified so that
67
when a game is played between i and j and observer k, a new agent l simultaneously
evaluates k’s punishment behavior (Table 4.1). To assess k’s general predisposition to
punish, l compares its punishment threshold tl to k’s threshold tk. If k is generally more
lenient on low offers compared to l (tk < tl), l punishes k. In simpler terms, the newly
introduced agent l is ensuring that the punisher is doing its job.
As in Chapter 3 a sweep of the parameter M was conducted on several networks
to determine the effect of this additional 2nd order enforcement.
Results and discussion: 2nd order free rider simulations
Figures 4.3 and 4.4 present comparisons of simulations on several networks with
and without punishment of 2nd order free riders. Contrary to expectations, the ability to
punish 2nd order free-riders led to the requirement of higher M in order to induce
cooperative behavior in a population. In other words, punishment needed to be more
severe to achieve cooperation than in Chapter 3 when punishment of 2nd order free riders
was not allowed (Table 4.2).
This is likely due to the fact that 2nd order punishment is not based on whether the
punishment recipient was a cooperator or defector, but on whether the recipient was a
punisher or not. In the presence of a 2nd order punishment institution, simply contributing
to a public good is no longer sufficient to guarantee freedom from punishment. Many
cooperative agents that would have otherwise helped moved a population to full
cooperation in Chapter 3 may have been injured through sanctions in the present
experiment, and would therefore decrease the overall effectiveness of punishment.
68
The Effect of Retaliatory Behavior
Another often unacknowledged drawback to punishment is the phenomenon of
retaliation. Studies have shown that humans and other animals do not take kindly to being
punished and often retaliate at a cost both to themselves and their punisher (Molm 1989a,
b, Clutton-Brock and Parker 1995, Saijo and Nakamura 1995, Hopfensitz and Reuben
2005). This can inhibit the punishment of free-riding and ultimately negate the
cooperative effects of punishment (Nikiforakis 2008). However, previous research on
punishment has rarely considered the potential consequences of retaliation (Fon and
Parisi 2005, Denant-Boemont et al. 2007).
In the previous chapter, simulation experiments revealed outcomes that may be
achieved under a variety of social structures when altruistic punishment is allowed.
However, the ability to punish was limited to a single act by a 3rd party. In the current
experiment the CPD simulation used in Chapter 3 is modified to allow a punished agent
to retaliate against its punisher.
To examine the effects of retaliation on cooperative outcomes, agent behavior was
modified so that agents automatically retaliate after being punished by paying an amount
s ∈ [0, 1] to have its punisher sanctioned by an amount sM. Because the amount of
retaliation s may be 0, agents may evolve so that they effectively do not retaliate, even
when punished. Three different rules were implemented for calculating how much a
punished agent should spend on retaliation:
(1) s equals the same amount the punished agent would have spent to punish a
low contributor (s = c). This assumes that a single strategy component
69
dictates how much an agent will spend to punish another regardless of the
reason for punishing.
(2) s is a new, independently evolving strategy component (s is independent). In
this case acts of retaliation are assumed to be independent of other
punishment acts by an agent.
(3) s equals the amount the agent contributes to the public good in the CPD (s =
x). This reflects the idea that both punishment and retaliation are non-self
interested behaviors, and so may be governed by the same strategy
component.
Results and discussion of retaliation experiments
Using retaliation rule 1 (s = c) cooperation did not evolve on any network. The
ability to retaliate led to the collapse of cooperation that evolved when there was no
retaliation. Likewise with retaliation rule 2 (s is independent), full defection evolved on
all social structures.
However, in simulations using retaliation rule 3 (s = x), results were more
complex. As with simple punishment, simulations on networks other than the complete
network underwent a rapid transition from low to high contributions with increasing M.
However, contributions did not transition to full cooperation as before (Table 4.3) but
instead plateaued at a value between full cooperation and full defection depending on the
network (Figure 4.5). In addition, payoffs initially increased with increasing M but
eventually evolved to levels even below the Nash equilibrium payoffs of the base CPD,
in which there is no punishment or retaliation (Figure 4.6).
70
These results present a challenge to the explanation of cooperation based on
punishment because humans often do retaliate after being punished (Hopfensitz and
Reuben 2005, Nikiforakis 2008). However, results, at least under retaliation type 3, also
do not result in full defection. While the presence of retaliatory behavior may present a
barrier to full cooperation it does not preclude some intermediate level of contributions to
a public good provided there is a sufficient strength of punishment.
Despite an intermediate level of public good contributions, results are ambiguous
regarding cooperation. Though free-riding was partially deterred and contributions to the
public good evolve to some positive level, social welfare eventually evolved to levels
lower than the worst possible outcome in the absence of punishment and retaliation. In
other words, populations with the option to retaliate fared worse than populations with no
punishment at all, even though those with retaliation had a partially provisioned public
good and those without punishment had none.
It is precisely this type of outcome that should lead policy makers to scrutinize
punishment mechanism before they are incorporated into policies designed to foster
cooperation. Their efforts may only result in the illusion of cooperation through increased
compliance but at the cost of decreased social welfare. Perhaps this helps to explain the
existence of institutional policies such as that of the United States Department of Labor,
which implements methods for discouraging retaliation (USDL 2009).
Summary and Future Directions
This chapter demonstrates the potential danger of generalizing about the benefits
of using punishment to induce cooperation. On the other hand it points to numerous
71
potential questions that may be foci of future studies. First, while this study has thus far
used a form of the prisoners dilemma, future studies should duplicate these type of
simulations on a wide variety of 2-player games to ascertain a more general nature of
punishment in social dilemmas and to inform policy makers of potential adverse affects
of institutionalized punishment.
Second, as stated above, in experiments with punishment collected fees and fines
are routinely removed from the experimental system without further consideration. It is
more likely in real world situations that collected penalties and fines are redistributed, to
some degree, to the society from which they are collected – either to those who did not
defect or to all society members. Future simulations should include and explore a variable
that allows redistribution of collected sanctions and fees.
In addition, simulations results presented here regarding 2nd-order punishment
should be coupled with laboratory experiments to validate the detrimental nature of such
supplemental punishment.
Finally, a rich suite of questions is posed regarding retaliation. Three methods for
determining how to retaliate are presented in this study. Others surely await discovery
and testing.
72
Table 4.1
Payoffs p in the continuous prisoners dilemma between observer k and 2nd order punisher
l
tl < tk tl ≥ tk
pk – clM 0
pl – cl 0
Note: see Tables 3.1 and 3.3 for description of variables
73
Table 4.2
Approximate value of M at which populations transitioned from low to high contributions
in the continuous prisoners dilemma, with and without punishment of 2nd order free-
riders
Network type (No. of neighbors) Approx. transition value of M a
with no 2
nd order with 2nd order
punishment punishment
Linear (2) 1.5 1.6
von Neumann (4) 1.8 2.8
Hexagonal (6) 2.2 4.1
Moore (8) 2.8 5.7
Complete (N – 1) N/Ab N/Ab
a See Appendix A for method of approximating transition values.
b A transition did not occur on the complete graph with increasing M under either
treatment. This was true even at values as high as M = 5,000.
74
Table 4.3
Response of contribution rate to network type with and without retaliation in the
continuous prisoners dilemma
Mean contribution (std. dev.)
Network type Without Retaliationa With Retaliation (type 3)b
Complete graph 0.030 (0.010) 0.036 (0.050)
Regular graphs
Moore 0.990 (0.017) 0.065 (0.018)
Hexagonal 0.997 (0.002) 0.115 (0.035)
von Neumann 0.998 (0.001) 0.277 (0.064)
Linear 0.996 (0.002) 0.949 (0.015)
Complex, real-world graphs
Small-world 0.997 (0.001) 0.525 (0.066)
Scale-free 0.666 (0.232) 0.644 (0.231)
Other graphs
Random 0.455 (0.284) 0.126 (0.035)
Random regular 0.999 (0.001) 0.129 (0.040)
Note: in both treatments, punishment of low contributors is allowed
a Mean contribution over 100 runs at M = 4 (see table 3.4 above).
b Mean contribution over 100 runs at each M = 10, 15, 20, 25, 30 (N = 500).
75
Punishment Multiplier M
0123
Mean Ending Payoff
5.95
6.00
6.05
Moore Lattice
Small-world Network
Complete Network
Figure 4.1. Prisoners dilemma payoffs vs. M at low values of M. The rational expectation
for selfish individuals is that mean payoff = 6.0. As punishment is introduced, however,
under both regular and small-world networks, payoffs fall below expectations. Payoffs
continue to fall until the value of M reaches a threshold point (Table 3.5) at which
payoffs jump to near the full cooperative value. Under a complete network, payoffs are
always lower under a punishment regime than without.
76
Punishment Multiplier M
0510
Mean Ending Payoff
8.20
8.45
8.70
Moore Lattice
Small-world Network
Figure 4.2. Prisoners dilemma payoffs vs. M at high values of M. The rational
expectation for a population of fully cooperating individuals is that mean payoff = 9.0.
However, having made the transition to cooperative contributions with increasing M
(Table 3.5), payoffs steadily decline with increasingly potent punishment. This same
trend is observed with increasing M before the transition to cooperative contributions
(Figure 4.2) and suggests that crafting an optimal punishment institution may be difficult.
77
Linear
036
0.0
0.5
1.0
von Neumann
036
0.0
0.5
1.0
Hexagonal
036
0.0
0.5
1.0
without 2nd order punishment
with 2nd order punishment
Moore
036
0.0
0.5
1.0
Punishment Multiplier M
Mean Ending Contribution
(a) (b)
(c) (d)
Figure 4.3. Effect of 2nd-order punishment: lattice networks. Introducing punishment of
2nd-order free-riders leads to less cooperative contributions at any given value of M. This
effect becomes more pronounced as network density increases.
78
Scale-free
036
0.0
0.5
1.0
Complete
036
0.0
0.5
1.0
Small-world
036
0.0
0.5
1.0
without 2nd order punishment
with 2nd order punishment
Random
036
0.0
0.5
1.0
Punishment Multiplier M
Mean Ending Contribution
(a) (b)
(c) (d)
Figure 4.4. Effect of 2nd-order punishment: other networks. In cases other than scale-free
networks, allowing punishment of 2nd-order free-riders leads to less cooperative
contributions at any given value of M.
79
Punishment Multiplier M
0 102030
Mean Ending Contribution
0.0
0.5
1.0
Linear
von Neumann
Hexagonal
Moore
Small-world
Figure 4.5. Effect of retaliation on mean ending contributions in the continuous prisoners
dilemma. In populations given the option to retaliate, neither full cooperation nor full
defection evolved in the above networks. These data come from simulations using
retaliation type 3, in which the amount an agent spends on retaliation s is the same
amount the agent contributes to the public good x. Using types 1 and 2 retaliation
cooperation collapses completely and full defection evolves on all networks.
80
Punishment Multiplier M
0102030
Mean Ending Payoffs
5.0
5.5
6.0
6.5
von Neumann
Hexagonal
Moore
Complete
Figure 4.6. Effect of retaliation on mean payoffs in the continuous prisoners dilemma. In
populations given the option to retaliate, mean payoffs eventually evolved, with
increasing M, to levels lower than the least possible payoffs without punishment and
retaliation (dashed line at p = 6.0). These data come from simulations using retaliation
type 3, in which the amount an agent spends on retaliation s is the same amount the agent
contributes to the public good x.
CHAPTER 5
PUNISHMENT AND SOCIAL STRUCTURE:
THE EVOLUTION OF FAIRNESS IN AN ULTIMATUM GAME9
Introduction
Kazemi and Eek (2008) assert there are two predecessors to cooperative outcomes
of social dilemmas, the provisioning of public goods and the allocation of public goods.
Whereas questions of provisioning are often associated with cooperation, questions of
allocation are concerned with fairness. Though research on social dilemmas has been
dominated by cooperation and provisioning questions (Kazemi and Eek 2008), to better
facilitate the resolution of social dilemmas, research should encompass both antecedents
of cooperative outcomes. Accordingly chapters 3 and 4 of this dissertation investigate
cooperation through experiments with public goods provisioning, demonstrating that
social structure, coupled with punishment, has an important influence on cooperative
outcomes. In this chapter I present experimental results on the allocation step and discuss
whether punishment and social structure similarly affect the ability of a population to
evolve fairness behavior.
The question of fairness
Despite a voluminous literature addressing fair allocations, authors rarely attempt
to define the term fairness. This is partly because concepts of fairness are often culturally
contextual norms and may vary among individual groups (Kazemi and Eek 2008).
However, a general definition is warranted to adequately discuss abstract questions of
9 This chapter is a modified version of Shutters (2008).
82
fairness. Here I adopt the definition proposed by Varian (1974) in which a fair allocation
is one that is both pareto efficient and equitable. Being pareto efficient an allocation
cannot be altered without decreasing the payoff of at least one participant. Being
equitable, says Varian, means no participant prefers the allocation of another participant.
This definition may explain why many authors, while declining to explicitly define
fairness, nevertheless typically imply that a fair allocation is one resulting in
approximately equal shares to participating parties (Nowak et al. 2000, Henrich et al.
2001).
In the 2-player ultimatum game used in this chapter and described below, every
allocation resulting from an accepted offer is pareto efficient. In other words,
disregarding cases where an offer is rejected, the ultimatum game is a zero-sum game –
no player can increase his payoff without decreasing the payoff of his opponent. This
satisfies the first criterion for fairness. Regarding those allocations that are also equitable,
all simulated agents used in this study begin each generation with the same resource
endowment and compete in the same reproduction algorithm for the ability to pass
offspring into the next generation. Therefore, an agent will benefit from the higher share
of an allocation and, in this sense, will prefer an opponent’s allocation if it is larger. The
only point at which neither agent would prefer the other’s allocation is when the
allocation is an equal split so that in the following simulated ultimatum game, a fair
allocation is one in which each player receives a 50% share.
83
Background
Since cooperation often means overcoming an incentive to cheat or free-ride, the
emergence of cooperation among unrelated individuals remains largely unexplained in
the life and social sciences. A long history of explanations includes kin selection
(Hamilton 1964, Rothstein and Pierotti 1988, Wilson 2005), direct and indirect
reciprocity (Trivers 1971, Nowak and Sigmund 1998a, Riolo et al. 2001, Killingback and
Doebeli 2002, Nowak and Sigmund 2005), and multi-level selection (Wilson and Sober
1994, Goodnight 2005, Reeve and Hölldobler 2007). However, these explanations often
require assumptions such as close genetic relationships, small populations, or repeated
interactions in order for cooperation to evolve (Fowler 2005). Recent findings suggest
that strong reciprocity – the altruistic punishing of cheaters and altruistic rewarding of
cooperators – may provide an alternative and more general explanation. In particular
altruistic punishment by third-party observers has been shown to play a positive role in
maintaining cooperation (Fehr and Gächter 2000, Gintis 2000, Henrich and Boyd 2001,
Fehr et al. 2002, Fehr and Gächter 2002, Boyd et al. 2003, Bowles and Gintis 2004, Jaffe
2004, Shinada et al. 2004, Fowler 2005).
The effectiveness of punishment as a mechanism for cooperation has long been
debated, with some suggesting that it may simply lead to a destructive cycle of costly
retaliation (Molm 1994). However, researchers now seem sufficiently sure of
punishment’s ability to induce cooperation that they have moved to advocating its use by
policy makers, both at local scales, in institutions governing common pool resources
(Ostrom et al. 1992, Ostrom et al. 1994, Dietz et al. 2003, Anderies et al. 2004), and at
84
international scales, in agreements designed to provision global public goods (Sandler
1992, Wagner 2001, Barrett 2003b, a, 2005).
The punishment multiplier
An important parameter governing the mechanism of altruistic punishment is the
ratio of costs incurred by the punishing party to those of the party being punished. Letting
c = the cost that an individual incurs to punish another, cM = the fee or sanction imposed
on the punished actor where M is the punishment multiplier. As M becomes arbitrarily
large there should be some point at which it is no longer altruistic to provide punishment
but is instead strategically beneficial. M, therefore, becomes an important parameter in
understanding outcomes of punishment experiments. However, though any experiment
that uses a punishment mechanism implies a value of M under which the experiment
operates, explanations of how researchers set this parameter are largely absent. To my
knowledge, even those studies in which researchers explicitly state their value of M, the
authors rarely offer an explanation of the choice or demonstrate the effects of altering the
parameter (e.g. Fehr and Gächter 2000, 2002, Andreoni et al. 2003, Boyd et al. 2003,
Brandt et al. 2003, Gürerk et al. 2006).
For example, in their influential paper on altruistic punishment, Fehr and Gächter
(2000) demonstrate that humans in anonymous, one-shot interactions will punish low
contributors in a public-goods game. In their experiment M = 3.0 yet the authors offer no
explanation for this choice. Likewise, Andreoni et al. (2003) set M = 5.0 in their
experimental ultimatum games using punishment and rewards, but state only that they
chose their ratio so as to ensure that punishment would take place.
85
Table 5.1 lists a number of recent studies on altruistic punishment and the values
of M used in each study. Two of these studies, Fehr and Gächter (2000) and Masclet et al
(2003), did not use a fixed value of M but instead used a non-linear function of the
amount paid by the punisher to determine the amount deducted from the punishee. This
leads to further confounding issues which are addressed in detail by Casari (2005).
Despite the fact that researchers often neglect discussion of their selections of M,
policy-makers who choose to implement punishment mechanisms must include some
definition of costs incurred by punisher and punishee. Even if polices are unable to
directly set a common value of M for a crafted punishment institution, they may still be
able to influence its value. Before researchers promote the application of such
mechanisms to social dilemmas, a better understanding is warranted of how cooperative
outcomes respond to punishment mechanisms under varying values of M. This is
especially true since researchers have shown that if punishment is excessive, it can lead
to worse outcomes, in terms of total payoffs, than if no punishment were present (Fehr
and Rockenbach 2003).
To test the ability of altruistic punishment to elicit fair allocations simulations of
the ultimatum game were conducted on a variety of network structures while
systematically varying the parameter M. Perhaps for its sheer simplicity, the ultimatum
game has grown in popularity until it has come to rival the prisoners dilemma as the
preferred game-theoretical framework for studying cooperative phenomena (Nowak et al.
2000). The game is played by two agents i and j that must decide how to split an
endowment. The proposer i initiates the game by offering a percentage of the endowment
to the responder j. j then either accepts the division, in which case each agent collects its
86
agreed upon share, or j rejects the division, in which case both agents receive 0. In either
event the game ends. Economic theory predicts that, given rational agents, j will accept
the smallest positive amount possible and that i, knowing this, will therefore offer the
smallest amount possible. Tests of this economic expectation among non-humans have
been inconclusive, with evidence both rejecting (Silk et al. 2005, Jensen et al. 2007) and
supporting (Brosnan and de Waal 2003, Burkart et al. 2007) the existence of fairness
behavior among unrelated primates. However, fairness behavior among humans is well-
established and subjects across many cultures have shown a strong propensity to offer
fair allocations (i.e. ~ 40-50% of the endowment) and reject unfair offers in experimental
ultimatum games (Roth et al. 1991, Nowak et al. 2000, Henrich et al. 2001).
The Simulation Model
In simplest terms this model simulates a population of agents in a torroidal space
playing the ultimatum game against one another. Under various parameter settings,
agents are endowed with the ability to altruistically punish a neighbor after assessing the
neighbor’s game-play behavior. This punishment is accomplished through introduction of
a third party to the game - the observer k. When in the role of k an agent observes a game
being played by two other agents and may reduce its own fitness in order to punish what
it perceives to be a low offer.
Each agent consists of a strategy (x, α, t, c) where x = the amount that an agent in
the role of i will offer; α = the offer threshold above which an agent acting as j will accept
an offer; t = the offer below which an agent acting as k will punish i in a game under
observation; and c = the cost that an agent acting as k is willing to incur to punish i for
87
offering too little (Table 5.2). Each of the four strategy components holds a value on the
continuous interval [0,1] and is generated randomly from a uniform distribution at the
beginning of each simulation. To control for other factors that might contribute to the
maintenance of cooperation, such as interaction history or reputation, the model does not
allow recognition of or memory of other agents within the population (though repeated
interactions are possible since interactions are restricted to a small local neighborhood).
Following initialization a simulation proceeds through a number of generations,
each of which consisted of three routines – game play, observation & punishment, and
selection & reproduction. In each routine an agent interacts only with its immediate
neighbors as defined by the network type (Table 5.3) and all interactions take place in
parallel. In addition to its strategy, each agent is described by a fitness variable p, which
is simply an accumulation of the costs and payoffs an agent experiences during a
generation. Because only relative fitness is considered during selection and because this
model uses the elementary case in which parents do not differentially provision resources
for their offspring, p = 0 for each agent at the beginning of a new generation.
The present model most closely resembles the model of Page et al. (2000) who
also demonstrated that a simulated population playing the ultimatum game could evolve
fair allocations under a linear population structure (or what the authors refer to as a one-
dimensional spatial ultimatum game). The authors did briefly discuss implications of a
von Neumann neighborhood but the focus of the work was on the effect of varying the
population size and the radius of an agent’s neighborhood.
Another closely related model by Killingback and Struder (2001) produced results
not in agreement with those of the current simulation. The authors simulated a modified
88
ultimatum game in which fair allocations evolved when the population was structured on
a hexagonal lattice. However, in an effort to model a “collaborator’s dilemma,” their
modifications to the ultimatum game were extensive enough that it is unreasonable to
expect outcomes similar to those from the standard ultimatum game.
During the game play routine, each agent i plays the role of proposer and
randomly selects, with replacement, three responders from its neighborhood with which
to play a game. After receiving a standardized endowment of 1 per game, i initiates each
game by making its offer of xi to j who then evaluates the offer. If the offer is above j’s
acceptance threshold the offer is accepted and pj increases by xi while pi increases by 1 –
xi. If the offer is below the threshold it is rejected and both pi and pj remained unchanged.
After making offers to three neighbors, i selects three neighbors k to observe each
of those games. These observers, chosen from the same neighborhood as responders, are
selected with replacement and evaluate xi in the observed game. Provided that the offer is
not below k’s punishment threshold, pi and pk do not change. However, if the offer falls
below k’s punishment threshold k punishes i. In so doing pk is reduced by ck while pi is
reduced by ckM, where M is the punishment multiplier described above. Punishment of i
is independent of whether xi is actually accepted by j in the observed game. Payoffs for
the ultimatum game are listed in Table 5.4.
Finally, each generation ends with a selection & reproduction routine during
which each agent i randomly selects a neighbor j with which to compare payoffs. If pi >
pj the strategy occupying i’s node remains and passes to the next generation. If pj is
greater, j’s strategy is copied to i’s node. In the event that pi = pj a coin toss determines
which strategy occupies i’s node in the next generation. Once agents of the next
89
generation are determined, each strategy component of every agent is subjected
independently to mutation with a probability of m = 0.1. If selected for mutation a
number randomly drawn from a Gaussian distribution with mean = 0 and standard
deviation = 0.01 is added to the mutated trait. In the event that mutation causes the value
of a trait to fall outside the interval [0,1] the trait is set to the closer endpoint (either 0 or
1). A single run continues in this manner for 30,000 generations and, as the model is not
deterministic, is replicated 100 times to complete a single simulation.
For each network type, simulations were run starting with M = 0 and thereafter at
increments of 0.5 until M = 6.0, by which point simulations that converged to a
population-wide offer value had all done so. Model parameters are summarized in Table
5.5. The dependent variables of interest are the mean offer and the mean payoff which
evolve in a population after 30,000 generations. The mean offer represents the
population’s level of fairness while the mean payoff represents the population’s social
welfare.
Results and Discussion
Results revealed three major trends worthy of discussion: (1) even without
punishment, a negative correlation exists between the number of neighbors per agent and
the mean offer rate to which a population evolves, (2) in simulations with some social
structure an abrupt transition occurs from relatively low mean offers to offers of nearly
100% as M increases, and (3) a correlation exists between the number of neighbors per
agent and the value of M at which the transition from low to high offers occurs.
90
Offer rates in the absence of punishment
It has been well demonstrated that spatial explicitness in computer simulations
can lead to outcomes significantly different than when populations are unstructured or
well-mixed (Nowak and May 1992, Nowak et al. 1994, Killingback and Doebeli 1996,
Killingback and Studer 2001). However, there are a number of ways in which a
population can be spatially explicit and it is important for researchers to show how
different social structures can lead to different results. In this experiment care was taken
to show not only the effect of introducing punishment but also to show how that effect
differed among different spatial structures of the agent population. These spatial
structures included a complete network in which every agent is a neighbor to every other
agent in the population (see for example Riolo et al. 2001).
Initial simulations were conducted without any ability to punish by third-party
observers. Agents simply played a standard ultimatum game under a number of different
neighborhood structures (Table 5.6). Under these conditions the offer rate to which a
population evolved was correlated with the neighborhood structure of that population. In
particular, ending mean offers increased as the number of neighbors per agent decreased
(Figure 5.1). Only under a linear population structure did offers approach fair allocations.
This result is in agreement with Page et al. (2000) who found that agents playing the
ultimatum game in a linearly structured population evolved approximately fair
allocations.
However, though mean offers fell in response to increasing numbers of neighbors,
offers remained significantly greater than the economic expectation of ~ 0. Only when
populations lacked social structure (complete network), and agents could interact with
91
any other agent in the population with equal probability, did ending offers approximate
the Nash equilibrium. In fact, when additional simulation were run on a complete
network with M as high as 5,000, offers did not deviate from the Nash equilibrium.
This result indicates that the structure of local interactions may matter more for
cooperation than the overall population size and may be as important, if not more so, than
punishment. More importantly it suggests that cooperative outcomes are possible even in
a large anonymous population provided that local small-size clustering is allowed. This
conclusion concurs with similar research highlighting the importance of small group or
neighborhood size (Olson 1965, Page et al. 2000, Ifti et al. 2004).
Response of offer rates as M increases
By increasing M above 0 agents became endowed with the ability to punish one
another. Following this introduction of punishment, agents initially evolved offer rates
equal to those in the absence of punishment. In other words at relatively low values of M,
punishment had no discernable effect on simulation outcomes. However, as M continued
to increase populations on each network type, other than a complete network, eventually
encountered a threshold value of M at which a rapid transition occurred in offer rates. In
these transitions agents went from offering relatively low percentages to offering
approximately 99% of their endowments in an attempt to be reproductively successful
(Figure 5.2). Other than in the case of a linearly structured population discussed above,
neither offers before nor after these transitions fell into a range that could be considered
fair allocations. Instead those offers before the transition were below fair offers and those
after were well above fair offers.
92
Despite several studies in which strong reciprocity is shown to induce
cooperation, third-party punishment failed to lead to fair allocations in these simulations
(Figure 5.2). This outcome is contrary to the experimental results of Fehr and
Fischbacher (2003) in which anonymous human observers routinely paid to punish those
who offered below 50% in a laboratory ultimatum game. A likely explanation for this
difference is that cultural factors, which were explicitly excluded in the present
simulation model, played a significant role in outcome of the Fehr and Fischbacher
experiment. This concurs with conclusions drawn from ultimatum games played between
chimpanzees that considerations of fairness are limited to humans (Jensen et al. 2007).
An important aspect of these transitions in mean ending offers is that it appears to
flip between only two basins of attraction – those offers which evolve when M = 0 or
relatively low and those offers of ~ 99% when M is relatively high. A possible
explanation of this rapid transition is through consideration of agents’ relative fitness.
Because reproductive success is explicitly a function of payoffs in this model, a
consideration of relative fitness means a consideration of relative payoffs. For a given
agent i let pi equal the agent's absolute payoff and let
p
equal the mean payoff of i’s n
neighbors.
p
nis then equal to the sum payoffs of agent i's neighbors. i’s relative payoff
before punishing another might be represented as
[5.1] pn
pi.
Using previous definitions of the punishment multiplier M and the cost of punishment c,
the agent’s relative payoff after a single instance of punishment can then be described as
93
[5.2] cMpn
cpi
−
−.
In order for punishment to be evolutionarily beneficial, it is expected that i’s relative
payoff will be greater after an act of punishment so that
[5.3] pn
p
cMpn
cp ii >
−
−
which simplifies to
[5.4] Mpn
pi1
>.
From [5.4] it is clear that as M increases there should be some threshold value at which
punishment switches from being a detrimental strategy to one that is beneficial. Once this
condition is met an observer actually benefits, in terms of relative payoff, from punishing
its neighbor and there is no reason mathematically to limit the amount of punishment.
Therefore, as M increases the observed transitions in offer rates is not unexpected.
In general this result suggests that the ratio of costs between punisher and
punishee can dramatically affect experimental outcomes and that researchers should be
conscientious about the selection of M in both experiments and simulations. Because of
cost constraints it is perhaps expected that laboratory experiments with punishment
would not undertake a broader exploration of the parameter M. However, this is less of a
concern with computer models and future studies that make use of simulations could
easily include a sensitivity analysis of punisher/punishee cost ratios.
94
Furthermore, it is unlikely that such a punishment mechanism as that
implemented in this simulation would be well-received as an institutional solution to
promote cooperation among humans. Instead, this result supports the notion that
punishment – taken too far – can be counterproductive (see Molm 1994).
Network type and the transition value of M
A further result of this experiment is that the approximate value of M at which a
rapid transition in offers occurred was dependant on the neighborhood type. Transitions
occurred at higher values of M as the number of neighbors per agent, or average degree,
increased. Approximate values of these transitions are presented in Table 5.7.
A possible explanation for this trend is provided by extending the relative fitness
model described in equation [5.4] to demonstrate not only that a transition is expected,
but also at what value of M such a transition might occur. Given that the expected payoff
of any agent drawn randomly from the population = E(p), the expected relative payoff
[5.1] for any agent with n neighbors can be simplified as
[5.5] npnE
pE 1
)(
)( =.
Substituting the expected relative payoff of any agent [5.5] into equation [5.4] reveals
that punishment is expected to become an evolutionarily beneficial strategy when
[5.6] nM > ,
95
and that as punishment becomes a preferred strategy it will drive a rapid transition from
low offers to high offers.
A cursory inspection of the approximate transition points from low offers to high
offers under different neighborhood types (Table 5.7) indicates that equation [5.6] likely
presents a highly oversimplified model. It is more likely that the term
p
n, the part of
relative payoff describing exactly to what an agent’s payoff is relative, is much more
complex than a simple sum of immediate neighbors’ payoffs. A more general
representation of the relative payoff of agent i would be
[5.7] )(; ji
p
p
jj
i≠
∑
θ
where
θ
is a weight indicating the degree to which the payoff of every other agent in the
population affects agent i. For immediate neighbors this weight may be relatively high
and for distant members in the population it may be 0. However, a benefit of this more
general formulation of relative payoff is that it considers complex interactions such as the
possibility that distant members of the population may also influence an agent or the
possibility that immediate neighbors, despite their proximity, have negligible impact.
Future Directions
The different neighborhood structures used in this experiment (Table 5.3) were
chosen because they have been used frequently in similar simulations for many decades.
However, it is often overlooked that these symmetric and convenient neighborhood types
are but a tiny subset of the vast number of possible ways that agents may be connected in
96
a population. It is therefore prudent to embed research such as this in social network
theory, with its many tools for analyzing a vast number of possible structures. This paper
has merely explored possible relationships between the number of neighbors per agent
and mean ending offers. However it is probable that a better explanation for observed
correlations is much more complex and that results are driven by subtler elements of the
network structure.
A prime candidate for extending this work into social network theory is a better
determination of what constitutes P in the equation for relative fitness [5.1], or
alternatively how each agent in a population is weighted in [5.7]. With a more accurate
description of what exactly it is that an agent’s fitness is relative to, the current model can
better explain and predict the effects of altruistic punishment on fair allocations and
cooperation.
This study also suggests questions for experimentalists. Though costs may be an
inhibitive factor, it would be worthwhile to study the effects of systematically varying the
punishment multiplier M in a controlled laboratory setting. A replication of this
simulation with live subjects may isolate cultural or other factors that allow third-party
punishment to induce fair allocations and may better guide policy makers in designing
institutional solutions to social dilemmas.
In addition, laboratory ultimatum games may be designed to further explore the
role of neighborhood structure on offer rates. Like the current simulation, this may be
done without including 3rd party punishment. Participants may remain anonymous with
the experimenter controlling the array of possible interactions between participants.
Again, this would compliment the current study by helping to separate the effect that
97
culture has on offer rates from effects due strictly to the way in which the population is
structured.
98
Table 5.1
Some commonly-cited experiments using altruistic punishment and their values of the
punishment multiplier M
Laboratory experiments M Simulations and models M
Dreber et al. (2008) 4.0 Brandt et al. (2006) 1.2
Herrmann et al. (2008) 3.0 Fowler (2005) 2.0, 3.0
Gürerk et al. (2006) 3.0 Gardner and West (2004) 30.0
Fehr and Fischbacher (2004) 3.0 Jaffe (2004) 1.0
Shinada et al. (2004) 3.0 Brandt et al. (2003) 1.5
Andreoni et al. (2003) 5.0 Boyd et al. (2003) 4.0
Fehr and Fischbacher (2003) 3.0
Masclet et al. (2003) function
Fehr and Gächter (2002) 3.0
Fehr and Gächter (2000) function
Ostrom et al. (1992) 2.0, 4.0
99
Table 5.2
Strategy components used by agents in the ultimatum game
Component Description
x amount offered to a responder
α amount below which an offer is rejected
t amount below which to punish an observed offer
c amount to spend on punishment
Note: x, α, t, c ∈ [0, 1].
100
Table 5.3
Network types used in the ultimatum game
Network type Description of neighbors Number of neighbors
Linear left, right 2
von Neumann left, right ,up, down 4
Hexagonal up, down, diagonals 6
Moore left, right, up, down, diagonals 8
Complete every other agent N – 1
101
Table 5.4
Payoff matrix of the ultimatum game with 3rd party punishment
xi ≥ αj xi < αj
xi ≥ tk xi < tk xi ≥ tk xi < tk
Proposer i 1 - xi 1 - xi - ckM 0 - ckM
Responder j xi 0 0 0
Observer k 0 - ck 0 - ck
Note: see Table 5.3 for an explanation of variables used in the payoff functions.
102
Table 5.5
Simulation parameters used in the ultimatum game
Parameter Values
The population size (N) 625
a
The number of generations a single simulation run 30,000
The number of runs 100
The number of games initiated by each agent
in one generation