Content uploaded by Ze Hong
Author content
All content in this area was uploaded by Ze Hong on Aug 27, 2021
Content may be subject to copyright.
Title: Combining Conformist and Payoff Bias in Cultural Evolution: An Integrated Model for
Human Decision Making
Author: Ze Hong a1
Author Affiliations:
a Department of Human Evolutionary Biology, Harvard University, 11 Divinity Avenue, 02138,
Cambridge, MA, United States
1 To whom correspondence should be addressed: ze_hong@g.harvard.edu
1
Combining Conformist and Payoff Bias in Cultural Evolution:
An Integrated Model for Human Decision Making
1. Introduction
Unlike most animals, humans obtain a tremendous amount of information from conspecifics
(Richerson & Boyd, 2005). The transmission of information in human societies has been
extensively studied as an evolutionary process both theoretically (Boyd & Richerson, 1985;
Feldman & Cavalli-Sforza, 1976; Kendal et al., 2009) and empirically (Henrich & Henrich,
2010; Mesoudi, 2008). A lot of research has focused on transmission biases, the psychological
tendencies of individuals to favor specific cultural variants rather than others (Joseph Henrich &
McElreath, 2003). These biases can result in evolutionary dynamics that significantly differ from
genetic transmission as genetic material can only be passed from parents to offspring whereas
cultural information can flow through multiple transmission channels (from non-parents, peers,
etc.) (Creanza et al., 2017).
Transmission biases often allow for the adaptive evolution of culture (Kendal et al., 2018)
and may themselves be viewed as having a genetic basis and thus be subject to natural selection
(Laland, 2004; Mesoudi, 2005). Much effort has been devoted to examining the conditions under
which various transmission biases evolve (Kendal et al., 2009; Muthukrishna et al., 2016);
among the proposed biases, conformist bias and payoff bias have received particular attention
(Boyd & Richerson, 2009; Denton et al., 2020; McElreath et al., 2008; Whitehead & Richerson,
2009). In the cultural evolution literature, conformist bias refers to a specific kind of frequency-
dependent copying strategy where individuals adopt the most common cultural variant with
probability that is higher than its actual frequency in the population (Boyd & Richerson, 1985;
Henrich & Boyd, 1998). Payoff biased imitation, on the other hand, has been discussed in both
economics (Schlag, 1998) and cultural evolution (Boyd & Richerson, 1985, 2009; Mesoudi &
O’Brien, 2008), and generally refers to the type of copying strategy where the probability of
adopting a cultural variant depends on some observed payoff (the same bias has also been called
success bias, see (Baldini, 2012) and "indirect bias" in Boyd & Richerson (1985)'s original
formulation). In some recent theoretical models, it has been used to describe the trait-adoption
2
strategy where the probability of adopting a particular cultural variant is positively related (e.g.
directly proportional) to its relative payoff (Baldini, 2012; J. Kendal et al., 2009).
Typically, the different transmission biases and the associated learning rules are treated as
distinct strategies favored in different environmental contexts. For example, conformist
transmission has been shown to be favored (compared to unbiased frequency dependent
transmission) when the number of traits involved is large (Nakahashi et al., 2012) or the
population size is large (Perreault et al., 2012). Similarly, payoff bias has been suggested to be
favored when the high-payoff variant is rare and the payoff information is not very stochastic
(Baldini, 2012). Such treatment allows for evolutionary stable strategy (ESS) analysis and has
provided much insight for both understanding how our ancestral environmental conditions might
have shaped our learning psychology and how such learning psychology might flexibly respond
to specific situations that individuals may encounter in their lifetime. In reality, however, humans
likely possess a suite of learning strategies and the actual decision making in a given situation
may involve more than one strategy. In other words, different types of learning are not
psychologically distinct processes (Heyes, 1994; Plotkin, 1988), and instead of employing
individual strategies in particular learning instances, humans may combine multiple strategies
into a single decision making calculus (Perreault et al., 2012).
However, there are many ways to combine or integrate different learning strategies into a
single strategy (hereafter referred to as "integrated strategy"). Many existing cultural
evolutionary models treat these integrated strategies as having a step-like structure; that is,
individuals may utilize strategy 1 by default but will switch to strategy 2 if certain criteria are
met. For example, individuals may first attempt payoff biased imitation but will fall back to some
frequency dependence if observed payoffs are tied (McElreath et al., 2008). Similarly, Boyd &
Richerson (1995) model a situation where individuals first compare the payoffs of two variants at
a cost (individual learning) and will imitate (social learning) if the payoff difference is not
sufficiently large. In a more general case, Enquist et al. (2007)'s “critical social learning” strategy
has the same structure: individuals attempt social learning first and will perform individual
learning if socially acquired behavior is deemed unsatisfactory by some standards. Relatedly, in
an earlier seminal work on conformist transmission (Henrich & Boyd, 1998), social learning and
individual learning are attempted at different stages of individuals' life cycle. In contrast, much
3
less attention (with the notable exception of Perreault et al. (2012)) has been paid to the kind of
integrated strategies where information produced by transmission biases is processed
simultaneously.
Existing models have also primarily focused on comparing individual learning and social
learning, with the goal of examining the conditions under which either kind of learning is
evolutionarily advantageous. It is worth noting that “social learning” is not a single strategy but
refers to a multitude of ways in which individuals acquire information from others in the
community, and much less effort has been devoted to understanding how various social learning
biases (such as conformist bias and payoff bias) may interact with one another and their relative
importance in influencing trait adoption decisions. Given our species' enormous reliance on
social learning, a closer examination of these different types of social learning strategies may be
particularly informative.
In this paper, I build upon and complement existing work by proposing a simple model
where individuals process frequency and socially acquired payoff information simultaneously in
a single decision-making calculus. To my knowledge, there has not been any empirical evidence
that humans utilize one learning strategy first and then another in a sequential manner when
deciding what cultural variant to adopt, and I argue that my model setup is a more realistic
description of the actual psychological mechanism of human decision making for two reasons.
First, decades of research in cognitive psychology has conclusively demonstrated that humans
have rich cognitive structures that process input information in rather sophisticated ways in
contrast with blunt stimulus-response behaviorism (Greenwood, 1999; Miller, 2003; Pekala &
Pekala, 1991), and that humans are perfectly capable of integrating different kinds of information
in a single inferential process to respond flexibly and adaptively to a multi-dimensional
environment (Angelaki et al., 2009; Kayser & Shams, 2015). Instead of identifying some optimal
solution, I aim to illustrate the advantage of utilizing both frequency dependent bias (including
conformist bias) and payoff bias and briefly discuss the implications for human social learning.
While the ultimate elucidation of the exact mechanisms of human information processing and
decision making is likely to require breakthroughs in neurobiology and brain science, theoretical
models that take an evolutionary approach can provide potential direction and guidance given
4
that millions of years of evolution presumably has equipped humans with some adaptive design
of information acquisition and processing (Richerson, 2019).
2. Model and Results
In this stylized model, agents face a decision of adopting one of the two dichotomous cultural
variants (C1 and C2) that have associated payoffs which can be observed with error. I propose a
straightforward algebraic way of combining frequency and payoff information into a single
probabilistic decision making equation, and examine the evolutionary dynamics of cultural
variants under such decision making strategy. I then consider the evolution of the relative
importance (referred to as "weights" in the model) that individuals place on observed frequency
and payoff under various conditions.
The present model differs from previous theoretical work in that it assumes the
environment is constant and therefore one cultural variant is strictly superior than the other
regarding its payoff. While it is true that the payoff/fitness benefit of many cultural traits depends
on environmental states (e.g. variant A confers higher payoff in state 1 but relatively lower
payoff in state 2) (Richerson, 2019) which presumably selected for our capacity for social
learning, once the cultural capacity is evolved it needs to deal with the myriad of cultural
variants whose payoffs do not necessarily depend on environments. In other words, there exist
cultural traits where one variant is simply better (has higher payoff) on average than another
across different environmental states. This is especially true in the domain of technology;
examples include the replacement of stone tools by bronze/iron tools (Edmonds, 2003) and the
numerous technological breakthroughs during the industrial revolution (Thackray, 1970). Indeed,
cumulative improvement in technological variants would be difficult to imagine if all variants
confer exactly the same average payoff. The large repertoire of cultural items in human
populations means that naive individuals may often encounter situations where she needs to
evaluate alternative cultural variants and decide which one(s) to adopt, and this creates a
selective environment in which individuals with a decision-making apparatus that increases their
chance of adopting the high-payoff variants would be favored by natural selection.
5
2.1. Baseline Model
For analytic convenience, I take the typical assumptions of asexual reproduction and non-
overlapping generations1 (Day & Bonduriansky, 2011). Naive agents randomly sample a number
of cultural models from the parental generation and make their adoption decision based on both
the number of models possessing C1/C2 and the payoff of C1/C2.
Denote the number of C1 and C2 models in the sample
n1
and
n2
, and payoff of C1
and C2
s1
and
s2
respectively. Assuming the payoff observation error
ϵ
is normally
distributed with mean 0 and variance
σ2
for both variants, define a naive agent's probability
of adopting C1 as
Pr
(
C1adopted
)
=
{
1if n2=0
0if n1=0
(
n1
)
⋅wa+
(
s1+ϵ
)
⋅wo
(
n1+n2
)
⋅wa+
(
s1+ϵ+s2+ϵ
)
⋅wo
otherwise
(
1
)
where
n1
and
n2
represent the number of sampled models who possess C1 and C2
respectively, and
wa
and
wo
represent the weight attached to observed variant frequency
and observed payoff. Note that
wa
and
wo
theoretically can be any real number while in
practice their values need to be non-negative to be sensible. Since observed payoff error
ϵ
has mean 0 it will be omitted in subsequent analytic formulations. This particular way of
constructing the probability of adopting C1 ensures that
Pr
(
C1adopted
)
is properly bounded
between 0 and 1, and the relative importance of observed frequency and payoff can be flexibly
adjusted. Note that in the special case where one of the weights is zero, equation (1) either
becomes frequency dependent transmission (
wo=0
) or payoff biased transmission (
wa=0
).
In equation (1), the frequency-dependent component of
Pr
(
C1adopted
)
is unbiased
(conformist bias will be added later) and the payoff-dependent component follows a version of
the "proportional imitation rule" (Schlag, 1998). In this case, naive individuals compare C1 and
C2's payoff which proportionally contribute to the overall probability of adopting C1/C2.
1These assumptions will be relevant in the agent-based simulation later in the paper.
6
First let us examine the change in C1 frequency from one generation to the next under
such adoption rule. let the frequency of C1 at a given time be
p
. As the individuals choose
their models randomly from the parental generation, the number of models with cultural variant
C1 should follow a binomial distribution (Boyd & Richerson, 1985). In the next generation, the
frequency of C1
p'
is therefore
p'=∑
n1=1
n−1
[
n1⋅wa+s1⋅wo
n⋅wa+
(
s1+s2
)
⋅wo
]
⋅
(
n
n1
)
⋅pn1⋅
(
1−p
)
n−n1+pn
(
2
)
where
n
represents the total number of sampled models (
n=n1+n2
).2 Simplify equation
(2), we have
p'=n⋅p⋅wa+
(
s1−
(
1−p
)
n⋅s1+pn⋅s2
)
⋅wo
n⋅wa+
(
s1+s2
)
⋅wo
(
3
)
In order to identify possible equilibrium, we can simply set
p'=p
. However,
analytically solving equation (3) can be unwieldy; according to Abel-Ruffini theorem, there are
no solution in radicals for polynomial equations of degree five or higher. Further simplify and re-
arrange equation (3), we get
p¿=s1−
(
1−p¿
)
n⋅s1+p¿n⋅s2
s1+s2
(
4
)
where
p¿
denotes the equilibrium C1 frequency. Note that when the number of sampled
models
n
is relatively large, we may ignore terms with
p¿n
and
(
1−p¿
)
n
and therefore
equation (4) becomes
p¿=s1
(
s1+s2
)
(
5
)
2 When
wo=0
(i.e. probability of adopting cultural variants only affected by their frequency), equation (2) is a
special case of equation (5) in Denten et al. (2020).
7
The accuracy of the approximation of equation (5) depends on the magnitude of
n
,
and a thorough exploration of the parameter space to check its validity can be found in
Supplemental Material. Equation (5) shows that the frequency of C1 at equilibrium is determined
by the relative payoff of the two cultural variants, independent of weights and number of models
sampled (given the approximation assumption that
n
is large). This makes intuitive sense, as
unbiased frequency dependent transmission itself does not change the relative frequency of
cultural variants. In the following sections, equation (5) will also be used as a baseline condition
to contrast with more complex situations that include additional parameters.
Figure 1 provides a graphical illustration of the relationship between current frequency of
cultural variant C1 and change in frequency using both analytic computation (equation 2) and
agent-based simulation (equation 1). As can be seen, when
n=10
, equation (5) already
provides a pretty good approximation, as the
p'−p
curve crosses 0 right at
s1
s1+s2
. This
stable polymorphic equilibrium (
p'−p
crossing 0 with negative slope) exists in most cases in
addition to the obvious equilibrium states of
p=0
and
p=1
. Note that both
p=0
and
p=1
are unstable equilibria, meaning they can only be maintained in the absence of
innovation/invading variants, and slight deviation would push
p
away from these equilibrium
states. This suggests that in the absence of additional forces, C1 and C2 will co-exist in the
population regardless of the initial population composition (so long as the population does not
entirely consist of C1 or C2) under unbiased frequency dependence and proportional imitation
based on observed payoff.
8
Figure 1: Relationship between current frequency
p
and change in frequency
p'−p
(
p'
denotes the
frequency in the next generation) under different parameter combinations.
s1
is fixed at 1, and both
wa
and
wo
are set to be 1. Analytic computation and agent-based simulation are represented by black solid lines and red
dotted lines respectively, and the approximated equilibrium values according to equation (5), (
s1
s1+s2
) is marked
by dotted blue lines.
2.2 Adding Conformist Bias
We now include a conformist bias parameter. Classically, conformist bias has been modeled as
the probability of adopting the most common variant being its actual frequency plus a positive
value (denoted by
D
) . Following the notation scheme in equation (1), an individual's
probability of adopting C1 in the presence of conformist transmission bias thus becomes
9
Pr
(
C1adopted
)
=
{
1if n2=0
0if n1=0
(
n1+D
)
⋅wa+s1⋅wo
(
n1+n2
)
⋅wa+
(
s1+s2
)
⋅wo
if n1≠0∧n2≠0∧n1>n2
(
n1−D
)
⋅wa+s1⋅wo
(
n1+n2
)
⋅wa+
(
s1+s2
)
⋅wo
if n1≠0∧n2≠0∧n1<n2
(
6
)
If we set
wo=0
, i.e. agents make trait adoption decisions only based on frequency
information, equation (6) closely resembles the classic result in cultural evolution literature
where Boyd and Richerson (1985) have solved the special case when
n=3
. Here we are
interested in how conformist bias may affect the adoption of cultural variants when both payoff
and frequency are taken into the individuals decision making calculus. Again, we first look at the
relationship between current frequency and change in frequency under different
D
conditions. Given the current frequency of C1 being
p
, the frequency of C1 in the next
generation
p'
can be expressed similarly as in equation (2):
p'=∑
n1=0
n
{
[
n1
n
]
⋅
(
n
n1
)
⋅pn⋅
(
1−p
)
n−n1if n1=0∨n1=n
[
n1⋅wa+s1⋅wo
n⋅wa+
(
s1+s2
)
⋅wo
]
⋅
(
n
n1
)
⋅pn⋅
(
1−p
)
n−n1if n1=n/ 2
[
(
n1+D
)
⋅wa+s1⋅wo
n⋅wa+
(
s1+s2
)
⋅wo
]
⋅
(
n
n1
)
⋅pn⋅
(
1−p
)
n−n1if n1>n/ 2
[
(
n1−D
)
⋅wa+s1⋅wo
n⋅wa+
(
s1+s2
)
⋅wo
]
⋅
(
n
n1
)
⋅pn⋅
(
1−p
)
n−n1if n1<n/ 2
(
7
)
where
D
represents the magnitude of conformist bias. Equation (7) is simply the specification
of the probability of adopting C1 under various sample composition conditions. Note that when
n1=0∨n1=n
, agents never get to experience the alternative variant and its payoff, therefore
the
s1⋅wo
(
s1+s2
)
⋅wo
component is absent in the frequency calculation of C1 in the next generation.
As in the case of unbiased transmission, we first examine the change in frequency of C1
p'−p
as a function of current frequency of C1,
p
. Figure 2 shows the expected change in
10
frequency under different conditions, and a few remarkable features should be noted. First, as
pointed out previously, there is always a stable polymorphic equilibrium (
p'−p=0
) when the
conformist bias parameter
D
is zero, which is determined by the relative payoff difference of
the two cultural variants as well as the number of models sampled. On the other hand, when the
conformist bias parameter
D
is sufficiently large (
D=2
) no stable equilibrium exists and
the population will always move towards either
p=0
or
p=1
depending on the initial
frequency. The intuition here is that because the most common variant is disproportionally
favored under conformist transmission, a strong conformist bias will tend to push the common
variant towards fixation.
Figure 2: Relationship between current frequency
p
and change in frequency
p'−p
in the presence of
conformist bias under different parameter combinations.
s1
is fixed at 1, and both
wa
and
wo
are set to
be 1. The reference line
y=0
is marked by the solid blue line. All values are computed according to equation
(7).
What is particularly interesting, however, is that when
D
is of intermediate magnitude
(
D=0.6
and
D=1.2
) and the relative payoff difference between C1 and C2 is large,
change in frequency can be entirely negative, meaning that the cultural variant that confers high
11
payoff (C2) will reach fixation regardless of the initial population composition. If we use C1 and
C2's payoffs as an index of fitness, this result suggests that moderate conformist transmission
bias can increase population mean fitness when individuals take both frequency and payoff into
trait adoption decisions. Intermediate conformist bias can both resist the invasion of cultural
variant with lower payoff and allow for the spread of cultural variant with higher payoff, because
proportional imitation based on relative payoff favors the high-payoff variant but does not push it
into fixation as the low-payoff variant still has some probability of being adopted. As the
frequency of the high-payoff variant increases, conformist transmission may push it towards
fixation once it becomes the more common variant (
p>0.5
).
2.3 Evolution of Information Weights
Since the combination of payoff bias and conformist bias may lead to better population level
outcomes (in terms of the adoption of the cultural variant with higher payoff), how would natural
selection operate on the relative weight placed on observed frequency and payoff? In this section
we allow the weights (
wa
and
wo
) to evolve, and track its evolutionary trajectory under
various conditions.
Denote the fitness benefit that
s1
and
s2
confer as
f
(
s1
)
and
f
(
s2
)
, the
expected fitness
zi
of a naive individual
i
is thus
zi=Pr
(
C1adopted
|
wai , w oi, p
)
⋅f
(
s1
)
+
[
1−Pr
(
C1adopted
|
wai , w oi, p
)
]
⋅f
(
s2
)
(
8
)
In the simplest case, set
f
(
s1
)
=s1
and
f
(
s2
)
=s2
and fix
wo
to be 1, we have
zi=Pr
(
C1adopted
|
wai , p
)
⋅s1+
[
1−Pr
(
C1adopted
|
wai , p
)
]
⋅s2
(
9
)
Here
Pr
(
C1adopted
|
wai , p
)
denotes the overall probability of adopting C1 across
different population compositions; note that this probability is affected by both the individual's
weight
wai
the frequency of C1
p
in the population. Figure 3 provides a graphical
illustration of fitness as a function of the relative magnitude of
wa
, where
p
denotes the
frequency of C1 in the population, and the fitness values
zi
are computed according to
equation (9). Note these fitness values are only comparable with each other within specific
s
12
and
D
settings (individual graphs in the panel of Figure 4). There are two points worth
noting: first, for a given particular payoff ratio between
s1
and
s2
,
wa
always have the
opposite fitness effect when
p
is large (0.9) vs.
p
is small (0.1). This means that whether a
particular weight value increases or decreases fitness crucially depend on the frequency of the
cultural variants in the population; only in the extreme case where one of the dichotomous
variants reaching fixation does fitness
zi
become independent of
wai
(as can be seen from
equation 9). Second, the effect of
wa/wo
ratio on fitness is not very pronounced, especially
when the magnitude of this ratio is relatively large. In Figure 3, for example, when
wa/wo
reaches around 2 (frequency information is valued twice as much as payoff information), there is
little further change in fitness
z
as the ratio keeps increasing, suggesting that selection is
unlikely to drive either weight to 0.
13
Figure 3: Relationship between weight attached to observed frequency
wa
and fitness
z
in the presence of
conformist bias under different parameter combinations.
s1
is fixed at 1,
wo
is set to be 1, and
n=10
.
All values are computed according to equation (7) and (9), where
Pr
(
C1adopted
|
wai
)
=p'
.
To fully explore the evolutionary dynamics, I construct an agent-based simulation that
allows the weights to evolve. In particular, I track the temporal changes of both the frequency of
C1 (
p
) and weight on observed frequency (
wa
). Recall that the weights are genetically
transmitted under asexual reproduction, and the life cycle of agents are modeled as a simple
Wright-Fisher process with selection (Ewens, 2012), where the population is of constant size
N
and an agent's probability of contributing to the gene pool (reproduction) is proportional to
its fitness. In order to better visualize the evolutionary dynamics, I increase the selective pressure
by recomputing the fitness of individuals as their original fitness to the power of
α
(see
supplemental Material for more detail). I ran a large number of simulations to fully explore the
parameter space (see Supplemental Material for simulation setup and parameter value details)
and Figure 4 shows the evolutionary trajectory of the frequency of C1 along two dimensions, the
initial frequency of C1 (
pinitial
) and the payoff of C2 (
s2
). What is immediately noticeable
is that the value of
wa
changes primarily in the beginning in each condition and reaches
equilibrium after a certain number of generations. The reason is that once either C1 or C2
reaches fixation individuals' fitness becomes a constant and everyone has the same fitness values.
Therefore,
wa
contributes to fitness only when there is a mixture of both C1 and C2 variants
in the population. The most informative condition in Figure 4 is perhaps
pinitial=0.9
and
s2=1.2
(top right graph). Here, the starting frequency of C1 is high (0.9) and the “invading”
variant C2 has higher payoff.
wa
quickly declines across all
D
conditions but start to
increase when C1's frequency drops below 0.5 for both
D=1
and
D=2
. This is because
when high-payoff variant is initially rare, it is better to rely more on payoff information (larger
wo
, thus smaller
wa
relatively) as the relative payoff advantage is not matched by its
frequency in the population. As the high-payoff variant's frequency increases and becomes the
more common variant we have a different situation: in the case of unbiased frequency dependent
transmission (
D=0
) its relative payoff and relative frequency become the same and thus
weights do not affect variant adoption probability anymore; in the case of conformist biased
transmission (
D=1
and
D=2
) the better strategy becomes to rely more on frequency
14
information as the contribution from the frequency component (
n1+D
n1+n2
) to the overall
probability of adopting the high-payoff variant is higher (due to conformist bias) than the
contribution from the payoff component (
s1
s1+s2
).
Figure 4: Temporal evolutionary trajectories of C1 frequency (
p
) and weight on frequency information (
wa
) under various initial frequency and payoff conditions.
s1
and
wo
are set to be 1,
N=1000
. Other
parameter values are specified in Supplemental Material.
In the three other conditions in Figure 4, the change in
wa
is relatively slight, as the
populations rather quickly reach fixation. The general trend is however the same: if the high-
payoff variant is common, then large
wa
is favored until fixation; conversely, if the low-
payoff variant is common, then small
wa
is favored until fixation. This result is consistent
with Baldini (2012)'s conclusion that payoff bias favors rare variants; if the high-payoff variant is
rare, then a stronger reliance on payoff (small
wa
) would be more adaptive. The magnitude of
change in
wa
, on the other hand, depends on the time that the population stays in mixed
composition; the longer the population consists of both C1 and C2, the more
wa
changes.
15
It should be noted that the above analyses focus on one pair of dichotomous variants and
assumes neither passive mutation nor active innovation of variants (i.e. C1 and C2 never mutate
into each other). In the presence of either mutation or innovation that prevents the population
from reaching complete fixation, the relative magnitude of
wa
will matter insofar as the
population is polymorphic. Assuming the population reaches near fixation and the dominant
variant has high payoff, a constant supply of low-payoff variant due to mutation/innovation will
cause an increase in
wa
; that is to say, individuals that weigh more on payoff will enjoy a
fitness advantage. On the other hand, if the low-payoff variant reaches near fixation for whatever
reason yet cannot drive the high-payoff variant into extinction, then
wa
will decrease,
meaning those who rely more on payoff would have higher fitness. The second scenario is less
likely because normally we would expect the high-payoff variant to reach near fixation through a
number of mechanisms. Therefore, if we consider only one pair of dichotomous variants, a
stronger reliance on observed frequency (
wa¿
would be favored on average in settings where
the transmission fidelity is not 100% due to transmission errors or certain individuals
consciously experimenting different variants. However, we need to keep in mind that 1) as
wa
gets larger, the marginal fitness benefit that it confers declines dramatically, as can be seen in
Figure 3, and 2) though not as common, there are cases where the low-payoff variant dominates
the population and some consideration of the payoff difference would be advantageous.
3 Discussion
3.1 Combining Frequency and Payoff Bias in Social Learning
Although there hasn't been a lack of theorizing of transmission biases in cultural evolution, most
published work has treated these biases as distinct strategies and aims to identify evolutionary
stable strategies. In this paper I show the evolutionary dynamics of a pair of dichotomous
cultural variants when individuals combine frequency and payoff information into a single
decision-making calculus, and how natural selection may have selected for the weights of the
two information input sources.
Unbiased frequency-dependent transmission leads the population into a state that
resembles Hardy-Weinberg equilibrium in population genetics (Meirmans, 2018) where
frequencies of variants remain constant. Conformist bias by definition favors the more common
16
variant and pushes the frequencies of dichotomous variant towards the absorbing states
p=0
and
p=1
. Intuitively, the caveat of conformist-biased transmission is that a previously
adaptive cultural variant may become non-adaptive due to environmental change, and as a result
the population may get stuck in a sub-optimal condition. Some theoretical work even suggests
that relying on conformist transmission alone may lead to population collapse under certain
circumstances (Whitehead & Richerson, 2009).
Payoff bias on the other hand looks like an attractive alternative learning strategy as it
directly compares different cultural variants and favors the one with higher payoff. Assuming
payoff is statistically associated with fitness (see Baldini (2012) for instances where payoff may
be quite dissociated from fitness), a reliance on payoff information may confer fitness advantage
by adopting the high-payoff variant. However, payoff may be noisy and assessing it may involve
a cost (Nakahashi et al., 2012). My model shows that a straightforward algebraic combination of
these biases may prove a superior strategy: payoff bias tends to increase the frequency of the
high-payoff variant, and once it passes 0.5 conformist bias can quickly pushes it towards
fixation.
The effect of conformist bias's magnitude on the evolution dynamics is worth reiterating.
If it is too small it cannot effectively help the high-payoff variant that is already common in the
population to reach fixation, and if it is too large a rare high-payoff variant may never have a
chance to pass 0.5 in frequency. In this integrated decision-making calculus, the adaptive
synergistic interaction between the two biases requires the magnitude of
D
to be within a
particular range. In fact,
D
's magnitude likely matters whenever conformist bias is combined
in this way with some additional mechanism that makes the adoption of the variant with higher
payoff more likely, as the initial spread of high-payoff variant is always suppressed by strong
conformity.
In an earlier seminal paper on conformist transmission, Henrich and Boyd (1998)
conclude that maximal conformity is favored under a broad range of conditions. This is not
surprising, as the results of simulation studies that explore the evolution of conformity depend
sensitively on the precise design of the simulations (Denton et al., 2020). In this case, the
different conclusions are due to a crucial difference in the model setup: the payoffs that
individuals obtain in Henrich and Boyd (1998) are a result of independent individual learning
17
and as such individuals are always comparing the payoffs of different cultural variants, whereas
in my model there is no individual experimentation and payoff information is always obtained
culturally, and therefore the payoff of the variant not possess by one’s cultural models is simply
not available to the naive individual for payoff evaluation. This means that in the present setup if
one variant reaches high frequency most naïve individuals may never experience the payoff of
the alternative variant, and in cases where the rare variant happens to have higher payoff, the
population may nonetheless push the common variant (with lower payoff) towards fixation in the
presence of strong conformist bias.
Therefore, the population level consequences of conformity hinge upon the extent to
which accurate payoff information can be obtained either through independent individual
learning or some other payoff-revealing mechanisms. While individual trial-and-error learning
certainly occurs, I suggest that the payoff information of many cultural items cannot be
realistically evaluated and compared by potentially costly individual learning for two reasons.
First, naïve individuals may not even be aware of the existence of alternative variants if they do
not observe others possessing these variants; second, “trying” both variants and comparing their
payoffs may not always be feasible; for example, comparing the efficacy of two illness
treatments through individual learning requires one to intentionally get herself ill twice
(presumably at different times). In fact, Henrich and Boyd (1998) do show that when individual
learning is highly error-prone (small
ρ
) conformist bias does not evolve to large values. My
analysis here thus complements previous work by showing that in cases where payoff
information is obtained culturally, a very strong conformist bias may not be optimal at the
population level.
In general, integrated strategies are more adaptive than individual strategies. In Enquist et
al. (2007)'s approach, for example, “critical social learners” (individuals with the strategy of
attempting social learning first and perform individual learning if social learning proves
unsatisfactory) are more likely to obtain the more advantageous variant compared to individual
learners. However, in the absence of conformist bias the advantageous variant rarely reaches
fixation in the population, and a stable polymorphic equilibrium of the two cultural variants
exists. It is likely that the critical social learners (as well as individuals with other integrated
strategies) may perform even better regarding their probability of obtaining the high payoff
18
variant if they incorporate some form of conformist bias in their trait adoption decision in light of
the result of the present model. This is because the conformist component in their decision-
making favors the more frequent variant, and large proportion of critical social learners in the
population are likely to cause the high payoff variant to be more frequent. It should be kept in
mind, however, that integrated strategies may be more computationally intensive and thus
requires costly neural machinery. The present model does not assume any cost in employing
integrated strategies, and further theoretical work may take this factor into consideration, ideally
in light of neurological mechanisms in human learning.
3.2 How Many Cultural Models to Learn from? The Effect of n
Although most published theoretical models include some mentioning of the effect of
n
, systematic treatment remains scant (see Denton et al. (2020) and Perreault et al. (2012) for
some exceptions). This is probably because the role
n
plays is different under different sets of
assumptions. In my model,
n
is directly involved in the approximation that leads to equation
(5). Here a large
n
ensures that most naive agents will have both C1 and C2 models in their
sample; in other words, a relatively large
n
ensures that naive agents have a chance to
experience the payoff of both C1 and C2 variants. When
n
is small, the payoff of the rare
variant will experience a disproportional disadvantage compared to its frequency and therefore
the evolutionary dynamics is affected more by the frequency-dependent component. In the most
extreme case where
n=1
(each agent randomly picks one model from the parental
generation) payoff information becomes entirely irrelevant and we end up with a special case of
the unbiased frequency-dependent transmission.
Assuming no cost is incurred, sampling more individuals to learn from should always
lead to better inferences, particularly from a Bayesian perspective. Perreault et al. (2012) point
out that in their model a larger
n
sometimes leads to worse inference, but suggests that this is
an artefact due to agents' relatively simple priors. In reality, people likely do not rigidly sample a
fixed number of individuals but rather make trait adoption decisions based on idiosyncratic
personal experiences which include both individual learning and social informational inputs.
Therefore,
n
may be determined not by evolved human preferences but by external factors
such as population size and interconnectedness, both of which have been shown to be important
in human cultural evolution (Henrich, 2004). As such, demographic details of our ancestral
19
population may be needed to better understand the evolutionary dynamics and outcomes of
social learning strategies as affected by the number of models picked.
3.3 Evolution of Information Weights and Implications for Modeling Human Decision-
making
In this stylized setting, then, how would natural selection operate on the weight of frequency and
payoff information respectively? My results show that when one cultural variant strictly confers
higher payoff than the other, the direction and magnitude of change in the relative weights
crucially depend on population composition (in particular, whether the high-payoff variant is the
common variant or not). Therefore, given that humans likely encounter many such situations
where they need to adopt some cultural variant among a number of candidates during their
lifetime, it is unlikely that there is a single optimal weight across different learning situations.
What is clear, however, is that both
wa
and
wo
are likely to be positive. In other words, it
is better to take both frequency and payoff into consideration, though the degree to which
frequency and payoff matters may vary in domain-specific ways.
Intuitively, it seems a bad idea to completely ignore frequency or payoff information
when it is available, and humans may adaptively and flexibly evaluate these different types of
information depending on their prior beliefs and the specificities of the situation. In a way, what
is presented in the paper is a proof-of-concept model showing that some way of combining
observed frequency and observed payoff in a single decision-making calculus can be more
advantageous than strategies discarding either information. My own fieldwork suggests that both
types of information often feed into the same inferential process; for example, when evaluating
some healing practice, the Wa and Yi people in southwest China frequently use "many people in
the community use it" and "it worked on my friends" as reasons for its efficacy (Hong,
unpublished). In the domain of technology where means-ends reasoning dominates, transmission
biases often manifest themselves as cognitive processes of integrating information from different
sources into a single inference, i.e. the cultural variants' efficacy or effectiveness.
Lastly, I shall address an obvious concern: why not just compare payoffs when making
trait adoption decisions? In addition to the aforementioned noise and cost, payoff bias may not be
the panacea that applies to all learning situations for two other reasons. First, many cultural traits
do not have obvious payoffs associated with them. In fact, people often do not understand why
20
particular actions are performed (Henrich, 2016) or the causal mechanisms underlying seemingly
purposive actions (Derex et al., 2019). Second, people may simply have the wrong payoff
associated with cultural practices. For example, people in small scale societies have tried all
kinds of methods to induce rain (Frazer, 1890) and payoff biased imitation would lead one to
nowhere as none of the methods had any real influence on weather. Thus, the applicability of
payoff bias as a general learning mechanism may be limited and sole reliance on it can be non-
adaptive.
Actual human information processing and decision making are complicated and likely
affected by a wide range of factors. Future evolutionary theorizing of human social learning may
benefit from explicitly considering various types of information feeding into the same
computational process and treating human decision making not as rigidly implementing pre-
programed rules but as flexibly influenced by the context. Models constructed with more
psychological realism may then be empirically tested and the explanatory as well as predictive
power of different learning models could be contrasted and compared to enhance our
understanding of the important phenomenon of human social learning.
4 Conclusion
I have presented a model showing that in settings where one cultural variant strictly confers
higher payoff than the other variant, combining frequency dependent bias and payoff bias in a
single decision-making calculus can be more advantageous than employing either strategy alone,
and that the magnitude of the conformist bias may be particularly important in resisting the
invasion of low-payoff variants and helping the spread of high-payoff variants. The insights here
are generally applicable when conformist bias is coupled with some other mechanism that favors
the adoption of the high-payoff variant.
Code Availability
The graphical representations of all equations are created using python 3.7. The final agent-based
simulation for the evolution of epistemic weights (shown in Figure 4) is created using Julia 1.5.0.
All codes are available at https://github.com/kevintoy/epistemic_weight_evo.
21
References
Angelaki, D. E., Gu, Y., & DeAngelis, G. C. (2009). Multisensory integration: psychophysics,
neurophysiology, and computation. In Current Opinion in Neurobiology.
https://doi.org/10.1016/j.conb.2009.06.008
Baldini, R. (2012). Success-biased social learning: Cultural and evolutionary dynamics.
Theoretical Population Biology. https://doi.org/10.1016/j.tpb.2012.06.005
Boyd, R., & Richerson, P. J. (1985). Culture and the evolutionary process. University of Chicago
Press.
Boyd, R., & Richerson, P. J. (2009). Voting with your feet: Payoff biased migration and the
evolution of group beneficial behavior. Journal of Theoretical Biology.
https://doi.org/10.1016/j.jtbi.2008.12.007
Creanza, N., Kolodny, O., & Feldman, M. W. (2017). Cultural evolutionary theory: How culture
evolves and why it matters. Proceedings of the National Academy of Sciences of the United
States of America. https://doi.org/10.1073/pnas.1620732114
Day, T., & Bonduriansky, R. (2011). A Unified Approach to the Evolutionary Consequences of
Genetic and Nongenetic Inheritance. The American Naturalist.
https://doi.org/10.1086/660911
Denton, K. K., Ram, Y., Liberman, U., & Feldman, M. W. (2020). Cultural evolution of
conformity and anticonformity. Proceedings of the National Academy of Sciences of the
United States of America. https://doi.org/10.1073/pnas.2004102117
Derex, M., Bonnefon, J.-F., Boyd, R., & Mesoudi, A. (2019). Causal understanding is not
necessary for the improvement of culturally evolving technology. Nature Human Behaviour,
1. http://www.nature.com/articles/s41562-019-0567-9
Edmonds, M. (2003). Stone tools and society: Working stone in neolithic and bronze age Britain.
In Stone Tools and Society: Working Stone in Neolithic and Bronze Age Britain.
https://doi.org/10.4324/9780203481080
Ewens, W. J. (2012). Mathematical population genetics 1: theoretical introduction (Vol. 27).
22
Springer Science & Business Media.
Feldman, M., & Cavalli-Sforza, L. (1976). Cultural and biological evolutionary processes,
selection for a trait under complex transmission. Theoretical Population Biology.
https://doi.org/10.1016/0040-5809(76)90047-2
Frazer, J. G. (1890). The Golden Bough: A Study in Comparative Religion, Volume 2 (Vol. 2).
Macmillan.
Greenwood, J. D. (1999). Understanding the “cognitive revolution” in psychology. Journal of
the History of the Behavioral Sciences. https://doi.org/10.1002/(SICI)1520-
6696(199924)35:1<1::AID-JHBS1>3.0.CO;2-4
Henrich, Joe, & Boyd, R. (1998). The Evolution of Conformist Transmission and the Emergence
of Between-Group Differences. Evolution and Human Behavior.
https://doi.org/10.1016/S1090-5138(98)00018-X
Henrich, Joseph. (2004). Demography and Cultural Evolution: How Adaptive Cultural Processes
Can Produce Maladaptive Losses—The Tasmanian Case. American Antiquity.
https://doi.org/10.2307/4128416
Henrich, Joseph. (2016). The secret of our success: How culture is driving human evolution,
domesticating our species, and making us smarter. Princeton University Press.
https://psycnet.apa.org/record/2016-18797-000
Henrich, Joseph, & Henrich, N. (2010). The evolution of cultural adaptations: Fijian food taboos
protect against dangerous marine toxins. Proceedings of the Royal Society B: Biological
Sciences. https://doi.org/10.1098/rspb.2010.1191
Henrich, Joseph, & McElreath, R. (2003). The Evolution of Cultural Evolution. In Evolutionary
Anthropology. https://doi.org/10.1002/evan.10110
Heyes, C. M. (1994). Social learning in animals: Categories and mechanisms. In Biological
Reviews of the Cambridge Philosophical Society. https://doi.org/10.1111/j.1469-
185X.1994.tb01506.x
Kayser, C., & Shams, L. (2015). Multisensory Causal Inference in the Brain. In PLoS Biology.
23
https://doi.org/10.1371/journal.pbio.1002075
Kendal, J., Giraldeau, L. A., & Laland, K. (2009). The evolution of social learning rules: Payoff-
biased and frequency-dependent biased transmission. Journal of Theoretical Biology.
https://doi.org/10.1016/j.jtbi.2009.05.029
Kendal, R. L., Boogert, N. J., Rendell, L., Laland, K. N., Webster, M., & Jones, P. L. (2018).
Social Learning Strategies: Bridge-Building between Fields. In Trends in Cognitive
Sciences. https://doi.org/10.1016/j.tics.2018.04.003
Laland, K. N. (2004). Social learning strategies. In Learning and Behavior.
https://doi.org/10.3758/bf03196002
McElreath, R., Bell, A. V., Efferson, C., Lubell, M., Richerson, P. J., & Waring, T. (2008).
Beyond existence and aiming outside the laboratory: Estimating frequency-dependent and
pay-off-biased social learning strategies. Philosophical Transactions of the Royal Society B:
Biological Sciences. https://doi.org/10.1098/rstb.2008.0131
Meirmans, P. G. (2018). Hardy-weinberg equilibrium. In Encyclopedia of Ecology.
https://doi.org/10.1016/B978-0-12-409548-9.10555-X
Mesoudi, A. (2005). The Transmission and Evolution of Human Culture. Evolution.
Mesoudi, A. (2008). An experimental simulation of the “copy-successful-individuals” cultural
learning strategy: adaptive landscapes, producer-scrounger dynamics, and informational
access costs. Evolution and Human Behavior.
https://doi.org/10.1016/j.evolhumbehav.2008.04.005
Mesoudi, A., & O’Brien, M. J. (2008). The cultural transmission of great basin projectile-point
technology I: An experimental simulation. American Antiquity.
https://doi.org/10.1017/S0002731600041263
Miller, G. A. (2003). The cognitive revolution: A historical perspective. In Trends in Cognitive
Sciences. https://doi.org/10.1016/S1364-6613(03)00029-9
Muthukrishna, M., Morgan, T. J. H., & Henrich, J. (2016). The when and who of social learning
and conformist transmission. Evolution and Human Behavior.
24
https://doi.org/10.1016/j.evolhumbehav.2015.05.004
Nakahashi, W., Wakano, J. Y., & Henrich, J. (2012). Adaptive Social Learning Strategies in
Temporally and Spatially Varying Environments: How Temporal vs. Spatial Variation,
Number of Cultural Traits, and Costs of Learning Influence the Evolution of Conformist-
Biased Transmission, Payoff-Biased Transmissio. Human Nature.
https://doi.org/10.1007/s12110-012-9151-y
Pekala, R. J., & Pekala, R. J. (1991). The Cognitive Revolution in Psychology. In Quantifying
Consciousness. https://doi.org/10.1007/978-1-4899-0629-8_4
Perreault, C., Moya, C., & Boyd, R. (2012). A Bayesian approach to the evolution of social
learning. Evolution and Human Behavior.
https://doi.org/10.1016/j.evolhumbehav.2011.12.007
Plotkin, H. C. (1988). The role of behavior in evolution. MIT press.
Richerson, P. J. (2019). An integrated bayesian theory of phenotypic flexibility. Behavioural
Processes. https://doi.org/10.1016/j.beproc.2018.02.002
Richerson, P. J., & Boyd, R. (2005). Not by genes alone : how culture transformed human
evolution. University of Chicago Press.
Schlag, K. H. (1998). Why Imitate, and If So, How? Journal of Economic Theory.
https://doi.org/10.1006/jeth.1997.2347
Thackray, A. (1970). Science and Technology in the Industrial Revolution. History of Science.
https://doi.org/10.1177/007327537000900104
Whitehead, H., & Richerson, P. J. (2009). The evolution of conformist social learning can cause
population collapse in realistically variable environments. Evolution and Human Behavior.
https://doi.org/10.1016/j.evolhumbehav.2009.02.003
25