Content uploaded by Peter J Richerson
Author content
All content in this area was uploaded by Peter J Richerson
Content may be subject to copyright.
A Prolegomena to Nonlinear Empiricism
in Human Evolutionary Ecology
Charles Efferson1,2,4
Peter J. Richerson1,3,4
1Graduate Group in Ecology, University of California, Davis
2Institute for Empirical Research in Economics, University of Zürich
3Graduate Group in Animal Behavior, University of California, Davis
4Department of Environmental Science and Policy, University of California, Davis
Experimental Nonlinear Dynamics
“ . . . in nonlinear systems one should not necessarily expect data to be simply ‘fuzzy’
versions of some attractor. Instead, one is likely to see dynamics that are mixtures of
attractors, transients, and even unstable entities.”
Cushing et al. (2003)
“One of the features of nonlinearity is that responses are not (necessarily) proportional to
disturbances.”
Cushing et al. (2003)
Consider a world with a large population of organisms and two alternative
behaviors, A and B. Call the proportion of individuals with behavior A at a particular
point in time t
q. Further assume this frequency changes deterministically from one
period to the next according to some convex combination of linearity and conformity1,
(
)
(
)
(
)
{
}
1211
1
−
−
+
−
+
=
+tttttt qqqqqq
λ
λ
,
where
()
1,0∈
λ
. In short, with probability
λ
individuals maintain their current
probability of choosing A in the next period, while with probability
λ
−
1 individuals
exhibit a tendency to select the most common current behavior in the next period. In the
long run this system will persist in one of only three states. If the initial frequency of A
1 Strictly speaking, a model of this sort cannot be fully deterministic in a finite population. Feasible values
of t
q will form a lattice in the sense that, in a population of size N, the variable t
q can only take N + 1
different values. Model (x) places no such restrictions of the possible values of t
q, and thus it can only be
approximately deterministic. In a nonlinear experimental environment, one should not trivialize this
distinction, as it can prove critical to understanding the predictive power of a mode (see Cushing et al.
2003, ch. 5). For present expository purposes, assume N is large enough to ignore lattice effects.
in the population is less than one half, the long-run frequency is 0
ˆ
=
q. If the initial
frequency of A is greater than one half, the long-run frequency is 1
ˆ
=
q. Lastly, if the
initial frequency of A is exactly one half, then it will remain so indefinitely.
This enumeration of steady states, however, glosses over their stability properties.
The steady states 0
ˆ=q and 1
ˆ=q are locally stable. They are attractors in the sense that,
if the system starts sufficiently close to the attractor, the system moves toward the
attractor. For example, if we start with an initial value at 0
=
t of 2.0
0
=
q, the system
tends toward the 0
ˆ
=
q steady state through time. An initial value of 8.0
0=q, in
contrast, means the system tends toward the 1
ˆ
=
q steady state. The steady state 2/1
ˆ=q,
however, is unstable. For any value of
ε
±
=
2/1
t
q, where 0>
ε
, the system moves
away from the steady state 2/1
ˆ=q no matter how small
ε
.
In a deterministic world, this distinction causes no practical problems. One need
only characterize the steady states, and all falls into place. Alternatively and more
realistically, imagine a setting where model (x) yields highly accurate but not perfect
predictions. In this case the residual variation is small, and for many purposes the
deterministic model (x) might serve admirably. In effect the deterministic signal is so
strong relative to any remaining effects, which we may choose to model stochastically,
that few interpretive problems arise. If given several replicate data sets with initial
conditions
ε
−= 2/1
0
q for some small 0>
ε
, we recognize some probability that a given
time series will tend toward 1
ˆ=q rather than 0
ˆ
=
q. But as long as
ε
is not too small,
and as long as determinism is strong enough, we nonetheless expect the time series to
tend toward 0
ˆ=q in general. In other words, in the long run we expect dynamical data
to be “fuzzy” versions of some appropriately modeled deterministic attractor.
But what if residual variation is large in some particular sense? This difficulty is
intrinsic to high-dimensional systems. Many phenomena involve dozens, hundreds, or
even thousands of relevant causal variables. As scientists, however, we need to restrict
the set of variables under consideration in some way. Given a particular data set,
restricting the set of explanatory variables increases the residual variation. In a static
linear environment, this increased residual variation may not hamper insight. A linear
trend can remain a linear trend even in the face of substantial increases in residual
dispersion. In a dynamical nonlinear environment, however, small differences can have
large effects. In model (x), for instance, 49.0
=
t
q yields a prediction about the future
state of the system that is wildly different from the prediction under 51.0=
t
q. As a
consequence, if residual variation is large enough, one essentially loses the ability to
predict if one only focuses on deterministic attractors. This difficulty is intrinsic to high-
dimensional nonlinear dynamical systems. High dimensionality ensures that a tractable
and thus typically low-dimensional model leaves a noteworthy amount of residual
variation. Nonlinear dynamics, in turn, ensure that small differences, like those due to
residual variation, for example, can yield disproportionately large differences in future
dynamics. Together, these two forces suggest that empirical time series need not be
“fuzzy” versions of a deterministic attractor even if this attractor derives from model that
summarizes important deterministic forces in the empirical system.
Consider, for example, the hypothetical time series in figure (x). An
unconsidered focus on the attractors of (x) would lead us to conclude the model is
fundamentally incompatible with the data. As a strictly deterministic model, (x) offers no
basis for making sense of such a time series. We could alternatively treat (x) as a
conditional expectation for a nonlinear stochastic model. For example, assume we model
the probability that in the next period 1+t
A out of N individuals will choose behavior A,
conditioned on the frequency of A in the current period, t
q. Then our stochastic
nonlinear dynamical model with conditional predictions takes the form
()
tt qAP |
1+ ~ binomial
(
)
(
)
(
)
(
)
Nqqqq tttt ,1211
−
−
−
+
λ
.
Note that this model, although stochastic, preserves the structure of (x) under conditional
expectations in the following sense,
[][]
()( )( )
1211
1
|11 −−−+== ++ ttttttt qqqqAE
N
qqE
λ
.
The difference in terms of confronting data (e.g. figure x) with models, however, is
fundamental. In model (x) stochasticity can move the system away from the attractor.
Moreover, under the right circumstances the system can end up near the unstable steady
state at 2/1
ˆ=q. This unstable entity can thereby affect the dynamics of the system
transiently as in figure x. Moreover, one can show that as
λ
increases the deterministic
forces drawing the system away from 2/1
ˆ
=
q grow weaker, and this fact will protract the
transient dynamics under the influence of this unstable steady state. Now the time series
in figure x is at least interpretable under the deterministic forces summarized by model x.
When we embed these nonlinear deterministic forces in a stochastic model as in (x), we
can tentatively conclude conformity effects are compatible with the data. Initially the
system tends toward the attractor at 1
ˆ
=
q, but subsequent stochastic effects place the
system near the unstable entity, 2/1
ˆ
=
q, which in turn affects dynamics transiently as
t
qlingers near 1/2 under the potentially weak forces attracting the system to other steady
states. At some point the system enters the basin of attraction for 0
ˆ
=
q and escapes the
transient pull of the unstable steady state. Deterministic conformity forces come to
dominate again, and the system tends toward this latter attractor.
In practical terms, this interpretation is intuitive. We recognize that a
deterministic model (x) yields a sharp distinction between 49.0
=
t
q and 51.0=
t
q, but
we can also imagine the response of an actual group of 100 people with two alternative
behaviors. If, for some reason, they end up with 49 out of 100 choosing behavior A at a
given point in time, and if they know this fact, we still would not be surprised if the
system moved to 100 out of 100 choosing A instead of 0 out of 100. A large, probably
extremely large, number of unconsidered forces affecting decision making could produce
this outcome even if conformity is important. If conformity does play some role, the
probability the system will move toward 0 is greater than the probability it will move
toward 100. No matter how small this tendency, if we could replicate the precise
situation above enough times, we could detect the effects of conformity simply by
counting replicates. Nonetheless, as stochasticity becomes more important as we
substitute low-dimensional systems for high-dimensional reality, and as threshold effects
enter the fray due to nonlinearity, the number of necessary replicates becomes
impractically large. The crux lies here; the dynamical interaction of nonlinearity,
determinism, and strong residual stochasticity presents unique problems to those who
would integrate theory and empiricism in the evolutionary social sciences.
Interestingly and not coincidentally, population biologists face the same
difficulties. Recently a group of ecologists and applied mathematicians, Jim Cushing,
Robert Costantino, Brian Dennis, Robert Desharnais, Shandelle Henson, and Aaron King
(collectively known as “The Beetles”), have developed a provocative set of tools for
integrating theory and empiricism in experimental nonlinear stochastic environments.
They study the population dynamics of the flour beetle, Tribolium castaneum, and their
task is a demanding one. Specifically, beetles in the genus Tribolium exhibit a taste for
eating conspecifics, and this cannibalistic tendency creates strong nonlinear interactions
between the different stages in the life cycle. The resulting range of dynamical behaviors
in controlled laboratory environments runs from simple steady states to chaos and in
between includes an exotic and surprising collection of intermediate dynamical regimes.
Cushing et al. (2003) recently summarized a decade’s worth of research to develop and
use a single low-dimensional model to predict, qualitatively and quantitatively, the
myriad dynamical regimes they observe in their experimental Tribolium populations.
Their achievement is stunning. Figure x provides a summary of some of their results.
(Pete, do we have an interest in reproducing some of their graphs? One has to get
permission for this, right? If there are any graphs we’d like to reproduce, I’d say they’re
from Cushing et al. I don’t know if you’ve looked at this book or not, but some of their
graphs are nothing short of mind-blowing. They would show, I think, better than
anything we can say, just how far a concerted research program in experimental nonlinear
dynamics can go, and they would suggest the possibilities with regard to micro-society
experiments. The one problem is that many readers might not understand the graphs. I
don’t have a good sense about this. On the other hand, I’d think anybody could look at
the graphs and see that the predictions match the observations very closely, something
that rarely happens in ecology.) This kind of predictive accuracy rarely, if ever, happens
in ecology, and it is a testimony to how far a sustained research program in experimental
nonlinear dynamics can go.
With regard to the study of human evolutionary ecology, the importance of
Cushing et al. (2003) is primarily methodological. We describe some of their key
methods below and discuss their implications specifically with regard to the experimental
study of social learning and cultural evolution.
Low-dimensional models from first principles. Many phenomena in biology and the
social sciences are high-dimensional in the sense that even in simple systems more causal
variables play a role than we would like to address. As a consequence, a key question,
and one that will figure prominently in our discussion of model selection below, is the
question of how to find a suitable low-dimensional surrogate to examine high-
dimensional phenomena. All else being equal, in nonlinear stochastic settings a
mechanistic model can typically do more with less than a corresponding
phenomenological model. (Why is this? Cushing et al. claim that this is true, and it
seems intuitively compelling to me. But why? See what follows.) Phenomenological
models, though they bring the advantage of flexibility across various applications, they
also bring important costs. In particular, as observed time series become more complex,
phenomenological models typically require an increasingly opaque array of nonlinear
terms to capture the variation. In the limit, one can end up with a senseless model that
does an admirable job of summarizing observed variation. Mechanistic models, in
contrast, . . . (I’m losing it here . . .)
Conditional predictions. Because small differences in parameter values and the values
of state variables can yield large differences in dynamical behavior in stochastic
nonlinear models, one needs some way to prevent even small errors in prediction from
accumulating through time. Nonlinearity increases the chances that, even if a model
predicts well, compounding errors of this sort will lead to predictions that diverge wildly
from the observed time series. As a consequence, Cushing et al. (2003) rely exclusively
on conditional predictions. They take the state of the system in one time period as given
by the actual observed state and predict one period forward only. In essence, if N
tRX ∈
is a vector of random state variables with expected values determined by NN RRf →:,
predictions take the following form,
[
]
(
)
ttt xfxXE
=
+|
1,
where t
x are the observed values of the state variables at t. In essence, one uses the
observed values at t – 1 to predict the values at t, and then one uses the observed values at
t to predict the values at t + 1. This resetting process continues recursively over the entire
data set, and allows one to estimate parameters in the model and make predictions
without the potentially serious consequences of stochastic errors accumulating in a
nonlinear environment through time.
Different deterministic entities and their effects. As our exercise with figure x
demonstrates, when stochasticity is important in nonlinear empirical settings one cannot
focus solely on the attractors of a given model. Deterministic entities with different
stability properties can interact with stochastic effects to yield time series with a
historically contingent character. Cushing et al. (2003) find evidence for this repeatedly.
Their solution is to model deterministic and stochastic forces together in ways rooted
firmly in the biology of their study organism. Moreover, once a given model is in place,
they develop various model-specific approaches to examining the effects of different
deterministic entities when stochasticity is important. More generally, Cushing et al.
(2003) do not average the time series from replicate experiments in search of average
trends. Although such a procedure may be useful in some situations, one must recognize
the potential importance of history in a dynamical world, as our own, with a thorough
mix of nonlinearity, determinism, and stochasticity. Averaging loses the information
inherent in history.
Bifurcation experiments. A bifurcation is a change in the deterministic attractor of a
dynamical system as some parameter of interest changes. Although Cushing et al. (2003)
focus repeatedly on the need to expand the study of nonlinear dynamical systems beyond
the study of attractors, the fact remains that their core model is amazingly accurate at
predicting the dynamics of their experimental Tribolium populations, and thus an analysis
of attractors constitutes a crucial part of their general research program. Moreover,
although removing data certainly requires some justification, one can remove transients
from a time series if the particular task at hand requires a focus on attractors. In this
regard Cushing et al. (2003) make effective use of what they call “bifurcation
experiments.” Essentially, given a model of an experimental situation, at least some of
the parameters in the model will be under experimental control. One can vary one of
these parameters theoretically, and the result is a bifurcation diagram. This diagram is a
description of how the set that attracts the dynamical system changes as the parameter of
interest changes. One can then select appropriate values of this parameter from the
bifurcation diagram to implement as different experimental treatments. If the observed
sequence of bifurcations matches the predicted sequence, the bifurcation experiment
represents a severe and successful test of the theoretical model. Figure x shows the
results of such a bifurcation experiment from Cushing et al. (2003). As the figure shows,
their core 3-dimensional model has a remarkable ability to predict changes in the
behavior of experimental Tribolium dynamics as some parameter, in this case the rate at
which adults cannibalize pupae, changes. We suspect that the experimental study of
cultural evolution is a log way from the successful use of such a technique, but we would
like to put it forward as a long-term goal. Before such a goal can be realized, however,
bifurcation analysis must become a central part of the theoretical study of cultural
evolutionary dynamics. We have yet to see a complete bifurcation analysis of a cultural
evolutionary model (Pete, is this true? Is it important?).