Page 1

The traveling wave approach to asexual evolution: Muller’s ratchet

and speed of adaptation

Igor M. Rouzinea,*, Éric Brunetb, and Claus O. Wilkec

a Department of Molecular Biology and Microbiology, Tufts University, 136 Harrison Avenue, Boston, MA

02111

b Laboratoire de Physique Statistique, École Normale Supérieure, 24 rue Lhomond, 75230 Paris Cédex 05,

France

c Section of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cell

and Molecular Biology, University of Texas, Austin, TX 78712, USA

Abstract

We use traveling-wave theory to derive expressions for the rate of accumulation of deleterious

mutations under Muller’s ratchet and the speed of adaptation under positive selection in asexual

populations. Traveling-wave theory is a semi-deterministic description of an evolving population,

where the bulk of the population is modeled using deterministic equations, but the class of the highest-

fitness genotypes, whose evolution over time determines loss or gain of fitness in the population, is

given proper stochastic treatment. We derive improved methods to model the highest-fitness class

(the stochastic edge) for both Muller’s ratchet and adaptive evolution, and calculate analytic

correction terms that compensate for inaccuracies which arise when treating discrete fitness classes

as a continuum. We show that traveling wave theory makes excellent predictions for the rate of

mutation accumulation in the case of Muller’s ratchet, and makes good predictions for the speed of

adaptation in a very broad parameter range. We predict the adaptation rate to grow logarithmically

in the population size until the population size is extremely large.

1 Introduction

“I was observing the motion of a boat which was rapidly drawn along a narrow channel

by a pair of horses, when the boat suddenly stopped—not so the mass of water in the

channel which it had put in motion; it accumulated round the prow of the vessel in a

state of violent agitation, then suddenly leaving it behind, rolled forward with great

velocity, assuming the form of a large solitary elevation, a rounded, smooth and well-

defined heap of water, which continued its course along the channel apparently

without change of form or diminution of speed.” (Russell, 1845)

One of the fundamental models of population genetics is that of a finite, asexually reproducing

population of genomes consisting of a large number of sites with multiplicative contribution

to the total fitness of the genome. This model has been studied for decades, and has presented

substantial challenges to researchers trying to solve it analytically. Even in its most basic

formulation, where each site is under exactly the same selective pressure, the model has not

*Corresponding author. email: igor.rouzine@tufts.edu. phone: 617-636-6759. fax: 617-636-4086.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers

we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting

proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could

affect the content, and all legal disclaimers that apply to the journal pertain.

NIH Public Access

Author Manuscript

Theor Popul Biol. Author manuscript; available in PMC 2009 February 1.

Published in final edited form as:

Theor Popul Biol. 2008 February ; 73(1): 24–46.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 2

been fully solved to this day. For the special case of vanishing back mutations, the model

reduces to the problem of Muller’s ratchet (Muller, 1964; Felsenstein, 1974). A tremendous

amount of research effort has been directed at this problem (Haigh, 1978; Pamilo et al.,

1987; Stephan et al., 1993; Higgs and Woodcock, 1995; Gordo and Charlesworth, 2000a,b;

Rouzine et al., 2003). Other special cases of this model are mutation–selection balance when

the forward and back mutation rates are equal (Woodcock and Higgs, 1996), and the speed of

adaptation under various conditions (Tsimring et al., 1996; Kessler et al., 1997; Gerrish and

Lenski, 1998; Orr, 2000; Rouzine et al., 2003; Wilke, 2004).

In 1996, Tsimring et al. pioneered a new approach to studying the multiplicative multi-site

model. They described the evolving population as a localized traveling wave in fitness space,

using partial differential equations developed to describe wave-like phenomena in physical

systems. A traveling wave is a localized profile traveling at near-constant speed and shape

(physicists refer to such phenomena also as solitary waves). We can envision a population as

a traveling wave of the distribution of the mutation number over genomes if the relative mutant

frequencies in the population stay approximately constant while the population shifts as a

whole. For example, a population may have specific abundances of sequences at one, two,

three, or more mutations away from the least loaded class at all times, but the least-loaded class

moves at constant speed by one mutation every ten generations.

Encouraging results by Tsimring et al. (1996) were based on two strong approximations.

Firstly, all fitness classes, including the best-fit class, were described deterministically,

neglecting random effects due to finite population size. Finite population size was introduced

into the problem as a cutoff of the effect of selection at the high-fitness edge, when the size of

a class becomes less than one copy of a genome. Secondly, Tsimring et al. (1996) approximated

the traveling wave profile with a function continuous in fitness (or mutation load). In these

approximations, Tsimring et al. (1996) demonstrated the existence of a continuous set of waves

with different speeds. The cutoff condition determined the choice of a specific solution and

the dependence of the speed on the population size.

Rouzine et al. (2003) confirmed the qualitative conclusions by Tsimring et al. (1996) and

refined their quantitative results in two ways, by taking into account the random effects acting

on the smallest, best-fit class, and showing that, in a broad parameter range, approximating the

logarithm of the wave profile as a smooth function of the fitness is a much better approximation

than approximating the wave profile itself as a smooth function. It was shown that the

substitution rate increases logarithmically with the population size, until the deterministic

single-site limit is reached at extremely large population sizes and the theory breaks down.

The purpose of the present paper is to show that the results by Rouzine et al. (2003) can still

be improved regarding both the treatment of the high-fitness edge and the deterministic part

of the fitness distribution. Because the original work was presented in an extremely condensed

format, we will here re-derive the general theory in detail. Then, we will present improved

treatments of the stochastic edge that lead to accurate predictions for the substitution rate

defined as the average gain in beneficial alleles per genome per generation. We consider in

detail two opposite parametric limits, Muller’s ratchet (when beneficial mutation events are

not important) and adaptive evolution (when deleterious mutations are not important).

Because the range of validity of our approach caused some confusion in the literature, we

discuss it in detail in the main text and Appendix. Briefly, both in Muller’s ratchet and in the

adaptation regime, we assume that the total number of sites is large, and the selection coefficient

s is small. The population size N should be sufficiently large, so that the difference in the

mutational load between the least-loaded and average genomes is much larger than 1. In other

words, a large number of sites are polymorphic at any time. For the adaptive evolution, the

Rouzine et al.Page 2

Theor Popul Biol. Author manuscript; available in PMC 2009 February 1.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 3

condition corresponds to the average substitution rate V being much larger than s/ln(V/Ub),

where Ub is the beneficial mutation rate, or population sizes being much larger than

s/Ub

adaptation occurs by isolated selective sweeps at different sites, and one-site models apply.

Two-site models of adaptation, such as the clonal interference theory (Gerrish and Lenski,

1998;Orr, 2000;Wilke, 2004), can be used to describe the narrow transitional interval in N.

Further, if the deleterious mutation rate per genome is much larger than s, and the average

population fitness is sufficiently high, an additional broad interval of N appears, where

deleterious mutations accumulate (Muller’s ratchet). Using numeric simulations and analytic

estimates, we demonstrate good accuracy of our results in a very broad parameter range relevant

for various asexual organisms, including asexual RNA and DNA viruses, yeast, some plants,

and fish.

3/ln (V/Ub). We also assume that V is much larger than Ub. In smaller populations,

The manuscript is organized as follows. In Section 2, we describe the model and the general

method to derive evolutionary dynamics in terms of the fitness distribution. In Section 3 and

4, we consider in detail two particular cases, Muller’s ratchet and the process of adaptation,

respectively, and test analytic results with computer simulations. In Section 4, we discuss our

findings.

2 Traveling-wave theory

2.1 Model assumptions

We consider a multi-site model of L sites, where each site can be in two states, i.e., carry one

of two alleles, either advantageous or deleterious. The deleterious allele reduces the overall

fitness of a genome by a factor of 1 − s, where s ≪1 is the selective disadvantage per site. We

assume that there are no biological interactions between sites (epistasis), so that the fitness of

a genome with k deleterious alleles (mutational load) is (1−s)k ≈e−ks. We refer to the frequency

of sequences with mutational load k in the population as fk, and write the population-average

fitness as wav = Σk e−sk fk. We introduce kav, which is mutational load for which a sequence’s

fitness is exactly equal to the population mean fitness, as given by wav = e−skav. For small s,

kav is approximately the average mutational load in the population, kav ≈ Σk kfk. We denote the

mutational load of the best-fit sequence in the population as k0. Note that in general k0 ≠ 0, that

is, the best-fit sequence in the population is not the sequence with the overall highest possible

fitness.

We assume that an allele can mutate into an opposite allele with a small probability μ. For the

sake of simplicity of the derivation, we assume that the mutation rate is low, so that there is,

at most, one mutation per genome per generation. The case when multiple mutations per

genome are frequent takes place at very large population sizes when the average substitution

rate is high, larger than one new beneficial allele per generation. Rouzine et al. (2003)

considered this more general case and showed that there is no essential change in the final

expression for the substitution rate. Thus, for finite population size, the assumption of a single

mutation per genome per round of replication is not limiting (see also the next subsection).

In real genomes, s varies between sites. Moreover, Gillespie (1983, 1991) and Orr (2003)

argued that the distribution of s differs between sites with beneficial and deleterious alleles. A

viable genome represents a highly-fit, non-random selection of alleles, so that deleterious

mutations should generally have larger effects than beneficial mutations. For the same reason,

these authors predicted that the effective distribution of s for beneficial alleles should have a

universal exponential form. In the present work, we do not consider variation in s. Instead, we

use a simplified model including only those sites into the total number of sites L —with either

deleterious and beneficial alleles— whose selection coefficient is on the order of the same

Rouzine et al.Page 3

Theor Popul Biol. Author manuscript; available in PMC 2009 February 1.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 4

typical value s, and approximate all selection coefficients at these sites with a constant s. The

choice of s and, hence, of the set of included sites depends on the time scale of evolution under

consideration. Strongly deleterious mutations with effects much larger than s are cleared

rapidly from a population. Strongly beneficial mutations are fixed at the early stages of

evolution. Note that in the “clonal interference” approximation (Gerrish and Lenski, 1998),

which considers competition between two beneficial clones emerging at two sites, variation of

s must be taken into account to make continuous adaptation possible. In contrast, in the present

theory, which allows new beneficial clones to grow inside of already existing clones, the

importance of variation in s is less obvious. We hope to address this matter elsewhere.

We discuss the validity of our approach in detail for the limits of Muller’s ratchet and adaptation

and give a summary of the central simplifications—numbered 1 through 6 and referenced

throughout the text—in the Appendix. Note that we can verify the validity of the various

assumptions only after the fact, once we have obtained our final results. All simplifications are

asymptotically exact, i.e., are based on the existence of small dimensionless parameters. The

most limiting requirements are that the high-fitness tail of the distribution is long, and that the

distribution is far from the unloaded and fully loaded (possible best-fit and less-fit) genomes.

2.2 General approach

The first idea underlying the approach of ”solitary wave” is to classify all genomes in a

population according to their fitness (mutational load), regardless of specific locations of

deleterious alleles in a genome, and focus on evolution of fitness classes. The second idea is

to describe evolution of most fitness classes deterministically. To take into account the effects

of finite population size, such as genetic drift and randomness of mutation events, only one

class with the highest fitness is described stochastically using the standard two-allele diffusion

approach. The best-fit class is considered a minority ”allele” in a population, and all other

sequences are considered the majority ”allele”. The reason why stochastic effects can be

neglected already for the next-to-best class is that, in a broad parameter range, the fitness

distribution decreases exponentially towards the stochastic edge. Hence, the next-to-best class

is large enough to neglect stochastic effects, especially in the adaptation regime (see estimates

in Sections 3 and 4).

We note that neglecting stochastic effects completely and considering the limit of infinite

population is not correct. As we show below, stochastic processes acting on the best-fit class

limit the overall evolution rate and make it dependent on the population size. Even for a modest

number of sites (L = 15–20), the substitution rate is predicted to reach the true deterministic

limit only in astronomically large populations not found in nature (Tsimring et al., 1996;

Rouzine et al., 2003; Desai and Fisher, 2007). For the same reason, the assumption of one

mutation per genome we made in our model is not a limiting factor and, as shown by Rouzine

et al. (2003), does not change much in the final expression for the average substitution rate.

The formal procedure consists of several steps, as follows.

i.

The frequencies of all fitness classes excluding the best-fit class are described by a

deterministic balance equation.

ii.

The equation is shown to have a traveling wave solution with an arbitrary speed (the

average substitution rate).

iii. The leading front of the wave (high-fitness tail of the distribution) is shown to end

abruptly at a point, expressed in terms of the wave speed.

iv. The difference between the values of the fitness distribution at the center and the edge

is expressed in terms of the wave speed.

Rouzine et al. Page 4

Theor Popul Biol. Author manuscript; available in PMC 2009 February 1.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 5

v.

The value at the center is found from the normalization condition.

vi. Because the biological justification for the lack of genomes beyond the high-fitness

cutoff is finite size of population, the cutoff point is identified with the stochastic

edge.

vii. To determine the wave speed as a function of the population size, the average

frequency of the least-loaded class is estimated from the classical diffusion result and

matched to the deterministic cutoff value.

2.3 Equation for the deterministic part of the fitness distribution

We proceed with the first step. On the basis of our model assumptions and neglecting multiple

mutation events per generation per genome, we can write the deterministic time evolution of

the frequency fk(t) of genomes with mutational load k as

1

fk(t + 1) =

wav(t)(e−s(k−1)μ(L − k + 1)fk−1(t)

+e−s(k+1)μ(k + 1)fk+1(t) + e−sk(1 − μL )fk(t)),

(1)

where k runs from 0 to L and f−1(t) ≡ fL+1(t) ≡ 0. By definition, Σk fk(t) = 1 for all times t. Now

we introduce the total per-genome mutation rate U = μL, and the ratio of beneficial mutation

rate per genome to the total mutation rate, αk = μk/U = k/L. Inserting these expressions into Eq.

(1), expressing wav in terms of kav, expanding e−sx ≈ 1 − sx (which we are allowed to do under

the condition that s|k −kav| ≪ 1, Simplification 1), and neglecting all terms proportional to

sU, we arrive at:

fk(t + 1) = U(1 − αk−1)fk−1(t) + Uαk+1fk+1(t) + 1 − U − s(k − kav) fk(t). (2)

As mentioned before, we consider the case when the traveling wave is far from the sequence

with the highest possible fitness, k = 0, as given by the condition |k −kav|≪ kav. Therefore,

αk depends only slowly (linearly) on k, we are allowed to replace αk by α ≡ αkav (Simplification

2), and find

fk(t + 1) − fk(t) = U(1 − α)fk−1(t) + Uα fk+1(t) − U + s(k − kav) fk(t).(3)

Our goal is to turn this expression into a continuous differential equation. As the mutation rates

are low, fk(t) evolves very slowly in time, and we can write fk(t + 1) ≈ fk(t) + ∂fk(t)/∂t

(Simplification 3).

We need to be more careful when making a continuous approximation for fk(t) as a function

of k. As we show below, fk(t) changes rapidly with k in the important high fitness tail. Therefore,

the Taylor expansion of fk(t) is not justified. However, the logarithm of fk(t) is a smooth function

of k in a broad parameter range, provided the ”lead” of the distribution is large (Simplification

4). [The lead is the difference in number of mutations from the population center to the least-

loaded class in the population (Desai and Fisher, 2007).] Therefore, a better approximation,

which represents an improvement on the work by Tsimring et al. (1996), is to do the Taylor

expansion on ln fk and write fk+1(t) = fk(t) exp[∂ln fk(t)/∂k]. With these approximations, and

after introducing a rescaled time dτ = Udt and a rescaled selection coefficient σ= s/U, we find

∂ ln fk(t)/∂τ = (1 − α)e

This nonlinear partial differential equation describes the deterministic movement of the

population in fitness space over time.

−∂lnfk(t)/∂k

+ αe

∂lnfk(t)/∂k

− σ(k − kav) − 1.(4)

(Note that, for sufficiently large N, the assumptions of Simplifications 1 and 3 may be violated,

so that technically, we can neither expand fitness in k nor replace discrete time with continuous

Rouzine et al. Page 5

Theor Popul Biol. Author manuscript; available in PMC 2009 February 1.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript