Page 1

Copyright ? 2007 by the Genetics Society of America

DOI: 10.1534/genetics.107.075697

Analytical Description of Mutational Effects in Competing

Asexual Populations

Daniel Pinkel1

Comprehensive Cancer Center and Department of Laboratory Medicine, University of California,

San Francisco, California 94143

Manuscript received May 7, 2007

Accepted for publication October 1, 2007

ABSTRACT

The adaptation of a population to a new environment is a result of selection operating on a suite of

stochastically occurring mutations. This article presents an analytical approach to understanding the

population dynamics during adaptation, specifically addressing a system in which periods of growth are

separated by selection in bottlenecks. The analysis derives simple expressions for the average properties of

the evolving population, including a quantitative description of progressive narrowing of therange of selec-

tioncoefficientsofthepredominantmutantcellsandoftheproportionofmutantcellsasafunctionoftime.

A complete statistical description of the bottlenecks is also presented, leading to a description of the sto-

chastic behavior of the population in terms of effective mutation times. The effective mutation times are

related to the actual mutation times by calculable probability distributions, similar to the selection coef-

ficients being highly restricted in their probable values. This analytical approach is used to model recently

publishedexperimentaldatafromabacterialcocultureexperiment,andtheresultsarecomparedtothoseof

a numerical model published in conjunction with the data. Finally, experimental designs that may improve

measurements of fitness distributions are suggested.

T

on the suite of stochastically occurring mutations, each

of which may confer a different selective advantage.

The mutations occur throughout time, so that multiple

clones of mutant cells are present simultaneously.

Mathematical modeling of the population dynamics

typically employs numerical simulation to calculate a

particular instance of the system, sampling probability

distributions to include stochastic effects. The calcu-

lations may involve detailed tracking of each mutant

clone through the history of the population. Running

the model many times allows determination of its char-

acteristic behavior as a function of the parameters de-

scribing mutation and selection. Estimates of the values

of these parameters in a living system are obtained by

comparisons of the statistical properties of the model

with those of experimental data.

By contrast, this article presents an analytical de-

scription of the population dynamics. The analysis be-

gins by establishing the identity of the average behavior

of sequential finite cultures separated by bottlenecks

with the growth of an exponentially expanding effec-

tively infinite culture. Consideration of the infinite sys-

tem provides analytical expressions for characteristic

properties of the finite populations. The results include

HE adaptation of asexual populations to a new

environment is the result of selection operating

quantitative descriptions of growth of the proportion of

mutant cells with time and of the accompanying nar-

rowing of the frequency distribution of their selection

coefficients. Next, the stochastic behavior of finite sys-

tems is considered, resulting in a comprehensive and

convenient description of the selection in the bottle-

necks. The stochastic description is then used to de-

velop a model of coculture experiments such as the one

recently published by Hegreness and Shoresh (HS)

(Hegreness et al. 2006). Application of the results ob-

tained for the average behavior is used to simplify the

model and facilitate its comparison with experimental

data. Finally, the results are discussed and alternate

experimental designs that may allow better measure-

ment of the fitness distribution of the mutant cells are

suggested. Significant additional information concern-

ingtheanalysisispresentedinthesupplementalmateri-

alsathttp:/ /www.genetics.org/supplemental/,including

an Excel workbook for calculation of the statistical

distributions that are developed in this article.

THE EXPERIMENTAL SYSTEM

Consideration of the experimental system of HS

(Figure 1a) provides concrete motivation for the anal-

ysis presented in the remainder of this article. Approx-

imately23105cellswere seeded intoaculture,one-half

labeled with yellow fluorescent protein (YFP) and one-

half with cyan fluorescent protein (CFP). Otherwise the

1Address for correspondence: University of California, Box 0808, San

Francisco, CA 94143. E-mail: pinkel@cc.ucsf.edu

Genetics 177: 2135–2149 (December 2007)

Page 2

ancestral cells were identical. Mutations with selection

coefficients s occurred stochastically in both popula-

tions of ancestral cells with a probability distribution

r(s) and an overall frequency m per generation per cell.

The cells grew exponentially for 24 hr, and the culture

wasthensampledtoseedthenextpassagewith?23105

cells. After this bottleneck, exponential growth re-

sumed, followed by sampling to seed the next daughter

culture after another 24 hr, etc. The full experiment

lasted ?40 days or ?450 doublings for the starting

(ancestral) cells. Figure 1a illustrates the growth of the

CFP and YFP populations just before and after the kth

bottleneck for a time early in the series before the mu-

tant populations have become significant. Each growth

phase lasts a time t ¼ 11.7 population doublings and

resuts in an ert? 3300-fold increase in cell number,

where r ¼ ln(2). At the bottleneck, a sampling of e?r t?

1/3300 of the mutant and ancestral cells proceeded to

the subsequent daughter culture. The rest of the cells

were discarded.

YFP/CFP fluorescence ratios in each of 72 series of

such cultures were measured at each bottleneck, pro-

viding a measure of the relative numbers of the two cell

types as a function of time. Because of the stochastic

nature of the mutational process and the passage of

mutants through the bottlenecks, one expects that

under some conditions the ratio may change with time,

differing in each series of cultures. Figure 1b shows

examples of four of the possible time courses for the

ratio, using green and red to distinguish the popula-

tions. The ancestral cells in both populations are repre-

sented as light colors, and the mutants (of any s) as dark

colors. This distinction is made for illustrative purposes

only since ancestral and mutant cells cannot be distin-

guished by their fluorescence. While the colors are

illustrated as separated, in the actual experiment the

two populations are thoroughly mixed.

In the first example, mutant cells are assumed to

cometoprominenceinitiallyinthe‘‘green’’population.

This time is indicated by a green T1, defined in our

analysisasthetimewhentheproportionsofmutantand

ancestral cells in the green population are equal. The

overall growth rate of the green population detectably

increases, and green cells begin to overgrow the ‘‘red’’

ones. Mutant cells with approximately the same s as

those in the green population are assumed to come to

prominence in the red population at a later time,

indicated by the red T1. After this time, the green and

red populations tend toward the same growth rate, and

thus their ratio stabilizes. Mutant cells completely dom-

inate both populations with time, as indicated by the

darkening colors. This growth pattern corresponds to

Figure 1.—Schematic of the YFP/CFP cocul-

ture experiment. (a) The exponential expansion

of the YFP and CFP populations (distinguished

by green and red colors) during the growth peri-

ods before and after the kth bottleneck at a time

early in the experiment when mutant cells are

present at low abundance. The cultures begin

with N0of each type of cell and exponentially ex-

pand for time t so that there are N0ertof each. At

the bottleneck N0of each type of cell is removed

and transferred to the next culture. The remain-

der is discarded. In the actual experiment, t ¼

11.7 doublings, and the expansion factor is ert?

3300. (b) Four representative time courses for

the behavior of the color ratios. Ancestral cells

are indicated by light colors and mutants (of

any selection coefficient) by dark colors. The col-

ors are drawn separately for illustrative purposes,

but in the actual system the cells are uniformly

mixed. Schematic growth curves for the mutant

and ancestral populations during three of the

growth periods for the first series of cultures

are shown in boxes above that series. The behav-

ior of each culture series is described in the text.

The red and green T1’s indicate the times when

mutant and ancestral cells are of equal abun-

dance in the respective populations.

2136D. Pinkel

Page 3

curves2and3inFigure6,aandb.Graphsaboveseries1

schematically show the growth of the mutant and an-

cestral populations during three cultures of this series.

In the second example, red mutants completely over-

take the culture before green mutants of sufficiently

large s come to prominence, although green mutations

with low s will have occurred. This corresponds to curve

1 in Figure 6,a and b. Inthethird example, red mutants

are assumed to come to prominence first and the

fluorescence ratio begins to change in favor of red. At

a later time, mutant cells with larger s come to prom-

inence in the green population, and the ratio changes

infavorofgreen.Stilllater,mutantsthathaveaselection

coefficient equivalent to the green population come to

prominence in the red population, and the ratio sta-

bilizes. The initial part of such behavior is shown by

curve 5 in Figure 6b. Subsequent variations in the ratio

may occur as mutants with ever larger s come to prom-

inence in the two populations. But if r(s) has a suf-

ficiently defined maximum s, then the ratio will finally

stabilize when both populations are dominated by

mutant cells with this maximal selection coefficient un-

less one population completely displaces the other as

shown in example 2. In the final example, roughly

equivalent mutants arise in both populations at about

the same times, so that the ratio remains constant as

both populations become dominated by mutant cells

with selection coefficients tending toward the maxi-

mum possible. This is the behavior that always occurs if

the size of the cultures is sufficiently large so that many

mutations occur.

THE ANALYTICAL FRAMEWORK

Imagine an arbitrarily large ensemble of initial cul-

tures, each starting with N0cells (Figure 2a). Only one

cell population is shown for clarity. If additional popu-

lations are present as in the coculture experiment, their

behaviors will be statistically independent. Each initial

culture is sampled after growth time t, but instead of

seeding only one next passage culture as in the actual

experiment, all of the material from each initial culture

Figure 2.—The conceptual experi-

ment. (a) An arbitrarily large number

of initial cultures, each containing N0

cells, are started at t ¼ 0. Two such cul-

tures, h and h 1 1, are shown. After time

t,thetimeofthefirstbottleneck,thecells

ineachinitialculturearedividedamong

ertdaughter cultures so that each re-

ceives N0 cells. Thus no cells are dis-

carded. After each subsequent time

interval t, each daughter seeds ertnext-

generationdaughters.

growth phase mutants received from

the previous culture expand and de novo

mutants randomly occur. At each bottle-

neck there is statistical variation in the

number of mutant andancestral cells re-

ceived by the daughters. This process is

indicated by the inset in the oval where

m(s)anddm(s)indicatetheaveragenum-

berandthefluctuationfromtheaverage

of the mutants transmitted to the next

daughter, and N and dN indicate the

same for the ancestral cells. The time

(in population doublings) and the num-

ber of the bottleneck is indicated along

thebottom,assuming11.7doublingsbe-

tween bottlenecks. (b) The shape of

W(s, t), normalized to its value at s ¼

0.1, is shown as a function of s for repre-

sentative values of t. As t increases the

weight of this function rapidly moves to

larger s. The distribution of mutant cells

in the pooled culture of a is given by

r(s)W(s,t).Foreachofthepossiblepaths

through a series of daughters, the selec-

tioncoefficientsofthemutantswilldiffer

due to the stochastic variation, but the

Duringeach

powerful shape of W(s, t) ensures that those that become prominent will be confined to a small range of s almost regardless of

theshapeofr(s).Ifr(s)isaconstantouttosomesmax,thenW(s,t)directlygivestheexpecteddistributionofthemutants.Thevalues

of s, termed s10, above which 90% of the mutant cells occur are indicated on the graphs for this case (see Equation 8).

Analysis of Asexual Competition 2137

Page 4

is used to seed ertdaughter cultures (3300 in the case of

the specific experiment in question). The details of this

exhaustive sampling are illustrated in the oval inset in

Figure 2a for the kth bottleneck. Each of the (k 11)st

cultures receive ?N 1 dN ancestral and m(s) 1 dm(s)

mutant cells, where the d’s indicate stochastic fluctua-

tions that affect the individual cultures. The actual

experiment is equivalent to selecting 72 of the initial

cultures, and for each of these selecting a sequence of

daughters, thereby producing 72 series with 40 sequen-

tial cultures in each. Two such series are indicated in

Figure 2a by the shaded boxes. Examination of Figure

2ashowsthatthetimedependenceofthecharacteristics

of the mutant cells averaged over a large number of

series of daughters would be identical to the behavior of

the total mutant population if all of the daughters were

pooled. This pooled culture is just a population in un-

bounded exponential growth, which can be analyzed

with straightforward approaches. Since the structure of

this conceptual experiment preserves cell lineages, it

also provides the basis for the subsequent calculation of

the stochastic properties of the system.

This conceptual experiment is, of course, impossible

to implement. Using the parameters of the actual ex-

periment, by the 13th of the 40 days, the volume of cul-

ture medium required for the progeny of a single initial

culture would fill a ball with a radius about 22 times that

ofour solar system out to Pluto, and the radius would be

increasing faster than the speed of light. Coincidentally,

this isabout the characteristic time, T1? 150 doublings,

that mutants became equal in abundance to the ances-

tral cells in the actual HS experiment (see below).

DESCRIPTION OF THE AVERAGE CHARACTERISTICS

Basic relationships: Consider one of the cell popula-

tions in the infinite pooled culture of Figure 2a. Its

development is described by standard equations for

exponential growth. Let N‘(t) and M‘(s, t) be the

number of ancestral and mutant cells at time t. Then

_N‘ðtÞ ¼ ð1 ? mÞrN‘ðtÞ ffi rN‘ðtÞ

ð1aÞ

˙ M‘ðs;tÞ ¼ mrðsÞN‘ðtÞ1rð11sÞM‘ðtÞ;

where the dots indicate the derivative with respect to

time, m is the overall mutation rate per generation for

theancestralcells,r(s)istheprobability distributionfor

theselectioncoefficientssofthemutants,andr¼ln(2).

Equation 1a describes the increase of ancestral cells due

to division and the decrease due to conversion to

mutants. Since m > 1, it is neglected in what follows.

The coefficient r scales time so that it is measured in

units of the doubling time for the ancestral cells.

Equation 1b describes the increase in the number of

ð1bÞ

mutant cells with selection coefficient s through de novo

mutationandthedivisionofexistingmutantswitharate

1 1 s times that of the ancestral cells. In the real world,

singly mutant cells are susceptible to additional muta-

tions. The possibility of multiple mutations raises com-

plex modeling issues that are discussed in section 2 of

the supplemental materials at http:/ /www.genetics.org/

supplemental/. Multiple mutants are neglected in what

follows.

This continuous growth model (with bottlenecks

introduced below) does not include one important

component oftheactual experiment. Inthe real experi-

menttheculturesenterastationaryphasewheregrowth

stops prior toseedingthe daughtercultures. Asmutants

become prominent in the population and the overall

growth rate increases somewhat, this stationary phase

will be reached earlier during each passage. By contrast,

the model allows expansion to continue during this

stationary period. The error introduced by this simpli-

fication is small for the HS experiment. As is shown

below, the maximum selection coefficient for the mu-

tants in the experiment is ?0.1. Thus, after cultures

become dominated by mutants, in the time equivalent

to 11.7 doublings of the ancestral cells the model allows

about one additional doubling per passage (rst ? 1).

Therefore, during the latephasesof the experimentthe

timescale in the model may be somewhat accelerated

compared to that of the actual cultures. However, in-

clusion of this small effect in the analysis is not war-

ranted given the noise in the experimental data with

which it will be compared. The presence of the sta-

tionary phase raises additional interpretive issues, since

asindicatedinthediscussiontheselectiveadvantageof

mutants may be expressed by changes in their behavior

as they cease and resume proliferation.

ThesolutionofEquation1aisN‘(t) ¼N‘ert,whereN‘

is the number of ancestral cells at t ¼ 0. Inserting N‘(t)

into Equation 1b allows solving for M‘(s, t). It is con-

venient for the subsequent discussion to calculate

Rm(s, t), the ratio of mutant cells with selection coeffi-

cientstothetotalnumberofancestralcells,becausethis

describes how significant the mutant population has

become:

Rmðs;tÞ ¼ M‘ðs;tÞ=N‘ðtÞ

or

M‘ðs;tÞ ¼ N‘ðtÞRmðs;tÞ:

Inserting this into Equation 1b, one finds

Rmðs;tÞ ¼ mrðsÞðerst? 1Þ=rs ¼ mrðsÞWðs;tÞ;

where

ð2Þ

Wðs;tÞ ¼ ðerst? 1Þ=rs:

It is immediately apparent that the distribution of se-

lection coefficients found in the mutant cells at time

t is given by the product of r(s) with the weight factor

W(s, t).

2138D. Pinkel

Page 5

At small values of st, W(s, t) ¼ t, and Rm(s, t) ¼ m tr(s),

indicating the buildup of denovo mutationslinearlywith

time and with a distribution in s given by r(s). As s and/

or t increase, W(s, t) increases dramatically due to the

relative expansion of mutations that occurred early in

thecultures.Figure2bshowstheshapeofW(s,t)for0,

s , 0.1 and times t ¼ 1, 11.7 (the first bottleneck in the

actualexperiment),100,and200populationdoublings.

For plotting convenience these graphs have been nor-

malized to the values of W(s, t) at s ¼ 0.1. As time

progresses, the weight of this function becomes con-

centrated at higher values of s. Thus for almost any

shape of r(s) that one might choose, the width of the

rangeof selectioncoefficientsthat areprominent in the

mutant population will become narrower as time pro-

gresses. For most shapes of r(s) the ‘‘effective’’selection

coefficients will fall within a narrow range near the max-

imum s that is possible. An alternate derivation of Equa-

tion2,whichhastheflexibilitytoaddressmorecomplex

systems, is given in section 1 of the supplemental ma-

terials at http:/ /www.genetics.org/supplemental/.

Given Equation 2, the ratio of the number of mutant

cells with selection coefficients between 0 and some

value of s to the total number of ancestral cells at time t,

denoted by Pm(s, t), can be calculated. Thus

ðs

Integrating over the whole range of s gives Pm(‘, t), the

ratio ofthetotal number ofmutantcellsto the ancestral

cellsattimet.Thefraction ofmutantcellswithselection

coefficients between any values s1and s2is then given by

Qðs1;s2Þ ¼ ½Pmðs2;tÞ ? Pmðs1;tÞ?=Pmð‘;tÞ:

These relationships allow determination of the average

behavior of the system for various assumptions concern-

ing m and r(s). The range of effective selection coef-

ficients and the characteristic times T1and T100, for

which the number of mutant cells are respectively equal

to, or 100 times greater than, the ancestral population

are now compared for two specific choices r(s).

Comparison of uniform and delta-function distri-

butions for r(s): Consider first a uniform distribution

r(s) ¼ 1/smaxfor 0 # s # smaxand 0 otherwise. The

product r(s)W(s, t) is W(s,t)/smaxfor 0#s #smaxand so

has the shape of W(s, t) up to smax, whereupon it drops

to 0. Figure 2b shows the shape of this function, indi-

cating that as time increases the predominant mutants

have selection coefficients progressively closer to smax.

Applying Equations 3 and 4 allows calculation of the

range of s of the effective mutations:

ðs

¼

rsmax

0

z

Pmðs;tÞ ¼

0

Rmðs9;tÞds9 ¼ m

ðs

0

rðs9ÞWðs9;tÞds9:

ð3Þ

ð4Þ

Pmðs;tÞ ¼ m

0

rðs9ÞWðs9;tÞds ¼

ðrst

m

smax

ðs

0

ers9t? 1

rs9

??

ds9

m

ez? 1

??

dz ¼

m

rsmaxIðrstÞ:

ð5Þ

The integral I(rst) can be evaluated numerically in a

straightforward manner. Figure 3a shows a graph of

Log10½I(rst)? along with regression fit for the region 5 ,

rst , 50. Using this fit as an approximation gives:

Log10½Pmðs;tÞ? ¼ Log10½m=ðrsmaxÞ?1Log10½IðrstÞ?

? Log10½m=ðrsmaxÞ?10:39rst ? 0:50

10:0005ðrstÞ21 ...

? Log10½m=smax?10:27st ? 0:34

or equivalently

ð6Þ

Pmðs;tÞ ?0:32me0:9rst

rsmax

:

The use of the linear approximation for the fit to

Log10½I(rst)? is particularly accurate for 7 , rst , 16,

for which the deviation of the approximate value of

Log10½Pm(s, t)? from the true value is ,0.1. This corre-

sponds to times on the order of ?100 to ?230 genera-

tionsfors?0.1.Section4ofthesupplementalmaterials

Figure 3.—Approximate evaluation of the integral I(rst).

(a) The black line shows the numerical evaluation of the inte-

gral in Equation 5 for 0.1 , rst , 50. The red line, coincident

withtheblackline,showsasecond-orderfittothevaluesfor5,

rst , 50. The coefficients of the fit are sensitive to the exact

rangeofthecurvethatisbeingfitted,butthesearerepresenta-

tive.(b)EvaluationofI(rst)for0,rst,3(blackline).Theresult

of integrating a Taylor series expansion of the integrand is

showninred.Theformulaforthiscurveisshownonthegraph

and used in section 4 of the supplemental materials at http:/ /

www.genetics.org/supplemental/.

Analysis of Asexual Competition 2139

Page 6

at http:/ /www.genetics.org/supplemental/ and Figure

3b present an approximation to Equation 5 and I(rst)

for low values of rst.

The range of s containing 90% of the mutant cells is

arbitrarily defined as the range of effective selection

coefficients. Since Pm(s, t) is monotonically increasing,

the effective range can be obtained by determining the

selectioncoefficient,s10, for which 10% ofmutantshave

lower s. Setting Q(0, s10) ¼ 0.1 in Equation 4 yields s10.

Since Pm(0, t) ¼ 0, one finds

Log10Qð0;s10Þ ¼ ?1

¼ Log10ðPmðs10;tÞ ? Log10ðPmðsmax;tÞ

¼ 0:27s10t ? 0:27smaxt:

ð7Þ

Thus within the range of validity of the approximation

for Log10½I(rst)?,

1

0:27t

smax? s10¼ Ds ¼

or

Ds

smax¼

1

0:27smaxt:

ð8Þ

Thus if smax¼ 0.1, at t ¼ 100 doublings, 90% of the

mutant cells have s between 0.063 and 0.1, while by t ¼

200 the same proportion will be between 0.083 and 0.1.

Thevaluesofs10areindicatedonthegraphsinFigure2b.

Thehighly peakeddistributionins ofthe mutant pop-

ulationatlong timeseventhoughr(s) wasassumedtobe

flat suggeststhat someaspectsofthebehavior ofcultures

of these cells would be similar to a system in which one

assumed a delta-function distribution of selection coef-

ficients. Using r(s) ¼ d(s ? x) and a mutation rate of md

in Equation 3, one finds analogous to quation 5,

Pmð‘;tÞ ¼mdðerxt? 1Þ

rx

ffimderxt

rx

ð9Þ

or

Log10½Pmð‘;tÞ? ¼ Log10½mdðerxt? 1Þ=rx?

? Log10ðmd=xÞ10:30xt 10:16

forrxt?1:

This result for the delta function is identical in form

tothe(approximate)resultfortheuniformdistribution

ofr(s)foundinEquation6.EvaluatingEquation6ats¼

smax so that the entire mutant population is repre-

sented, Equations 4 and 9 are quantitatively identical if

one sets x ¼ 0.9smaxand md¼ 0.29m. Thus after the

initial period where they are clearly distinct, the total

number of mutant cells evolves with approximately the

same time course for both of these forms for r(s). The

delta-function approximation is appropriate for any

r(s) that results in a highly peaked shape for r(s)W(s, t)

at long times. The scaling of the overall mutation rates

and the ‘‘equivalent’’ selection coefficients will depend

on the details of the shape of r(s) and may also depend

on which aspects of the cultures are being modeled.

Some critical aspects of real systems cannot be modeled

bythedelta-functionapproximation,asshowninFigure

6 and the related text.

Characteristic times: There are at least two times in

the history of the culture that are interesting to calcu-

late.ThemostrelevantisT1,thetimewhenthenumbers

of mutant and ancestral cells in the culture are equal.

While this differs in each series of finite cultures as

shown in Figure 1b, it has a well-defined value for the

effectively infinite system of Figure 2a. This character-

istic time, T1; is the time when Pmð‘;T1Þ ¼ 1; which can

be determined from Equation 3 for any assumed r(s).

For the specific case of the uniform fitness distribution,

Pmð‘;T1Þ ¼ Pmðsmax;T1Þ ¼ 1orLog10½Pmðsmax;T1Þ? ¼ 0:

Similarly, for a delta-function distribution, Log10

½Pmð‘;T1dÞ? ¼ 0: Using Equations 6 and 9 one finds

T1¼ ð0:34 ? Log10½m=smax?Þ=ð0:27smaxÞðuniformdistributionÞ

T1d¼ ð?0:16 ? Log10½md=x?Þ=ð0:30xÞðdelta-functiondistributionÞ:

ð10Þ

The second relevant time is the time for ‘‘fixation’’ of

themutations,whichfollowingHSisdefinedasthetime

when mutants are 100 times the abundance of ances-

tral cells. At this time for the uniform distribution

Pmðsmax;T100Þ ¼ 100;

while similarly for the delta-function distribution

Log10½Pmð‘;T100dÞ? ¼ 2: Thus from Equations 6 and 9,

T100¼ ð2:34 ? Log10½m=smax?Þ=ð0:27smaxÞðuniformdistributionÞ

T100d¼ ð1:84 ? Log10½md=x?Þ=ð0:30xÞðdelta-functiondistributionÞ:

or Log10½Pmðsmax;T100Þ? ¼ 2;

ð11Þ

The range of selection coefficients that are promi-

nent in the population at these times for the uniform

distribution can be calculated using Equation 8. Using

m ¼ 10?5and smax? 0.12, values used by HS for their

simulation of a uniform r(s) described in their Figure

1B, Equation 10 yields T1¼ 136 generations. At that

time 90% of the mutant cells are in the range Ds/smax?

0.23, or 0.089 , s , 0.12. Similarly, from Equation 11

T100¼ 198generations,andatthattimeDs/smax¼0.16,

or 0.10 , s , 0.12. Thus at these times 90% of the

mutant cells have selection coefficients very near the

maximum possible.

STOCHASTIC EFFECTS

Differences in behavior among different series of

cultures are due to stochastic variation in the times tmi

new mutants arise, the corresponding selection coeffi-

cients siof the mutants, and the effects of the selection

bottlenecks. This section demonstrates that the stochas-

tic effect of the multiple bottlenecks is equivalent to

altering the actual mutation times tmito effective times

t* miand derives probability distributions for t* migiven

tmi; s, and the length of the growth periods, t. This

formalism, coupled with statistical sampling of the mu-

tation times on the basis of the mutation rate and sam-

pling of the selection coefficients on the basis of r(s),

2140 D. Pinkel

Page 7

allows a complete description of the stochastic behavior

of the system. The analysis proceeds by first calculating

the statistics of the behavior of a single-mutant clone

with a defined mutation time and selection coefficient

and then combining multiple mutants to describe the

system completely.

Stochastic behavior of a single-mutant clone: Figure4

shows the details of the behavior of the progeny of a

mutant cell expanding through a series of daughter

cultures such as described in the conceptual experi-

ment Figure 2a. No cells are discarded. Assume that the

mutation occurred at time 0 # tm# t, where t is the

length of each of the culture periods. At the kth bot-

tleneck, t ¼ kt, and the average number of mutants,

m(kt), transferred to each daughter is the total number

of progeny from this mutation divided by the total

number of daughter cultures:

mðktÞ ¼erð11sÞðkt?tmÞ

m ¼ e?rtmersðt?tmÞ:

erkt

¼ erskt?rð11sÞtm¼ mersðk?1Þt

ð12Þ

m is the average number of mutant progeny that are

transferred to each daughter at the first bottleneck.

Due to statistical fluctuations in the bottlenecks, the

actual number of cells each daughter receives is dis-

tributed around this average. As illustrated in Figure 4,

after the second and subsequent bottlenecks, daughters

with the same number of mutants can descend from

different predecessors, so that the probability, G*kðnÞ, of

a daughter receiving n mutant cells after the kth bot-

tleneck requires summing over these multiple possibil-

ities. Suppose that q mutants were transferred into a

daughter after the (k ? 1)st bottleneck. At the next

bottleneck, the kth, these have expanded so that on

averageqerstmutantcellswillbedistributedtoeach ofits

ertdaughters. The statistical distribution in the number

n received by these daughters is given by a probability

distribution p(n:qerst). But the probability of having a

culture with q initial cells is given by G*k?1ðqÞ. Thus,

G*

kðnÞ ¼

q¼0

X

where l ¼ erst, k $ 2, and the Poisson probability

distribution has been used because it is appropriate,

X

‘

‘

pðn : qerstÞG*

k?1ðqÞ

¼

q¼0

ðqlÞne?ql

n!

Gk?1ðqÞ;

ð13Þ

Figure 4.—Schematic of the stochastic trans-

mission of the progeny of a single mutation

through two bottlenecks. The boxes indicate in-

dividual cultures that are related to each other

according to the exhaustive sampling scheme de-

scribed in Figure 2a. A mutation is assumed to oc-

cur at some time during the initial culture period

and on average m mutant progeny are passed to

each of its ertdaughters. The daughters are ar-

ranged in order of the number of mutant cells

they receive, which is indicated within the boxes.

The small arrows indicate the additional daugh-

tercultures thatreceivedthesamenumberofmu-

tant cells. The proportion of the first-generation

daughters receiving an actual number, q, is given

by a suitable probability distribution such as Pois-

son. Between the first and the second bottlenecks

these mutants expand relative to the ancestral

cells by a factor erst. These will again be probabi-

listically distributed to their daughters. From the

second bottleneck onward, daughters with a par-

ticular number of cells, n, descend from multiple

predecessor cultures. The probability of having n

mutants after bottleneck k, G*kðnÞ, is given by sum-

ming over these possibilities, as shown in Equa-

tion 13. The ratio of the actual number of

mutant cells in an individual culture after the

kth bottleneck to the average number for that

time is ek(Equation 14).

Analysis of Asexual Competition2141

Page 8

at least initially, when the number of mutant cells is

substantially smaller than the total number of cells. As

shown below, this condition is met for the entire time

period during which stochastic variation is important.

As can be seen from examination of Figure 4, G*1ðnÞ ¼

p(n:m(t)), where from quation 12, m(t) ¼ erst?r(11s)tm¼

le?r(11s)tm. Evaluation of Equation 13 proceeds by using

G*1ðnÞ to calculate G*2ðnÞ, etc. The supplemental materi-

als at http:/ /www.genetics.org/supplemental/ contains

an Excel workbook that performs this calculation, as

well as the calculations of the related statistical distribu-

tions discussed below. Note that G*kðnÞ depends on t, s,

and tmthrough m and l.

The basic behavior of Equation 13 is easily under-

stood. Beginning with a Poisson distribution after the

first bottleneck, it progressively broadens after sub-

sequent bottlenecks. Figure 5a shows plots of G*kðnÞ for

k ¼ 1–7 when tm¼ 1 and s ¼ 0.09. Note that G*kðnÞ varies

smoothly and progressively more slowly as k increases,

exceptbetweenn¼0and1wherethereisapronounced

discontinuity that increases with increasing k. Initially

the broadening of G*kðnÞ is due to the combination of

stochastic events in the bottlenecks and the expansion

of the mutant clone. As the number of mutant cells in-

creases with time, stochastic variation in the number

transferred from a particular culture to its immediate

daughters eventually becomes insignificant compared

tothemeannumbertransferredsince Dn/n? n?1/2.Thus

it is expected that the important stochastic variation

induced by the bottlenecks occurs early in the experi-

ment. This expectation can be quantitatively described

in the following manner.

Let ek(n) be the ratio of the actual number of mutant

cells n in a particular culture to the average number

(Equation 12) after the kth bottleneck. Then

n ¼ ekðnÞmðktÞ ¼ ekðnÞerskt?rð11sÞtm:

ð14Þ

The values of ek(n) for daughters receiving different

numbers of mutants after the second bottleneck are

illustrated in Figure 4. The average value of ek(n) ¼ 1.

Although n is rigorously an integer so that only specific

valuesofek(n)canoccur,theslowvariationofG*kðnÞwith

n, Figure 5a, allows an accurate description of the sys-

temtobeobtainedbytreatingnasacontinuousvariable

and defining Gk(n) as a continuous probability density

that has the values G*kðnÞ for integer values of n. For

noninteger n, Gk(n) can be obtained by interpolation.

This approximation is valid for n $ 1, but not for n ¼ 0

due to the substantial discontinuity in G*kðnÞ between

n ¼ 0 and 1. Thus in the calculations that follow,

probabilities corresponding to n ¼ 0 will be given by

G*kð0Þ, while those corresponding to other values of n

will be based on calculations that treat parameters as

continuous. Using this approximation, which is increas-

inglyaccurateaskincreases,theprobabilitydistribution,

Figure 5.—Statistical distributions for n, e, and the effective

mutation time t* m. (a) The probability G*kðnÞ of having n prog-

eny of a mutant after k bottlenecks for s ¼ 0.09, t ¼ 11.7, and

tm¼ 1. The value of k for each curve is shown by the orange

numbers. Note that the distribution broadens to higher values

of n with increasing k, and its maximum magnitude for n . 0

correspondingly decreases. For the chosen parameters, the

probability that a culture will have 0 progeny of the mutant (in-

dicated by the arrow) is basically established at the first bottle-

neck and subsequently increases only slightly. (b) Probability

distributionforekaftereachbottleneck.Distributionsareshown

for k ¼ 1–7, for tm¼ 1, t ¼ 11.7, and s ¼ 0.09. Symbols corre-

sponding to each n are shown for the first three bottlenecks; for

larger k they are so closely spaced that solid curves are shown.

Note that differences become imperceptible for k . 4 and that

the width is significantly broader than a Poisson distribution,

which is shown in blue for comparison. The probability of a cul-

ture receiving 0 cells hardly increases after the first bottleneck

for these parameters. These values are given by Qk*ð0Þ ¼ G*kð0Þ.

(c) Dependence of the probability distribution F(t* m) of the ef-

fective mutation times, t* m, on the actual mutation time, tm, for

t ¼ 11.7 and s ¼ 0.1. As a mutation occurs at later times, the

probability that a daughter culture will receive 0 progeny in-

creases, as shown by the boxes at t* m¼ ‘. The curves for finite

t* mare plotted after normalization by 1 ? F*(‘), where F*(‘)

is the probability of having n ¼ 0 progeny ofthe mutation. Note

that the curves move to slightly higher values of t* mas the mu-

tation time increases, but still overlap 0.

2142D. Pinkel

Page 9

Qk(ek), for ek, after each bottleneck can be calculated

using Equations 12–14,

GkðnÞdn ¼ GkðnÞdn

QkðekÞ ¼ GkðnðekÞÞmðktÞ;

dekdek[QkðekÞdek0

ð15Þ

where m(kt) is given by Equation 12. Equation 15 holds

for ek. 0. The probability of having ek¼ 0, which

corresponds to n ¼ 0, is Q*kð0Þ ¼ G*kð0Þ.

Figure 5b shows behavior of Qk(ek) and Q*kð0Þ for k ¼

1–7,s¼0.09, t¼11.7, andtm¼1.(Thedistributionsfor

other parameter values can be calculated using the

Excel sheet in the supplemental materials at http:/ /

www.genetics.org/supplemental/.) Note that the shape

changes substantially for the first few bottlenecks, but

for k $ 4 it becomes constant. This is the quantitative

description of the previous statement that once the

number of progeny of the mutated cell becomes large,

no significant additional variation is introduced by

the subsequent bottlenecks. Thus a single distribution,

Q(e), calculable from first principles, can be used to

describe the behavior of the mutants after sufficient

time. The distribution for each mutant clone depends

on the actual time the mutation occurred, the selection

coefficient,and thelengthofthegrowthperiods.Inthis

example calculation, the stable distribution is reached

by the analysis of cultures containing up to 100 mutant

cells, 1000-fold less than the number of ancestral cells

present in the actual experiment. Thus statistical sta-

bility is reached well before the time T1. For compari-

son, Figure 5b also shows a Poisson distribution scaled

so that its maximum is located at the same position as

the maximum of the limiting mutant distribution. The

enhanced broadening due to theseries of bottlenecksis

evident.

The fact that a stable, readily calculable probability

distribution canbeusedtodescribetheprogenyofeach

mutation after several bottlenecks allows a particularly

simple general description of the stochastic behavior of

the entire system. Consider the ratio, RðtÞ; of the

average number of mutant cells to the number of an-

cestral cells per culture:

RðtÞ ¼mðtÞ

N0ert¼erð11sÞðt?tmÞ

N0

¼erst?rð11sÞtm

N0

:

ð16Þ

The actual ratio in the daughter cultures, R(t), will

differ from this ratio by the factor e that has developed

due to stochastic variation in the earlier bottlenecks.

Therefore,fortimeslong enoughafter themutationfor

statistical stability to be established one has

RðtÞ ¼ eRðtÞ:

ð17Þ

Therefore,

RðtÞ ¼ eerst?rð11sÞtm

N0

¼erst?rð11sÞtm1lne

N0

¼ersðt?t*

N0ert*

mÞ

m;

ð18Þ

where tm* ¼ tm1Dtmis the effective time the mutation

occurred, adjusted from the actual tmby Dtm¼ ?ln(e)/

r(1 1 s) due to the stochastic effects in the bottlenecks,

and N0ertm *is the size of the ancestral population at the

effective mutation time. Negative values of Dtmindicate

a stochastic fluctuation that increases the abundance of

a mutant clone relative to its average value; e.g., it ap-

pears that the mutation occurred earlier than it actually

did, while positive values indicate a stochastic decrease

relative to the average since the effective occurrence

time was later than the actual time. If a mutant is lost

fromaseriesofcultures,Dtm¼‘.NotethatEquation18

assumes that 0 # tm# t.

Calculation of the probability distribution, F(t* m), for

t* mallows a statistical description of the behavior of the

mutant cells. Using the continuous variable approxima-

tion discussed prior to Equation 15,

GkðnÞdn ¼ GkðnÞdn

Fkðt*

dt*

m

dt*

m[Fkðt*

mÞdt*

m0

mÞ ¼ GkðnÞnrð11sÞð19aÞ

or

Fkðt*

mÞ ¼ GkðnÞ

¼ QkðekÞekrð11sÞ/QðeÞerð11sÞ;

where the relationships among the various parameters

required to evaluate Equation 19 are tm* ¼ tm? lnðeÞ=

rð11sÞ ¼ tm? lnðn=mðktÞÞ=rð11sÞ, and Equation 15

was used to relate Fk(t* m) to QkðekÞ: Equation 19b

demonstrates that Fk(t* m) becomes independent of k

after several bottlenecks since QkðekÞ becomes indepen-

dent of k. As a practical matter, Fk(t* m) is calculated by

choosing n and m(kt) for any bottleneck after stabiliza-

tion of the distributions, with m(kt) given by Equation

12. Equation 19 is valid for finite t* m. For t* m¼ ‘, which

corresponds to having n ¼ 0 progeny of the mutant,

F*(‘) ¼ G*kð0Þ. The stabilization of the distribution for

t* m after several bottlenecks means that the t* m of a

mutation in a particular daughter culture remains the

sameforallofitsdescendantcultures.Thustheeffective

mutation times provide a suitable basis for describing

the stochastic character of the long-term evolution of

the system.

Figure 5c shows the behavior of F(t* m) and F*(‘) for

tm¼ 0, 1, 2, and 5, with s ¼ 0.1 and t ¼ 11.7. Given the

assumed parameters, if a mutation actually occurs at

tm¼ 0, then on average ?2.25 mutant cells from this

clone will be transferred to each daughter at the first

bottleneck, so that most will receive at least one cell. For

larger tm, the proportion of daughter lineages that

receive no progeny of the mutant (e.g., have t* m¼ ‘)

increases, correspondingly reducing the magnitude of

n

mðktÞmðktÞrð11sÞ

ð19bÞ

Analysis of Asexual Competition 2143

Page 10

F(t* m) for finite t* m. Therefore, to conveniently visualize

the behavior of F(t* m) for different tmin one figure,

F(t* m)/(1 ? F*(‘)) has been plotted. Note that the

distributions of effective mutation times for those

daughter cultures that have at least one cell with this

mutation include t* m ¼ 0 and have widths of several

doubling times. Thus if the progeny of a mutation are

not lost in a series of cultures, they behave as if their

founding mutation occurred near the beginning of the

culture period in which they originated. As tmincreases

the distributions move somewhat toward higher values

t* m,buteventuallystabilizeexceptforadecreaseinoverall

magnitude that has been normalized in this plot. The

stabilization comes about since as tmincreases eventu-

ally all daughter cultures receive either 0 or 1 cell at the

first bottleneck, and those that receive a cell subsequently

develop with similar statistical behavior.

Description of multiple mutations: Equations 18 and

19 allow a complete description of the multiple muta-

tions that occur during growthinthecultures. Theratio

of the total number of mutant to ancestral cells in

daughterculture,ˆPmðtÞ;isobtainedbysummingoverall

mutations. Using Equation 18,

ˆPmðtÞ ¼

X

K

k¼0

X

Ik

i¼1

erskiðt?t*

N0erðt*

mkiÞ

mki?ktÞ

ð20aÞ

ˆPmðtÞ ¼1

N0

X

K

k¼0

X

Ik

i¼1

e?rskikterskiðt?ˆt*

erðˆt*

mkiÞ

mkiÞ

;

ð20bÞ

where the kith mutation occurs during the growth pe-

riod after the kth bottleneck at time tmkiand with

selection coefficient skiand effective mutation time of

t* mki, and the times in the denominator of Equation 20a

are adjusted to account for the fact that after each bot-

tleneck the culture returns to having N0ancestral cells.

To simplify interpretation of Equation 20a and calcula-

tion of the effective mutation times, letˆ tmki¼ tmki? kt

and correspondingly ˆ tmki

* ¼ tmki

actual mutation time andˆ tmki

* is the effective mutation

timemeasuredrelativetothetimeofthekthbottleneck.

Thus the distributions for the ˆ tmki

using Equation 19, employing the adjusted mutation

timesˆ tmki:

The evaluation of Equation 20 formally requires ran-

domsamplingfromthemutationratedistributiontimes

theinstantaneouspopulationofancestralcellstoobtain

the mutation times, assigning a selection coefficient to

each mutation by sampling from the distribution r(s),

and finally assigning the corresponding effective muta-

tion times by sampling from the appropriate F(t* m)

calculated from Equation 19. However, many aspects of

its general behavior can be understood much more

simply, as discussed in section 3 of the supplemental

materials at http:/ /www.genetics.org/supplemental/.

* ? kt, where ˆ tmkiis the

* can be calculated

Note thatˆPmðtÞ is the finite-system analog of the pre-

viously derived Pm(‘, t) from Equation 3. Pm(‘, t) is

the average value of Equation 20 over multiple finite

cultures.

ANALYSIS OF THE YFP/CFP

COMPETITION EXPERIMENT

Calculation of the fluorescence ratio: Equation 20

allows a full description of the stochastic behavior of the

YFP/CFP cocultures. In the coculture the differentially

labeled populations grow independently and are in-

dependently subject to the statistics of mutation forma-

tion and bottlenecks. Thus, if NYand NCare the total

numbers of YFP and CFP cells, respectively, and NYAand

NCAare the corresponding numbers of ancestral cells,

then

NY¼ NYA½11ˆPmYðtÞ?

NC¼ NCA½11ˆPmCðtÞ?

Log10ðNY=NCÞ ¼ Log10

11ˆPmYðtÞ

11ˆPmCðtÞ

??

1Log10ðNYA=NCAÞ:

ð21Þ

At the beginning of an experiment Log10(NY/NC)

remains constant (equal to 0 if the initial population

sizes areequal)untilasufficientnumberofmutant cells

arise such thatˆPmYðtÞ and/orˆPmCðtÞ ? 1. Around this

time, whose characteristic value is T1; variations in

Log10(NY/NC) may develop. The behavior depends on

the experimental design and mutational properties of

the YFP and CFP populations, which are incorporated

into evaluation of Equation 20.

While Equation 20 appears complex, it can be sub-

stantially simplified in a manner that preserves its quan-

titative accuracy and facilitates understanding of the

essential factors that affect the behavior of the cultures.

The simplification results from recognizing that only a

small subset of the terms is significant and that key

parameters are restricted in their values. First, most of

the terms in Equation 20 are equal to 0 because the

effective mutation time for most mutants is infinite, as

indicated by the approach of F*(‘) to 1 as tmkiincreases

during a culture period (Figure 5c) (e.g., most are lost

due to‘‘drift’’in thebottlenecks).Additionally, the non-

zero terms have effective mutation times clustered near

0, as also indicated by Figure 5c. Moreover, Equation 2

shows that as time progresses the cells that constitute an

appreciable proportion of the mutant population will

have selection coefficients in the range where r(s)W(s,

t) has significant magnitude, which is typically very nar-

row. Finally, a mutant clone originating at an effective

time t* mkicontributes substantially to Equation 20 only if

itsselectioncoefficientislargerthananyoftheselection

coefficientsofmutationswithsmallert* mki.Thereforethe

most significant contributors to the mutant population

2144D. Pinkel

Page 11

come from the ordered subset of mutations ½t* mki, ski?,

where both monotonically increase and the ski are

restricted to a narrow range just below the maximum

available selection coefficient, while the t* mkiare near 0.

Thus as time passes clonal succession occurs, with the

population being dominated by mutants with selection

coefficients tending toward the largest available. Sec-

tion 4 of the supplemental materials at http:/ /www.

genetics.org/supplemental/ calculates the average

number of newly arising mutant cells transferred from

a culture to itsdaughters, elucidating the buildup of the

total mutant population.

Consider the YFP/CFP coculture at the time period

around T1; the time whenˆPmYðtÞ andˆPmCðtÞ ? 1 on

average. The behavior of the population ratio depends

on the stochastically generated differences inˆPmYðtÞ

andˆPmCðtÞ; which are related to how densely the ranges

of effective mutation times and selections coefficients

are sampled in the two mutant populations. Imagine

that N0for both the YFP and the CFP populations is very

large, so that at T1; which is independent of N0, many

terms are required to makeˆPmðtÞ ? 1. Note thatˆPmðtÞ

contains the factor 1/N0(Equation 20) so that increas-

ingly more terms are required as N0increases. Biolog-

ically the increase in the number of terms comes from

the proportional increase in the number of mutations

that occur due to the larger population size. But if there

are many significant terms, then the intervals of both

the effective mutation times and selection coefficients

between sequential terms in the ordered subset ½t* mki, ski?

mustbesmallduetotheirlimitedrangesoftheeffective

values.Althoughthespecificvaluesin½t* mki,ski?willdiffer

for the YFP and CFP populations due to the stochastic

effects, the behavior of theˆPmðtÞ’s is insensitive to these

differencesandforbothpopulationsapproachesthatof

Pm(‘, t) given by Equation 3. Thus the YFP and CFP

populations evolve indistinguishably in terms of cell

numbers, and the population ratio remains constant.

All cocultures will appear to behave the same, and no

indication of the internal stochastic differences will be

measurable.

As N0is decreased, fewer terms are required in Equa-

tion 20 to result inˆPmðtÞ ? 1 at t ? T1; so the intervals

between sequential terms in ½t* mki, ski? increase. The

stochastic differences between the exact terms included

in ½t* mki, ski? for the two populations lead to increasing

possibilities for differential behavior between ˆPmYðtÞ

andˆPmCðtÞ; so that greater variation in the population

ratiowilloccuramongagroupof‘‘identical’’cocultures.

The easily measurable effects will include increased

variability in the T1’s, increased magnitude of the ratio

variations that develop, and a larger proportion of cul-

tures in which the ratio variations are so large that one

population effectively displaces the other. If the initial

YFP and CFP populations differ substantially in size, for

example, one is ‘‘small’’ and the other is ‘‘large,’’ then

their stochastic behaviors may be quite different in

character. Section 5 of the supplemental materials at

http:/ /www.genetics.org/supplemental/ presents a rough

methodofestimatingthenumberofsignificanttermsin

Equation 20 or, conversely, estimating the population

size above which ratio variations are not expected.

The effect of population size on the stochastic varia-

tions in the population ratio can be appreciated by

examination of the experimental data of HS (repro-

duced in section 6 of the supplemental materials at

http:/ /www.genetics.org/supplemental/). If their ini-

tial population size were increased by a factor of 10, the

expectedratiobehaviorcouldbeestimatedbyaveraging

randomlyselectedsetsof10oftheirexperimentalcurves,

afterfirstproperlytransformingthedatapriortoaverag-

ing. Clearly the ratio deviations would begin at approx-

imately the same time, indicating the constancy of T1;

but would be of substantially reduced magnitude.

Equation 22 shows Equation 21 with the first few

nonzero terms explicitly displayed,

Log10ðNY=NCÞ

¼ Log10

11ð1=N0YÞðe?rsY1kY1tersY1ðt?ˆt*

11ð1=N0CÞðe?rsC1kC1tersC1ðt?ˆt*

Y1Þ=erðˆt*

C1Þ=erðˆt*

Y1Þ1e?rsY2kY2tersY2ðt?ˆt*

C1Þ1e?rsC2kC2tersC2ðt?ˆt*

Y2Þ=erðˆt*

C2Þ=erðˆt*

Y2Þ1 ???Þ

C2Þ1 ???Þ

"#

;

ð22Þ

where the offset term due to unequal populations has

been neglected. As just discussed, in the regime where

substantial population variations occur only a few terms

are important. Figure 6a shows the behavior of Equa-

tion 22 under the assumptions that the initial popula-

tion sizes are equal, r(s) is a delta function, and only

one mutant clone is contributing significant progeny

in each population. Since all the selection coefficients

are equal, if ratio fluctuations occur due to differences

in the effective mutation times, the ratio will stabilize at

some constant value after mutant cells dominate both

populations. Mathematically the function always rea-

ches a stable value, curve 2. However, this value may be

soextremethatitrepresentstheextinctionofoneofthe

populations, as illustrated by curve 1. The parameters

for the curves are shown at the bottom of Figure 6.

The appropriateness of Equation 22 for describing

real experiments can be assessed by comparison of its

behavior to the experimental data of HS. The overall

shapesofthecurvesinthetimeperiodwheretheybegin

to depart from Log10(NY/NC) ¼ 0 and the rough mag-

nitudes of the ranges of variation of the curves for the

model and the experimental data are qualitatively simi-

lar. Thus a model containing single significant mutant

clones in the two populations, and employing effective

mutation times consistent with the range defined by

Figure5c,reasonablydescribestheseaspectsofthedata.

However, Figure 6a does not properly describe thelong-

term behavior of the ratio.

The experimental data clearly show that as mutant

cells become dominant in both populations, the slope

of Log10(NY/NC) becomes small, but it is not typically 0.

This is a clear indication of differences in the selection

Analysis of Asexual Competition 2145

Page 12

coefficients of the YFP and CFP mutants that are most

prevalent in that time period. This behavior is modeled

very well by employing differences in both the selection

coefficients and the effective mutation times in Equa-

tion 22, as shown in Figure 6b. All but one of the curves

in Figure 6b use one exponential term (one mutant

clone) to describe the mutants in each population. The

slopes of the curves after the mutants have overgrown

the ancestral cells are directly related to selection coef-

ficients of the dominant mutant clones. Taking the

derivative of Equation 22 and assuming only one expo-

nential term in the numerator and the denominator,

one finds

sY? sC¼DLog10ðNY=NCÞ

rðDtÞLog10ðeÞ

¼ 3:32DLog10ðNY=NCÞ

Dt

:

ð23Þ

If more than one clone has significant prevalence in

these populations, then the effective selection coeffi-

cients are weighted averages of the several clones. How-

ever, if more than one clone is prominent, then one

expects the slope to progressively change with time as

the one with the highest selection coefficient increases

in prevalence relative to the others. This is illustrated by

curve 4 in Figure 6b. Note that on the timescale of the

actual experiment, the change in slope due to clonal

succession is very subtle, since the selection coefficients

are necessarily closely spaced.

In summary, if the population sizes are sufficiently

large,multiplemutationswillcontributesignificantpro-

portionsofmutantcellsattimesontheorderofT1:This

corresponds to having many significant terms in Equa-

tion 20, and ratio variations will be of low amplitude. If

the population sizes are sufficiently small, then Equa-

tion 22 with only one exponential term in the numer-

ator and the denominator describes the experimental

system, and one expects substantial ratio variations.

Moreover,thesmallerthepopulationsizesarethelarger

the proportion of culture series that will be entirely

overtakenby either the YFPor the CFP cells,e.g., curve1

in Figure 6, a and b.

Measurement of bacterial characteristics from the

experimental data: As indicated in the discussion fol-

lowing Equation 22, the experimental data from HS are

qualitatively described by a very simple model contain-

ing only one prominent mutant clone in the YFP and

CFP populations. Quantitative determination of biolog-

ical parameters of the bacteria based on the model is

Figure 6.—Characteristic behavior of Log10(NY/NC)as a

function of time. (a) If all mutants have the same selection

coefficient, r(s) equals a delta function, and two basic behav-

iors are possible for Log10(NY/NC). The ratio either departs

from 0 and continues to increase (or decrease) until one pop-

ulation displaces the other (curve 1) or stabilizes at a constant

value when mutants come to dominate both populations

(curve 2). The effective mutation times t* mand the selection

coefficients for the mutant clones for both a and b are shown

at the bottom. (b) If the distribution of selection coefficients

has finite width, then the behavior is more complex. The

curves show the behavior of Equation 22 with either one or

two (curve 4 only) mutant clones dominatingˆPmðtÞ for each

population. In curve 1, a mutant occurs first in the YFP pop-

ulation, and it completely dominates the culture. Curve 3

shows the behavior if mutants with the same selection coeffi-

cient occur in both populations but at different effective

times. This is unlikely to be stable as time progresses unless

both selection coefficients are at the maximum available

value, since a clone with higher s will eventually become dom-

inant. Curve 4 shows two significant mutant clones in the YFP

population with slightly different values of s and effective mu-

tation times and a single significant clone in the CFP popula-

tion with a value of s between those in the YFP mutants. In the

YFP population the mutant with larger s must occur at a later

effective time or the lower-s mutant would never affect the be-

havior of the culture. Note that the curve departs from 0 in

the positive direction and bends over to develop a negative

slope, which slowly become less negative as the larger-s clone

become more dominant in the YFP population. Curve 5 shows

the behavior with single prominent mutant clones in both

populations. Finally, curve 6 shows the effect of single mutants

in each population with nearly identical selection coefficients

and effective mutation times. The curve hardly departs from 0.

2146D. Pinkel

Page 13

straightforward in this case. Once these parameters are

obtained, they can be checked to determine if they are

quantitatively consistent with the single-mutant clone

description.

Estimation of T1: T1can be determined by examining

the timing of the initial departures of Log10(NY/NC)

from 0. T1is the time at whichˆPmðtÞ ¼ 1 for a particular

population. If this happens in the YFP population

significantly prior to the CFP population, then T1is just

the time when Log10(NY/NC) ¼ 0.3 (or ?0.3 if the

reverse occurs). More generally, bothˆPmYðtÞ andˆPmCðtÞ

have some significant value, so that jLog10(NY/NC)j will

be ,0.3 at t ? T1: Thus a reasonable estimate for T1is

the time when many of the experimentally measured

traces of Log10(NY/NC) from a collection of culture

series have significant departures from 0, but do not

fully reach 60.3. For the experimental data of HS, T1?

150 generations. A more complex statistical fitting of

Equation21tothedatawouldproduceabetterestimate

for T1: The value of such a procedure depends on the

noise level of the data.

Estimation of the maximum effective selection coefficient for

the mutations: The maximum value of s can be estimated

by examining the population-ratio curves for all of the

experimental cultures to find the maximum slope of

Log10(NY/NC) after its initial departure from 0. This

typically occurs in aculture series whereone population

completely and rapidly overtakes the other, presumably

because a mutation with a near-maximum selection

coefficienthadaveryearlyeffectiveoccurrencetime(or

perhaps was preexisting in the population). This ap-

proach leads to an estimate of smax? 0.11, using Equa-

tion 23 with one selection coefficient set equal to 0. To

assist with the analysis, lines with various slopes have

been added to the data figures of HS that are repro-

duced in the supplemental materials at http:/ /www.

genetics.org/supplemental/.

The analysis assuming a single-mutant clone is self

consistent. Using the procedure in section 5 of the

supplemental materials at http:/ /www.genetics.org/

supplemental/, Equation S8 estimates that there are

typically 1.1 significant mutant clones in both popula-

tions.Ofcourse,somecultureswillbychancehavemore

than one significant mutant clone in one or both

populations, but as mentioned before, these cultures

will typically have relatively low population-ratio devia-

tions, and initial slopes of the curves will be relatively

small compared to the others.

Estimation of the width of the effective selection coefficients:

Examination of the slopes of Log10(NY/NC) for the

experimentaldataattimesaftermutantsdominateboth

populationsfinds?32culturesserieswith0,jsY?sCj,

0.02 and only 5 with 0.02 , jsY? sCj , 0.04. (Given the

largenumberofcurvescontainedintheHSfigures,and

the measurement noise, it is difficult to be precise in

these estimates.) Therefore most of the mutants must

have nearly identical selection coefficients, and these

must be very close to smaxgiven the large slope of the

curves during their initial departure from 0. Thus one

would estimate that most of the selection coefficients of

themutantsthataresignificantintheseexperimentsfall

into the range of 0.09–0.11. This is just the behavior

expected for a system with an effective mutation

distribution described by r(s)W(s, t) from Equation 2.

Given the strong dependence of W(s, t) on s, and the

noise in the data, it is very difficult to determine much

about the actual shapeof r(s) for the bacteria.Alternate

experimental designs that may reveal more details of

r(s) are proposed in the discussion.

Estimation of the mutation rate: Finally, assuming a

uniform distribution for the selection coefficients for

new mutations, Equation 10a finds that m ? 10?5if T1?

150 generations. The numerical value obtained for m

depends on the assumed form for r(s). The inferred

mutation rate would be much lower if a larger pro-

portion of the weight of r(s) were at higher selection

coefficients and would be much higher if r(s) were

preferentially weighted toward 0. For comparison, if

r(s) were a delta function, the estimate from Equation

10b for the mutation rate would be ?8.5 3 10?7. Since

all shapes of r(s) lead to a very narrow distribution of

effectiveselectioncoefficients,thevalueofthemutation

rate for the delta-function distribution is an estimate of

an effective mutation rate for the system—the other

mutations that may occur will have limited discernible

effect on the cultures. Therefore it represents a ‘‘lower

bound’’ estimate of the actual mutation rate for these

cells.

DISCUSSION

This article has presented an analytical description of

competing asexual populations that highlights the es-

sential processes underlying their behavior. Beginning

with a qualitative consideration of the competing popu-

lations (Figure 1), it introduced a hypothetical con-

struction that established the relationship of a single

large culture in unbounded exponential growth to a set

of embedded sequential finite cultures that are sepa-

rated by bottlenecks (Figure 2). The finite cultures are

subject to stochastic variation due to the mutational

processes and the random assortment of mutants at the

bottlenecks, but their average behavior is identical to

the ‘‘infinite’’ culture. Since the infinite culture is de-

scribed by the standard equations of exponential

growth, average characteristics of the stochastic pro-

cesses could be readily calculated.

Consideration of the infinite, continuously expand-

ing system immediately yielded the result that the dis-

tribution of selection coefficients in the mutant cells

is given by r(s)W(s, t) (Equation 2). As time increases,

W(s,t)increasesmorestronglywithincreasings(Figure2b)

so that the distribution of effective selection coefficients

Analysis of Asexual Competition2147

Page 14

becomesprogressively moreconfined toanarrow range

near the maximum s permitted by r(s). Since this must

also be true on average for the embedded finite cul-

tures, Equation 2 is the analytical description of the

‘‘equivalence principle’’ that HS found through exten-

sive numerical simulation.

The qualitative description of the behavior of a series

of finite cultures (Figure 1b) highlighted the impor-

tance of the times T1, when the mutant and ancestral

populations are of equal magnitude, for determining

whendiscerniblevariationsinthepopulationratiosmay

begin to occur in competition experiments. Analysis of

the infinite culture also allowed determination of the

characteristic value for this time, T1: Thus the distribu-

tion of selection coefficients that are prominent in the

mutant populations, and the characteristic times at

which ratio variations might occur in competition

experiments, could be calculatedwithoutconsideration

of the details of the stochastic processes.

The stochastic aspects of the cultures were addressed

byanalyzingthefateoftheprogenyofasinglemutation,

again considering a hypothetical experiment in which

all of the cells in a culture are transferred to daughter

cultures (Figure 4). This analysis demonstrated that

after a sufficient number of bottlenecks, the progeny of

a mutated cell expand in a lineage of daughter cultures

as a simple exponential, but with an apparent mutation

time that is shifted from the actual mutation time

according to readily calculable statistical distributions

(Equations 15 and 19 and Figure 5). Summing over

multiple mutants provided a complete description of

the system (Equation 20).

Full evaluation of Equation 20 requires substantial

effort. However, the results of the average value calcu-

lations, specifically those for T1and the demonstration

of the narrow range of effective selection coefficients,

and the restricted range of effective mutation times

derived from the statistical analysis, permitted a quali-

tative understanding of the basic behavior of Equation

20. These considerations lead to a dramatically simpli-

fied single-clone (or several-clone) description of a

system of competing populations (Equation 22), which

is applicable to the experimental regime where varia-

tions in the population ratios are prominent. Thus the

interplay between a detailed statistical analysis and

simple calculations of average values of various param-

eters for infinite populations leads to a tractable

analytical description of competition experiments.

The analysis of experimental data to extract quanti-

tative estimates of parameters of the mutational process

is straightforward if the population sizes are small

enough so that only a single exponential term is re-

quiredinEquation22torepresentthemutantcells.The

maximum slope that the population ratio achieves after

its initial departure from horizontal gives an estimate of

themaximumvalueoftheselectioncoefficientavailable

to the mutant cells, and the average time that nonzero

slopes appear allows an estimate of T1: If the popula-

tions sizes are intermediate, so that ratio changes are

observable but progeny of multiple mutations contrib-

ute, then the maximum deviations in the ratios are

reducedsincemutantclonesbecomesignificantinboth

populations at about the same time and with about the

same values of selection coefficients. In this case extrac-

tion of the biological parameters from the data would

require a more complex statistical fitting process.

Application of the single-mutant clone version of

Equation 22 to the experimental data of HS estimates

that the maximum selection coefficient of the muta-

tions is s ? 0.11 and that the mutation rate is m ¼ 10?5or

8.5310?7,dependingonwhetheroneassumesthatr(s)

is uniform or a delta function. By contrast, HS found

s ? 0.054 and m ? 10?6.7¼ 2 3 10?7when using their

procedures to fit a delta-function model to their data

(their Figure 3B). The reasons for the considerable dif-

ferences in the estimates produced by the two modeling

approaches require elucidation.

One of the benefits of the analytical approach taken

here is that it reveals general relationships that may be

obscured in more complex computational analyses. For

example, exponential fitness distributions, r(s) ¼ (1/

k)e?ks, have been postulated as being relevant to real

biologicalsystems(Orr2003).However,iftheexpected

behavior of a culture with such a distribution is calcu-

lated using Equation 3, one finds that at times t . k/r

the product r(s)W(s, t) is unbounded with increasing s.

Thus the fraction of mutant cells becomes infinite at

finite times. This is a physically unreasonable result and

indicates that in real systems the fitness distribution

must go to 0 more rapidly than exponentially. An ex-

ponentialtruncatedatsomemaximumshasthedesired

properties, and indeed truncation is assumed in the

distributions proposed by Orr (2003).

The information concerning the shape of r(s) that

can be obtained from the HS measurements is limited

because the experimental behavior is so strongly dom-

inated by the limited range of selection coefficients

defined by r(s)W(s, t). Altering aspects of the experi-

mental design may allow one to obtain more informa-

tion. For example, one could substantially reduce the

population sizes in the experiment and correspond-

ingly increase the number of culture series. Such a

change would not affect the characteristic time, T1; at

which the mutant populations will become significant,

but would dramatically increase the variation in the T1’s

for the YFP and CFP populations in each culture series.

This would increase the fraction of culture series in

which one population completely displaces the other,

e.g., the curves labeled ‘‘1’’ in Figure 6, a and b. These

displacements are predominantly due to single-mutant

clones, and determination of the frequency with which

different slopes are obtained during the displacement

process provides a direct measure of the shape of r(s).

This design should allow measurement of selection

2148 D. Pinkel

Page 15

coefficients that are significantly below the maximum

sincethesmallpopulationsizewouldgivethesemutants

a greater chance of becoming dominant prior to ap-

pearance of another mutant clone with larger s. The

smaller the population size is, the lower the values of s

that could be addressed in such an experiment, but the

larger the number of series of cultures that would be

needed.

Another possible approach would be to perform

single-cell comparative growth experiments on a mas-

sive scale. This could be done by first growing several

series of large cultures with all bacteria containing the

same label, say YFP, for a time on the order of T1:

Approximately half of the cells in these mass cultures

would be mutant, with the mutant population distrib-

uted in s according r(s)W(s, t). Growing the cultures for

different times would provide different proportions of

mutant and ancestral cells and somewhat different

frequency distributions given the change in W(s, t) with

time. Shorter growth times would yield a broader dis-

tribution of selection coefficients for the mutants but a

greater proportion of the cells would be normal, while

longer times would produce a greater proportion of

mutantcellsbut anarrower frequency distributionfor s.

Selection of a multitude of single YFP cells from the

mass cultures and individually comparing their growth

rates to normal CFP-labeled cells would allow measure-

ment of the shape of r(s)W(s, Tm), where Tmis the

length of the initial mass culture. Division by W(s, Tm)

would provide a determination of r(s). The range of s

for which a reasonable estimate could be obtained

would depend on the number of mutants that are

measured and the accuracy of the measurements.

The better the experimental means of measuring the

two cell populations is, the shorter the culture time

required for the measurement and the more accurate

the result. Thus high-sensitivity fluorescence monitor-

ing of growth in microtiter plates, or single-cell analyt-

ical procedures such as flow cytometry or fluorescence

microscopy, or perhaps highly parallel microfluidic

measuring systems, would be beneficial. Ideally one

might add one YPF cell to one ancestral CFP cell and

determine the relative numbers at several time points

during a single growth phase without the introduction

of any bottlenecks. However, it is important to note that

such an approach would measure the selection coef-

ficients under conditions different from those of the

original selection. Specifically,ifamutation provided its

advantage during the initial mass culture by increasing

the ability of cells to divide as the culture enters and/or

leaves stationary phase, then that advantage will not be

properly quantified.

In conclusion, the analytical approach presented

here allows a qualitative and quantitative interpretation

of the behavior of many aspects of competing asexual

populations without the need to perform detailed nu-

merical simulations. Analytical expressions for the nar-

rowing of the range of effective selection coefficients

and for various characteristic times during the adapta-

tion of a population to new selective conditions are

presented. The stochastic behavior of the population is

accurately described, and the critical aspects of the

description are highlighted to allow a simplified quan-

titative interpretation of experimental data. This anal-

ysismaybeusefulinconsideringalternateexperimental

designs that enhance probing of specific aspects of the

adaptation process.

LITERATURE CITED

Hegreness, M., N. Shoresh, D. Hartl and R. Kishony, 2006

equivalence principle for the incorporation of favorable muta-

tions in asexual populations. Science 311: 1615–1617.

Orr, H. A., 2003 The distribution of fitness effects among beneficial

mutations. Genetics 163: 1519–1526.

An

Communicating editor: A. D. Long

Analysis of Asexual Competition2149