Page 1

Copyright 1998 by the Genetics Society of America

Maximum Likelihood Estimation of Population Growth Rates

Based on the Coalescent

Mary K. Kuhner, Jon Yamato and Joseph Felsenstein

Department of Genetics, University of Washington, Seattle, Washington 98195

Manuscript received February 24, 1997

Accepted for publication January 2, 1998

ABST RACT

We describe a method for co-estimating 4Ne? (four times the product of effective population size and

neutral mutation rate) and population growth rate from sequence samples using Metropolis-Hastings

sampling. Population growth (or decline) isassumed to be exponential. The estimates of growth rate are

biased upwards, especially when 4Ne? is low; there is also a slight upwards bias in the estimate of 4Ne?

itself due to correlation between the parameters. This bias cannot be attributed solely to Metropolis-

Hastings sampling but appears to be an inherent property of the estimator and is expected to appear in

any approach which estimates growth rate from genealogy structure. Sampling additional unlinked loci

is much more effective in reducing the biasthan increasing the number or length of sequences from the

same locus.

T

history. The distribution of coalescence times (timesat

which two of the sampled individuals have a common

ancestor) depends on the effective population size Ne:

in a diploid population the distribution isproportional

to 4Ne. Since coalescence times cannot be directly ob-

servedinmostcases, butonlyinferredfromtheaccumu-

lation of mutations, we rescale time proportional to the

per-site neutral rate?. Thus, thoughwecannotestimate

4Nedirectly, we can estimate the product 4Ne? which

we will call ?.

If the population size has changed over time, the

distribution of coalescence times will differ from its ex-

pectation in a population where ? is constant, and in

principle this should be detectable. In particular, if

the population has been growing the most rootward

branches will be relatively short, whereas if it has been

shrinking the most rootward branches will be relatively

long.

Wehave previouslydescribedamethodfor estimating

? in apopulation of constant size (Kuhner etal. 1995),

using Metropolis-Hastingssampling (Met ropol is et al.

1953; Hast ings1970) ofgenealogies. Thebasic strategy

isto samplegenealogiesbased on their posterior proba-

bility with regard to the data and a trial value of ?,

and then use the sampled genealogies to evaluate the

relative likelihoodof othervaluesof?. Thisimportance

sampling approach concentrates the sampled genealo-

gies in regions of high posterior probability, which is

much more ef® cient than using random genealogiesand

HE genealogical structureof asamplefromapopu-

lation containsinformationaboutthatpopulation's

avoidsthebiasofusingonlyasinglegenealogyreconstruc-

tion. Thisalgorithm isimplemented in our Coalesce pro-

gram.

In thispaper weextendthe Metropolis-Hastingsgene-

alogy sampling approach to the case of a population

experiencingexponential growthordecline.In thiscase

population size is represented by two parameters: the

exponential growth rate gand the present-day value of

? (that is, the value at the time when the organisms

were sampled). The parameters are not independent:

the more rapidly a population hasgrown, the larger its

current size isexpected to be compared to itsªaverageº

size.Wehavewritten aprogram, Fluctuate, whichimple-

ments this sampler.

Both analytic and simulation results show that the

estimate of the growth rate g is biased upwards when a

® nite number of individuals are analyzed. At least two

factors are at work in this bias: the nonlinear relation-

ship between coalescence times and the estimate of g,

and truncation of the coalescent distribution, in gene-

alogies of ® nite numbers of individuals, by the bot-

tommost coalescence. There is also a smaller upwards

biasin?duetothecorrelationbetweenthetwoparame-

ters. The bias in these estimators can most effectively

be reduced by sampling multiple loci.

ThemethodproposedbyGrif®thsandTavare Â(1994)

forestimation of growth rateusesadifferent strategyfor

de® ningand sampling genealogies, but sharesacommon

mathematical rationale. It should therefore experience

the same bias. Further testing will be required to com-

pare the effectiveness of these two methods. Other ap-

proachestoestimating growth,such asthepairwisemea-

suresof Sl at kin and Hudson (1991) and Rogers and

Harpending (1992), uselessoftheinformationpresent

in the data and should be less ef® cient (Fel senst ein

Corresponding author: Mary K. Kuhner, University of Washington,

Department of Genetics, Box 357360, Seattle, WA 98195-7360.

E-mail: mkkuhner@genetics.washington.edu

Genetics 149: 429±434 (May, 1998)

Page 2

430 M. K. Kuhner, J. Yamato and J. Felsenstein

A series of genealogies generated under a given ?0and g0

can be used to determine the likelihood L(?,g) for other

values of ? and g. For each genealogy G a product is taken

overall coalescenceintervalsi:ineach interval,kisthenumber

of lineagesin the genealogy during that interval, tsisthe time

at the tipward end, and teis the time at the rootward end.

Note that these are not rescaled times:

1992); the genealogical methodsare at a particular ad-

vantage when the growth rate g is low or negative, a

case in which pairwise methodstend to fail due to the

confounding in¯uence of the genealogical structure

(Sl at kin and Hudson 1991).

MAT ERIALS AND METHODS

P(G??,g) ? ?

i

egtee

k(k ? 1)

?g

(egts? egte)

?

.

The Metropolis-Hastings genealogy sampler for constant-

sized populations (Kuhner et al. 1995) works by a two-phase

process.Itbeginswith an initial genealogyand an initial value

of ?, called ?0. In the ® rst phase, a new genealogy iscreated

by locally rearranging the previous genealogy in proportion

tothecoalescent priorprobabilityP(G??0) (given byKingman

1982a,b). In the second phase, this genealogy is accepted or

rejected based on P(D?G), the probability of the sequence

dataonthegenealogy. Thisisequivalent to samplingfrom the

posteriorprobability, which isproportional to P(G??0)P(D?G).

Thisprocessisrepeated,with samplestaken fromitatintervals

to produce asetof genealogiesfrom which amaximum likeli-

hood of ? can be made. The estimate ismost ef® cient when

?0is close to ?, so it is useful to run several iterationsof the

sampler,usingtheestimated ? ofeach iteration asthestarting

?0of the next.

Like most calculationsinvolving the coalescent, theseequa-

tions hold exactly only in the limit as the population size N

goestoin® nity: inpracticethe approximation involvedshould

beinsigni® cant aslong asthe number of individualssampled

is less than the square root of the population size.

Mutational model: Weused theDNA/ RNA sequencemodel

ofFel senst ein (1981) which allowsunequal basefrequencies

and transition/ transversion bias, extended asin Fel senst ein

and Churchil l (1996) to allowfor variable ratesamong sites

and auto-correlation of those rates. It is simple to substitute

any other mutational model for which P(D?G) can be calcu-

lated:for example, modelsappropriate to protein or microsa-

tellite data. The algorithm as designed doesnot estimate pa-

rameters of the mutational model.

Scaling for populationgrowth: When thesizeof thepopula-

tion changesexponentiallythrough time,the coalescentprior

becomesP(G??0,g0) whereg0isatrial valueof the exponential

growth rateg.(Positivevaluesof gindicate population growth,

and negative valuesindicate decline.) The units of g are 1/?

generations.

In order to sample coalescence times from this prior, we

use a time rescaling under which it becomes identical to the

simpler constant-population prior. Time is scaled propor-

tional to growth, so that the same expected amount of coales-

cenceoccursin one unit of timeregardlessof population size.

Under this transformation, the coalescent structure of the

genealogy becomes identical to the constant-population ex-

pectation.

The rescaled time T is derived from the original time t

by the following relation (Sl at kin and Hudson 1991). The

negative sign in the exponent is due to the fact that we are

considering times previous to the present:

This formula can be shown to be equivalent to that given

in Grif®t hs and Tavare Â1994), bearing in mind that they

scaled time in unitsof N generationsrather than 1/? genera-

tions and they considered a haploid rather than a diploid

case: they also retained some combinatorial constants which

we omit, since we are concerned onlywith ratiosof probabili-

ties.

Thisprobability is then corrected for the importance sam-

pling function P(D?G)P(G??0,g0) (where n is the number of

genealogies sampled):

L(?,g)

L(?0,g0)?1

n?

G

P(G??,g)

P(G??0,g0).

[ThetermsP(D?G) drop outastheyare thesame for all values

of ? and g.]

The maximum of thisfunction, which is a joint maximum

likelihood estimate of ? and g, can be found by standard

methods. Technical dif® culties are often encountered due to

arithmetic over¯ow in exponentiation and the characteristic

curving-ridge shape of the likelihood surface.

Multiple loci: The likelihoods can be multiplied together

across unlinked loci to generate an overall multi-locus likeli-

hood. Doing so should greatly improve the ef® ciency of the

estimate, especially for g, since doubling the number of loci

doubles the amount of information available about the most

rootward partsof the genealogy(which are the most informa-

tive for growth rate, since they represent the population size

most divergent from the modern-daysize). Adding additional

sequences mainly adds information about the most tipward

partsof the genealogy, which contain relativelylittle informa-

tion about growth.

If the loci to be combined cannot be assumed to have the

same values for the parameters, this must be taken into ac-

count when combining them. It is reasonable to assume that

the population growth rate affects all loci equally (barring

selection),but both theneutral mutation rate? and theeffec-

tive population size Necan vary among loci (for example, Ne

is lower for a mitochondrial locus than for a nuclear one).

This can easily be accommodated if the relative values of

the parameters for different loci are known (or can be as-

sumed): we simply replace Neand ? with appropriate locus-

dependent functions when calculating the multi-locus likeli-

hood. In the future, a method for dealing with unknown

variability in ? among loci could be developed by assuming

Gamma distributionsfor the parametersand integrating over

the range of possibilities.

Assessing theaccuracyof theestimate:An advantage of likeli-

hood methods is that information about the accuracy of the

estimate can be gleaned from the likelihood curve. We will

consider thecon® denceintervalasthesetofallparametervalues

which would not be rejected (via a likelihood ratio test) at the

given level. Asymptotically, as the number of loci approaches

in® nity, the shape of the likelihood curve becomes Gaussian

(normal) and wecan construct avariancefor itusing a?2metric

with two degreesof freedom (Cox and Hinkl ey 1974, p. 314).

Using this approach, the area of the parameter space in which

the log likelihood is no more than three units below the maxi-

mum can be taken as a rough 95% con® dence interval.

T ?1

g(1 ? e?gt) .

This rescaled time is then substituted for ordinary time in

constructingrearrangementsof the genealogy. In caseswhere

g islessthan zero, some proportion of the rescaled times will

correspond to in® nite ordinary time. Our implementation

rejects genealogies which contain in® nite times, on the

groundsthat their likelihood for biologically reasonable data

will tend to be very small. An upwardsbiasmay be created by

thisprocedure, but in practice it should be trivial.

Page 3

431Estimating Growth Rates

Such con® dence intervals will be approximate at best for

® nite numbers of loci. It is not obvious a priori whether bias

present in the maximum likelihood parameter estimates will

also strongly affect the con® dence intervals. We have not

solved this problem analytically, but we can assess the use-

fulnessofthe approximatecon® denceintervalsbysimulation.

Simulation procedures: Each simulation consisted of 100

replicates. Genealogiesof 25 sequenceswererandomlygener-

ated according to given valuesof ? and g, and DNA data were

generated randomly from these genealogies using a Kimura

2-parameter model (Kimura 1980) with a transition/trans-

version ratio of 2.0. In the following description a ªstepº is

the construction of a single genealogy; a ªchainº is a set of

such genealogies used to make a parameter estimate, which

can then be used to set initial parameters for the following

chain. For both the exponential-growth program Fluctuate

and our constant population sizeprogram Coalesce(used for

comparison), we used the following search strategy: for each

locus, 10 short chains of 1000 steps each were run, followed

by 2 long chains of 15,000 steps each, sampling every 20th

step. We provided the programs with the correct transition/

transversion ratio. For initial estimates of ? we used Watter-

son's estimate (Wat t erson 1975); for initial estimates of g

we arbitrarily chose 1.0. Initial genealogies were generated

using Phylip programs (Fel senst ein 1993, version 3.5c):

Dnadist to produce corrected distances from the sequence

data, and Neighbor to generate Unweighted Pair-Group

Method using Averages(UPGMA) genealogiesfromthesedis-

tances.

Wealsoperformedsimulationsinwhichwemademaximum

likelihood estimates assuming that the true genealogy was

known withouterror. Thisisequivalenttousing in® nitelylong

sequences, since with such sequencestheMetropolis-Hastings

sampler should unerringly generate the true genealogy. We

have called these results ªin® nite sitesº in the Tables.

For each estimation, we noted whether or not the log likeli-

hood for the true ? and g was within three units of the log

likelihood at the maximum, i.e., whether or not the truth

could be rejected at the approximate 95% level.

deviation of g was much less for high true values of ?

than for low ones, even with in® nite numbers of sites.

Estimates of ? also tended to be biased upwards, in

contrast to the constant-population case, in which they

appear nearly unbiased (Kuhner et al. 1995).

With few exceptions, doubling the number of loci

wasmore effective in reducing biasand standard devia-

tion than doubling the number of sites.

In most cases the true values of ? and g were rejected

at the 95% level slightlymore often than the desired 5%.

Table 2 shows comparable results, for the case in

which the true g was zero, from the program Coalesce

(Kuhner et al. 1995) which uses a similar Metropolis-

Hastingsstrategybut doesnot allowchangesin popula-

tionsize.Examinationoftheresultssuggeststhatadding

growth asa parameter approximately doublesthe stan-

dard deviation of ?.

DISCUSSION

Why is the estimate of g biased? We have identi® ed

two processes that contribute to this bias. Both are in-

trinsic to the estimation of exponential growth from

genealogical data and are not due to the Metropolis-

Hastings sampler itself: they can be shown in simple

casesthat donot requireanyof the Metropolis-Hastings

machinery.

Onecomponentofthebiasresultsfromthenonlinear

relationshipbetween the coalescencetimesandtheesti-

mate of g. A simple two-sequence case provides a con-

crete demonstration. In genealogies of two tips where

the true growth rate is zero and ? is known without

error, the distribution of the coalescence time t follows

directly from coalescent theory (Kingman 1982a,b).

Centiles of this distribution can then be used to make

a distribution of g Ãvalues (Table 3). The distribution of

g Ãishighlyskewed, with a mean far abovethe true value.

Essentially, the nonlinear relationship between t and g Ã

transforms variance in t into bias in g Ã . Thus, bias is

expected not only in our method but in any method

that uses t (or measurements depending on it, such as

numberof mutations) asabasisfor estimating exponen-

tial growth. For example, the star-phylogenymethod of

Sl at kin and Hudson (1991), which counts variable

sites, shows a similar upwards bias; we have con® rmed

this in simulation tests (data not shown).

However, even in the absence of variability in coales-

cence times some bias is present. Table 4 shows results

based on analysis of a ªperfectº coalescent genealogy,

in which each interval has exactly its expected length;

there isno variance in t. A biasisclearlyvisiblein Table

4, although the 95% con® dence intervals do include

the true value. Thiscomponent of the biasresultsfrom

the fact that any genealogy with ® nite tips truncates

the distribution of coalescence times; it has a ª® nal

coalescenceº attheroot, priortowhich nofurtherinfor-

mation isavailable. This presentslikelihood estimation

with an attractivehypothesisinvolving apopulation bot-

RESULT S

Table 1 showsresults from simulation testsof Fluctu-

ate. We do not present resultsfor the case of ? ? 0.01,

g ? 100 with ® nite numbers of sites because data sets

simulated at these values frequently contained no vari-

ablesites.Ontheoreticalgroundsweexpectaninvariant

data set to produceazero estimateof ? and an indeter-

minate estimate of g (all values are equally likely).

Cases where g is negative entail the possibility that

in® nitetimewill berequired forcoalescencewhensimu-

lating the genealogy. The probability that this will hap-

pen dependson the product of ? and g. In practice, the

case of ? ? 0.01, g ? ?10.0 could be simulated (less

than 1% failure to coalesce), but in the case of ? ? 0.1

a substantial fraction of simulated genealogiesfailed to

coalesce in ® nite time, and so no resultsare presented.

In general, estimates of g showed a strong upwards

bias, decreasing somewhat with number of sites and

more markedlywith number ofloci. Theonlyexception

wasthe case of ? ? 0.1, g? 100 in which the estimates

appear biased downwards with ® nite amounts of data,

possiblydue to saturation of variablesites. Thestandard

Page 4

432 M. K. Kuhner, J. Yamato and J. Felsenstein

T ABLE 1

Fluctuate simulation results

? ? 0.01

? ? 0.1

Loci bp ? 500bp ? 1000bp ? ∞

bp ? 500bp ? 1000bp ? ∞

A. Estimate of ?

0.013

0.011

0.012

0.011

ND

ND

g ? ?101

2

1

2

1

2

0.014

0.012

0.013

0.012

ND

ND

0.010

0.010

0.011

0.010

0.011

0.011

ND

ND

0.112

0.104

0.097

0.103

ND

ND

0.107

0.106

0.092

0.097

ND

ND

0.113

0.104

0.110

0.106

g ? 0

g ? 100

B. Standard deviation of ?

0.006

0.003

0.004

0.003

ND

ND

g ? ?101

2

1

2

1

2

0.009

0.004

0.009

0.004

ND

ND

0.002

0.002

0.002

0.002

0.003

0.002

ND

ND

0.032

0.023

0.043

0.033

ND

ND

0.027

0.017

0.038

0.029

ND

ND

0.028

0.018

0.031

0.021

g ? 0

g ? 100

C. Estimate of g

165.8

71.3

145.9

82.1

ND

ND

g ? ?101

2

1

2

1

2

257.3

130.2

360.2

128.4

ND

ND

50.0

27.8

45.1

46.8

227.4

187.1

ND

ND

14.6

5.1

73.6

53.4

ND

ND

6.2

6.2

69.7

52.7

ND

ND

12.7

5.3

119.1

110.2

g ? 0

g ? 100

D. Standard deviation of g

286.6

149.2

248.3

144.6

ND

ND

g ? ?101

2

1

2

1

2

567.1

463.2

1215.1

298.5

ND

ND

116.9

67.8

95.3

88.1

214.8

144.7

ND

ND

19.3

8.1

73.6

53.4

ND

ND

15.1

10.2

69.7

52.7

ND

ND

15.7

8.5

45.9

27.8

g ? 0

g ? 100

E. Number of samples (out of 100) in which the true values were rejected at the 95% level

g ? ?101 156

27 12

g ? 01 16 10

2 137

g ? 1001 ND ND

2ND ND

1

2

2

5

6

7

ND

ND

2

9

6

7

ND

ND

4

1

5

4

ND

ND

3

3

8

3

Estimates of ? and g based on 100 simulated data sets each, with 25 sequences of the given number of base

pairs. Columns headed bp ? ∞ were created by assuming that the genealogy could be reconstructed without

error. Table 1E shows the number of times that the true valuesof ? and g could be rejected at the nominal

95% level, out of 100 data sets. ND, not determined.

tleneck at the time of the ® nal coalescence; such a

hypothesishashigh likelihood becauseitmaximizesthe

probability of the ® nal event. Attraction towards this

degenerate hypothesis produces a bias in g Ã.

Correctness of the sampler: It is dif® cult to prove

a complex computer program correct, but we tested

Fluctuate in several ways to help assure ourselves that

the observed bias was not due to program error. If the

sampler is run with 100% acceptance (that is, the data

are ignored and every proposed genealogy accepted)

the genealogiesproduced should be an autocorrelated

but otherwise random sample from a coalescent distri-

bution with the given ? and g. We examined large

samplesof such genealogiesand found them consistent

with the random coalescent (data not shown). We also

tested the sampler with g ? 0 and found its results

substantivelyidentical toourpreviousprogramCoalesce

which dealtwith the constant-population case(data not

shown). Based on these tests, we believe the sampler to

becorrect. In any case, asisshown in Tables3 and 4, bias

would be expected in a perfectly functioning sampler.

Overcoming bias: Given that thismethod (and other

Page 5

433Estimating Growth Rates

T ABLE 3 T ABLE 2

Coalesce (constant-population) simulation resultsTheoretical results for tree of two tips

Loci Mean g Ã SD of meanMedian g Ã

?Ã

SD of ?

1

2

3

100

20.3

3.1

1.3

0.02

75.4

12.3

3.6

0.07

2.2

0.8

0.5

0.01

500 bp1000 bp 500 bp1000 bp

A. Low? (0.01), low g (0)

0.0097 0.0099

0.0102 0.0101

One locus

Two loci

0.0042

0.0028

0.0034

0.0025

The expected distribution of t for trees of two tips was

determined, and centilesof the distribution used to construct

adistributionforg Ã. Givenvaluesaremean, standard deviation,

and median of g Ãfor one, two, and three loci. The true value

of g was 0.0. ? was assumed to be known without error. The

result for 100 loci isan approximation based on 1000 replica-

tionsusing valuesof t drawn at random from the distribution

for each locus.

B. High ? (0.1), low g (0)

0.09820.1006

0.10520.1116

One locus

Two loci

0.0191

0.0184

0.0237

0.0167

Estimatesof ? and gbased on 100simulated data setseach,

with 25 sequences of the given number of base pairs. SD,

standard deviation.

methods involving use of t to estimate g) has bias, how

can the mostaccurate resultsbe obtained? Tables3 and

4 showclearly that adding additional sitesor sequences

is ineffectual, whereas adding additional unlinked loci

rapidly reduces the bias. Each new locus will provide

additional information about the region of the early

branchings, thereby¯eshingoutthispartofthedistribu-

tion, and the independent variation in coalescence

times among loci helps counteract the biasintroduced

by non-linearity.

It appears that the small bias seen in ? is a conse-

quence of correlation between ? and g, since it does

not appear when g is held constant at zero (as in Co-

alesce). One positive aspect of these ® ndings is that it

is quite possible to estimate current ? accurately even

if the population has been growing or shrinking; the

bias in ? is small even when g is far from zero.

Future directions: Real biological populations often

grow or decline in ways more complicated than simple

exponential growth, but the biasin the estimator inter-

feres with attempts to ® t more complex models. For

example, one could imagine ® tting a two-stage model

with exponential growth followed by a steady-state pe-

riod; however, becauseofthesparsenessofthe rootward

part of the genealogy this model would be attracted to

wrong solutions featuring very rapid early growth. It is

possible that using a suf® ciently large number of loci

would allow such models to work.

Since relativelylittle power isavailable for estimating

growth, attempts to differentiate between different mod-

elsof growth (for example, exponential versus geomet-

ric or linear) are unlikely to succeed with reasonably

sizeddata sets. In principle, however, thismethodcould

accommodate any growth model for which the time

transformation can be worked out.

The algorithm can readily be adapted to data types

other than nucleotide sequence data, such as protein

sequences, allozyme alleles, or restriction site polymor-

phisms, as long as an appropriate evolutionary model

is available.

It is possible to extend this family of algorithms by

including recombination, which will greatly facilitate

the analysis of nuclear loci. This mayalso allowa single

longlocustoprovidesomeof theadvantagesofmultiple

loci, since recombination turns the single genealogy

into several partially correlated genealogies. However,

the algorithm with recombination will be technically

challenging due to the more complex data structures

and rearrangement scheme required. Grif®t hs and

Marjoram (1996) have developed an alternative ap-

proach to genealogical sampling in the presence of re-

combination,whichisalsocomputationallydemanding:

it will be interesting to compare these approaches in

the future.

Availabilityof software:TheMetropolis-HastingsMonte

T ABLE 4

Results from perfectly coalescent genealogies

No. of tips

?

LowerUppergLower Upper

10

100

1000

10000

1.2093

1.0200

1.0026

1.0003

0.7454

0.9463

0.9913

0.9988

2.5514

1.1127

1.0143

1.0018

1.012

0.497

0.422

0.409

?2.283

?1.751

?1.634

?1.610

3.458

2.079

1.892

1.860

Estimates of ? and g, and upper and lower approximate 95% con® dence limits, for ªperfectº genealogies

of the given number of sequences. True ?, 1.0; true g, 0.0.

Page 6

434 M. K. Kuhner, J. Yamato and J. Felsenstein

Fel senst ein, J., and G. Churchil l , 1996

approach to variation among sitesin rate of evolution. Mol. Biol.

Evol. 13: 93±104.

Grif®t hs, R. C., and P. Marjoram, 1996

samplesof DNA sequenceswith recombination. J. Comput. Biol.

3: 479±502.

Grif®t hs, R. C., and S. Tavare Â, 1994

allelesin a varying environment. Philos. Trans. R. Soc. Lond. B

Biol. Sci. 344: 403±410.

Hast ings,W.K., 1970 MonteCarlosamplingmethodsusingMarkov

chains and their applications. Biometrika 57: 97±109.

Kimura, M., 1980 A simple model for estimating evolutionaryrates

of base substitutions through comparative studies of nucleotide

sequences. J. Mol. Evol. 16: 111±120.

Kingman, J. F. C., 1982a The coalescent. Stochastic Processes and

Their Applications 13: 235±248.

Kingman, J. F. C., 1982b On the genealogy of large population. J.

Appl. Probab. 19A: 27±43.

Kuhner, M., J. Yamat o and J. Fel senst ein, 1995

tive population size and mutation rate from sequencedata using

Metropolis-Hastingssampling. Genetics 140: 1421±1430.

Met ropol is, N., A. W. Rosenbl ut h, M. N. Rosenbl ut h, A. H.

Tel l er and E. Tel l er, 1953

fast computing machines. J. Chem. Phys. 21: 1087±1092.

Rogers, A. R., and H.Harpending, 1992

waves in the distribution of pairwise genetic differences. Mol.

Biol. Evol. 9: 552±569.

Sl at kin, M., and R. R. Hudson, 1991

chondrial DNA sequences in stable and exponentially growing

populations. Genetics 129: 555±562.

Wat t erson, G. A., 1975 On the number of segregating sites in

genetical models without recombination. Theor. Popul. Biol. 7:

256±276.

A Hidden Markov Model

Carlo algorithm described here is available from the au-

thorsasprogram Fluctuatein the package Lamarc, which

usesthesameinput/outputformatsasthePhylip package.

(The program is written in C and can by obtained by

anonymous ftp at evolution.genetics.washington.edu in

directorypub/lamarcor viathe World Wide Web at http:

//evolution.genetics.washington.edu/lamarc.html.)

Ancestral inference from

Sampling theory for neutral

Wethank Mont y Sl at kin andSimonTavare Âforhelpful discussion

and Pet er Beerl i for assistance in ® nding maxima of likelihood

surfaces. We also thank the Organizing Committee of the fourth

annual meeting of the Society of Molecular Biology and Evolution

for inviting the ® rst author to a highly productive meeting. This

research was supported by National Science Foundation grants BIR-

8918333 and DEB-9207558 and National Institutes of Health grant

2-R55GM41716-04 (all to J.F.).

Estimating effec-

Equationsof state calculations by

LIT ERATURE CITED

Population growth makes

Cox, D. R., and D. V. Hinkl ey, 1974

and Hill, London.

Fel senst ein, J., 1981

maximum likelihood approach. J. Mol. Evol. 17: 368±376.

Fel senst ein,J., 1992 Estimating effectivepopulation sizefrom sam-

ples of sequences: inef® ciency of pairwise and segregating sites

ascompared to phylogenetic estimates. Genet. Res. 59: 139±147.

Fel senst ein, J., 1993 Phyl ip (Phylogeny Inference Package) ver-

sion 3.5c. Distributed by the author. Department of Genetics,

University of Washington, Seattle.

Theoretical Statistics. Chapman

Evolutionary trees from DNA sequences: a

Pairwise comparisonsofmito-

Communicating editor: S. Tavare Â