ArticlePDF Available

Abstract and Figures

The repeatability of evolutionary change is difficult to quantify because only a single outcome can usually be observed for any precise set of circumstances. In this study, however, we have quantified the frequency of parallel and divergent genetic changes in 12 initially identical populations of Escherichia coli that evolved in identical environments for 20,000 cell generations. Unlike previous analyses in which candidate genes were identified based on parallel phenotypic changes, here we sequenced four loci (pykF, nadR, pbpA-rodA, and hokB/sokB) in which mutations of unknown effect had been discovered in one population, and then we compared the substitution pattern in these “blind” candidate genes with the pattern found in 36 randomly chosen genes. Two candidate genes, pykF and nadR, had substitutions in all 11 other populations, and the other 2 in several populations. There were very few cases, however, in which the exact same mutations were substituted, in contrast to the findings from conceptually related work performed with evolving virus populations. No random genes had any substitutions except in four populations that evolved defects in DNA repair. Tests of four different statistical aspects of the pattern of molecular evolution all indicate that adaptation by natural selection drove the parallel changes in these candidate genes. • bacterial evolution • mutation • natural selection • parallel evolution
Content may be subject to copyright.
Tests of parallel molecular evolution in a long-term
experiment with
Escherichia coli
Robert Woods*, Dominique Schneider
, Cynthia L. Winkworth
, Margaret A. Riley
§
, and Richard E. Lenski*
Departments of *Zoology and
Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI 48824;
Laboratoire Adaptation et
Pathoge´ nie des Microorganismes, Universite´ Joseph Fourier, Institut Jean Roget, F-38041 Grenoble, France;
Department of Ecology and Evolutionary
Biology, Yale University, New Haven, CT 06520; and
§
Department of Biology, University of Massachusetts, Amherst, MA 01003
Communicated by John R. Roth, University of California, Davis, CA, April 10, 2006 (received for review January 22, 2006)
The repeatability of evolutionary change is difficult to quantify
because only a single outcome can usually be observed for any
precise set of circumstances. In this study, however, we have
quantified the frequency of parallel and divergent genetic changes
in 12 initially identical populations of Escherichia coli that evolved
in identical environments for 20,000 cell generations. Unlike pre-
vious analyses in which candidate genes were identified based on
parallel phenotypic changes, here we sequenced four loci (pykF,
nadR, pbpA-rodA, and hokBsokB) in which mutations of unknown
effect had been discovered in one population, and then we
compared the substitution pattern in these ‘‘blind’’ candidate
genes with the pattern found in 36 randomly chosen genes. Two
candidate genes, pykF and nadR, had substitutions in all 11 other
populations, and the other 2 in several populations. There were
very few cases, however, in which the exact same mutations were
substituted, in contrast to the findings from conceptually related
work performed with evolving virus populations. No random
genes had any substitutions except in four populations that
evolved defects in DNA repair. Tests of four different statistical
aspects of the pattern of molecular evolution all indicate that
adaptation by natural selection drove the parallel changes in these
candidate genes.
bacterial evolution mutation natural selection parallel evolution
P
arallel evolution and convergent evolution occur when two
or more lineages independently evolve similar or identical
features. Parallel evolution and convergent evolution are usually
distinguished on the basis that parallelism involves changes in
homologous features among closely related organisms, whereas
convergence can involve changes in different antecedent fea-
tures among more distantly related organisms (1–3). Both
parallel evolution and convergent evolution provide strong
evidence that the derived similarities resulted from adaptation
by natural selection, provided the state-space of possible changes
is so large that it is improbable that the observed similarities
arose by a purely random process. There are many compelling
examples of parallel evolution in nature, including recent studies
of lizard morphology (4) and fish behavior (5), showing that
certain phenotypes evolved repeatedly when separate popula-
tions independently colonized similar environments. Also, some
pathogens exhibit striking parallel genomic changes, including
multiple HIV lineages that substituted similar mutations con-
ferring antiviral drug resistance (6) and several strains of Esch-
er ichia coli that independently acquired similar virulence factors
by horizontal transfer (7).
Yet, despite these and other compelling examples of parallel
evolution (8–10), it has proven difficult to quantify evolution-
ary repeatability. In principle, even the most basic quantifi-
cation of parallel evolution would include the number of
potential instances of parallel outcomes, which could be
compared with the actual number seen. In practice, however,
the number of potential instances is rarely given and difficult
to ascertain. For example, undetected extinctions of other
populations that had evolved different, but ultimately unsuc-
cessful, adaptations might cause an upward bias to an estimate
of the extent of parallelism. Also, comparative studies cannot
generally exclude subtle differences in selective environments
or in founding genotypes as causes of divergent evolutionary
outcomes, which produce a downward bias in any estimate of
evolutionary repeatability. Thus, it is difficult to know the
denominator that corresponds to the number of potential cases
of parallel outcomes to compare with the actual number
observed. However, well designed evolution experiments over-
come these problems because the number of independent
populations is set by the experimenter, and systematic envi-
ronmental differences are precluded by an appropriate design.
Moreover, in experiments with microorganisms, replicate pop-
ulations can be founded by single haploid individuals, such that
there is no initial genetic variation and therefore any paral-
lelism must depend on the independent origin, as well as fate,
of variants (11, 12). Such experiments have now provided many
examples of both parallel and divergent evolution affecting
both phenotypic and genetic properties (13–31).
In a landmark study, Wichman et al . (24) examined parallel
evolution at the genetic level in
X174, a DNA virus with 11
genes and a 5.4-kb genome. Two populations were propagated
for 10 days on a novel host strain, and the viral genomes were
sequenced before and after the experiment. That study found 29
mutations, of which 14 were identical in the two populations.
Qualitatively similar results were obtained by Bull et al. (19) with
several additional populations of
X174 by using a more com-
plex experimental design. However, it is not known whether
much larger genomes, encoding more complex organisms and
having potentially many more targets of selection, would show
similarly strong parallelism at the sequence level.
To address that issue, we sought to examine the extent of
genetic parallelism in a long-term experiment with 12 popula-
tions of E. coli (11, 13, 25, 32), a bacterium with some 4,300 genes
and a genome of 4,500 kb (33). We (27, 29, 31) have previously
reported three cases of parallel substitutions affecting the rbs
operon, the spoT gene, and two genes involved in DNA topology
in these populations. Importantly, however, all three cases
depended on first finding that there had been parallel phenotypic
changes in ribose catabolism, global expression profiles, and
DNA supercoiling, respectively. Therefore, although these cases
provide clear examples of genetic parallelism, they represent a
nonrandom sample that might not reflect the overall extent of
parallel changes in the genome as a whole. In this study, by
contrast, we pursue an approach that is completely independent
of any known parallel phenotypic changes to address the extent
of genetic parallelism in a statistically unbiased manner. Our
specific approach is to compare the pattern of mutational
substitutions, in several candidate genes in which we have found
Conflict of interest statement: No conflicts declared.
Data deposition: The sequences reported in this paper have been deposited in the GenBank
database (accession nos. AY849930–AY849933 and AY625099 –AY625134).
To whom correspondence should be addressed. E-mail: lenski@msu.edu.
© 2006 by The National Academy of Sciences of the USA
www.pnas.orgcgidoi10.1073pnas.0602917103 PNAS
June 13, 2006
vol. 103
no. 24
9107–9112
EVOLUTION
mutations of unknown effect in one population (34), with the
pattern observed in many other genes that were chosen com-
pletely at random (35). Here, we use the idea of ‘‘candidate
gene’’ to mean only that a mutational substitution was previously
found in that gene in one population, not that the gene was
investigated based on parallel phenotypic changes related to its
function. Thus, they might also be called ‘‘blind’’ candidate
genes.
The 12 replicate populations all were founded by the same
ancestral strain and grown in identical environments for 20,000
generations. They evolved similar, but not identical, changes in
various aspects of their performance, morphology, and physiol-
ogy (11, 13, 15, 18, 25, 27, 29, 31, 32). Also, four of the
populations became ‘‘mutators’’ by evolving defects in their
DNA repair pathways, which caused large increases in their
spontaneous mutation rates (20, 25); the distinction between
mutator and nonmutator populations is important for some of
our analyses. In previous work (34), we discovered and charac-
terized four insertion mutations, which provide the basis for our
present study, by comparing the genomic fingerprints of two
evolved populations (neither a mutator) with their common
ancestor. The position of these four mutations in the phylogenies
of the populations in which they arose indicated that they were
eventually substituted, which suggested that they either were
beneficial themselves or, alternatively, had hitchhiked with un-
known beneficial mutations (34, 36). Therefore, although we
suspected these mutations might be beneficial, we chose each
candidate gene based on a single mutation to avoid any bias
toward parallel evolution. Population A
1 substituted IS150
insertion mutations in nadR, hokBsokB, and upstream of pbpA-
rodA; and population A
1 substituted an IS150 insertion in pykF.
These four loci thus became the blind candidate genes for further
investigation. In this study, they were sequenced in clones
sampled from all 12 populations after 20,000 generations. The
resulting sequence data were then used to test whether evolution
was parallel at the levels of genes and the mutations within them
and to quantify the extent of any parallelism.
Results and Discussion
Little Parallelism at the Level of Mutations in the Candidate Genes.
We sequenced the four candidate loci (pykF, nadR, hokBsokB,
and pbpA-rodA) in the ancestor and clones sampled at generation
20,000 from all 12 populations. Fig. 1 shows the physical extent
of sequencing and marks the locations for all of the mutations
found in the sequenced regions; some 7,150 bp were sequenced
for each evolved clone. A total of 40 mutations were found
including the 4 IS150 insertion mutations previously discovered
and 36 additional mutations, all of which were confirmed by
resequencing. It is impossible to prove that these mutations were
absolutely fixed in these populations. In particular, frequency-
dependent selection (37) and clonal interference (38) may
sustain minority populations and give rise to situations in which
beneficial mutations reach high frequency without being substi-
tuted. To examine this issue further, PCR and restriction frag-
ment length polymorphism (RFLP) assays were developed to
test numerous clones from multiple generations in two popula-
tions that together had seven mutations in candidate genes
(R.W., unpublished data). All seven cases provide compelling
evidence for substitution before 20,000 generations; for example,
the pbpA mutation in population A
1 was present in all 48 clones
tested from a sample taken at generation 5,000 (39). We
conclude, therefore, that the vast majority of the mutations
found in candidate genes were substituted. Table 4, which is
published as supporting information on the PNAS web site, lists
the precise locations and other molecular details for each
mutation. Two of the 36 newly discovered mutations lie outside
the 4 candidate genes and their known regulatory regions; both
occur in or near ydcA, a gene of unknown function near the
hokBsokB locus. These two mutations were excluded from our
main statistical tests because their relevance for quantifying
parallel evolution was unclear, although this decision had no
effect on these tests (as explained later).
Of the 38 mutations found in the candidate genes, there were
35 distinct mutations. Only two mutations were found in multiple
populations: three populations had identical G3 T substitutions
at position 901 in pykF, and two other populations had the same
Fig. 1. Mutations substituted by 20,000 generations in four candidate genes in 12 experimental populations of E. coli. Lighter regions indicate protein-coding
sequences for and near pykF (A), nadR (B), pbpA-rodA (C), and hokBsokB (D). Long bars below indicate the range sequenced; short bars show scale (200 bp).
Each arrow marks a mutation; the number shows the affected population. The mutations in and near ydcA are of unknown relevance.
*
,AnIS150 insertion, and,
for populations 1 and 1, these were the original mutations used to identify the candidate genes.
, A 1-bp deletion. §, A synonymous mutation. All others
in the coding regions are nonsynonymous, except for a 1-bp insertion in ydcA in population 2.
9108
www.pnas.orgcgidoi10.1073pnas.0602917103 Woods et al.
A3 G mutation at position 902 in nadR. The 66 possible pairs of
the 12 E. coli populations shared, on average, only 2.1% of their
mutations. By contrast, the two
X174 virus populations studied
by Wichman et al. (24) shared almost half of their mutations.
Parallelism at the level of mutations was evidently much less
common in these evolving bacteria than in the previously studied
viruses.
Extensive Parallelism at the Level of the Candidate Genes. Turning
from mutational identity to the level of the affected genes, the
pattern is very different. Every population had exactly one
nonsynonymous point mutation in both pykF (Fig. 1 A) and nadR
(Fig. 1B), with the exception of the focal populations that
contained the previously discovered insertion mutations that led
to identification of these candidate genes. One synonymous
mutation was found in pykF, and none in nadR. The two other
candidates also yielded mutations, although not in every popu-
lation. Besides the focal population’s insertion upstream of
pbpA-rodA, five others had mutations in the upstream region, the
pbpA gene, or both (Fig. 1C). For hokBsokB, three other
populations had insertions similar to that in the focal population
(Fig. 1D). The many independent substitutions in the candidate
genes suggest parallelism, but it is necessary to demonstrate that
the numbers are above those expected by chance. For example,
perhaps most genes accumulated mutations after 20,000 gener-
ations, such that finding mutations in a candidate gene in several
or even all of the populations is statistically unremarkable.
To that end, we compared the results of sequencing the
candidate genes with the results of sequencing 500-bp regions
in each of 36 randomly chosen genes for the same 12 populations
(35). Only six substitutions in total were found in these random
genes; three of these substitutions were synonymous point
mutations, and the other three were nonsynonymous point
mutations. Moreover, all of these substitutions in random genes
were found in the four evolved mutator populations.
Statistical Tests Support Parallelism at the Level of Candidate Genes.
We performed four distinct tests of the hypothesis that natural
selection drove parallel changes in the candidate genes. First, we
compared the overall substitution rates between the candidate
and randomly chosen genes, with an expectation of a higher rate
in the candidates because they accumulated substitutions by
selection as well as by drift. Consistent with that expectation, all
12 populations had higher substitution rates in the candidate
genes (Table 1), which is highly unlikely by chance (sign test, P
0.0002). This result is unaffected by excluding the insertions used
to identify the candidates in populations A
1 and A
1; both of
these focal populations had substitutions in one or more genes
whose candidacy was identified in the other population, and
neither of them had any substitutions in the random genes.
Second, if mutations in the candidate genes were beneficial,
then we would expect to see an excess of nonsynonymous
substitutions relative to synonymous substitutions. Three of 6
point mutations in the randomly chosen genes were synonymous,
but only 1 of 27 was synonymous in the candidate genes (Table
2). This difference is significant (Fisher’s exact test, P 0.0136)
and also supports the hypothesis that the mutations substituted
in the candidate genes were beneficial.
Third, theory predicts that the substitution rate for neutral
mutations should scale with the mutation rate (40), whereas the
substitution rate for beneficial mutations is subject to diminish-
ing returns in large asexual populations owing to clonal inter-
ference (41). Recall that four of the populations became muta-
tors and had mutation rates 100-fold higher than the other
eight populations. We expect to observe more nonbeneficial
substitutions in the mutator populations and consequently a
relatively higher proportion of beneficial mutations in the non-
mutator populations. Indeed, all 6 substitutions in the random
genes were found in the mutator populations, whereas half of the
30 point mutations in the candidate genes were substituted in the
nonmutator populations (Table 3). This difference is significant
in the direction expected if the candidate genes experienced
selection favoring new alleles (Fisher’s exact test, P 0.0279).
Fourth, if mutations in the candidate genes were neutral, then
the numbers of substitutions in the populations should follow a
Poisson distribution if all populations had the same mutation
rate, or they would be substantially clumped in the mutator
populations given the differences in mutation rates. By contrast,
if the substituted mutations were beneficial, and if different
mutations in the same gene conferred functionally similar ben-
efits, then we would expect a more uniform distribution of
mutations. Unlike the first three statistical tests, this test is
independent of the evolutionary forces affecting the randomly
chosen genes. The distributions are the most uniform possible
given the numbers of mutations in two of the candidate genes
(Fig. 2). For nadR, there were 12 substitutions, with each
population having exactly 1; the likelihood of this distribution by
chance is 12!12
12
0.0001. For pykF, the chance of 11
Table 1. Number of mutations in random and candidate genes
in 12 E. coli populations after 20,000 generations
Population
Random genes
(18,374 bp total)
Candidate genes
(7,150 bp total)
No. of
mutations
Rate per
1000 bp
No. of
mutations
Rate per
1000 bp
A
1 0 0.000 3 0.420
2* 0.280
A
2 0 0.000 6 0.839
A
3 0 0.000 3 0.420
A
4 3 0.163 3 0.420
A
5 0 0.000 4 0.559
A
6 0 0.000 2 0.280
A
1 0 0.000 4 0.559
1* 0.140
A
2 0 0.000 2 0.280
A
3 1 0.054 3 0.420
A
4 0 0.000 2 0.280
A
5 0 0.000 2 0.280
A
6 2 0.109 4 0.559
Numbers of mutations and substitution rates are pooled across 36 random
genes and 4 candidate genes. Populations A
2, A
4, A
3, and A
6 became
mutators; all others retained the ancestral mutation rate.
*Excluding the mutations (one in A
1, three in A
1) used to identify the
candidate genes.
Table 2. Candidate and random genes differ in relative
abundance of synonymous and nonsynonymous
point mutations
Gene Synonymous Nonsynonymous
Candidate 1 26
Random 3 3
Table 3. Candidate and random genes differ in relative
abundance of point mutations in mutator and
nonmutator populations
Gene Mutator Nonmutator
Candidate 15 15
Random 6 0
Woods et al. PNAS
June 13, 2006
vol. 103
no. 24
9109
EVOLUTION
populations having 1 mutation and 1 having 2 mutations is
0.0004. Moreover, these calculations are very conservative be-
cause the four mutator populations should push strongly away
from uniformity, making the observed distributions that much
more unexpected. The other two candidate genes do not deviate
significantly from the Poisson distribution, but that outcome may
simply reflect fewer mutations in those genes and the very
conservative nature of this test.
Statistical Tests Are Robust with Respect to Criteria for Data Inclusion.
Regarding the excess of nonsynonymous substitutions in the
candidate genes, Table 2 includes point mutations in protein-
coding regions only, with 27 such mutations in the candidate
genes and 6 in the random genes. Three additional nonpoint
mutations occurred in the protein-coding regions of the candi-
date genes, including two IS-element insertions and one 1-bp
deletion, and these mutations could be viewed as nonsynony-
mous because they change the resulting amino acid sequence. If
these additional mutations are included in the analysis, the
outcome remains significant (P 0.0104).
With respect to the observed excess of substitutions in the
nonmutator populations among the candidate genes, Table 3
includes 30 point mutations in the candidate genes and 6 point
mutations in the random genes. There were eight additional
mutations in candidate genes, including seven insertions and one
1-bp deletion. One of these insertions was in a mutator popu-
lation, and all others were in nonmutator populations. If these
additional mutations are included, the outcome remains signif-
icant (P 0.0211).
Regarding the two mutations found in or near ydcA, we chose
both candidate and random genes a pr iori, as explained. The
ydcA mutations do not fit into either category, and therefore they
were excluded from our main analyses. It is possible that ydcA is
related to hokBsokB, given its proximity and unknown func-
tionality, but it is also possible these loci have nothing to do with
each other. If we include ydcA with the random genes, it would
not weaken any of our four tests and, in fact, strengthens one of
them. The first test (Table 1) compares the density of mutations
found in random and candidate genes; adding one random
mutation to both A
2 and A
6 would not change the fact that
all 12 populations have a higher substitution density in candidate
genes. The second test (Table 2) is unaffected because neither
ydcA mutation counts as synonymous or nonsynonymous; one is
outside the coding region, and the other is not a point mutation.
The third test (Table 3) compares the mutator and nonmutator
populations. Both ydcA mutations are in mutator populations,
and including it as a random gene would strengthen that already
significant result. Finally, the fourth test (Fig. 2) is unaffected
because noncandidate genes do not enter into the analysis. If,
instead, we include ydcA with the hokBsokB candidate locus,
the third test would be slightly weakened but remain significant,
whereas the other tests would not be affected.
Alternative Hypotheses Are Inconsistent with One or More Tests. The
four tests individually and collectively support the hypothesis
that parallel evolution in the candidate genes was driven by
natural selection favoring the mutant alleles, and their conclu-
sions are robust when we use different criteria for including
ambiguous data. The first two tests are also consistent with the
alternative hypothesis that the candidate genes had relaxed
selective constraints, such that they could accumulate mutations
without adverse effects. However, the third and fourth tests
clearly reject this alternative hypothesis because, if it were true,
mutator populations should accumulate disproportionately
more substitutions in candidate genes, and substitutions would
not be uniform across populations. Another alternative is that
the candidate genes might contain ‘‘hot spots’’ with mutation
rates much higher than the genome-wide average. This alterna-
tive also runs counter to the test comparing mutator and
nonmutator populations, unless one further supposes that these
hypermutable sites are independent of the repair pathways that
became defective in the mutators. However, substitutions in
three of the four candidate loci (pykF, nadR, and pbpA) include
transitions and transversions as well as the original insertions,
whereas the substitutions in the random genes occurred only in
the mutators and all of them had signatures reflecting specific
defects in DNA repair (35). The extreme uniformity of substi-
tution number in pykF and nadR, coupled with the multiplicity
of mutational targets in these genes, also contradicts the hot-spot
hypothesis. The several IS insertions in hokBsokB, and the
absence of other types of mutations, might indicate an increased
rate of those mutations at that locus, but such a bias, if it exists,
does not contradict the possibility that the substitutions are also
beneficial (27).
Possible Functional Bases for Beneficial Effects of Mutations in the
Candidate Genes. The four tests collectively provide compelling
evidence that the mutations that were substituted in the candi-
date genes are beneficial in the environment used in the evolu-
tion experiment. We do not know, however, the functional bases
for their beneficial effects. At first glance, the fact that the
candidates were first identified by IS-element insertions in the
focal populations might suggest that the beneficial mutations are
knockouts. However, a more nuanced view is appropriate for
several reasons. First, most of the mutations found in the other
populations are unlikely to act as knockouts, with the probable
exception of the several IS insertions in hokBsokB and one
frameshift mutation in pykF (Table 4). Even if the originally
discovered mutation in a gene were a knockout, other popula-
tions may have substituted mutations with more subtle effects.
Second, the original IS insertion affecting pbpA-rodA is not in the
reading frame but, instead, sits in the upstream regulatory
region, where IS elements can exert subtle effects on gene
expression (42). Third, in the case of nadR, the affected protein
is bifunctional with both repressor and transport domains (34,
43). A knockout of one function could leave the other function
intact; and, in the case of the repressor function, a knockout
would elevate expression of the de-repressed genes.
The following hypotheses suggest how mutations in the can-
didate genes might enhance fitness in the environment of the
long-term evolution experiment (34), although we emphasize
that they require further testing and other explanations may also
be plausible. The pykF gene encodes one of two pyruvate kinases
Fig. 2. Distribution of numbers of substitutions in the 12 populations for the
four candidate genes. Observed distributions are shaded. Poisson distribu-
tions with the same mean as the observed distribution are shown in outline.
9110
www.pnas.orgcgidoi10.1073pnas.0602917103 Woods et al.
that catalyze the conversion of phosphoenolpyruvate (PEP) and
ADP into pyruvate and ATP. PEP is also used by the phospho-
transferase system (PTS) to transport glucose into the cell. By
slowing the conversion of PEP to pyruvate, mutations in pykF
might make more PEP available to drive the PTS-mediated
uptake of glucose, which is the limiting resource in the environ-
ment of the long-term evolution experiment. As noted, nadR
encodes a bi-functional protein that is involved in several aspects
of NAD metabolism, itself a key metabolite important in many
different pathways. Several genes involved in NAD synthesis and
recycling are repressed by the NadR protein, and mutations in
nadR might increase their expression and the resulting intracel-
lular concentration of NAD. The evolved bacteria have higher
maximum growth rates as well as shorter lags after the daily
transfers into fresh medium (15, 29), and increased levels of
NAD might be beneficial in achieving one or both of these
advantages (44). Alternatively, changes in NAD regulation might
improve the control of oxidative stress in the experimental
environment (45). The hokBsokB locus is one of several loci in
E. coli related to the hoksok locus of plasmid R1; hok encodes
a toxin and sok an antisense RNA that blocks translation of the
toxin. Together, these activities kill any cells that lose the
plasmid, a function that may benefit the plasmid but is obviously
harmful to the bacteria. Inactivation of hokBsokB would there-
fore benefit the bacteria (in the absence of plasmids), and indeed
other copies of hoksok loci in E. coli contain insertion elements
that have presumably inactivated them (46, 47). Finally, the
pbpA-rodA operon encodes two essential proteins that are
involved with peptidoglycan synthesis and coupling cell-wall
synthesis to the overall cell cycle (48). All 12 populations evolved
much larger cell volumes (13), which may require altered rates
of peptidoglycan synthesis, changes in the timing of its synthesis
in relation to the cell cycle, or both.
Conclusions and Future Directions. Our results demonstrate that
evolution in these 12 E. coli populations was often parallel at the
level of genes, but only rarely were the substitutions identical at
the base pair level. This latter point stands in sharp contrast with
results obtained in two replicate populations of virus
X174,
where almost half of the substitutions were identical (24). We are
tempted to suggest that this difference in parallelism reflects
differences in genome size and complexity, but that explanation
is by no means proven. Thus, it would be interesting to have
comparably precise experiments for many other viruses and
bacteria as well as archaea, single-celled eukaryotes, and mul-
ticellular animals and plants to evaluate comparatively whether
increasing genomic and functional complexity leads to less
repeatable outcomes of molecular and phenotypic evolution.
In this study, we also observed variation in the extent of
parallelism among the candidate genes, with nadR and pykF
exhibiting more evolutionary repeatability than pbpA and hokB
sokB. These differences might indicate important functional
interactions among loci or, alternatively, that those genes under
stronger selection may converge more quickly on beneficial
substitutions than those experiencing weaker selection. Future
work, including additional generations and genetic manipula-
tions, may reveal whether the gene-level differences between
populations will erode or be sustained. If the fitness effects of the
mutations at these loci are additive, or at least do not change
sign, then we would expect all of the populations to eventually
get mutations at each of these loci. However, if there are epistatic
interactions such that mutations at some of the genes are no
longer beneficial on certain evolved backgrounds, then this
gene-level divergence could be sustained indefinitely. Genetic
manipulations that produce isogenic constructs differing by
single known mutations will also be useful for examining the
physiological mechanisms responsible for the beneficial effects
of the mutations in the candidate genes (29, 31). Finally, recent
technological advances have led to substantial reductions in the
cost of whole-genome sequencing and resequencing (49, 50), so
that it will become feasible to sequence entire genomes from
several or all of the populations in this long-term evolution
experiment.
Methods
Background on the Long-Term Evolution Experiment. Twelve popu-
lations were started from the same ancestral strain of E. coli B,
except that 6 of the populations were founded from an Ara
variant and 6 from an Ara
variant. These populations are
designated A
1toA
6 and A
1toA
6 according to this
marker, which is neutral in the experimental environment (11).
The populations were serially propagated in identical glucose-
limited environments for 20,000 cell generations (3,000 days),
with population sizes fluctuating daily between 5 10
6
and 5
10
8
cells. The populations achieved similar, although not iden-
tical, fitness gains (11, 13, 25). Also, all 12 populations evolved
large increases in average cell size (13), certain catabolic abilities
were lost in parallel (25, 27), and global gene-expression profiles
showed similar changes in the two populations that were exam-
ined in this regard (29). Four populations evolved defects in
DNA repair pathways, which caused 100-fold increases in their
spontaneous mutation rates (20, 25). An in-depth review of this
long-term evolution experiment can be found elsewhere (32).
Sequencing the Four Candidate Genes. An earlier study (34) of two
focal populations used restriction fragment length polymor-
phism-IS genomic ‘‘fingerprinting’’ to find mutations that were
caused by IS-element insertions and that had been substituted in
one focal population or the other. Four IS150 insertions were
then characterized with respect to their site of insertion in pykF,
nadR, hokBsokB, and upstream of pbpA-rodA. These four genes
became the blind candidate genes for this study, and each was
sequenced in the ancestral strain and in clones sampled at
generation 20,000 from all 12 evolved populations. Evolved
clones were chosen at random from single-colony isolates. The
candidate genes vary in length, with the extent of sequencing
shown in Fig. 1. The ancestral sequences for these genes were
deposited in GenBank with accession numbers AY849930
AY849933.
Sequences of Randomly Chosen Genes. We previously chose 36
genes at random from the E. coli genome, and we sequenced
500-bp regions in each gene from the ancestor and 2 clones
randomly sampled from each of the 12 populations after 20,000
generations (35). The ancestral nucleotide sequences for these
regions were deposited in GenBank with accession numbers
AY625099–AY625134. The total length sequenced from each
clone was 18,374 bp. A total of eight mutations was found in the
samples; the precise details of each mutation are provided
elsewhere (table 3 in ref. 35). In cases where one or both clones
had a mutation in a particular gene, that region was sequenced
for three more clones randomly sampled from the same popu-
lation. Four mutations, including three in population A
4 and
one in population A
3, were present in all five clones. They are
counted as substitutions in Table 1 of this paper. The other four
mutations were in population A
6, and all were polymorphic,
with two present in four of the five sampled clones and two
others in only one clone. For the analyses in this paper, A
6is
considered to have two substitutions in the random genes, which
corresponds to the number of cases in which a mutation reached
majority status, and which also equals the summed frequency
across these four polymorphisms.
We thank T. Cooper, D. Hall, F. Moore, E. Ostrowski, D. Schemske, and
T. Schmidt for discussions and comments on our article. This work was
supported by grants from the National Science Foundation.
Woods et al. PNAS
June 13, 2006
vol. 103
no. 24
9111
EVOLUTION
1. Simpson, G. G. (1953) The Major Features of Evolution (Columbia Univ. Press,
New York).
2. Harvey, P. H. & Pagel, M. (1991) The Comparative Method in Evolutionary
Biology (Oxford Univ. Press, Oxford).
3. Futuyma, D. J. (1998) Evolutionary Biology (Sinauer, Sunderland, MA).
4. Losos, J. B., Jackman, T. R., Larson, A., de Queiroz, K. & Rodrı´guez-Schettino,
L. (1998) Science 279, 2115–2118.
5. Rundle, H. D., Nagel, L., Boughman, J. W. & Schluter, D. (2000) Science 287,
306–308.
6. Crandall, K. A., Kelsey, C. R., Imamichi, H., Lane, H. C. & Salzman, N. P.
(1999) Mol. Biol. Evol. 16, 372–382.
7. Reid, S. D., Herbelin, C. J., Bumbaugh, A. C., Selander, R. K. & Whittam, T. S.
(2000) Nature 406, 6467.
8. Stewart, C. B., Schilling, J. W. & Wilson, A. C. (1987) Nature 330, 401–404.
9. Conway Morris, S. (2003) Life’s Solution (Cambridge Univ. Press, Cambridge,
U.K.).
10. Jones, C. D. & Begun, D. J. (2005) Proc. Natl. Acad. Sci. USA 102, 11373–11378.
11. Lenski, R. E., Rose, M. R., Simpson, S. C. & Tadler, S. C. (1991) Am. Nat. 138,
1315–1341.
12. Elena, S. F. & Lenski, R. E. (2003) Nat. Rev. Genet. 4, 457–469.
13. Lenski R. E. & Travisano, M. (1994) Proc. Natl. Acad. Sci. USA 91, 68086814.
14. Rosenzweig, R. F., Sharp, R. R., Treves, D. S. & Adams, J. (1994) Genetics 137,
903–917.
15. Vasi, F., Travisano, M. & Lenski, R. E. (1994) Am. Nat. 144, 432–456.
16. Travisano, M., Mongold, J. A., Bennett, A. F. & Lenski, R. E. (1995) Science
267, 87–90.
17. Mongold, J. A., Bennett, A. F. & Lenski, R. E. (1996) Evolution 50, 35–43.
18. Travisano, M. & Lenski, R. E. (1996) Genetics 143, 15–26.
19. Bull, J. J., Badgett, M. R., Wichman, H. A., Huelsenback, J. P., Hillis, D. M.,
Gulati, A., Ho, C. & Molineux, I. J. (1997) Genetics 147, 1497–1507.
20. Sniegowski, P. D., Gerrish, P. J. & Lenski, R. E. (1997) Nature 387, 703–705.
21. Rainey, P. B. & Travisano, M. (1998) Nature 394, 69–72.
22. Burch, C. L. & Chao, L. (1999) Genetics 151, 921–927.
23. Ferea, T. L., Botstein, D., Brown, P. O. & Rosenzweig. R. F. (1999) Proc. Natl.
Acad. Sci. USA 96, 9721–9726.
24. Wichman, H. A., Badgett, M. R., Scott, L. A., Boulianne, C. M. & Bull J. J.
(1999) Science 285, 422–424.
25. Cooper, V. S. & Lenski, R. E. (2000) Nature 407, 736–739.
26. Notley-McRobb, L. & Ferenci, T. (2000) Genetics 156, 1493–1501.
27. Cooper, V. S., Schneider, D., Blot, M. & Lenski, R. E. (2001) J. Bacteriol. 183,
2834–2841.
28. Riehle, M. M., Bennett, A. F. & Long, A. D. (2001) Proc. Natl. Acad. Sci. USA
98, 525–530.
29. Cooper, T. F., Rozen, D. E. & Lenski, R. E. (2003) Proc. Natl. Acad. Sci. USA
100, 1072–1077.
30. Zhong, S., Khodursky, A., Dykhuizen, D. E. & Dean, A. M. (2004) Proc. Natl.
Acad. Sci. USA 101, 11719–11724.
31. Crozat, E., Philippe, N., Lenski, R. E., Geiselmann, J. & Schneider, D. (2005)
Genetics 169, 523–532.
32. Lenski, R. E. (2004) Plant Breed. Rev. 24, 225–265.
33. Blattner, F. R., Plunkett, G., Bloch, C. A., Perna, N. T., Burland, V., Riley, M.,
Collado-Vides, J., Glasner, J. D., Rode, C. K., Mayhew, G. F., et al. (1997)
Science 277, 1453–1462.
34. Schneider, D., Duperchy, E., Coursange, E., Lenski, R. E. & Blot, M. (2000)
Genetics 156, 477–488.
35. Lenski, R. E., Winkworth, C. L. & Riley, M. A. (2003) J. Mol. Evol. 56, 498–508.
36. Papadopoulos, D., Schneider, D., Meier-Eiss, J., Arber, W., Lenski, R. E. &
Blot, M. (1999) Proc. Natl. Acad. Sci. USA 96, 3807–3812.
37. Rozen, D. E., Schneider, D. & Lenski, R. E. (2005) J. Mol. Evol. 61, 171–180.
38. Gerrish, P. J. & Lenski, R. E. (1998) Genetica 102 103, 127–144.
39. Woods, R. J. (2005) Ph.D. dissertation (Michigan State University, East
Lansing).
40. Kimura, M. (1983) The Neutral Theor y of Molecular Evolution (Cambridge
Univ. Press, Cambridge, U.K.).
41. De Visser, J. A. G. M., Zeyl, C. W., Gerrish, P. J. Blanchard, J. L & Lenski,
R. E. (1999) Science 283, 404 406.
42. Mahillon, J. & Chandler, M. (1998) Microbiol. Mol. Biol. Rev. 62, 725–774.
43. Penfound, T. & Foster, J. W. (1999) J . Bacteriol. 181, 648655.
44. Grose, J. H., Bergthorsson, U. & Roth, J. R. (2005) J. Bacter iol. 187, 2774–2782.
45. Grose, J. H., Joss, L., Velick, S. F. & Roth, J. R. (2006) Proc. Natl. Acad. Sci.
USA 103, 7601–7606.
46. Pedersen, K. & Gerdes, K. (1999) Mol. Microbiol. 32, 1090–1102.
47. Schneider, D., Duperchy, E., Depeyrot, J., Coursange, E., Lenski, R. E. & Blot,
M. (2002) BMC Microbiol. 2, 18.
48. Begg, K. J. & Donachie, W. D. (1998) J. Bacteriol. 180, 2564–2567.
49. Shendure, J., Porreca, G. J., Reppas, N. B., Lin, X., McCutcheon, J. P.,
Rosenbaum, A. M., Wang, M. D., Zhang, K., Mitra, R. D. & Church, G. M.
(2005) Science 309, 1728–1732.
50. Margulies, M., Egholm, M., Altman, W. E., Attiya, S., Bader, J. S., Bemben,
L. A., Berka, J., Braverman, M. S., Chen, Y.-J., Chen, Z., et al. (2005) Nature
437, 376–380.
9112
www.pnas.orgcgidoi10.1073pnas.0602917103 Woods et al.

Supplementary resources (40)

Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
May 2004
R.E. Lenski · C.L. Winkworth · Margaret Riley
Nucleotide Sequence
December 2004
R.J. Woods · D. Schneider · C.L. Winkworth · Margaret Riley · R.E. Lenski
... Two independent fast-growing colonies from each culture were sequenced, revealing mutations in pykF (6 of 12 strains, from 4 of 6 cultures), rpoB or rpoC (4 of 12 strains, 3 of 6 cultures), along with a series of less common mutations (Table 1). Mutations at the pykF [29], rpoB and rpoC [5,6] loci are consistent with other studies of E. coli in minimal medium, and are directly associated with faster growth (Fig 2B). Analysis of the DNA sequencing data using the breseq package [30] demonstrated that mutations in pykF were a consequence of IS5 insertions, transition mutations, and single base deletions. ...
... That stated, we can still learn from the similarities our evolved WT has to previous evolved WT strains and the differences between our evolved deletion mutants and our evolved WT strains. Our evolved WT strains show frequent insertion elements and frameshift mutations disrupting pykF, seen previously in the evolution of WT E. coli B [29], in contrast with other studies of WT E. coli MG1655 cultivated in minimal medium [5,6], which show prevalent rpoB point mutations. Our evolved deletion strains and evolved WT differ in the extent of pykF mutation and in the location of the rpoB and rpoC mutations. ...
... This shared mutation class in independently evolved ∆entC-sup strains suggest that modulating metabolite flow through isochorismate synthase by changing menF expression is important to restore growth in cells lacking entC. Mutations within or upstream of the pykF gene, which encodes pyruvate kinase I, are associated with increased growth rate across a variety of E. coli genetic backgrounds [6,29,69,70], and were identified in our wild-type adaptive evolution experiments (Table 1). However, pykF mutations are rare in our adaptively-evolved mutant strains, in which rpoB/rpoC are by far the most common rescue mutations. ...
Preprint
Cell growth is determined by substrate availability and the cell's metabolic capacity to assimilate substrates into building blocks. Metabolic genes that determine growth rate may interact synergistically or antagonistically, and can accelerate or slow growth, depending on the genetic background and environmental conditions. We evolved a diverse set of Escherichia coli single-gene deletion mutants with a spectrum of growth rates and identified mutations that generally increase growth rate. Despite the metabolic differences between parent strains, mutations that enhanced growth largely mapped to the core transcription machinery, including the β\beta and β\beta' subunits of RNA polymerase (RNAP) and the transcription elongation factor, NusA. The structural segments of RNAP that determine enhanced growth have been previously implicated in antibiotic resistance and in the control of transcription elongation and pausing. We further developed a computational framework to characterize how the transcriptional changes that occur upon acquisition of these mutations affect growth rate across strains. Our experimental and computational results provide evidence for cases in which RNAP mutations shift the competitive balance between active transcription and gene silencing. This study demonstrates that mutations in specific regions of RNAP are a convergent adaptive solution that can enhance the growth rate of cells from distinct metabolic states.
... The increasing scale of experimental evolution, thanks to new levels of throughput and parallelism, provides unprecedented opportunities to understand evolutionary dynamics and mechanisms. Unifying principles have emerged about evolution in a single environment, including that replicate populations show parallel adaptation in recurrent genes, operons or pathways [1][2][3][4][5] , with successive mutations having a saturating effect on fitness, resulting in declining adaptability over time 6,7 . However, laboratory studies have mostly examined static environments, which have relatively temporally constant and spatially uniform conditions and which often select for mutations that exhibit costly fitness tradeoffs if conditions change [8][9][10][11][12] . ...
... Fitness inference. We inferred fitness of mutants by solving equation (1) for all mutants at each timepoint for each condition and replicate: f i+1 = f i e (s−s ) , where f i+1 represents the frequency of a lineage at timepoint i + 1, which changes exponentially from timepoint i if its fitness s differs from mean population fitness s . We measure mean population fitness at each timepoint using ~50 uniquely barcoded neutral strains, which were isolated and identified as neutral, or having fitness equal to that of the ancestor 62 . ...
... We take the average of equation (3) over all neutral lineages at each timepoint i + 1 of each condition and replicate, and plug the result into equation (1) to determine the fitness of each mutant: ...
Article
Full-text available
Evolution in a static laboratory environment often proceeds via large-effect beneficial mutations that may become maladaptive in other environments. Conversely, natural settings require populations to endure environmental fluctuations. A sensible assumption is that the fitness of a lineage in a fluctuating environment is the time average of its fitness over the sequence of static conditions it encounters. However, transitions between conditions may pose entirely new challenges, which could cause deviations from this time average. To test this, we tracked hundreds of thousands of barcoded yeast lineages evolving in static and fluctuating conditions and subsequently isolated 900 mutants for pooled fitness assays in 15 environments. Here we find that fitness in fluctuating environments indeed often deviates from the time average, leading to fitness non-additivity. Moreover, closer examination reveals that fitness in one component of a fluctuating environment is often strongly influenced by the previous component. We show that this environmental memory is especially common for mutants with high variance in fitness across tested environments. We use a simple mathematical model and whole-genome sequencing to propose mechanisms underlying this effect, including lag time evolution and sensing mutations. Our results show that environmental fluctuations impact fitness and suggest that variance in static environments can explain these impacts.
... Environment dictates adaptive trajectories of populations [1][2][3][4][5]. Identical environments tend to drive adaptive parallelism [6][7][8][9][10][11][12][13][14][15][16][17][18], while populations evolving in different environments often exhibit adaptive divergence [4,[19][20][21][22][23][24]. The collateral effects of adaptation, in similar or non-identical, pleiotropic effects of adaptation, in a range of non-synonymous carbon environments, are largely predictable. ...
... Adaptation in identical environments has been studied in the past [6][7][8][9][10][11][12][13][14][15][16][17][18]36], and both parallelism and divergence have been observed depending on the evolutionary context. But how does the likelihood of these possibilities change with minute changes in the environment? ...
... Rights reserved so ubiquitously and so early in glucose adaptation experiments remains a puzzle. It has been suggested that the diminished function of PykF observed in the Lenski adaptation experiments is beneficial insofar as it redirects PEP to increase the import rate of glucose via the phosphotransferase system (PTS) 11 . We propose that loss of flux-mediation through glycolysis resulting from the inactivation of the pykF gene provides an additional benefit of increasing the efficiency of the enzymes in lower glycolysis by increasing enzyme saturation. ...
... Why mutations in such an important enzyme should appear so ubiquitously and so early in glucose adaptation experiments remains a puzzle. It has been suggested that the diminished function of PykF observed in the Lenski adaptation experiments is beneficial insofar as it redirects PEP to increase the import rate of glucose via the phosphotransferase system (PTS) 11 . We propose that loss of flux-mediation through glycolysis resulting from the inactivation of the pykF gene provides an additional benefit of increasing the efficiency of the enzymes in lower glycolysis by increasing enzyme saturation. ...
Article
Full-text available
Adaptive laboratory evolution experiments provide a controlled context in which the dynamics of selection and adaptation can be followed in real-time at the single-nucleotide level. And yet this precision introduces hundreds of degrees-of-freedom as genetic changes accrue in parallel lineages over generations. On short timescales, physiological constraints have been leveraged to provide a coarse-grained view of bacterial gene expression characterized by a small set of phenomenological parameters. Here, we ask whether this same framework, operating at a level between genotype and fitness, informs physiological changes that occur on evolutionary timescales. Using a strain adapted to growth in glucose minimal medium, we find that the proteome is substantially remodeled over 40 000 generations. The most striking change is an apparent increase in enzyme efficiency, particularly in the enzymes of lower-glycolysis. We propose that deletion of metabolic flux-sensing regulation early in the adaptation results in increased enzyme saturation and can account for the observed proteome remodeling.
... To date, most empirical information on fitness landscapes in biological applications has come from studies of RNA (e.g., Schuster 1995;Huynen et al. 1996;Fontana and Schuster 1998), proteins (e.g., Lipman and Wilbur 1991;Martinez et al. 1996;Rost 1997), viruses (e.g., Chao 1999, 2004), bacteria (e.g., Elena and Lenski 2003;Woods et al. 2006), and artificial life (e.g., Lenski et al. 1999;Wilke et al. 2001). The three paradigmatic landscapes -rugged, single-peak, and flat -emphasizing particular features of fitness landscapes have been the focus of most of the earlier theoretical work (reviewed in Kauffman 1993;Gavrilets 2004). ...
Preprint
We study how correlations in the random fitness assignment may affect the structure of fitness landscapes. We consider three classes of fitness models. The first is a continuous phenotype space in which individuals are characterized by a large number of continuously varying traits such as size, weight, color, or concentrations of gene products which directly affect fitness. The second is a simple model that explicitly describes genotype-to-phenotype and phenotype-to-fitness maps allowing for neutrality at both phenotype and fitness levels and resulting in a fitness landscape with tunable correlation length. The third is a class of models in which particular combinations of alleles or values of phenotypic characters are "incompatible" in the sense that the resulting genotypes or phenotypes have reduced (or zero) fitness. This class of models can be viewed as a generalization of the canonical Bateson-Dobzhansky-Muller model of speciation. We also demonstrate that the discrete NK model shares some signature properties of models with high correlations. Throughout the paper, our focus is on the percolation threshold, on the number, size and structure of connected clusters, and on the number of viable genotypes.
... A-WW5 56 13 47 116 In this study, the probability of non-synonymous mutations was higher than that of synonymous mutations, indicating that non-synonymous mutations were dominant in the mutant A-WW5. In general, non-synonymous mutations are mostly beneficial mutations [39,40]. By modifying 21 representative genes in the genome of Saccharomyces cerevisiae, Shen et al. found that most synonymous mutations are harmful [41]. ...
Article
Full-text available
To investigate the effect and mechanism of plasma-activated water (PAW) on Aspergillus niger, PAW was prepared using a needle array–plate dielectric barrier discharge plasma system. The concentrations of long-lived reactive oxygen and nitrogen species (RONS), namely, H2O2, NO2−, and NO3−, in the PAW were 48.76 mg/L, 0.046 mg/L, and 172.36 mg/L, respectively. Chemically activated water (CAW) with the same concentration of long-lived RONS was also prepared for comparison. A. niger A32 was treated with PAW and CAW. After treatment, the treated strains were observed and analyzed with scanning electron microscopy (SEM) and transmission electron microscopy (TEM) to screen probable mutants. The results indicated that the pH, conductivity, and ORP values of PAW were 2.42, 1935 μS/cm, and 517.07 mV, respectively. In contrast, the pH and ORP values of CAW were 6.15 and 301.73 mV, respectively, which differed significantly from those of PAW. In addition, the conductivity of CAW showed no change. SEM and TEM analyses revealed that A. niger A32 treated with CAW exhibited less damage compared with the control. In contrast, A. niger A32 treated with PAW showed significant shrinkage, deformation, and exudate attachment over time. Following PAW treatment, after four passages, a high cellulase-producing stable mutant strain A-WW5 was screened, exhibiting a filter paper enzyme activity of 29.66 U/mL, a cellulose endonuclease activity of 13.79 U/mL, and a β-glucosidase activity of 27.13 U/mL. These values were found to be 33%, 38%, and 2.1% higher than those of the original fungus sample, respectively. In total, 116 SNPs and 61 InDels were present in the genome of the mutant strain A-WW5. The above findings indicate that the impact of PAW on A. niger is not only attributed to long-lasting H2O2, NO2−, and NO3− particles but also to other short-lived active particles; PAW is expected to become a new microbial breeding mutagen.
... The resulting pattern has been called "parallel" or "convergent" [1][2][3][4][5]. Replication occurs at different levels of biological tissue, including genes, pathways, networks, univariate and multivariate phenotypes, ecological traits, and biological communities, and may lead to replicated evolution of species or ecotypes [6][7][8][9]. Indeed, the genetic mechanisms underlying parallel evolution are often unclear in many studies of repeated evolution. ...
Article
Full-text available
Background The parallel evolution of similar traits or species provides strong evidence for the role of natural selection in evolution. Traits or species that evolved repeatedly can be driven by separate de novo mutations or interspecific gene flow. Although parallel evolution has been reported in many studies, documented cases of parallel evolution caused by gene flow are scarce by comparison. Aquilegia ecalcarata and A. kansuensis belong to the genus of Aquilegia , and are the closest related sister species. Mutiple origins of A. ecalcarata have been reported in previous studies, but whether they have been driven by separate de novo mutations or gene flow remains unclear. Results In this study, We conducted genomic analysis from 158 individuals of two repeatedly evolving pairs of A. ecalcarata and A. kansuensis . All samples were divided into two distinct clades with obvious geographical distribution based on phylogeny and population structure. Demographic modeling revealed that the origin of the A. ecalcarata in the Eastern of China was caused by gene flow, and the Eastern A. ecalcarata occurred following introgression from Western A. ecalcarata population. Analysis of Treemix and D -statistic also revealed that a strong signal of gene flow was detected from Western A. ecalcarata to Eastern A. ecalcarata. Genetic divergence and selective sweep analyses inferred parallel regions of genomic divergence and identified many candidate genes associated with ecologically adaptive divergence between species pair. Comparative analysis of parallel diverged regions and gene introgression confirms that gene flow contributed to the parallel evolution of A. ecalcarata . Conclusions Our results further confirmed the multiple origins of A. ecalcarata and highlighted the roles of gene flow. These findings provide new evidence for parallel origin after hybridization as well as insights into the ecological adaptation mechanisms underlying the parallel origins of species.
... Plant species colonizing harsh ecological niches can evolve characteristics that help them adapt to local challenges. Parallel evolution occurs when two lineages independently evolve these characteristics and is considered a signal of positive selection with low fitness costs (Woods et al., 2006;Kreiner et al., 2019;Konečná et al., 2021). For these reasons, identifying these mechanisms for traits of agricultural interest provides valuable knowledge to increase crop tolerance to environmental threats under a changing climate. ...
Preprint
Full-text available
Soil salinization poses a significant threat to crop production impacting one fifth of all cultivated land. The Cape Verde Islands are located 600 km from the coast of Africa and are characterized by high salinity soils and inland water sources. In this study we find that Arabidopsis thaliana plants native to these islands accumulate a metabolite that protects them from salt stress. We partially characterized this metabolite as glucuronyl-mannose. We find that the ability to produce glucuronyl-mannose evolved independently in two different islands from the same archipelago through mutations in the same gene, an alpha glycosidase protein that we named GH38cv. These cases of parallel evolution suggest positive selection towards the increase of salt tolerance with low fitness costs. Indeed, plants carrying derived alleles of GH38cv do not present growth defects or low defenses under normal conditions, but show better germination rates, longer roots and better hydric status than non-mutated plants when exposed to salt stress. These findings provide a knowledge-based method to develop salt resilient crops using natural mechanisms, which could be attractive both to conventional and organic agriculture.
Article
Full-text available
Darwinian evolution has given rise to all the enzymes that enable life on Earth. Mimicking natural selection, scientists have learned to tailor these biocatalysts through recursive cycles of mutation, selection and amplification, often relying on screening large protein libraries to productively modulate the complex interplay between protein structure, dynamics and function. Here we show that by removing destabilizing mutations at the library design stage and taking advantage of recent advances in gene synthesis, we can accelerate the evolution of a computationally designed enzyme. In only five rounds of evolution, we generated a Kemp eliminase—an enzymatic model system for proton transfer from carbon—that accelerates the proton abstraction step >10⁸-fold over the uncatalyzed reaction. Recombining the resulting variant with a previously evolved Kemp eliminase HG3.17, which exhibits similar activity but differs by 29 substitutions, allowed us to chart the topography of the designer enzyme’s fitness landscape, highlighting that a given protein scaffold can accommodate several, equally viable solutions to a specific catalytic problem.
Article
Full-text available
Adaptation in an environment can either be beneficial, neutral or disadvantageous in another. To test the genetic basis of pleiotropic behaviour, we evolved six lines of E. coli independently in environments where glucose and galactose were the sole carbon sources, for 300 generations. All six lines in each environment exhibit convergent adaptation in the environment in which they were evolved. However, pleiotropic behaviour was observed in several environmental contexts, including other carbon environments. Genome sequencing reveals that mutations in global regulators rpoB and rpoC cause this pleiotropy. We report three new alleles of the rpoB gene, and one new allele of the rpoC gene. The novel rpoB alleles confer resistance to Rifampicin, and alter motility. Our results show how single nucleotide changes in the process of adaptation in minimal media can lead to wide-scale pleiotropy, resulting in changes in traits that are not under direct selection.
Article
Full-text available
Twelve populations of the bacterium Escherichia coli were propagated for 2,000 generations in a seasonal environment, which consisted of alternating periods of feast and famine. The mean fitness of the derived genotypes increased by approximately 35% relative to their common ancestor, based on competition experiments in the same environment. The bacteria could have adapted, in principle, by decreasing their lag prior to growth upon transfer to fresh medium (L), increasing their maximum growth rate (V(m)), reducing the the concentration of resource required to support growth at half the maximum rate (K(s)), and reducing their death rate after the limiting resource was exhausted (D). We estimated these parameters for the ancestor and then calculated the opportunity for selection on each parameter. The inferred selection gradients for V(m) and L were much steeper than for K(s) and D. The derived genotypes showed significant improvement in V(m) and L but not in K(s) or D. Also, the numerical yield in pure culture of the derived genotypes was significantly lower than the yield of the common ancestor, but the average cell size was much larger. The independently derived genotypes are somewhat more variable in these life-history traits than in their relative fitnesses, which indicates that they acquired different genetic adaptations to the seasonal environment. Nonetheless, the evolutionary changes in life-history traits exhibit substantial parallelism among the replicate populations.
Article
Fisher’s geometric model of adaptive evolution argues that adaptive evolution should generally result from the substitution of many mutations of small effect because advantageous mutations of small effect should be more common than those of large effect. However, evidence for both evolution by small steps and for Fisher’s model has been mixed. Here we report supporting results from a new experimental test of the model. We subjected the bacteriophage ϕ6 to intensified genetic drift in small populations and caused viral fitness to decline through the accumulation of a deleterious mutation. We then propagated the mutated virus at a range of larger population sizes and allowed fitness to recover by natural selection. Although fitness declined in one large step, it was usually recovered in smaller steps. More importantly, step size during recovery was smaller with decreasing size of the recovery population. These results confirm Fisher’s main prediction that advantageous mutations of small effect should be more common. We also show that the advantageous mutations of small effect are compensatory mutations whose advantage is conditional (epistatic) on the presence of the deleterious mutation, in which case the adaptive landscape of ϕ6 is likely to be very rugged.
Article
Following an environmental change, the course of a population's adaptive evolution may be influenced by environmental factors, such as the degree of marginality of the new environment relative to the organism's potential range, and by genetic factors, including constraints that may have arisen during its past history. Experimental populations of bacteria were used to address these issues in the context of evolutionary adaptation to the thermal environment. Six replicate lines of Escherichia coli (20°C group), founded from a common ancestor, were propagated for 2000 generations at 20°C, a novel temperature that is very near the lower thermal limit at which it can maintain a stable population size in a daily serial transfer (100-fold dilution) regime. Four additional groups (32/20, 37/20, 42/20, and 32-42/20°C groups) of six lines, each with 2000 generation selection histories at different temperatures (32, 37, 42, and daily alternation of 32 and 42°C), were moved to the same 20°C environment and propagated in parallel to ascertain whether selection histories influence the adaptive response in this novel environment. Adaptation was measured by improvement in fitness relative to the common ancestor in direct competition experiments conducted at 20°C. All five groups showed improvement in relative fitness in this environment; the mean fitness of the 20°C group after 2000 generations increased by about 8%. Selection history had no discernible effect on the rate or final magnitude of the fitness responses of the four groups with different histories after 2000 generations. The correlated fitness responses of the 20°C group were measured across the entire thermal niche. There were significant tradeoffs in fitness at higher temperatures; for example, at 40°C the average fitness of the 20°C group was reduced by almost 20% relative to the common ancestor. We also observed a downward shift of 1-2°C in both the upper and lower thermal niche limits for the 20°C selected group. These observations are contrasted with previous observations of a markedly greater rate of adaptation to growth near the upper thermal limit (42°C) and a lack of trade-off in fitness at lower temperatures for lines adapted to that high temperature. The evolutionary implications of this asymmetry are discussed.
Article
Natural selection plays a fundamental role in most theories of speciation, but empirical evidence from the wild has been lacking. Here the post-Pleistocene radiation of threespine sticklebacks was used to infer natural selection in the origin of species. Populations of sticklebacks that evolved under different ecological conditions show strong reproductive isolation, whereas populations that evolved independently under similar ecological conditions lack isolation. Speciation has proceeded in this adaptive radiation in a repeatable fashion, ultimately as a consequence of adaptation to alternative environments.
Article
Life's Solution builds a persuasive case for the predictability of evolutionary outcomes. The case rests on a remarkable compilation of examples of convergent evolution, in which two or more lineages have independently evolved similar structures and functions. The examples range from the aerodynamics of hovering moths and hummingbirds to the use of silk by spiders and some insects to capture prey. Going against the grain of Darwinian orthodoxy, this book is a must read for anyone grappling with the meaning of evolution and our place in the Universe. Simon Conway Morris is the Ad Hominen Professor in the Earth Science Department at the University of Cambridge and a Fellow of St. John's College and the Royal Society. His research focuses on the study of constraints on evolution, and the historical processes that lead to the emergence of complexity, especially with respect to the construction of the major animal body parts in the Cambrian explosion. Previous books include The Crucible of Creation (Getty Center for Education in the Arts, 1999) and co-author of Solnhofen (Cambridge, 1990). Hb ISBN (2003) 0-521-82704-3
Article
Motoo Kimura, as founder of the neutral theory, is uniquely placed to write this book. He first proposed the theory in 1968 to explain the unexpectedly high rate of evolutionary change and very large amount of intraspecific variability at the molecular level that had been uncovered by new techniques in molecular biology. The theory - which asserts that the great majority of evolutionary changes at the molecular level are caused not by Darwinian selection but by random drift of selectively neutral mutants - has caused controversy ever since. This book is the first comprehensive treatment of this subject and the author synthesises a wealth of material - ranging from a historical perspective, through recent molecular discoveries, to sophisticated mathematical arguments - all presented in a most lucid manner.
Article
Molecular methods are used widely to measure genetic diversity within populations and determine relationships among species. However, it is difficult to observe genomic evolution in action because these dynamics are too slow in most organisms. To overcome this limitation, we sampled genomes from populations of Escherichia coli evolving in the laboratory for 10,000 generations. We analyzed the genomes for restriction fragment length polymorphisms (RFLP) using seven insertion sequences (IS) as probes; most polymorphisms detected by this approach reflect rearrangements (including transpositions) rather than point mutations. The evolving genomes became increasingly different from their ancestor over time. Moreover, tremendous diversity accumulated within each population, such that almost every individual had a different genetic fingerprint after 10,000 generations. As has been often suggested, but not previously shown by experiment, the rates of phenotypic and genomic change were discordant, both across replicate populations and over time within a population. Certain pivotal mutations were shared by all descendants in a population, and these are candidates for beneficial mutations, which are rare and difficult to find. More generally, these data show that the genome is highly dynamic even over a time scale that is, from an evolutionary perspective, very brief.